## Tackling the Minimal Superpermutation Problem

2014-08-22 § 5 Comments

What’s the shortest string that contains every possible permutation of ABCD somewhere inside it? As it happens, it’s 33 letters long: ABCDABCADBCABDCABACDBACBDACBADCBA. A string like this is called a *minimal superpermutation*.

So what’s the shortest string that contains every possible permutation of ABCDE? It was recently shown that 153 letters is the shortest possible, and that there are eight different superpermutations of this length.

Okay, what about ABCDEF? The answer is that **nobody knows**. Until this week the shortest known superpermutation of ABCDEF was 873 letters long:

ABCDEFABCDEAFBCDEABFCDEABCFDEABCDFEABCDAEFBCDAEBFCDAEBCFDAEBCDFAEB CDAFEBCDABEFCDABECFDABECDFABECDAFBECDABFECDABCEFDABCEDFABCEDAFBCED ABFCEDABCFEDABCADEFBCADEBFCADEBCFADEBCAFDEBCADFEBCADBEFCADBECFADBE CAFDBECADFBECADBFECADBCEFADBCEAFDBCEADFBCEADBFCEADBCFEADBCAEFDBCAE DFBCAEDBFCAEDBCFAEDBCAFEDBCABDEFCABDECFABDECAFBDECABFDECABDFECABDC EFABDCEAFBDCEABFDCEABDFCEABDCFEABDCAEFBDCAEBFDCAEBDFCAEBDCFAEBDCAF EBDCABEFDCABEDFCABEDCFABEDCAFBEDCABFEDCABACDEFBACDEBFACDEBAFCDEBAC FDEBACDFEBACDBEFACDBEAFCDBEACFDBEACDFBEACDBFEACDBAEFCDBAECFDBAECDF BAECDBFAECDBAFECDBACEFDBACEDFBACEDBFACEDBAFCEDBACFEDBACBDEFACBDEAF CBDEACFBDEACBFDEACBDFEACBDAEFCBDAECFBDAECBFDAECBDFAECBDAFECBDACEFB DACEBFDACEBDFACEBDAFCEBDACFEBDACBEFDACBEDFACBEDAFCBEDACFBEDACBFEDA CBADEFCBADECFBADECBFADECBAFDECBADFECBADCEFBADCEBFADCEBAFDCEBADFCEB ADCFEBADCBEFADCBEAFDCBEADFCBEADCFBEADCBFEADCBAEFDCBAEDFCBAEDCFBAED CBFAEDCBAFEDCBA

and it was thought that might be the shortest possible.

But we now know it isn’t, because I found a shorter one:

ABCDEFABCDEAFBCDEABFCDEABCFDEACBFDEACFBDEACFDBEACFDEBACFDEABCDFEAB CDAEFBCDAEBFCDAEBCFDAEBCDFAEBCDAFEBCDABEFCDABECFDABECDFABECDAFBECD ABFECDABCEFDABCEDFABCEDAFBCEDABFCEDABCFEDACBFEDCABFDECAFBDCEAFBDCA EFBDCAFEBDCAFBEDCAFBDECAFDBECADFBECADBFECADBEFCADBECFADBECAFDEBCAD FEBCADEFBCADEBFCADEBCFADEBCAFDECBAFDECABFDCEABFDCAEBFDCABEFDCBAEFD BCAEDFBCAEDBFCAEDBCFAEDBCAFEDBCAEFDBACEFDBAECFBDAECFBADECFBAEDCFBA ECDFBACEDFBACDEFBACDFEBACDFBEACDFBAECFDBAEFCDBAFECDBAFCEDBAFCDEBAF CDBEAFCDBAEFDCBEAFDCBEFADCBEFDACBEFDCABFEDCBAFEDCBFAECDBFACEDBFACD EBFACDBEFACDBFEACDBFAECBDFEACBDFECABDFCEABDFCAEBDFCABEDFCBAEDFCBEA DFCBEDAFCBEDFACBEDFCABDEFCBADEFCBDAEFCBDEAFCBDEFACBDEFCABDFECBADFE CBDAFECBDFAECBFDAECBFADECBFAEDCBFEADCFBEADCFEBADCEFBADCEBFADCEBAFD CEBADFCEBADCFEABDCFAEBDCFABEDCFABDECFABDCEFABDCFEADBCEFADBCEAFDBCE ADFBCEADBFCEADBCFEADCBFEDACFBEDACFEBDACEFBDACEBFDACEBDFACEBDAFCEBD ACFEDBACFEDABC

I’ve uploaded a short note about it to the arxiv.

Nice achievement, Robin, congrats!

I have a few comments.

The standard notation to describe a superpermutation is with numbers. Is there any reason why you changed the standard notation to the alphabet? Is it because the phrasing of the problem is with strings and substrings and therefore using the alphabet is more pertinent?

Talking about your note on the Arxiv, thanks for section 4 (first paragraph), it is very enlightening! It is a shame that Concorde algorithm failed for n=6. But at least you found a counterexample to the conjectured minimum by the randomized method. The counterexample makes the problem more intriguing.

There is an obvious connection of this problem to the De Bruijn sequences (http://en.wikipedia.org/wiki/De_Bruijn_sequence) as Nathaniel Johnston describes in his paper. In De Bruijn sequences you have to fit in all possible sequences of length n (including the ones with repetitions). BTW, one of the standard methods of obtaining De Bruijn sequences is through Hamiltonian cycle in the De Bruijn graph. Did you get the great idea of using the TSP from there? Anyway, there is another difference between both problems, De Bruijn squences are cyclical, while in superpermutations the beginning and the end are not linked.

So, finally, have you considered the problem of having a “cyclical superpermutation”? That is, a string with no beginning and no end of min length such that all permutations of n symbols appear as contiguos substrings? This problem is more related to De Bruijn sequences and IMO it is more natural. The problem you are attacking has some boundaries (beginning & end) which make it a bit ackward. It is a much less symmetric problem. I have searched a little bit the literature but found nothing about a “cyclical superpermutation” problem or something like that. Of course the TSP is easier to apply because you don’t have to “change the weights of certain edges to 0″.

One last thing, you apply the TSP to a complete weighted directed graph with n! vertices. But can’t you delete edges that have a weight of 3 or more? I am not sure about this, but intuitively it seems that the discrepancy can not be very high at any step, right?

Thanks, Roberto. Nice comment.

There’s no good reason I used ABCD rather than 1234 here. It was just for the sake of variety. As you say, there’s no particular reason to prefer one set of symbols over another. I think I was influenced by this tweet of Matt Parker’s from March.

I wasn’t explicitly thinking about De Bruijn sequences. A reddit discussion last year explicitly relates the minimal superpermutation problem to finding a minimal-weight Hamiltonian path, so that wasn’t an original idea. I think in any case it’s a natural way to think about the problem.

It’s very natural to think about cyclical superpermutations. One of the reasons I thought the conjecture must be false was that the corresponding conjecture for cyclical superpermutations had already been falsified at n=5 by Benjamin Chaffin’s examples.

I played around with deleting edges of weight above a particular threshold. If you look at my Python script for generating the instances, you’ll see options to do that. Interestingly I found that LKH performed less well with this change.

~~Another observation is that the edges that occur in the known minimal paths are all of the form~~. Edit: not quite true. But restricting the graph to edges of this form again seems to hinder rather than help the search.>It’s very natural to think about cyclical superpermutations. One of the

>reasons I thought the conjecture must be false was that the

>corresponding conjecture for cyclical superpermutations had already

>been falsified at n=5 by Benjamin Chaffin’s examples.

That was a great insight! So looking at Ben Chaffin’s examples (http://www.njohnston.ca/superperm5.txt) the first 2 would give a cyclical superpermutation of length 152, and the other 6 a length of 151. But it looks that a cyclical superpermutation of length 150 could be possible: it would be obtained from a classical superpermutation of length 154 that ends with …1234.

> I played around with deleting edges of weight above a particular

> threshold. … Interestingly I found that LKH performed less well with

> this change.

Surprising.

> Another observation is that the edges that occur in the known minimal

> paths are all of the form s.t\rightarrow t.s^{\mathrm{rev}}.

There must be a reason for that.

> But restricting the graph to edges of this form again seems to hinder

> rather than help the search.

Again surprising!

There isn’t a five-symbol cyclical superpermutation of length 150. It isn’t hard to argue that such a thing would imply a classical superpermutation of length 152; or of course one can throw Concorde at it.

Well, after some more searching I’ve found something similar to what I was calling “cyclical superpermutation” in the previous comment. It is the concept of “universal cycles” (http://www.math.ucsd.edu/~ronspubs/92_06_universal_cycles.pdf and http://arxiv.org/pdf/0710.5611.pdf).

A “universal cycle” is a cyclic sequence where you have to fit in all the substrings of a set (for example the set of all permutations of [n]={1,2,..,n}), but as in De Bruijn sequences each substring has to appear *exactly* once. It is proved that for permutations this can not be done so they attack the problem of k-permutations of [n] for k<n. The case k=n-1 is important and can be solved.

The condition of a unique appearance of each substring in "universal cycles" makes the problem different than the "cyclic superpermutations".

So, have “cyclical superpermutation” already been studied?