I wrote about superpermutations here: a superpermutation is a string that has as substrings all the permutations of some set of symbols.
For example, there are six permutations of the symbols 1, 2, 3. They are: 123, 132, 213, 231, 312, 321. The string
123121321 has all six of these as substrings – and it’s the shortest string that does, with only one three-character substring that isn’t a permutation: the 121 in the middle.
For larger numbers of symbols it’s impossible to do quite so well. For example, there are 24 different permutations of the symbols 1, 2, 3, 4, and the shortest superpermutation is
123412314231243121342132413214321, which has 33 characters. So there are 30 different four-character substrings, and six of them contain repeated symbols:
1213 (which overlap with each other);
Here it is again, with the non-permutations highlighted:
No general formula is known for the length of the shortest superpermutation on n symbols. If we denote this length s(n), then it is known that for n ≥ 3,
and it is known that neither bound is tight in general. We know the lower bound is not tight because s(5) has been computed exactly and is equal to 153, whereas the formula for the lower bound gives 152. We know the upper bound is not tight because there are superpermutations on 6 symbols of length 872, so we know that s(6) ≤ 872, whereas the formula for the upper bound gives 873.
Since I wrote the above, there has been a major improvement. Yesterday (19 October 2018) Greg Egan announced on Twitter the results of a new algorithm he has found for constructing superpermutations, which appears to give an improved upper bound
This new technique gives superpermutations shorter than any that were previously known for n≥7.
The original purpose of this post was to prove the lower bound
but unfortunately there were several errors in my attempt: thanks to Greg Egan for alerting me to one, and Zizoz in the comments below for spotting a couple more. Also, my approach – even if the general strategy can somehow be made to work – is a less elegant argument than the anonymous proof on Wikia, which on rereading is not as hard to understand as I originally found it.
I’ll leave the preliminaries here, as a reference for the relevant definitions.
Define the permutation graph (on n symbols) to be a weighted directed graph whose nodes are all the permutations of the symbols. So there are n! nodes. We’ll regard it as a complete graph, i.e. for any two nodes A and B there is an edge A→B, and the weight of the edge is the number of symbols you need to add to A to get to B. More formally, it’s the length of the shortest string that has A as a prefix and B as a suffix, minus n.
The point of this, of course, is that superpermutations correspond to Hamiltonian paths in the permutation graph, where the length of the superpermutation is n plus the weight of the path.
I’ll use terminology in what I hope is a standard way. A walk is a sequence of edges that connect a sequence of vertices. A path is a walk all of whose vertices are distinct. A cycle is a walk whose first and last vertices are the same. The length of a walk is the number of edges it contains; the weight of a walk is the sum of the weights of those edges. A Hamiltonian path is a path that visits all nodes; a Hamiltonian tour is a cycle that visits each node precisely once.
For example, for n=3 the six nodes are labelled 123, 132, 213, 231, 312, and 321. In the diagram below, only the edges of weights 1 and 2 are shown; if no edge is drawn between two nodes, that means the edge between them has weight 3.
There are two different types of weight-2 edges shown here. The edge 123→312 can be broken down into a longer path of the same weight (123→231→312), but the edge 123→321 cannot be broken into a longer path of the same weight. Let’s say an edge is improper if there is a longer path of the same weight joining the same two nodes, and proper if not. (In this example all the weight-3 edges are improper.)
Every node has precisely one proper and one improper weight-2 edge coming out of it. (This is true in general, not only for n=3.)
There are no known minimal superpermutations that use improper edges, though I don’t think it has been proved that there cannot be.
Define a 1-cycle in the permutation graph to be a cycle consisting entirely of edges of weight 1. Notice that 1-cycles have length n and weight n; and that every node belongs to exactly one of these cycles, so the graph is partitioned into 1-cycles.
Define a 2-cycle to be a cycle of n(n-1) edges, alternating between sequences of (n-1) weight-1 edges and proper weight-2 edges. The n=3 example above consists of a single 2-cycle 123→231→312⇒213→132→321⇒123.