Superpermutations: lower bound

I wrote about superpermutations here: a superpermutation is a string that has as substrings all the permutations of some set of symbols.

For example, there are six permutations of the symbols 1, 2, 3. They are: 123, 132, 213, 231, 312, 321. The string 123121321 has all six of these as substrings – and it’s the shortest string that does, with only one three-character substring that isn’t a permutation: the 121 in the middle.

For larger numbers of symbols it’s impossible to do quite so well. For example, there are 24 different permutations of the symbols 1, 2, 3, 4, and the shortest superpermutation is 123412314231243121342132413214321, which has 33 characters. So there are 30 different four-character substrings, and six of them contain repeated symbols: 1231, 2312; 3121 and 1213 (which overlap with each other); 2132, and 1321.

Here it is again, with the non-permutations highlighted:


No general formula is known for the length of the shortest superpermutation on n symbols. If we denote this length s(n), then it is known that for n ≥ 3,

n! + (n-1)! + (n-2)! + (n-3) \le s(n)\le \sum_{i=1}^ni!

and it is known that neither bound is tight in general. We know the lower bound is not tight because s(5) has been computed exactly and is equal to 153, whereas the formula for the lower bound gives 152. We know the upper bound is not tight because there are superpermutations on 6 symbols of length 872, so we know that s(6) ≤ 872, whereas the formula for the upper bound gives 873.

Stop Press!

Since I wrote the above, there has been a major improvement. Yesterday (19 October 2018) Greg Egan announced on Twitter the results of a new algorithm he has found for constructing superpermutations, which appears to give an improved upper bound

s(n)\le n! + (n-1)! + (n-2)! + (n-3)! + (n-3)

This new technique gives superpermutations shorter than any that were previously known for n≥7.

Mea culpa

The original purpose of this post was to prove the lower bound

n! + (n-1)! + (n-2)! + (n-3) \le s(n)

but unfortunately there were several errors in my attempt: thanks to Greg Egan for alerting me to one, and Zizoz in the comments below for spotting a couple more. Also, my approach – even if the general strategy can somehow be made to work – is a less elegant argument than the anonymous proof on Wikia, which on rereading is not as hard to understand as I originally found it.

I’ll leave the preliminaries here, as a reference for the relevant definitions.


Define the permutation graph (on n symbols) to be a weighted directed graph whose nodes are all the permutations of the symbols. So there are n! nodes. We’ll regard it as a complete graph, i.e. for any two nodes A and B there is an edge A→B, and the weight of the edge is the number of symbols you need to add to A to get to B. More formally, it’s the length of the shortest string that has A as a prefix and B as a suffix, minus n.

The point of this, of course, is that superpermutations correspond to Hamiltonian paths in the permutation graph, where the length of the superpermutation is n plus the weight of the path.

I’ll use terminology in what I hope is a standard way. A walk is a sequence of edges that connect a sequence of vertices. A path is a walk all of whose vertices are distinct. A cycle is a walk whose first and last vertices are the same. The length of a walk is the number of edges it contains; the weight of a walk is the sum of the weights of those edges. A Hamiltonian path is a path that visits all nodes; a Hamiltonian tour is a cycle that visits each node precisely once.

For example, for n=3 the six nodes are labelled 123, 132, 213, 231, 312, and 321. In the diagram below, only the edges of weights 1 and 2 are shown; if no edge is drawn between two nodes, that means the edge between them has weight 3.


There are two different types of weight-2 edges shown here. The edge 123→312 can be broken down into a longer path of the same weight (123→231→312), but the edge 123→321 cannot be broken into a longer path of the same weight. Let’s say an edge is improper if there is a longer path of the same weight joining the same two nodes, and proper if not. (In this example all the weight-3 edges are improper.)

Every node has precisely one proper and one improper weight-2 edge coming out of it. (This is true in general, not only for n=3.)

There are no known minimal superpermutations that use improper edges, though I don’t think it has been proved that there cannot be.

Cycle structures

Define a 1-cycle in the permutation graph to be a cycle consisting entirely of edges of weight 1. Notice that 1-cycles have length n and weight n; and that every node belongs to exactly one of these cycles, so the graph is partitioned into 1-cycles.

Define a 2-cycle to be a cycle of n(n-1) edges, alternating between sequences of (n-1) weight-1 edges and proper weight-2 edges. The n=3 example above consists of a single 2-cycle 123→231→312⇒213→132→321⇒123.

This entry was posted in chatter. Bookmark the permalink.

4 Responses to Superpermutations: lower bound

  1. Zizoz says:

    Hi! I found this post via Twitter.
    The omitted case seems unnecessary to me: if we let w be the maximal weight of an edge, then for any tour with w < 3, weight ≥ n! + (n-1)! + (n-2)! implies weight ≥ n! + (n-1)! + (n-2)! + (w-3) because w-3 < 0.
    Unfortunately, I think I see a problem elsewhere: you write "Suppose it [the bound of (n+1)(n-1) weight for a path of length n(n-1)] were attained: then each contiguous subpath of length n would consist of (n-1) edges of weight 1 and one (proper) edge of weight 2." This is not true, and in fact the bound can be obtained; consider, for n=4, the path
    1234→2341 (weight 1)
    2341→3412 (weight 1)
    3412→4123 (weight 1)
    4123→2314 (weight 2)
    2314→1432 (weight 2)
    1432→4321 (weight 1)
    4321→3214 (weight 1)
    3214→2143 (weight 1)
    2143→4312 (weight 2)
    4312→3124 (weight 1)
    3124→1243 (weight 1)
    1243→2431 (weight 1)
    which has a length of 12 and a total weight of 15.
    I believe the Wikia proof has an analogous flaw in that it assumes that you do not leave a 1-loop until it is completed. (If that strategy could be proven optimal, I think it would fix the proof here as well.)

    • Zizoz says:

      I realize now that I had misinterpreted the Wikia proof, and in fact it dealt with the issue I pointed out already (by not counting incomplete 1-loops other than the current one).

    • I think you are right on both counts. Thanks! I should probably just delete this whole failed proof attempt. The 4chan/wikia proof is nicer anyway.

  2. rcolombari says:

    I wasn’t aware of the Haruhi problem up to these days, when someone on 4chan gave a kind of solution.

    I had a look on it and I’d like to post here my approach.

    The shortest string containing all the permutations of n digits occurs when you are able to take advantage of the maximum overlap between each permutation, obviously.

    Given a permutation of n digits, I can find at maximum (n – 1) different permutations that grant the maximum overlap (n – 1) with an overall length of (2n – 1) digits.

    Than I have to lower down to an overlap of (n – 2) for finding a new different permutation that grants other (n – 1) different ones with maximum overlap (n – 1); this new series has length equal to 2n – 1 – (n – 2) (I’m subtracting n – 2 because of the overlap with the previous permutation series).

    Lets call L a string of n permutations with overlap (n – 1) between them.

    I can put in a row (n – 1) L strings with overlap between them equal to (n – 2) than I have to lower down to (n – 3) for being able to create a new block.

    Here it follows the L string lengths for n = 5:

    L1 => length 9 (2n – 1)

    L2 => length 6 (2n – 1 – (n – 2) for the overlap with L1)

    L3 => length 6

    L4 => length 6


    L5 => length 7 (2n – 1 – (n – 3) for the overlap with L4)

    L6 => length 6

    L7 => length 6

    L8 => length 6


    L9 => length 7 (2n – 1 – (n – 3) for the overlap with L8)

    L10 => length 6

    L11 => length 6

    L12 => length 6


    L13 => length 7 (2n – 1 – (n – 3) for the overlap with L12)

    L14 => length 6

    L15 => length 6

    L16 => length 6


    L17 => length 7 (2n – 1 – (n – 3) for the overlap with L16)

    L18 => length 6

    L19 => length 6

    L20 => length 6


    L21 => length 7 (2n – 1 – (n – 3) for the overlap with L20)

    L22 => length 6

    L23 => length 6

    L24 => length 6

    Total = 152

    Lower bound formula = n^2(n – 2)! + n – 3

    Hope it helps.

    Best regards.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s