I wrote about superpermutations here: a superpermutation is a string that has as substrings all the permutations of some set of symbols.

For example, there are six permutations of the symbols 1, 2, 3. They are: 123, 132, 213, 231, 312, 321. The string `123121321`

has all six of these as substrings – and it’s the shortest string that does, with only one three-character substring that isn’t a permutation: the 121 in the middle.

For larger numbers of symbols it’s impossible to do quite so well. For example, there are 24 different permutations of the symbols 1, 2, 3, 4, and the shortest superpermutation is `123412314231243121342132413214321`

, which has 33 characters. So there are 30 different four-character substrings, and six of them contain repeated symbols: `1231`

, `2312`

; `3121`

and `1213`

(which overlap with each other); `2132`

, and `1321`

.

Here it is again, with the non-permutations highlighted:

`1234`

**1231**4**2312**4**31213**4**2132**4**1321**4321

No general formula is known for the length of the shortest superpermutation on n symbols. If we denote this length *s*(*n*), then it is known that for *n* ≥ 3*,*

and it is known that neither bound is tight in general. We know the lower bound is not tight because *s*(5) has been computed exactly and is equal to 153, whereas the formula for the lower bound gives 152. We know the upper bound is not tight because there are superpermutations on 6 symbols of length 872, so we know that *s*(6) ≤ 872, whereas the formula for the upper bound gives 873.

## Stop Press!

Since I wrote the above, there has been a major improvement. Yesterday (19 October 2018) Greg Egan announced on Twitter the results of a new algorithm he has found for constructing superpermutations, which appears to give an improved upper bound

This new technique gives superpermutations shorter than any that were previously known for *n≥7.*

## Mea culpa

The original purpose of this post was to prove the lower bound

but unfortunately there were several errors in my attempt: thanks to Greg Egan for alerting me to one, and Zizoz in the comments below for spotting a couple more. Also, my approach – even if the general strategy *can *somehow be made to work – is a less elegant argument than the anonymous proof on Wikia, which on rereading is not as hard to understand as I originally found it.

I’ll leave the preliminaries here, as a reference for the relevant definitions.

## Preamble

Define the **permutation graph** (on *n* symbols) to be a weighted directed graph whose nodes are all the permutations of the symbols. So there are *n*! nodes. We’ll regard it as a complete graph, i.e. for any two nodes A and B there is an edge A→B, and the weight of the edge is the number of symbols you need to add to A to get to B. More formally, it’s the length of the shortest string that has A as a prefix and B as a suffix, minus *n*.

The point of this, of course, is that superpermutations correspond to Hamiltonian paths in the permutation graph, where the length of the superpermutation is *n* plus the weight of the path.

I’ll use terminology in what I hope is a standard way. A **walk** is a sequence of edges that connect a sequence of vertices. A **path** is a walk all of whose vertices are distinct. A **cycle** is a walk whose first and last vertices are the same. The **length** of a walk is the number of edges it contains; the **weight** of a walk is the sum of the weights of those edges. A **Hamiltonian path** is a path that visits all nodes; a **Hamiltonian tour** is a cycle that visits each node precisely once.

For example, for *n*=3 the six nodes are labelled 123, 132, 213, 231, 312, and 321. In the diagram below, only the edges of weights 1 and 2 are shown; if no edge is drawn between two nodes, that means the edge between them has weight 3.

There are two different types of weight-2 edges shown here. The edge 123→312 can be broken down into a longer path of the same weight (123→231→312), but the edge 123→321 cannot be broken into a longer path of the same weight. Let’s say an edge is **improper** if there is a longer path of the same weight joining the same two nodes, and **proper** if not. (In this example all the weight-3 edges are improper.)

Every node has precisely one proper and one improper weight-2 edge coming out of it. (This is true in general, not only for *n*=3.)

There are no known minimal superpermutations that use improper edges, though I don’t think it has been proved that there cannot be.

## Cycle structures

Define a **1-cycle** in the permutation graph to be a cycle consisting entirely of edges of weight 1. Notice that 1-cycles have length *n* and weight *n;* and that every node belongs to exactly one of these cycles, so the graph is partitioned into 1-cycles.

Define a **2-cycle** to be a cycle of *n*(*n*-1) edges, alternating between sequences of (*n*-1) weight-1 edges and proper weight-2 edges. The *n*=3 example above consists of a single 2-cycle 123→231→312⇒213→132→321⇒123.

Hi! I found this post via Twitter.

The omitted case seems unnecessary to me: if we let w be the maximal weight of an edge, then for any tour with w < 3, weight ≥ n! + (n-1)! + (n-2)! implies weight ≥ n! + (n-1)! + (n-2)! + (w-3) because w-3 < 0.

Unfortunately, I think I see a problem elsewhere: you write "Suppose it [the bound of (n+1)(n-1) weight for a path of length n(n-1)] were attained: then each contiguous subpath of length n would consist of (n-1) edges of weight 1 and one (proper) edge of weight 2." This is not true, and in fact the bound can be obtained; consider, for n=4, the path

1234→2341 (weight 1)

2341→3412 (weight 1)

3412→4123 (weight 1)

4123→2314 (weight 2)

2314→1432 (weight 2)

1432→4321 (weight 1)

4321→3214 (weight 1)

3214→2143 (weight 1)

2143→4312 (weight 2)

4312→3124 (weight 1)

3124→1243 (weight 1)

1243→2431 (weight 1)

which has a length of 12 and a total weight of 15.

I believe the Wikia proof has an analogous flaw in that it assumes that you do not leave a 1-loop until it is completed. (If that strategy could be proven optimal, I think it would fix the proof here as well.)

I realize now that I had misinterpreted the Wikia proof, and in fact it dealt with the issue I pointed out already (by not counting incomplete 1-loops other than the current one).

I think you are right on both counts. Thanks! I should probably just delete this whole failed proof attempt. The 4chan/wikia proof is nicer anyway.

Hi.

I wasn’t aware of the Haruhi problem up to these days, when someone on 4chan gave a kind of solution.

I had a look on it and I’d like to post here my approach.

The shortest string containing all the permutations of n digits occurs when you are able to take advantage of the maximum overlap between each permutation, obviously.

Given a permutation of n digits, I can find at maximum (n – 1) different permutations that grant the maximum overlap (n – 1) with an overall length of (2n – 1) digits.

Than I have to lower down to an overlap of (n – 2) for finding a new different permutation that grants other (n – 1) different ones with maximum overlap (n – 1); this new series has length equal to 2n – 1 – (n – 2) (I’m subtracting n – 2 because of the overlap with the previous permutation series).

Lets call L a string of n permutations with overlap (n – 1) between them.

I can put in a row (n – 1) L strings with overlap between them equal to (n – 2) than I have to lower down to (n – 3) for being able to create a new block.

Here it follows the L string lengths for n = 5:

L1 => length 9 (2n – 1)

L2 => length 6 (2n – 1 – (n – 2) for the overlap with L1)

L3 => length 6

L4 => length 6

_______________________________________

L5 => length 7 (2n – 1 – (n – 3) for the overlap with L4)

L6 => length 6

L7 => length 6

L8 => length 6

_______________________________________

L9 => length 7 (2n – 1 – (n – 3) for the overlap with L8)

L10 => length 6

L11 => length 6

L12 => length 6

_______________________________________

L13 => length 7 (2n – 1 – (n – 3) for the overlap with L12)

L14 => length 6

L15 => length 6

L16 => length 6

_______________________________________

L17 => length 7 (2n – 1 – (n – 3) for the overlap with L16)

L18 => length 6

L19 => length 6

L20 => length 6

_______________________________________

L21 => length 7 (2n – 1 – (n – 3) for the overlap with L20)

L22 => length 6

L23 => length 6

L24 => length 6

Total = 152

Lower bound formula = n^2(n – 2)! + n – 3

Hope it helps.

Best regards.