Asymptotically half of binary words are shuffle squares

Series
Stochastics Seminar
Time
Thursday, April 16, 2026 - 3:30pm for 1 hour (actually 50 minutes)
Location
Skiles 006
Speaker
Logan Post – Georgia Institute of Technology – lpost3@gatech.eduhttps://sites.google.com/view/loganpost
Organizer
Benjamin McKenna

A binary shuffle square is a binary word of even length that can be partitioned into two disjoint, identical subwords. While recognizing shuffle squares is NP-hard, we show that they are surprisingly ubiquitous. We prove that a uniformly random binary word $s$ of length $2n$ is a shuffle square with probability $\frac 12-o(n^{-1/15})$, verifying a conjecture of He, Huang, Nam, and Thaper. In particular, almost every binary word is at most two bit-deletions away from a shuffle square, giving the best possible average case for the “Longest Twin” problem.

 

By revealing the bits of $s$ sequentially,  we reformulate the problem as a discrete stochastic process. We track the evolution of a “buffer set”, a collection of suffixes produced by the revealed bits. In this setting, there is a simple greedy algorithm which behaves like a SSRW; we define a local optimization which creates a negative bias. We also show that the buffer set is robust enough to absorb small defects, yielding a perfect partition with high probability.