Sensitivity conjecture (Problem 14.16) solved

Let $Q^n$ be the $n$-dimensional hypercube graph, whose vertex set consists of vectors in $\{0, 1\}^n$, and two vectors are adjacent if they differ in exactly one coordinate.

In 1988, Chung, Füredi, Graham, and Seymour [1] proved that if $H$ is an induced subgraph on more than $2^{n-1}$ vertices of $Q^n$, then the maximum degree of $H$ is at least $(1/2-o(1)) \log_2 n$. Moreover, they constructed a $(2^{n-1}+1)$-vertex induced subgraph whose maximum degree is $\lceil \sqrt{n}\, \rceil$.

In 2019, Huang [2] proved the optimality of their construction. Note that the $2^{n-1}$ even vertices of $Q^n$ induces an empty subgraph. The following theorem shows that any subgraph with just one more vertex would have its maximum degree suddenly jump to $\sqrt{n}$.

Theorem 1 (Huang [2]): Any set $S$ of $|S|=2^{n-1}+1$ vertices of $Q^n$ contains a vertex with at least $\sqrt{n}$ neighbors in $S$.

By the Gostman-Linial result (Thm. 14.18⁽¹⁾ applied with $h(n)=\sqrt{n}$), this gives the inequality \[ s(f)\geq \sqrt{\deg{f}} \] for every Boolean function $f$, where $s(f)$ is the sensitivity of $f$, and $\mbox{deg}(f)$ is the degree of $f$ [the minimum degree of a multilinear real polynomial $p$ that takes the same values as $f$ on all binary vectors]. Nisan and Szegedy [4] showed that $bs(f) \le 2 \deg{f}^2$, where $bs(f)$ is the block sensitivity of $f$. This bound was later improved by Tal [5] to $bs(f) \le \deg{f}^2$. We thus have the following relations \[ \sqrt{\deg{f}}\leq s(f)\leq bs(f)\leq \deg{f}^2\,. \] In particular, for every boolean function $f$, \[ s(f)\leq bs(f) \le s(f)^4\,. \]

The proof in [2] is amazingly simple! The crucial idea is, instead of adjacency $\{0,1\}$-matrix of the graph $Q^n$ to consider a sign $\{-1,0,1\}$-version $A_n$ of this matrix defined inductively as follows: \[ A_1=\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix},~~A_{n}=\begin{bmatrix} A_{n-1} & I_{n-1} \\ I_{n-1} & -A_{n-1} \end{bmatrix}, \] where $I_{n-1}$ is the $2^{n-1}\times 2^{n-1}$ identity matrix. By the iterative construction of $A_n$, it is not hard to see that when changing every $(-1)$-entry of $A_n$ to $1$, we get exactly the adjacency matrix of $Q^n$. This is because the graphs $Q^n$ can be inductively defined by taking two copies of the graph $Q_{n-1}$ (whose vertices are vectors $u\in\{0,1\}^{n-1}$), attach $0$ (resp., $1$) to each vertex in the first (resp., second) copy, and add a matching between these copies matching vertices $(u,0)$ of the first copy with vertices $(u,1)$ of the second copy; then the submatrix $I_{n-1}$ of $A_n$ is the adjacency matrix of this matching.

By easy induction, it can be shown that \begin{equation} A_n^2=n I_{n}\,. \end{equation} Indeed, for $n=1$, we have $A_1^2=I_1$. Suppose Eq. (1) holds for $n-1$, that is, $A_{n-1}^2=(n-1)I_{n-1}$, then \[ A_n^2=\begin{bmatrix} A_{n-1}^2+I_{n-1} & 0 \\ 0 & A_{n-1}^2+I_{n-1}\end{bmatrix} =\begin{bmatrix} nI_{n-1} & 0 \\ 0 & nI_{n-1}\end{bmatrix} =nI_{n}\,. \]

The original proof of Huang [2] is based on the Cauchy’s Interlace Theorem⁽²⁾ relating the eigenvalues of a symmetric matrix with the eigenvalues of its principal submatrices, and then uses the fact that the maximum degree of a graph $H$ is at least the largest eigenvalue of any its $\{-1,0,1\}$ adjacency matrix (which has $0$'s at the entries corresponding to the non-edge of $H$, and has any of the entries $-1,0,1$ in the remaining positions).

A direct proof given shortly by Knuth [3] (which, as the author mentions, essentially fleshes out the idea of Shalev Ben-David) is even simpler, and avoids the use of Cauchy’s Interlace Theorem. Consider the $2^n\times 2^{n-1}$ matrix \[ B_n=\begin{bmatrix} A_{n-1}+\sqrt{n}I_{n-1} \\ I_{n-1} \end{bmatrix}\,. \] The matrix $B_n$ has rank $2^{n-1}$, and we have \begin{align*} A_nB_n&=\begin{bmatrix} A_{n-1}(A_{n-1}+\sqrt{n}I_{n-1}) +I_{n-1}^2 \\ I_{n-1}(A_{n-1}+\sqrt{n}I_{n-1}) -A_{n-1}I_{n-1} \end{bmatrix} =\begin{bmatrix} A_{n-1}^2+\sqrt{n}A_{n-1}+I_{n-1} \\ \sqrt{n} I_{n-1} \end{bmatrix}\\[1ex] &\overset{\mathrm{by}~(1)}{=} \begin{bmatrix}\sqrt{n}A_{n-1}+nI_{n-1} \\ \sqrt{n} I_{n-1} \end{bmatrix} =\sqrt{n} B_n\,. \end{align*} Let $B^*$ be the $(2^{n-1}-1)\times 2^{n-1}$ matrix consisting of the $2^{n-1}-1$ rows of $A_n$ that do not belong to the set $S$. Since the homogeneous system $B^*x=0$ of linear equations has more variables than equations, it has a solution $x$. The $2^n\times 1$ vector $y=B_nx$ has zeros outside $S$, and $A_ny=A_nB_ny=\sqrt{n}y$. Take a position $u$ of vector $y$ at which the absolute value $|y_u|$ is maximal. Then \[ |\sqrt{n}y_u|=|(A_ny)_u|=\left|\sum_{v=1}^{2^n} A_{u,v}y_v\right| =\sum_{v\in S} |A_{u,v}\cdot y_u| \leq\sum_{v\in S} |A_{u,v}|\cdot|y_u| =|y_u|\sum_{v\in S} [\mbox{$v$ is adjacent to $u$}]\,. \] In other words, $u$ has at least $\sqrt{n}$ neighbors in $S$. $\Box$

Remark: Let $\lambda_1 \ge \lambda_2 \ge \cdots \ge \lambda_{2^n}$ be the eigenvalues of the matrix $A_n$. Equation $A_n^2=n I_{n}$ implies that each $\lambda_i$ is either $\sqrt{n}$ or $-\sqrt{n}$. Since the trace (sum of all diagonal entries) of $A_n$ is zero, and since $\lambda_1+\cdots+\lambda_{2^n}=\mathrm{Trace}(A_n)=0$, the matrix $A_n$ has only two eigenvalues $\sqrt{n}$ and $-\sqrt{n}$, each with multiplicity $2^{n-1}$. Equality $A_nB_n=\sqrt{n}B_n$ implies that columns of the matrix $B_n$ are eigenvectors for the positive eigenvalues $\sqrt{n}$ of $A_n$. Then the fact that $|\{0,1\}^n\setminus S|$ is smaller than the number of columns of $B_n$ is used to find a linear combination $y=B_nx$ of these "positive" eigenvectors to get an eigenvector $y=A_nx$ such that $y_u=0$ for all positions $u\not\in S$.

Footnotes:

^{(1)
Let $\Delta(n)$ denote the maximal number $D$ such that, for every subset
$S\subseteq \{0,1\}^n$ of size $|S|\neq 2^{n-1}$,
the maximum degree of $Q^n[S]$ or $Q^n[\bar{S}]$ is at
least $D$.}

Theorem [Gostman and Linial 1992] The following are equivalent for any monotone function $h: \mathbb{N}\to\mathbb{R}$: $\Delta(n)\geq h(n)$ if and only if $s(f)\geq h(\deg{f})$ holds for any boolean function $f$.

Jump back ☝

^{(2)
Given a $n \times n$ matrix $A$, a principal submatrix of $A$ is obtained by deleting the same set of rows and columns from $A$.}

Theorem (Cauchy's Interlace Theorem): Let $A$ be a symmetric $n \times n$ matrix, and $B$ be a $m \times m$ principal submatrix of $A$, for some $m\lt n$. If the eigenvalues of $A$ are $\lambda_1 \ge \lambda_2 \ge \cdots \ge \lambda_n$, and the eigenvalues of $B$ are $\mu_1 \ge \mu_2 \ge \cdots \ge \mu_m$, then for all $1 \le i \le m$, \[ \lambda_i \ge \mu_i \ge \lambda_{i+n-m}\,. \]

Jump back ☝

References:

[1] F. Chung, Z. Füredi, R. Graham, P. Seymour, On induced subgraphs of the cube, J. Comb. Theory, Ser. A, 49 (1) (1988), 180-187. Local copy
[2] H. Huang, Induced subgraphs of hypercubes and a proof of the Sensitivity Conjecture, Annals of Mathematics 190 (2019), 949-955. Preprint in arxiv.
[3] D. Knuth, A computational proof of Huang's degree theorem, local copy
[4] N. Nisan, M. Szegedy, On the degree of Boolean functions as real polynomials, Comput. Complexity, 4 (1992), 462-467. local copy
[5] A. Tal, Properties and applications of boolean function composition, in: Proc. of the 4th conf. on Innovations in Theoretical Computer Science, ITCS '13, (2013) 441-454. local copy

⇦ Back to the comments page