The first lower bound for monotone depth-3 $AC^0$ circuits

As we know, first super-polynomial lower bounds on constant-depth circuits (monotone and non-monotone) where proved almost simultaneously by several authors in 1983-1984; see Chapters 11 and 12 in the book and references herein. In fact, such a bound for monotone depth-3 circuits was proved 3 years earlier by Russian mathematician G.A. Tkachov [1], a student of O.B. Lupanov. His paper was published 1980 in local proceedings of Gorkii university, and was not available in the West (and even hardly available within the former USSR). The goal of this (more historical) comment is to sketch what he proved, and how.

The function he considered is an iterated AND-OR function $f_n(X)$ of $n=4^h$ variables, and is defined inductively as follows. For $n=4$, \[ f_4(x_1,x_2,x_3,x_4)=x_1x_2\lor x_3x_4. \]

Tkachov's function
For general $n=4^h$, \[ f_{4n}(X)=f_1(f_n(X_1),f_n(X_2),f_n(X_3),f_n(X_4)) =f_n(X_1)\land f_n(X_2)\lor f_n(X_3)\land f_n(X_4) \] where $X_1, X_2,X_3,X_4$ is a partition of the set $X$ of $|X|=4n$ variables into four consequent intervals, each of length $n$.

Theorem (Tkachov 1980): Every monotone $\Sigma_3$-circuit for $f_n$ has $2^{\Omega( n^{1/4})}$ gates.
We will sketch his argument when translated into "modern" terms, used in the book. Roughly speaking, his argument is as follows: To achieve this last goal, Tkachov uses a very special property of maxterms of his function $f_n$. By its definition, $f_n$ can be computed by a monotone formula which is read-once: no variable appears twice as a leaf. Thus, $f_n$ can be computed by a monotone $\pi$-scheme in which no two contacts are labeled by the same variable. That is, we have a 1-to-1 correspondence between variables and contacts.

On the other hand, every $\pi$-scheme (i.e. DeMorgan formula) has the following easy verifiable properties, not shared by arbitrary contact schemes. A chain in a scheme is a path from its source to a sink. A cut of a set of chains is a set of contacts intersecting each of these paths; such a cut is minimal (is a min-cut) if no its proper subset does the job. A global cut is a min-cut of all paths in the scheme. Then we have:

  1. Every global cut intersects every chain in exactly one contact.
  2. For every set $P$ of chains, every min-cut of $P$ lies within at least one global cut.
  3. No two min-cuts of $P$ can lie in the same global cut.
Properties (i) and (ii) can be easily proved by induction on the number of contacts. If our scheme $S$ is an AND of two sub-schemes, then every min-cut (global or not) lies entirely in exactly one of these (strictly smaller) sub-schemes. If $S$ is an OR of two sub-schemes, then the corresponding parts of every min-cut are min-cuts for the coresponding (disjoint) subsets of chains in strictly smaller sub-schems. In both cases we can apply the induction hypothesis. Property (iii) follows from (i): contained some global cut two distinct min-cuts of the same set $P$ of chains, then this global cut would contain at least two distinct contacts of some chain in $P$, a cotradiction with (ii). $\Box$

Now suppose we have a monotone $\Sigma_3$-circuit computing $f_n$. This circuit is just an OR of monotone CNFs. Every minterm of $f_n$ must be produced by at least one of these CNFs. So, take one of these CNFs $F$. Minterms of $f_n$ correspond to chains, and maxterms of $f_n$ to global cuts in the $\pi$-scheme for $f_n$. Let $P$ be the set of minterms of $f_n$ produced by $F$, and $P^{\perp}$ denote the set of all min-cuts of $P$. Let $|F|$ be the number of clauses in $F$. Lemma 2 in his paper states the following.

Lemma A: $|F|\geq |P^{\perp}|$.
Proof: It is enough to show that every min-cut $Y$ of $P$ lies in at least one clause of $F$, and that no two min-cuts can lie in the same clause. By (ii), we know that $Y\subseteq X$ for some maxterm $X$ of $f_n$. On the other hand, there must be at least one clause $C$ of $F$ such that $C\subseteq X$. To see this, take an assignment $a\in \{0,1\}^N$ defined by: $a_i=0$ if $i\in X$, and $a_i=1$ otherwise. Then $F(a)\leq f_n(a)=0$ since $X$ is a maxterm of $f_n$. Had we however $C\not\subseteq X$ for all clauses $C$ of $F$, then $a$ would set to $1$ at least one variable in each clause, and we would get $F(a)=1$, a contradiction. Thus, both $Y$ and $C$ are contained in $X$. The clause $C$ is a cut of $P$. Hence, it contains some min-cut $Y'\subseteq C\subseteq X$ of $P$. Held now $Y\not\subseteq C$, then the maxterm $X$ would contain two two distinct min-cuts $Y$ and $Y'$ of the same set of minterms $P$, contradicting Property (i). We have thus shown that every min-cut $Y$ of $P$ must be contained in its own clause, as claimed. $\Box$

The main technical contribution of the paper is the following lemma, which is the Main Lemma in his paper. He gives its proof in the appendix. The proof itself uses a clever counting, but is too long to at least sketch it here.

Lemma B: For every subset $P$ of minterms of $f_n$, \[ |P^{\perp}| \gtrsim |P|\cdot 2^{n^{1/4}-\sqrt{n}}. \]
Having this lemma, we can finish the proof of the theorem as follows. For the number $\mu_n$ of minterms of $f_n$, we have the following recursion (recall that $n$ is of the form $n=4^h$): $\mu_4=2$, and $\mu_{4n}=2\mu_n^2$. This yields $$\displaystyle \mu_n\geq 2^{\sum_{i=0}^{h-1}2^i}=2^{2^h-1}=2^{\sqrt{n}-1}. $$ Let $s$ be the top fanin of our circuit, i.e. the number of CNFs in it. Then there exists a CNF $F$, the family $P$ of minterms of $f_n$ produced by which is large enough: \[ |P|\geq \frac{1}{s}\cdot \mu_n\gtrsim \frac{1}{s}\cdot 2^{\sqrt{n}}. \] Lemmas A and B imply \[ |F|\geq |P^{\perp}|\gtrsim |P|\cdot 2^{n^{1/4}-\sqrt{n}} \gtrsim \frac{1}{s}\cdot 2^{\sqrt{n}}\cdot 2^{n^{1/4}-\sqrt{n}} = \frac{1}{s}\cdot 2^{n^{1/4}} \] Thus, $\max\{s, |F|\}=2^{\Omega( n^{1/4})}$. $\Box$

Note (observed by Igor Sergeev): Recall that the function $f_n$ is defined as a read-once formula having the form of alternating AND and OR gates (see how it looks). In 1984, Valiant has shown that the majority function $Maj_m$ of $n$ variables can be computed by an alternating (not read-once) AND/OR tree of depth $5.3\log m$. Thus, $Maj_m$ is a "subfunction" of $f_n$ for $n$ about $m^{5.3}$ which can be obtained from $f_n$ by setting some variables to constants, and identifying some of the remaining variables.

Acknowledgment:
My thanks to Igor Sergeev for scanning and repairing this (old, hardly available, and "scan-resistant") paper, and for very interesting discussions on the paper itself.

Reference:

  1. G.A. Tkachov (1980): On complexity of realization of a sequence of boolean functions by schemas of functional elements and $\pi$-schemes under additional restrictions on the structure of the schemas, in: Kombinatorno-Algebraicheskie Metody v Prikladnoj Matematike (Combinatorial-Algebraic Methods in Applied Mathematics), Gorkij, 161--207. For those who can read Russian, here is a scanned .djvu copy of his paper (23MB, be patient).

April 2013