On Shannon functions for $AC^0$ circuits and $AC^0$ formulas
(A joint comment with Igor Sergeev)
Eric Allender informed us that the asymptotic equations for Shannon functions $\mathrm{L}_d(n)$ and $\mathrm{C}_d(n)$ of
bounded depth formulas and circuits, given on page 304 of the book and referred to Lupanov (1961,1977), are apparently not accurate: Uri Zwick in his lecture notes shows that Lupanon's method of synthesis yields depth-$3$
circuits of size about $\sqrt{2^n}$, if all gates of the form
$(x_1^{a_1}\land\cdots\land x_m^{a_m})^b$ with $a_i,b\in\{0,1\}$
can be used as gates.
We re-checked original papers and, indeed, what Lupanov actually proved is:
- Lupanov (1961): $\mathrm{L}(n)\sim \mathrm{L}_3(n)\sim 2^n/\log n$ for the
leafsize of $AC^0$ formulas (number of input literals), not for the number of gates.
- Lupanov (1977): $\mathrm{C}(n)\sim \mathrm{C}_3(n)\sim 2^n/n$ for fanin-$2$ circuits over
$\{\land,\lor,\neg\}$, where the depth is defined as $1$ plus the maximum number of alternations between AND and OR gates along input-output paths.
In the case when we count gates in $AC^0$ circuits, Vladimir Dančik [1] has proved
the following bounds for $\mathrm{C}_d(n)$:
\[
1.914\cdot 2^{n/2}-4n \leq \mathrm{C}(n)\leq \mathrm{C}_3(n)\leq c\cdot 2^{n/2}+n+1\,,
\]
where $c=2$ for even $n$, and $c=2.12$ for odd $n$. Actually, the lower bound was proved in [1] for a more general model, where negation gates can appear anywhere in the circuit (not only on the input variables).
Recall that inputs in a "standard" $AC^0$ circuit are literals (variables and their negations) and the gates unbounded fanin AND and OR functions. That is, negations are only on input variables.
The goal of this comment is to give simple proofs of these estimates
on $\mathrm{C}_d(n)$ for "standard" $AC^0$ circuits, and to show that
$\mathrm{L}_d(n)=\Theta(2^n/n)$ holds for the size (the number of gates, not the number of leaves) of $AC^0$ formulas of any depth $d=d(n)\geq 3$.
Theorem 1 (circuits):
For the size of $AC^0$ circuits, we have
\[
2\cdot 2^{n/2}\lesssim \mathrm{C}(n)\leq \mathrm{C}_3(n)
\leq c \cdot 2^{n/2}+1\,,
\]
where $c=2$ for even $n$, and $c=3/\sqrt{2}\approx 2.12$ for odd $n$.
Proof: Upper bound $\mathrm{C}_3 (n)\leq c \cdot 2^{n/2}$.
Split the variables into two blocks $X=\{x_1,\ldots,x_k\}$ and $Y=\{y_1,\ldots,y_m\}$ with $k+m=n$.
Associate with every vector $b\in\{0,1\}^m$ a monomial and a clause:
\[
X^b=x_1^{b_1}\land x_2^{b_2}\land\cdots\land x_m^{b_m}\
\mbox{ and }\
X_b=x_1^{1-b_1}\lor x_2^{1-b_2}\lor\cdots\lor x_m^{1-b_m}\,.
\]
Note that $X^b(c)=1$ iff $c= b$, and $X_b(c)=0$ iff $c= b$.
For every vector $a\in\{0,1\}^k$, the subfunction $f_a(Y):=f(a,Y)$ of our function $f(X,Y)$ can be written as a CNF
\[
f_a(Y)=\bigwedge_{b\colon f(a,b)=0} Y_b\,.
\]
So, the function $f$ itself can be computed using the formula
\[
f(X,Y)=\bigvee_{a\in\{0,1\}^k} X^a\land f_a(Y)=
\bigvee_{a\in\{0,1\}^k}\ X^a\land \bigwedge_{b\colon f(a,b)=0} Y_b\,.
\]
- At the first layer, use $2^m$ OR gates to compute all $2^m$ $Y$-clauses $Y_b$.
- At the second layer, for each vector
$a\in\{0,1\}^k$, take one AND gate whose inputs
are the $k$ literals of the monomial $X^a$ and all clauses $Y_b$
for vectors $b\in\{0,1\}^m$ such that $f(a,b)=0$.
At this gate, the CNF $X^a\land f_a(Y)$ is computed.
- Finally, use one OR gate (of fanin $2^k$) to compute the entire function
The resulting OR-AND-OR circuit has depth three, and uses
only $\ell=2^m+2^k+1$ gates. If $n=2t$ is even, then we take $k=m=t$
and obtain $\ell\leq 2\cdot 2^t+1=2^{n/2+1}+1$. If $n=2t+1$ is odd, then we take $k=t$, $m=t+1$,
and obtain $\ell\leq 3\cdot 2^{t}+1=2^{n/2-1/2+\log 3}+1\leq 2.12\cdot 2^{n/2}+1$, as desired.
Lower bound $\mathrm{C}(n)\gtrsim 2\cdot 2^{n/2}$.
Let $N(s)$ be the number of distinct $AC^0$ circuits with at most $s$ gates. Let $s=a+b$ where $a$ is the number of AND gates and $b$ is the number of OR gates.
We can w.l.o.g. assume that gates "alternate" in the sense that
no AND gate is entered from an AND gate, and no OR gate is entered from an OR gate. Indeed, if $(u,v)$ is an edge in a circuit, where both gates $u$ and $v$ are of the same type, then remove this edge, and add
and edge $(w,v)$ for each edge $(w,u)$ entering $u$.
- There are only two types of gates: AND and OR. This gives $2^s$ possibilities.
- There are at most $a\cdot b$ (ordered) pairs of gates of different types. Each such pair is either connected in a circuit via an edge, or remains disconnected. This gives $2^{ab}$ possibilities. By the arithmetic-geometric mean inequality, we have $ab\leq (a+b)^2/4=s^2/4$.
- For each of the $s$ gates, there are at most $3^{n}$ possibilities to choose the input literals entering this gate.
This gives $(3^n)^s=3^{ns}$ possibilities.
- So, $N(s)\leq 2^{s^2/4}\cdot 3^{ns}\cdot 2^s$.
- Since all boolean functions of $n$ variables
must be computed, the size $s$ must satisfy $N(s)\geq 2^{2^n}$, that is,
$s^2/4+ns\log 3+s\geq 2^n$ or equivalently
$s^2+4ns\log 3+4s\geq 2^{n+2}$. This gives
$s^2\geq 2^{n+2}-\Delta$ for $\Delta:=4s(1+n\log 3)$, and hence, also $s\geq 2^{n/2+1}-\sqrt{\Delta}$.
- Set $\alpha:=\sqrt{4(1+n\log 3)}/2^{n/4+1/2}=2^{-\Omega(n)}$.
- If $\sqrt{\Delta}\geq \alpha 2^{n/2+1}$, then
$4s(1+n\log 3)\geq \alpha^2 2^{n+2}$, implying that then
$s\geq 2^{n/2+1}$.
- If $\sqrt{\Delta}\leq \alpha 2^{n/2+1}$, then $s\geq (1-\alpha)2^{n/2+1}$ follows from the previous inequality
$s\geq 2^{n/2+1}-\sqrt{\Delta}$.
$\Box$
We now consider $AC^0$ formulas. Recall that then
we have the asymptotic $\mathrm{L}(n)\sim \mathrm{L}_3(n)\sim 2^n/\log n$, as long as we count the number of input literals in a formula.
The following theorem shows that, if we count only gates, then
the behavior of the Shannon function is different.
Theorem 2 (formulas): For the size of $AC^0$ formulas, we have
\[
0.38\cdot 2^n/n \lesssim \mathrm{L}(n)\leq \mathrm{L}_4(n)\lesssim 2^n/n\,.
\]
Proof:
Upper bound $\mathrm{L}_4(n)\lesssim 2^n/n$.
We will use one result of Lupanov from 1956. A rectangle in a boolean matrix $A$ is an all-$1$ submatrix. If this is an $a\times b$ submatrix, then the weight of the rectangle is $a+b$. A decomposition of $A$ is a collection of mutually disjoint rectangles in $A$ covering all $1$-entries of $A$.
The weight of such a decomposition is the sum of weights of
its rectangles.
Lemma (Lupanov [2]): Every boolean $n\times m$ matrix $A$
with $\log n\ll m\leq n$ has a decomposition of weight at most
$(1+o(1))nm/\log n$. The number of rectangles in this decomposition is $O(nm/\log^3 n)$.
See here for a proof. $\Box$
Given a boolean function $f$ of $n$ variables, we now construct
a depth-$4$ $AC^0$ formula for $f$ of size about $2^n/n$ as follows.
- Split the variables into two blocks $X=\{x_1,\ldots,x_p\}$ and $Y=\{y_1,\ldots,y_q\}$ for $p+q=n$ and $\log p\ll q\leq p$.
- Represent our function $f(X,Y)$ as a boolean $2^{p}\times 2^{q}$ matrix whose entries are the values of $f$.
- Lupanov's lemma gives us a decomposition of the matrix into
rectangles $A_1\times B_1,\ldots,A_t\times B_t$ such that
\[
N:=\sum_{i=1}^t (|A_i|+|B_i|) \lesssim \frac{2^{p}\cdot 2^{q}}{|X|}=\frac{2^n}{p}
\]
and $t=O(2^n/p^3)=o(2^n/p)$.
- Let $r_i(X)$ be the OR of the monomials $X^a$ over all rows $a\in A_i$, and $c_i(Y)$ be the OR of the monomials $Y^b$ over all columns $b\in B_i$. We can represent $f$ as
\[
f = \bigvee_{i=1}^t r_i(X) \land c_i(Y)
=\bigvee_{i=1}^t\left(\bigvee_{a\in A_i} X^a \right) \land
\left(\bigvee_{b\in B_i} Y^b \right)\,.
\]
- On the first layer of the formula, we compute all monomials
participating in some $r_i(X)$ or some $c_i(Y)$. There are at most
$N$ such monomials.
- On the second layer, we compute all disjunctions of these monomials corresponding to the rows and columns of the rectangles. For this, $2t$ OR gates suffice.
- On the third layer, we take all $t$ ANDs of these disjunctions, and finally compute the or of these ANDs.
- Note that all $r_i(X)$ and $c_i(Y)$ can be simultaneously computed by an $AC^0$ formula of depth $2$ using at most $N$ gates.
- We thus have an $AC^0$ formula of depth $4$ with at most $N+3t+1 \leq 2^n/p+o(2^n/p)+1$ gates. Taking $\log n \ll q = n^{o(1)}$, we have $p=n-q\geq n-n^{o(1)}$, and the desired upper bound $l\leq (1+o(1))2^n/n$ follows.
$\Box$
Lower bound $\mathrm{L}(n)\gtrsim 0.38\cdot 2^n/n$.
Let $N(s)$ be the number of distinct $AC^0$ formulas with at most $s$ gates.
- Every gate of a formula has only one outgoing edge.
- So, for every gate, there are at most $s$ possibilities to choose where this (leaving) edge should go. This gives at most $s^s$ possibilities.
- For every gate, there are only two possibilities to choose its type (AND or OR gate), and at most $3^n$ possibilities to choose what input literals should enter this gate.
- So, $N(s)\leq (2s3^n)^s$. Since $FNs)$ must be at least $2^{2^n}$, the size $s$ must satisfy the inequality
$s\log s + sn\log 3+ s\geq 2^n$.
- Since we can assume that $s\lesssim 2^n/n$, this implies $sn(1+\log 3)\geq s(n-\log n+ n\log 3+1)\gtrsim 2^n$, and the desired lower bound $s\gtrsim 2^n/n\log 6\geq 0.38\cdot 2^n/n$ follows.
$\Box$
The proofs of upper bounds $\mathrm{C}_3(n)
\leq 2.12 \cdot 2^{n/2}+1$ and $\mathrm{L}_4(n)\lesssim 2^n/n$
in Theorems 1 and 2 are direct and elementary: no "tricks" are used in the constructions. By using some of the tricks (introduced by Lupanov and by Nechiporuk), one can prove $\mathrm{C}_4(n)
\lesssim 2 \cdot 2^{n/2}$ for all $n$ (even and odd) and $\mathrm{L}_3(n)\lesssim 2^n/n$. This
yields asymptotic equations $\mathrm{C}(n)\sim \mathrm{C}_4(n)
\sim 2 \cdot 2^{n/2}$ for $AC^0$ circuits.
To finish the picture, we have to prove an upper bound $\mathrm{C}_3(n)\lesssim 2 \cdot 2^{n/2}$ (for odd values of $n$), and to prove the lower bound $\mathrm{L}(n)\gtrsim 2^n/n$ for $AC^0$ formulas.
Then we would have the asymptotic equalities $\mathrm{C}(n)\sim \mathrm{C}_3(n)\sim 2\cdot 2^{n/2}$ and $\mathrm{L}(n)\sim \mathrm{L}_3(n)\sim 2^n/n$ for the size of $AC^0$ circuits and formulas.
Reference:
- V. Dančik, Complexity of Boolean functions over bases of unbounded fan-in gates, Information Processing Letters 57 (1996) 31-34. A local copy.
- O.B. Lupanov, On rectifier and switching-and-rectifier schemes,
Doklady Akad. Nauk SSSR, 111 (1956), 1171-1174 (in Russian).
English translation (by Igor Sergeev) is here.
S. Jukna and I. Sergeev, 18.07.2017