Since my original, direct and fairly simple combinatorial argument in [6] (via so-called finite limits) does not had this "absence of the cross-intersection property" issue, I present it below. We have already seen the concept of finite limits in the context of depth-3 circuits (Sect. 11.3). Here we show how this concept - told me by Mike Sipser many years ago - can be used for monotone (and even real) circuits. We consider circuits using any monotone $d$-ary real functions $\varphi:\RR^d\to\RR$ as gates; such a function is monotone if $x_1\leq y_1,\ldots,x_d\leq y_d$ implies $\phi(x_1,\ldots,x_d)\leq \phi(y_1,\ldots,y_d)$. Let us call such circuits $d$-local monotone real circuits. It is clear that this class includes all monotone boolean circuits that are allowed to use arbitrary monotone $d$-ary boolean functions $\phi:\{0,1\}^d\to\{0,1\}$ as gates.
Using counting arguments, Rosenbloom [7] has show that already $2$-local monotone real circuits can be exponentially more powerful than even non-monotone boolean $\{\lor,\land,\neg\}$ circuits!
On the other hand, we have shown in [6] that that (similarly as in the case of monotone boolean $\{\lor,\land\}$ circuits) functions computable by "small" monotone real $d$-local circuits are "simple" in the following sense.
Definition 3: Let $f$ be a monotone boolean function of $n$ variables. We say that $f$ is $(d,t)$-simple if for every pair of integer parameters $2\leq r,s\leq n$ there exists an exact $s$-CNF $C(x)$ with $|C|\leq t\cdot (dr-1)^{s}$ clauses, an exact $r$-DNF $D(x)$ with $|D|\leq t\cdot (ds-1)^{r}$ monomials, and a clause $c(x)=\bigvee_{i\in S}x_i$ with $|S|\leq s-1$ variables such that at least one of the following holds.For a monotone boolean function $f$, let $\real{f}{d}$ denote the minimum size of a monotone read $d$-local circuit computing $f$.
- (a) $C(u)=0$ for all vectors $u\in f^{-1}(0)$.
- (b) $D(v)\lor c(v)=1$ for all vectors $v\in f^{-1}(1)$.
Theorem 1 (The Criterion): Let $f$ be a monotone boolean function. If $\real{f}{d}\leq t$, then $f$ is $(d,t)$-simple.Thus, as we have done it in the case of boolean circuits (Theorem 9.17), this reduces the lower bounds problem for real circuits to an instance of SET COVER problem: show that for some choice of parameters $s$ and $r$, either no exact $s$-CNF can reject ``too many'' negative inputs, or no exact $r$-DNF can accept ``too many'' positive inputs.
Now fix a monotone boolean function $f:\{0,1\}^n\to\{0,1\}$. This function splits the binary cube into two parts $f^{-1}(0)$ and $f^{-1}(1)$. We call vectors in $f^{-1}(0)$ $0$-inputs, and those in $f^{-1}(1)$ $1$-inputs. The support of a vector $u\in \{0,1\}^n$ is the set $I(u):=\{i \colon u_i=f(u)\}$ of positions holding the value $f(u)$. Thus,
We will be interested in the case when vectors $u$ and sets $A$ come from different parts $f^{-1}(0)$ and $f^{-1}(1)$. Namely, if $u\in f^{-1}(\epsilon)$ and $A\subseteq f^{-1}(\epsilon\oplus 1)$, then we say that a witness $S\subseteq[n]$ of the vector $u$ against the set $A$ is a legal witness if $S\subseteq I(u)$. Thus, $S$ is a legal witness of vector $u$ against the set $A$ iff $S\cap I(v)\neq \emptyset$ for all $v\in A$. We will be interested in the minimal possible size $|S|$ of subsets $S\subseteq I(u)$ with this property.
Definition 1: A vector $u\in f^{-1}(\epsilon)$ is a $k$-limit for a set $A\subseteq f^{-1}(\epsilon\oplus 1)$ if for every set $S\subseteq I(u)$ of $|S|\lt k$ positions there is a vector $v\in A$ such that $u_i=v_i$ for all $i\in S$.In other words, a vector $u$ is a $k$-limit for $A$ if $|S|\geq k$ holds for every legal witness $S$ of $u$ against $A$. That is, if $u\in f^{-1}(1)$ and $A\subseteq f^{-1}(0)$, then for any set of fewer than $k$ $1$-positions of vector $u$ at least one vector of $A$ also has $1$s in all these positions. This means that the fact that $u\not\in A$ cannot be detected by looking at fewer than $k$ $1$-positions of vector $u$; in the case when $u\in f^{-1}(0)$ and $A\subseteq f^{-1}(1)$, the same holds for sets of $0$-positions of vector $u$. The empty set $A=\emptyset$ has no limit vectors.
Property 1 If a vector $u$ is a $k$-limit for a set $A$ and if $A\subseteq B$, then $u$ is a $k$-limit for the set $B$. If a vector $u$ is a $(dk)$-limit for a union of $d$ sets, then then $u$ is a $k$-limit for at least one of these sets.Proof: If a vector $u$ has legal witnesses $S_1,\ldots,S_d$ for the sets $A_1,\ldots,A_d$, then the set $S=S_1\cup\cdots\cup S_d$ is a legal witness of vector $u$ against the entire set $A=A_1\cup\cdots\cup A_d$. So, if $|S_i|\lt k$ for all $i$ (i.e., if vector $u$ is a $k$-limit for none of the sets $A_i$), then the set $S$ of size $|S|\lt dk$ is a legal witness of vector $u$ agains the entire set $A$. $\Box$
Definition 2: A set $A\subseteq f^{-1}(\epsilon)$ is $(a,b)$-closed if there exists a set $B\subseteq f^{-1}(\epsilon\oplus 1)$ such that:Remark 1: In terms of CNFs/DNFs, $(a,b)$-closure of a set $A\subseteq f^{-1}(1)$ means that there is a nonempty $(b-1)$-CNF $C$ such that $C(u)=1$ for every vector $u\in A$ but $C(w)=0$ for every vector $w\leq u$ with fewer than $a$ ones; to see this, just take $B:=C^{-1}(0)$. If $A\subseteq f^{-1}(0)$, then there is a nonempty $(b-1)$-DNF $D$ such that $D(u)=0$ for every vector $u\in A$ but $D(w)=1$ for every vector $w\geq u$ with fewer than $a$ zeros; to see this, just take $B:=D^{-1}(1)$. $\Box$
- (i) every vector of $A$ is an $a$-limit for the set $B$,
- (ii) no vector of $B$ is a $b$-limit for the set $A$.
Main property of closed sets is given by the following
Closure Lemma: If $A\subseteq f^{-1}(\epsilon)$ is $(a,b)$-closed, then there exist a family $\f\subseteq 2^{[n]}$ of $|\f|\leq (b-1)^{a}$ $a$-element sets such that the support $I(u)$ of each vector $u\in A$ contains at least one of the sets $F\in\f$.That is, for every vector $u\in A$ there is a set $F\in\f$ such that $u_i=\epsilon$ holds for all positions $i\in F$.
To prove the Closure Lemma, we will use the following property of transversal of families of sets. A transversal of a family ${\cal A}\subseteq 2^X$ of sets is a set $T\subseteq X$ intersecting all members of the family. Such a transversal is $r$-critical if no its subset $S\subseteq T$ of size $|S| \lt r$ is a transversal of ${\cal A}$.
Transversal Lemma ([6]): Let ${\cal A}$ be a nonempty finite family of sets, each with $\leq s$ elements. Then, for every integer $r\geq 1$ there is a family $\f$ of $|\f|\leq s^r$ $r$-element sets such that every $r$-critical transversal of ${\cal A}$ contains at least one set of $\f$.Proof: See here. $\Box$
Proof of Closure Lemma: By property (2) of the $(a,b)$-closure, for every vector $v\in B$ there is a subset $S_{v}\subseteq I(v)$ of $|S_{v}|\leq b-1$ positions such that the vector $v$ differs from every vector $u\in A$ in at least one position $i\in S_{v}$. This, in particular, means that the support $I(u)$ of every vector $u\in A$ intersects all sets of the family $\A:=\{S_{v}\colon v\in B\}$, that is, each the sets $I(u)$ with $u\in A$ is a a transversal of the family $\A$. Thus, by the Transversal Lemma applied with parameters $s:=b-1$ and $r:=a$, it is enough to show that for each vector $u\in A$ the set $I(u)$ is, in fact, an $a$-critical transversal for $\A$.
To show this, suppose that some set $I(u)$ is not an $a$-critical transversal for $\A$. Then some subset $T\subseteq I(u)$ of size $|T| \lt a-1$ also intersect all sets of $\A$. This means that the set $T$ is a legal witness (of size $\lt a-1$) of vector $u$ against the set $B$, a contradiction with the fact given by Property (1) that vector $u$ is an $a$-limit against the set $B$. $\Box$
The Closure Lemma shows that every $(a,b)$-closed set $A$ has the following property similar to that given by the Switching Lemma (Lemma 9.15 in the book):
Fix any two integer parameters $2\leq r,s\leq n$. Let $U^0:=f^{-1}(0)$ and $U^1:= f^{-1}(1)$. To capture the progress made by one gate, we associate with the $i$th gate the bipartite graph \[ G_i\df \left\{(u,v)\in U^0\times U^1\colon f_i(u) \lt f_i(v)\right\} \] whose edges are pairs of negative/positive inputs separated ``correctly'' at that gate. Since the entire circuit correctly computes the target boolean function $f$, the last graph is the bipartite complete graph $G_t=U^0\times U^1$. Let $U^0_i\subseteq U^0$ and $U^1_i\subseteq U^1$ denote the sets of vertices of the $i$th graph $G_i$ (on the corresponding side of the bipartition) that are non-isolated in the graph $G_i$. That is, $U^{\epsilon}_i$ is the set of input vectors correctly separated by the $i$th gate $f_i$ from at least one vector on the other side of the bipartition. The monotonicity of gates gives us the following property of graphs $G_i$: \begin{equation}\label{eq:monotonicity} \mbox{If $f_i=\phi(f_{i_1},\ldots,f_{i_d})$, then $G_i\subseteq G_{i_1}\cup\ldots\cup G_{i_d}$.} \end{equation} Indeed, observe that if some edge $(u,v)\in U^0\times U^1$ appears in none of the graphs $G_{i_1},\ldots,G_{i_d}$, then $f_{i_j}(u)\geq f_{i_j}(v)$ holds for all $j=1,\ldots,d$. Since the function $\phi:\RR^d\to\RR$ is nondecreasing, this implies that $f_i(u)\geq f_i(v)$, i.e. that $(u,v)\not\in G_i$, as desired.
We define what input vectors that are hard for a given gate $f_i$ by exploring the gates $f_1,\ldots,f_t$ one-by-one. Initially no input vector is hard. Suppose we already know what input vectors are hard for the first $i-1$ gates $f_1,\ldots,f_{i-1}$, i.e. that for every $j=1,\ldots,i-1$ and $\epsilon=0,1$, we already know the set $H_j^{\epsilon}$ of input vectors that were hard for the gate $f_j$. Let \[ E^{\epsilon}_i\df U^{\epsilon}_i\setminus\left(H_{1}^{\epsilon}\cup\ldots \cup H_{i-1}^{\epsilon}\right)\subseteq U^{\epsilon} \] denote the set of all vectors that were easy for all previous gates $f_1,\ldots,f_{i-1}$, that is, were hard for none of these gates. We define the sets of input vectors that are hard for the $i$th gate by: \begin{align*} H_i^0&\df \{u\in U^0_i\colon \mbox{$u$ is an $s$-limit for the set $E^1_i\subseteq U^1$}\}\,;\\ H_i^1&\df \{v\in U^1_i\colon \mbox{$v$ is an $r$-limit for the set $E^0_i\subseteq U^0$}\}\,. \end{align*} That is, a vector $u\in U^0_i$ is hard for the $i$th gate $f_i$ if it was hard for none of the previous gates, and the fact "$u\not\in E^0_i$" cannot be detected by looking at fewer than $s$ $0$-positions of $u$; for vectors from the other part $U^1$, we look at their $1$-positions. In particular, $H^0_i=\emptyset$ if $E^1_i=\emptyset$, and $H^1_i=\emptyset$ if $E^0_i=\emptyset$.
Claim 1: For every $i=1,\ldots,t$ the set $H_i^0$ is either empty or is $(s,dr)$-closed, and the set $H_i^1$ is either empty or is $(r,ds)$-closed.Thus, by the Closure Lemma, $H_i^0\subseteq C^{-1}(0)$ holds for some exact $s$-CNF $C(x)=\bigwedge_{F\in\f}\bigvee_{i\in F}x_i$ consisting of $|\f|\leq (dr-1)^s$ clauses, and $H_i^1\subseteq D^{-1}(1)$ holds for some exact $r$-DNF $D(x)=\bigvee_{F\in\f}\bigwedge_{i\in F}x_i$ consisting of $|\f|\leq (ds-1)^r$ monomials.
Proof of Claim 1: We will prove the claim only for sets $H_i^0$ of hard inputs on the left side of the bipartition; for sets $H_i^1$ the argument is the same (with parameters $s$ and $r$ interchanged). If the $i$-th gate $f_i$ is one of the variables $x_l$ $(1\leq l\leq n)$, then $H_i^0=\emptyset$ (initially, there are no hard input vectors). So, assume that $f_i=\phi(f_{i_1},\ldots,f_{i_d})$ is the $i$-th gate. We are going to show that both the items (i) and (ii) in Definition 2 (of closed sets) hold with $a \df s$, $b \df dr$, $A\df H_i^0$ and $B\df E^1_i$ for $j=1,\ldots,m$.
Item (i) holds by the definition of $H_i^0$: every vector $u\in H_i^0$ is an $s$-limit for the set $E^1_i$. To verify the second item (ii), suppose the opposite that some input $v\in E^1_i$ is a $(dr)$-limit for the set $H^0_i$. Since $H^0_i\subseteq E^0_i$, the vector $v$ is then also a $(dr)$-limit for the set $E^0_i$. Since $v\in E^1_i$, we know that vector $v$ was hard for none of the previous gates $f_1,\ldots,f_{i-1}$, that is, \begin{equation}\label{eq:not-hard} v\not\in H_1^1\cup\cdots\cup H_{i-1}^1 \end{equation} By the property \eqref{eq:monotonicity} of the graphs $G_i$ associated with the gates $f_i$, we have the inclusion $G_i\subseteq G_{i_1}\cup\ldots\cup G_{i_d}$ and, hence, also the inclusion $E^1_i\subseteq E^1_{i_1}\cup\ldots\cup E^1_{i_d}$. Hence, vector $v$ is a $(dr)$-limit for the union $E^1_{i_1}\cup\ldots\cup E^1_{i_d}$ of $d$ sets and, by Property 1 of finite limits, vector $v$ was an $r$-limit for some of these sets $E^1_{i_{\ell}}$, meaning that, $v\in H^1_{i_{\ell}}$, a contradiction with \eqref{eq:not-hard}. This finishes the proof of Claim 1. ∎
If $E^0\subseteq U^0$ and $E^1\subseteq U^1$ denote the sets of vectors which were hard for none of the gates $f_1,\ldots,f_t$, then $U^0=E^0 \cup H_1^0\cup \cdots H_t^0$ and $U^1=E^1 \cup H_1^1\cup \cdots H_t^1$.
Claim 2: Either $E^0=\emptyset$ or there exists an $(s-1)$-element subset $S_0$ of $\{1,\ldots,n\}$ such that $v(S_0)\not\equiv 0$ for all vectors $v\in E^1$. Dually, either $E^1=\emptyset$ or there exists an $(r-1)$-element subset $S_0$ of $\{1,\ldots,n\}$ such that $u(S_0)\not\equiv 1$ for all vectors $u\in E^0$.Proof: If $E^0=\emptyset$, there is nothing to do. Suppose therefore that $E^0\neq\emptyset$ and take an arbitrary vector $u\in E^0$. By the definition of $E^0$, this vector $u$ was hard for none of the gates, and in particular, was not hard for the last gate $f_t$. Since our circuit computes the function $f$ we have that $f_t=f$, and hence (by the definition of hardness), vector $u$ has a legal witness $S_0$ of size $|S_0|\leq s-1$ against the set $f^{-1}(1)\setminus\left(H_1^1\cup \cdots H_{t-1}^1\right)$. Since $E^1$ is a subset of this set, $S_0$ is a legal witness of $u$ also against $E^1$. Since $f(u)=0$, the legality of $S_0$ means that $u(S_0)\equiv 0$. Hence, $v(S_0)\not\equiv 0$ for all vectors $v\in E^1$, as claimed. ∎
We can now finally show that our target function $f=f_t$ is $(d,t)$-simple.
Case 1: $E^0=\emptyset$. Then the left part is the union $U^0 = H_1^0\cup \cdots H_t^0$ of the sets of negative input vectors that were hard for the gates $f_1,\ldots,f_t$. Since we only have $t$ gates, Claim 1 gives us an exact $s$-CNF $C$ consisting of $\leq t\cdot (dr-1)^s$ clauses such that $C(u)=0$ holds for all vectors $u\in U^0$.
Case 2: $E^0\neq\emptyset$. The right part is the union $U^1=E^1 \cup H_1^1\cup \cdots H_t^1$. By Claim 2, there is a clause $c(x)=\bigvee_{i\in S_0}x_i$ with $|S_0|\leq s-1$ variables such that $c(v)=1$ holds for all vectors $v\in E^1$. Since we only have $t$ gates, Claim 1 gives us an exact $r$-DNF $D$ consisting of $\leq t\cdot (ds-1)^r$ clauses such that $D(v)=1$ holds for all vectors $u\in U^1\setminus E^1$. Hence, $D(v)\lor c(v)=1$ holds for all vectors $v\in U^1$. This completes the proof of the theorem. ∎
Remark 2: Note that the only place where the monotonicity of operations $\varphi:\RR^d\to \RR$ performed at the gates was used was to show that if $f_i=\varphi(f_{i_1},\ldots,f_{i_d})$, then the inclusion $G_i\subseteq G_{i_1}\cup\cdots\cup G_{i_d}$ holds. That is, if a pair $(u,v)\in f^{-1}(0)\times f^{-1}(1)$ is correctly separated by the gate $f_i$, then it was correctly separated by at least one of the previous gates $f_{i_j}$. $\Box$
It is known that this graph is $(n-1)/2$-regular, and has the following ``uniform neighbourhood'' property: for every two disjoint sets $A,B$ one part $V_1$ or $V_2$ of the bipartition, with $|A|+|B|=k$, $k \lt (\log n)/4$, the number $\bound(k)$ of vertices (in the opposite part) that are adjacent with all vertices in $A$ and are nonadjacent with all vertices in $B$ is very close to $n/2^k$, namely \begin{equation}\label{eq:paley} |\bound(k) - n/2^{k}|\leq k\sqrt{n}. \end{equation} In the case of usual (non-bipartite) Paley graphs this property was established by Bollobas and Thomason [3]; the bound in this case is even better: $|\bound(k) - n/2^{k}|\leq k\sqrt{n}/2+k/2$. In the bipartite case one must be more careful because now we have two copies of $\gf{n}$, and hence, no of the edges $(x,x)$ is present in the graph. Still, with slight modification, the proof carries over also to bipartite case (with slightly worse upper bound $k\sqrt{n}/2+k$, which is still $\leq k\sqrt{n}$ for $n\geq 5$); an analysis for the bipartite case is given in [1,2].
Define $\paley(n,k)$ to be the function of $2n$ Boolean variables representing the vertices in $V_1\cup V_2$, which accepts a set of vertices iff this set contains some $k$-element subset $A\subseteq V_1$ together with the set of its common neighbours \[ \all{A}\df\{y\in V_2\colon (x,y)\in E\mbox{ for all vertices } x\in A\}. \] Thus, the sets of the form $A\cup\all{A}$, with $A\subseteq V_1$ and $|A|=k$ are $1$ inputs of $\paley(n,k)$. To define a set of $0$-inputs, let $\no{B}$ denote the set of all common non-neighbours, i.e. \[ \no{A}\df\{y\in V_2\colon (x,y)\not\in E\mbox{ for all vertices } x\in A\}. \] If $k\leq \tfrac{1}{6}\log n$ and $n$ is sufficiently large, then, by the above mentioned uniform neighbourhood property \eqref{eq:paley} of Paley graphs, we have that \[ \bound(2k) \geq n/2^{2k}- 2k\sqrt{n} \geq n/n^{1/3}-\tfrac{1}{3}\sqrt{n}\log n \gt 0\,. \] Hence, for any pair of disjoint $k$-element sets $A,B\subseteq V_1$ there must be a vertex $y\in V_2$ which is a common neighbour of all the vertices in $A$ and is isolated from all the vertices in $B$. Thus, \[ \mbox{$A\cap B=\emptyset$ iff $\all{A}\cap\no{B}\neq\emptyset$.} \] This implies that sets of the form $B\cup\no{B}$ with $B\subseteq V_1$ and $|B|=k$ intersect all the $1$-inputs and contain none of them; hence, these sets are $0$-inputs for $\paley(n,k)$.
For $k=\Theta(\log n)$ the minterms of $\paley(n,k)$ have size at most $k+\bound(k)\leq n$. Hence, the function can be computed by a trivial monotone circuit using $n{{n}\choose{k}}\leq n^{O(\log n)}$ fanin-2 And and Or gates. Is this trivial upper bound optimal? It was proved in [1,2,4] that in the case of, so-called, monotone span programs we cannot do better: any such program for $\paley(n,k)$ with $k=\Theta(\log n)$, requires size $n^{\Omega(\log n)}$. We prove that the same holds also in the case of monotone real circuits.
Theorem 2: Let $n\equiv 1 \mod 4$ be a sufficiently large odd prime power, $k=\lfloor(\log n)/6\rfloor$ and $f=\paley(n,k)$. Then $\real{f}{d}=n^{\Omega(\log n)}$ holds for any fanin $d\leq n^{1/14}$ of gates.Proof: We are going to apply Theorem 1 with equal parameters $s=r:=k$.
Claim 1: Any $k$-element set $S\subseteq V_1\cup V_2$ is contained in at most $n^{k(1-c)}$ positive inputs, and in at most $n^{k(1-c)}$ negative inputs, where $c\geq 1/13$.Proof: Let $A_i\cup\all{A_i}$, $i=1,\ldots,m$ be all positive inputs containing the set $S$. Then \[ \bigcap_{i=1}^m A_i\supseteq S\cap V_1\ \ \mbox{ and }\ \ S\cap V_2\supseteq \bigcap_{i=1}^m\all{A_i}\ \ \mbox{ or equivalently }\ \ \bigcup_{i=1}^m A_i\subseteq \all{S\cap V_2}. \] If less than half of the vertices of $S$ lie in $V_2$ then $|S\cap V_1|\geq k/2$, and hence, $m\leq \binom{n-k/2}{k-k/2} \leq n^{k/2}$. Otherwise, $|S\cap V_2|\geq k/2$, and we have $m\leq \binom{\bound(k/2)}{k} \leq \bound(k/2)^k$. Since $k\leq \tfrac{1}{6}\log n$, the above mentioned uniformity property \eqref{eq:paley} of Paley graphs gives an upper bound \[ \bound(k/2)\leq n/2^{k/2}+(k/2)\sqrt{n} \leq n^{1-1/12}+\tfrac{1}{12}\sqrt{n}\log n\leq n^{1-1/13}\,. \] Thus, $m\leq \bound(k/2)^k\leq n^{k(1-1/13)}$. The proof for the negative inputs is the same using the sets $\no{A_i}$ of common non-neighbors instead of sets $\all{A_i}$ of common neighbors. $\Box$
Claim 2: At least a half of all positive inputs avoid any fixed $k$-element set $S_0\subseteq V_1\cup V_2$.Proof: The number of positive (as well as of negative) inputs is $\binom{n}{k}$. To estimate the number of positive inputs avoiding the set $S_0$, let $A_i\cup\all{A_i}$, $i=1,\ldots,m$ be the inputs containing a fixed vertex $x\in S_0$. If $x\in V_1$ then clearly, $m\leq \binom{n-1}{k-1}=\frac{k}{n}\binom{n}{k}$. If $x\in V_2$ then $\left|\bigcup_{i=1}^mA_i\right|$ cannot exceed the degree $(n-1)/2$ of vertex $x$, because every vertex of $\bigcup_{i=1}^mA_i\subseteq V_1$ is a neighbor of vertex $x$. Hence, $m\leq \binom{(n-1)/2}{k} = \binom{n-b}{k}$ for $b:=(n+1)/2$. Since $\binom{n-b}{k}\left/\binom{n}{k}\right. \leq \left(\frac{n-k}{n}\right)^b\leq e^{-(k/n)b}\leq e^{-k/2}$, we have that in both cases at least $1-|S_0|\cdot \max\{k/n, e^{-k/2}\}\geq 1-o(1)$ fraction of all $\binom{q}{k}$ positive inputs avoid the set $S_0$. $\Box$
Together with Claims 1 and 2, Theorem 1 implies that our circuit must have size (recall that $k=\lfloor(\log n)/6\rfloor$ and $d\leq n^{1/14}$): \[ t\geq \frac{1}{2}\cdot \frac{\binom{n}{k}}{(dk)^k\cdot n^{k(1-1/13)}} \geq \frac{(n/k)^k}{(dk)^k\cdot n^{k(1-1/13)}} =\left(\frac{n^{1/13}}{dk^2} \right)^k =n^{\Omega(\log n)}\,. \] This completes the proof of Theorem 2. ∎
Remark 3 (Small vs. large fanin): Note that the lower bound in Theorem 2 holds when gates of large fanin $d$ are allowed. Hrubes and Pudlák [5] have show that for every monotone boolean function $f:\{0,1\}^n\to\{0,1\}$, $\real{f}{d}$ is at most a constant times $n^{d-2}\cdot \real{f}{2}$. Thus, any lower bound $\real{f}{2}\geq t$ for fanin-$2$ circuits yield a lower bound $\real{f}{d}\Geq t/n^{d-2}$ for fanin-$d$ circuits. Note, however, that for the Paley function $f=\paley(n,k)$, we have an upper bound $\real{f}{2}\leq n^{O(\log n)}$, the lower bound $\real{f}{d}\Geq \real{f}{2}/n^{d-2}$ for fanin-$d$ circuits can only be non-trivial for $d\leq \log n$. Still, Theorem 2 gives a lower bound $\real{f}{d}\geq n^{\Omega(\log n)}$ even for circuits of fanin $d=n^{1/14} \gg \log n$. $\Box$