Sketch of the solution of Problem 1.33

Recall that a monotone CNF \begin{equation} F(x)=\Big(\bigvee_{w\in S_1}x_w\Big)\land \Big(\bigvee_{w\in S_2}x_w\Big) \land \cdots\land \Big(\bigvee_{w\in S_t}x_w\Big)\ \ \ \ \ \ \ \ \ \ (*) \end{equation} where $S_i\subseteq U\cup V$, accepts a pair of vertices $(u,v)\in U\times V$ iff $\{u,v\}\cap S_i\neq \emptyset$ for all $i=1,\ldots,t$. The CNF represents a bipartite graph $G\subseteq U\times V$ iff it accepts exactly the edges of $G$. Let $cnf(G)$ is the smallest number $t$ of cluases in a monotone CNF representing $G$.

By slightly weakening a question of Pudlak, Rödl and Savicky (1988), the Research Problem 1.33 in the book asked:

Does there exist a constant $\epsilon>0$ such that $cnf(G)\geq d^{\ \epsilon}$ for every bipartite $K_{2,2}$-free $n\times n$ graph $G$ of average degree $d$?
By using probabilistic arguments, Nets Hawk Katz solved this problem negatively:
Theorem (Katz): For every constant $\epsilon >0$, there exists a bipartite $K_{2,2}$-free $n\times n$ graph $G$ of average degree $d\geq n^{1/10-\epsilon}$ such that $cnf(G)=O(\log d)$.
This, however, does not completely exclude that $K_{2,2}$-free graphs might be good candidates, because the argument requires that the average degree in this construction is rather small. On the other hand, we already know constructions of bipartite $K_{2,2}$-free $n\times n$ graphs of almost optimal average degree about $n^{1/2}$ (see e.g. page 208). So, a positive answer to the following modified problem would have the same consequences in circuit complexity.
Research Problem 1.33 (modified):
Does there exist a constant $\epsilon>0$ such that $cnf(G)\geq d^{\ \epsilon}$ holds for every bipartite $K_{2,2}$-free graph $G$ of average degree $d\geq n^{1/2-\epsilon}$?
In fact, it would be enough to show this for subgraphs of a any (fixed) known bipartite $K_{2,2}$-free $n\times n$ graph of average degree $d=\Omega(n^{1/2})$.

Proof sketch of the theorem

The proof idea is to show that, with nonzero probability, a random monotone CNF with $t=O(\log n)$ clauses represents a relatively dense graph with relatively few copies of $K_{2,2}$. By shortening the clauses in the corresponding CNF one obtains the desired graph.

Consider a random clause $C=\bigvee_{w\in S}x_w$, where $S\subseteq U\cup V$ is a random subset of vertices obtained by including each vertex independently with probability $1-p$. We want to upper bound the probability $q$ that the clause $C$ accepts a fixed copy of $K_{2,2}$.

The probability that $C$ accepts one fixed edge of $K_{2,2}$ is the probability $1-p^2$ that $C$ does not reject both its endpoints. However, the four events corresponding to the four edges of $K_{2,2}$ are not independent, and it does not follow that $q\leq (1-p^2)^4$ (we pick vertices, not edges in the CNF). Still using the inclusion-exclusion principle we can show that $q$ is not much larger than $(1-p^2)^4$:

Lemma: A random clause accepts a given $K_{2,2}$ with probability $q\leq (1-p^2)^4+6p^3-3p^4$.
Proof: If we let $A_i$ (for $i=1,2,3,4$) to denote the event that a fixed subset of $i$ edges of $K_{2,2}$ are excluded by the clause, then $Pr(A_1)=p^2$, $Pr(A_2)\leq p^3$ and $Pr(A_3)=Pr(A_4)=p^4$, because two edges of $K_{2,2}$ are incident with at least three vertices, and three edges are already incident with all four vertices; all these vertices must be excluded by the clause. The inclusion-exclusion principle gives \begin{align*} q&\leq 1-\binom{4}{1}Pr(A_1)+\binom{4}{2} Pr(A_2)-\binom{4}{3} Pr(A_3)+\binom{4}{4} Pr(A_4)\\ &=1-\binom{4}{1}p^2+\binom{4}{2}p^3-\binom{4}{3}p^4+\binom{4}{4}p^4\\ &=1-4p^2+6p^3-3p^4. \end{align*}

The inequality $1-4p^2\leq (1-p^2)^4$ follows from the Bernoulli inequality $1+nx\leq (1+x)^n$ holding for any nonnegative integer $n$ and every real number $x\geq -1$. Q.E.D.

Now let $G$ be a random bipartite $n\times n$ graph represented by a monotone CNF (*) consisting of $t$ independently chosen random clauses. We know that each edge appears in $G$ with probability $(1-p^2)^t$. We take an integer $k=\Theta(n^{1/10})$, and let $t$ be the nearest integer to $p^{-2}\ln (n/k)$; hence, $t=O(\log n)$. Then \[ (1-p^2)^t \sim \frac{k}{n}. \] Thus, we can expect at most \[ [(1-p^2)^4+6p^3]^tn^4\leq (k/n)^{4-\delta}n^4=n^{\delta}k^{4-\delta} \] copies of $K_{2,2}$ in $G$, where $\delta=\delta(p)$ is small. We can make $n^{\delta}k^{4-\delta}$ at most $k^5$ by picking $p$ in $\delta=\delta(p)$ sufficiently small. But what can be said about the average degree of $G$? We know that the expected number of edges in $G$ is $(k/n)n^2=kn$. Unfortunately, edges are not independent, and we cannot apply Chernoff bounds to the edges. But we can try to estimate the degrees of vertices. So, take a vertex $u\in U$, and let $I_u=\{i\colon u\not\in S_i\}$. Then \[ d_u=\Big|\bigcap_{i\in I_u}B_i \Big|. \] Since $Pr(w\in S_i)=1-p$, the expected degree is $d_u=(1-p)^m n$ where $m=|I_u|$. Since $Pr(u\not\in S_i)=p$ and the events are now independent, Chernoff-Hoeffding bounds imply that, with large probability, $pt -c\sqrt{t}\leq m\leq pt +c\sqrt{t}$, if the constant $c$ is large enough. Thus, $d_u=(1-p)^m n\geq (1-p)^{pt +c\sqrt{t}}n$ holds with large probability.

Now let $\epsilon >0$ be an arbitrarily small constant. Using the fact that $(1-p)^p=1-p^2+o(p^2)$, and that $(1-p^2)^t\sim k/n$ (by our choice of $t$), one can then show that, with probability $>1/2$, at least half of vertices in $U$ have degree between about $k^{1-\epsilon}$ and $k^{1+\epsilon}$. Thus, there exists an $n\times n$ graph $G$ with at least $\Omega(k^{1-\epsilon}n)$ edges, and with at most $O(k^5)$ copies of $K_{2,2}$ such that $cnf(G)=t=O(\log n)$. By removing two vertices in $U$ from each copy, we obtain a $K_{2,2}$-free graph $G'$ of average degree at least about $k^{1-\epsilon}=n^{1/10-\epsilon}$ such that $cnf(G')\leq t=O(\log n)$, as claimed. Q.E.D.

Note: Note that, even making more precise computations, the construction cannot produce a $K_{2,2}$-free graph of average degree much larger than $n^{1/3}$, because then the expected number $n^{\delta}k^{4-\delta}$ of $K_{2,2}$'s would exceed the expected number $kn$ of edges.