Contents

- 1 Highest Lower Bounds via Easy Arguments
- 2 Approximating circuits
- 2.1 Things get harder: exponential gaps
- 2.2 The balanced decomposition lemma
- 2.3 Decomposition lemma for approximating circuits
- 2.4 Explicit lower bounds for approximating circuits
- Appendix A: The rectangle lemma of Hyafil and Valiant
- Appendix B: Homogenization
- Appendix C: Non-monotone circuit lower bounds from monotone circuit lower bounds
- References

In this comment, we recall the simplified proof of Gashkov-Sergeev general lower bound. We then turn to monotone arithmetic circuits that, instead of computing a given multilinear polynomial $P$ *exactly*, are only required to *approximate* it within a given factor $r\geq 1$: compute any polynomial $Q$ such that every monomial of $Q$ is a part of at least one monomial of $P$, and every monomial of $P$ shares at least a $1/r$ portion of its variables with at least one monomial of $Q$. There are no further restrictions on the polynomial $Q$ to be computed. Motivation: lower bounds on approximating monotone arithmetic circuits are lower bounds on the approximating tropical $(\max,+)$ circuits and, hence, on approximating pure dynamic programming algorithms.

Let $\NN=\{0,1,2,\ldots\}$. The *Minkowski sum* or *sumset*
of two sets $X$ and $Y$ of vectors in $\NN^n$ is the set
$X+Y=\{x+y\colon x\in X\ \mbox{ and }\ y\in Y\}$
of vectors in $\NN^n$, where $x+y=(x_1+y_1,\ldots,x_n+y_n)$ is the
componentwise sum of vectors $x$ and $y$. That is, we add every vector
of $X$ to every vector of $Y$. A *Minkowski circuit* has the $n+1$ singleton sets $\{\vec{0}\},\{\vec{e}_1\},\ldots,\{\vec{e}_n\}$ as inputs, where $\vec{e}_i=(0,\ldots,0,1,0,\ldots,0)$ is the $i$th unit vector. Gates are fanin-2 union $(\cup)$ and Minkowski sum $(+)$ gates. The *size* of a circuit is the number of its (non input) gates.

At each node $v$ of such a circuit, some set $X_v\subseteq \NN^n$ of
vectors is *produced* in a natural way. If $v$ is a source node holding
a single-element set $\{x\}$, then $X_v=\{x\}$. If $v$ is a gate
entered by nodes $u$ and $w$, then $X_v=X_u\cup X_w$ if $v$ is a
union gate, and $X_v=X_u+ X_w$ if $v$ is a Minkowski sum gate. The set
produced by the entire circuit is the set $A=X_o$ produced at the output gate $o$.
For a set $A\subseteq \NN^n$ of
vectors, let $\Un{A}$ denote the minimum size of a Minkowski circuit producing $A$.

There is a natural relation between Minkowski and monotone arithmetic $(+,\times)$ circuits. For a polynomial $f(x_1,\ldots,x_n)$, let $A_f\subseteq \NN^n$ denote the set of its exponent vectors. Hence, $f$ is of the form $f(x)=\sum_{a\in A_f}\, c_a\prod_{i=1}^n x_i^{a_i}$ for some coefficients $c_a > 0$. Let $\arithm{f}$ denote the minimum size of a monotone arithmetic $(+,\times)$ circuit computing the polynomial $f$. The following simple fact motivates Minkowski circuits.

Fact 1:For every polynomial $f$, we have $\arithm{f}\geq \Un{A_f}$.

In fact, almost all known lower bounds on the monotone algebraic complexity of polynomials $f$ were obtained by proving lower bounds on $\Un{A_f}$.

**Example ** (due to Schnorr):
Let $n=\tbinom{m}{2}$, and let $A$ be the set of all $|A|=\tbinom{m}{k}$
characteristic $0$-$1$ vectors of $k$-cliques, viewed as sets of
their edges in a complete graph $K_m$ on $\{1,\ldots,m\}$ nodes. It is easy to verify that no union of two $k$-cliques can
contain some third $k$-clique. Indeed, the latter clique must then
have a node $u$ not in the first clique and a node $v$ not in the
second clique. If $u=v$ then the node $u$ is not covered, and if
$u\neq v$ then the edge $\{u,v\}$ is not covered by the first two
cliques, a contradiction.

By taking the clique size $k=m/2$, we have an explicit set $A\subset\{0,1\}^n$ with $\Un{A}\geq 2^m/2\sqrt{m}= 2^{\sqrt{n}-O(\log n)}$. Explicit lower bounds of the form $\Un{A}=2^{\Omega(\sqrt{n})}$ were also proved by Jerrum and Snir (1982) using different arguments. Gashkov (1987) and then Gashkov and Sergeev (2012) have extended Schnorr's lower bound to a much larger class of so-called "thin" sets, which led to explicit almost maximal possible lower bound of the form $\Un{A}\geq 2^{n-o(n)}$.Theorem 1(Schnorr 1976): For every cover-free set $A\subset\NN^n$, we have $\Un{A}\geq |A|-1$.

A set $A\subseteq \NN^n$ is $(k,l)$-*thin* ($1\leq k\leq l$) if $X+Y\subseteq A$ implies that $|X|\leq k$ or $|Y|\leq l$. In other words, if $\chi_A:\NN^n\to\{0,1\}$ is the characteristic function of a set $A\subset\NN^n$ (with $\chi_A(x)=1$ iff $x\in A$), then $A$ is $(k,l)$-thin iff for every $k+1$ distinct vectors $a_1,\ldots,a_{k+1}\in\NN^n$, the system of equations $\chi_A(a_1+x)=1,\ldots, \chi_A(a_{k+1}+x)=1$ has at most $l$ solutions $x\in\NN^n$. In particular, a set $A$ is $(1,1)$-thin iff for every $a\neq b\in\NN^n$, the system $\chi_A(a+x)=1,\chi_A(b+x)=1$ has at most one solution $x\in\NN^n$.

**Remark:** In terms of polynomials,
a polynomial $P$ is $(k,l)$-thin if it cannt be written in
the form $P=P_1\cdot P_2+P_3$ with both $|\mon{P_1}|>k$ and
$|\mon{P_2}|>l$. (We only consider polynomials with positive coefficients; so, there are no cancellations.)

**Remark:** The concept of thin sets is motivated
by the classical combinatorial Zarankiewicz problem: how many ones can a $0$-$1$ matrix have if it does not have any $(k,l)$ all-$1$ submatrix? Now associate with a finite set $A\subset\NN^n$ of vectors the (infinite) $0$-$1$ matrix $M_A$ whose rows and columns are labeled by vectors in $\NN^n$. Let $M_A[x,y]=1$ iff $x+y\in A$. Then $A$ is $(k,l)$-thin iff the matrix $M_A$ has no $(k+1,l+1)$ all-$1$ submatrix.

For every set $A\subseteq \NN^n$, the number $|A+A|$ of distinct vectors in the sumset $A+A$ is at most the number of all sets $\{a,b\}$ with $a,b\in A$, which is at most the number $\binom{|A|}{2}$ of sets with $a\neq b$ plus the number $|A|$ of sets with $a=b$. So,
\[
|A+A|\leq \binom{|A|}{2}+|A|=\binom{|A|+1}{2}\,.
\]
(This can be also shown by an easy induction on $|A|$; see here). Sidon sets are those sets $A$ for which
the equality $|A+A|=\binom{|A|+1}{2}$ holds. In other words,
a set $A\subseteq \NN^n$ is a *Sidon set* if, for every vectors $a,b\in A$,
\[
\tag{$\ast$}
\mbox{the sum $a+b$ determines the pair $\{a,b\}$,}
\]
that is, if $a+b=c+d$ for some $c,d\in A$, then $\{c,d\}=\{a,b\}$.
Note that in order to show that a set $A$ is a Sidon set, it is enough to only show that $(\ast)$ holds for all pairs of *distinct* vectors $a\neq b\in A$: if the sum $a+a$ of some vector $a\in A$ with itself does not determine the vector $a$, then $a+a=c+d$ but $\{a\}\neq \{c,d\}$ holds for some $c,d\in A$. Since $x+x=y+y$ over the semigroup $(\NN^n,+)$ implies $x=y$, we then have $c\neq d$. But then the condition $(\ast)$ is violated by the pair $c\neq d$ of distinct vectors of $A$.

One can also show that $(1,1)$-thin $\Rightarrow$ Sidon; see here.Fact 2:Cover-free $\Rightarrow$ Sidon $\Rightarrow$ $(1,1)$-thin.

**Proof:**
If a set $A\subseteq \NN^n$ is cover-free, then it is also a Sidon set just because $a+b=c+d$ yields both inequalities $a+b\geq c$ and $a+b\geq d$.
Now let $A\subseteq\NN^n$ be a Sidon set, and assume to the contrary that $A$ is not $(1,1)$-thin. Then
$\{x,x'\}+\{y,y'\}\subseteq A$ holds for some vectors $x\neq x'$ and
$y\neq y'$. The sum $a+b$ of the two vectors $a=x+y$ and $b=x'+y'$
of $A$ is then equal to the sum $c+d$ of the two vectors $c=x+y'$
and $d=x'+y$ of $A$. Since $A$ is a Sidon set, at least one of
$x+y'=x+y$ or $x+y'=x'+y'$ must hold, contradicting that $x\neq x'$
and $y\neq y'$.
$\Box$

Gashkov proved the following lower bound on the Minkowski complexity of $(k,l)$-thin sets when $k=l$.

Later, Gashkov and Sergeev (2012) extended this lower bound to the inbalanced case when $1\leq k\leq l$.Theorem 2(Gashkov 1987): For every $(k,k)$-thin set $A\subset\NN^n$, we have $\Un{A}\geq |A|/k^3-1$.

The proofs of these lower bounds use clever, but technically involved, analysis of monotone arithmetic circuits. Motivated by the argument used by Pippenger (1980), we gave a short and amazingly simple proof of the following lower bound.Theorem 3(Gashkov-Sergeev 2012): For every $(k,l)$-thin set $A\subset\NN^n$, we have $\Un{A}\geq |A|/\max\{k^3,l^2\}-1$.

This lower bound holds for slightly more general Minkowski circuits, whereTheorem 4(Jukna 2017): For every $(k,l)$-thin set $A\subset\NN^n$ of size $|A|>k$, we have $\Un{A}\geq |A|/2lk^2$.

What we actually show is that every $(k,l)$-thin set $A\subset\NN^n$ of Minkowski circuit complexity $t=\Un{A}$ is a union of at most $2t$ sumsets $X+Y\subseteq A$ such that $|X|\leq k^2$ and $|Y|\leq l$.

In the next section, we show how either of Theorems 3 and 4, together with known construction of large sufficiently thin sets $A$, yield truly exponential lower bounds on $\Un{A}$. Then we give an easy proof of Theorem 4.

** Example** (Greedy construction) A Sidon set $A\subseteq\{0,1\}^n$ of size $|A|\geq c 2^{n/3}$ for a constant $c > 0$ can be constructed by a simple greedy algorithm. Start with an empty set $A=\emptyset$, and at each step, include a vector $x$ in $A$ if $x$ is not of the form $x=a+b-c$ for some vectors $a,b,c$ already in the set $A$. If we already have a set of size $|A|=K$, and if $K+K^3 < 2^n$, at least one new vector $x$ will be included in $A$. So, the final Sidon set will have $|A|\geq c2^{n/3}$ vectors for a constant $c>0$.

** Example ** (Cubic parabola)
Let $q$ be an even prime power, and let $\FF=\gf{q}$ be the field with $q$ elements. The (graph of the) cubic parabola over $\FF$ is the set $A=\{v(x)\colon x\in\FF\}$ with $v(x):=(x,x^3)$. We claim that $A$ is a Sidon set over the group $(\FF,+)$. To show this, fix a pair $(a,b)\in \FF^2$ of elements, and
suppose that $v(x)+v(y)=(a,b)$ holds for two elements $x\neq y$ in $\FF$. Then $x+y=a\neq 0$. Since $x+x+x=x$ over $\FF$ ($q$ is even), we also have $x^3+y^3=(x+y)^3-3xy(x+y)=a^3-xyb$. So, the second equation $x^3+y^3=b$ yields $a^3-xya = b$, that is, $xy=c$ for $c=(b/a)-a^2$. Since the system $x+y=a, x\cdot y=c$ cannot have
more than two solutions in the field, we find that $\{x, y\}$ is uniquely determined by $(a,b)$.
By taking $n=2m$, $q=2^m$ and identifying the elements of $\gf{q}=\gf{2^m}$ with vectors in $\{0,1\}^n$,
we obtain a set $A\subseteq\{0,1\}^n$ of size $|A|=2^m=2^{n/2}$
such that for every vectors $a\neq b\in A$, the vector $a+b$ determines the pair $\{a,b\}$ over $(\FF,+)$ and, hence, also over $(\NN^n,+)$. Since over $(\NN^n,+)$, each sum $a+a$ also determines $a$,
$A$ is a Sidon set.

Corollary 1:Let $f(x_1,\ldots,x_n)$ be a multilinear polynomial whose set of exponent vectors forms a cubic parabola in $\gf{2^{n}}$.Then $\arithm{f}\geq 2^{n/2}$.

** Example** (Norm sets)
Let $q$ be a prime-power, $t\geq 2$ an integer, and consider the field
$\gf{q^t}$ with $q^t$ elements. The *norm* is a mapping $N:\gf{q^t}\to \gf{q}$
given by $N(a)=a\cdot a^q\cdots a^{q^{t-1}}=a^{(q^t-1)/(q-1)}$. Consider the set $A=\{a\in\gf{q^t}\colon N(a)=1\}$. It is known (see Lidl and Niederreiter 1986) that $|A|=(q^t-1)/(q-1)$.
Kollár, Rónyai and Szabó (1996) proved that
for every $t\geq 2$ distinct elements $a_1,\ldots,a_t$ of $\gf{q^t}$, the system of equations
$
N(a_1+x)=1, N(a_2+x)=1,\ldots,N(a_t+x)=1
$
has at most $t!$ solutions $x\in \gf{q^t}$. So, the set $A$ is $(t+1,t!)$-thin over the group $(\gf{2^n},+)$.

Now let $n$ be a positive integer of the form $n=r\cdot t$ for $t\geq 2$, and let $q=2^r$.
Hence, $\gf{q^t}=\gf{2^n}$.
By viewing elements of $\gf{2^n}$ as vectors in $\{0,1\}^n$,
we obtain a set $A$
of $|A|=(q^t-1)/(q-1)\geq q^{t-1}= 2^{n-r}$
vectors in $\{0,1\}^n$ which is $(t+1,t!)$-thin over $(\gf{2^n},+)$, and hence, also over the semigroup $(\NN^n,+)$. Call the set $A$ the $(n,t)$-*norm set*.

For $t=\sqrt{n}$, we obtain $\arithm{f}\geq 2^{n-o(n)}$.Corollary 2:Let $f(x_1,\ldots,x_n)$ be a multilinear polynomial whose set of exponent vectors forms an $(n,t)$-norm set $A\subset\{0,1\}^n$. Then $\arithm{f}\geq 2^{n-n/t-2t\log t}$.

**Proof:** We know that the set $A$ is $(k,l)$-thin for
$k=t+1$ and $l=t!\sim(t/e)^t$. Thus, Theorem 4 yields
$\Un{A}\geq |A|/2lk^2\geq |A|/t^{2t}\geq 2^{n-n/t-2t\log t}$.
$\Box$

**Remark:** Lower bounds of the form $\Un{A}\geq 2^{cn}$ for various constants $0 < c < 1$ were also proved using different arguments.
Raz and Yehudayoff (2011) used discrepancy
arguments and exponential sum estimates to derive such a bound.
Jukna (2014) proved such a bound using graphs with good expansion properties (Ramanujan graphs). Cavalar, Kumar and Rossman (2020) proved such a bound using recently constructed error-correcting codes with very good parameters, Srinivasan (2020) proved such a bound using communication complexity arguments.

**Remark:** Actually, a lower bound $\Un{A}\geq 2^{n/6}$ already follows from a lower bound $2^{n/3}$ of Kuznetsov (1981) on the size of Boolean $\{\land,\lor,\neg\}$ circuits without null-chains. Every Boolean circuit not only computes a unique Boolean function, but also produces (purely syntactically) some DNF $D$. When forming this DNF, we allow the laws $x\cdot x=x\lor xy=x$ be used but the law $x\land\overline{x}=0$ is not used. So, in general, the produced DNF may contain zero terms, i.e. those containing a variable together with its negation. A circuit is null-chain free if $D$ contains no zero terms. He proved that if every two vectors in every two vectors accepted by a Boolean function $f(x_1,\ldots,x_n)$ differ in at least $d$ positions, then every Boolean $\{\land,\lor,\neg\}$ circuits without null-chains computing $f$ must have at least $|f^{-1}(1)|^{d/(n+d)}$ gates. For characteristic functions $f$ of linear codes, this leads to lower bounds of the form $2^{\Omega(\sqrt{n})}$ (say, for Reed-Muller code with $d=\sqrt{n}$) and even of the form $2^{\Omega(n)}$ for less trivial codes.

Using another argument, Kuznetsov (1981) proved a lower bound of the form $2^{n/3}$. A *forbidden pattern* is a set $B\subset\{0,1\}^n$ of $|B|=4$ vectors such that, after appropriate permutation of positions, the set is of the form
$B=\{x,x'\}\times\{y,y'\}\times\{z\}$ with $x\neq x'$ and $y\neq y'$.
Using a greedy procedure (as we have done for Sidon sets), it is possible to construct a subset $A\subseteq \{0,1\}^n$ of $|A|\geq 2^{n/3}$ vectors such that $A$ contains no forbidden patterns. Kuznetsov (1981) proved that any null-chain free circuit computing a Boolean function $f_n(x_1,\ldots,x_n)=\bigvee_{a\in A}\bigwedge_{i=1}^nx_i^{a_i}$ requires at least $|A|\geq 2^{n/3}$ gates; here, $x_i^1=x_i$ and $x_i^0=\overline{x}_i$.

To translate this lower bound to one for monotone arithmetic circuits, consider the multilinear polynomial $P_m(x,y)=\sum_{a\in A}\prod_{i=1}^nz_i^{a_i}$ of $m=2n$ variables, where $z_i^1=x_i$ and $z_i^0=y_i$. Note that no monomial of $P_m$ contains both variables $x_i$ and $y_i$. So, given a monotone arithmetic circuit for $P_m$, we can transform it to a Boolean circuit without null-chains by replacing each input variable $y_i$ by a negated Boolean variable $\overline{x}_i$, and by replacing the $+$ and $\times$ gates by $\lor$ and $\land$ gates. The obtained circuit computes $f_n$, and the lower bound $\arithm{P_m}\geq 2^{n/3}=2^{m/6}$ follows.

Apparently unaware of this Kuznetsov's more general 40 years old result, a lower bound of the form $2^{\Omega(n)}$ for a very special case of Boolean circuits without null-chains (so-called DNNFs) was recently proved by Bova et al. (2015). These circuit are, in fact, read-once DeMorgan circuits.

Let $\Phi$ be a Minkowski $(\cup,+)$ circuit producing some set $A\subset\NN^n$; hence, $X_o=A$ holds for the output gate $o$. If $v$ is a non-output gate of this circuit,
then the set
$X_v\subset\NN^n$ of vectors produced at a non-output gate $v$ needs not lie in the set $A$ produced by the entire circuit, but at least one of its translates $X_v+y$ for some vector $y\in\NN^n$ must already lie in $A$. We collect all such vectors $y$ and call the set
\[
Y_v:=\{y\in\NN^n\colon X_v+y\subseteq A\}
\]
the *residue* of $X_v$ (within $A$). The sumset $\cont{v}:=X_v+Y_v$ is the *content* of the gate $v$. Note that $\cont{v}\subseteq A$ already holds for every gate $v$.
For example, if $v$ is the output gate, then
$X_v=A$ and $Y_v=\{0\}$. If $v$ is a source node holding a set $\{x\}$
with $x\in\NN^n$, then $X_v=\{x\}$ and $Y_v$ consists of all vectors
$y\in\NN^n$ such that $x+y\in A$.

The sets $X_v$ produced at gates $v$ and their residues $Y_v$ have the following properties. Let $v$ be a gate entered from gates $u$ and $w$.

- If $v$ is a union $(\cup)$ gate, then $X_v=X_u\cup X_w$ and $Y_v= Y_u\cap Y_w$.
- If $v$ is a Minkowski sum $(+)$ gate, then $X_v=X_u+X_w$ and the residue set $Y_v$ satisfies the inclusions \begin{equation}\label{eq:inclusion} \tag{*} X_{w}+Y_{v}\subseteq Y_{u}\qquad \mbox{and}\qquad X_{u}+Y_{v}\subseteq Y_{w}\,. \end{equation}

When defining the contents $\cont{v}= X_v+Y_v$ of *nodes* $v$ (gates in the circuit), it is irrelevant whether $v$ is a union $(\cup)$ or a Minkowski sum $(+)$ gate: $X_v$ is the set of vectors produced at $v$, and $Y_v$ consists of all vectors $y\in\NN^n$ for which the set $X_v+\{y\}$ lies in $A$.
When defining the contents $\cont{e}$ of *edges* $e=(u,v)$, we break this symmetry. Namely, if the head $v$ of $e$ is a Minkowski sum $(+)$ gate, then we let the content of $e$ be the content $\cont{e}:=X_v+Y_v$ of the head $v$. But if the head $v$ of $e$ is a union $(\cup)$ gate, then we let $\cont{e}:=X_u+Y_v$.

Note that in both case we have $\cont{e}\subseteq A$. Indeed, if the head $v$ of the edge $e=(u,v)$ is a Minkowski sum gate, then $\cont{e}=\cont{v}\subseteq A$. If $v$ of $e$ is a union gate, and $w$ is the second gate entering $v$, then $\cont{e}=X_u+Y_v=X_u+(Y_u\cap Y_w)\subseteq X_u+Y_u=\cont{u}\subseteq A$.

The reason for the asymmetry in this definition is explained by the following lemma.

Lemma 1(The content propagation lemma): Let $v$ be a gate and $a\in\cont{v}$ a vector in its content.

- If $v$ is a union gate, then vector $a$ belongs to the content of
at least oneedge entering $v$, as well as to the content of the tail of this edge.- If $v$ is a Minkowski sum gate, then vector $a$ belongs to the contents of
bothedges entering $v$, as well as to the contents of the tails of these edges.

Now let $v=u+w$ be a Minkowski sum gate. In this case, the contents of both edges $(u,v)$ and $(w,v)$ coincide with the content $\cont{v}$ of the gate $v$. So, it is enough to show that $\cont{v}\subseteq \cont{u}\cap \cont{w}$ holds in this case. This follows directly from the inclusions (*): \[ \cont{v}= X_v+Y_v =(X_u+X_w)+Y_v = X_u+(X_w+Y_v)\subseteq X_u+Y_u=\cont{u}\,, \] and similarly, $\cont{v}=X_v+Y_v\subseteq X_w+Y_w=\cont{w}$. $\Box$

**Remark:** What Lemma 1 ensures is the following. Take an arbitrary vector $a\in A$ in the set $A$ produced by the circuit. This vector belongs to the content $\cont{o}=A$ of the output gate $o$. Start at the output gate $o$, and construct a path in the underlying directed acyclic graph by going
backwards and by always choosing an edge whose content contains vector $a$. Lemma 1 ensures that no such path will get stuck, that is, every such path will eventually reach an input node.

Given this claim, the theorem follows easily. Indeed, if $E$ is the set of all cheap edges in the circuit, then Claim 1 implies that $A\subseteq \bigcup_{e\in E} \cont{e}$. Since $|\cont{e}|\leq lk^2$ holds for every edge $e\in E$, this implies $|A|\leq \sum_{e\in E}|\cont{e}|\leq lk^2|E|$. So, there must be at least $|E|\geq |A|/lk^2$ edges in the circuit. Since every gate (non-source node) in the circuit is entered by exactly two edges, and every edge must enter some gate, the total number of edges in the circuit is twice the total number $t$ of gates. Hence, $t\geq |E|/2\geq |A|/2lk^2$, as claimed.Claim 1:Every vector $a\in A$ belongs to the content of at least one cheap edge.

So, it remains to prove Claim 1. Call a node $u$ in the circuit
*cheap* if $|X_u|\leq k$, and *expensive* otherwise. We
can assume that the output gate is expensive because otherwise the
set $A$
produced by the circuit would be also cheap, and there would be
nothing to prove.
Fix a vector $a\in A$. Start at the output gate of the circuit, and
construct a path in the underlying directed acyclic graph by going
backwards and using the following rule, where $v$ is the last previously reached node.

- If $v$ is a union $(\cup)$ gate, then follow that edge whose content contains vector $a$.
- If $v$ is a Minkowski sum $(+)$ gate, then go to either of the two inputs if they are both expensive or are both cheap, and go to the expensive input if the second input is cheap.

If $v$ is a union gate, then $|\cont{e}|=|X_u+Y_v|\leq |X_u|\cdot|Y_v| \leq k\cdot l$, which is clearly at most $\leq k^2l$. So, the edge $e$ is cheap in this case. If $v$ is a Minkowski sum gate, and $w$ is its second input, then the content of $e$ is the Minkowski sum $\cont{e}=X_v+Y_v=X_u+X_w+Y_v$ of three sets. Since the node $u$ is cheap, rule (ii) in the construction of the path implies that the second node $w$ entering $v$ must have also been cheap, that is, $|X_w|\leq k$ must hold (for otherwise, we would have had chosen $w$ instead of $u$ during the construction of the path). We thus have $|\cont{e}|\leq |X_u|\cdot|X_w|\cdot|Y_v|\leq k^2\cdot l$, as desired. $\Box$

The set $B$ $r$-*approximates* the set $A$ if $B$ lies below $A$ and for every $a\in A$ there is a vector $b\in B$ such that $\skal{a,b}\geq \tfrac{1}{r}\cdot |a|$, where $|a|=\skal{a,a}$; $r\geq 1$ is here the approximation factor. Let $\Unr{A}{r}$ denote the minimum size of a Minkowski circuit $r$-approximating the set $A$:
\[
\Unr{A}{r}=\min\{\Un{B}\colon \mbox{$B\subset\NN^n$ $r$-approximates $A$}\}\,.
\]

**Motivation** (Approximation limits of dynamic programming)
It is known (Jukna and Seiwert 2020) that $\Unr{A}{r}$ is a lower bound on the size of tropical $(\max,+)$ circuits approximating the maximization problem $f(x)=\max_{a\in A} \skal{a,x}$ within the factor $r$. Tropical $(\max,+)$ circuits is a mathematical model to simulate so-called "pure" dynamic programming algorithms, those using only basic operations $\max$ and $+$ in their recursion equations. So, lower bounds on $\Unr{A}{r}$ and, hence, on the size of approximating $(\max,+)$ circuits show the weakness of DP algorithms approximating maximization problems.

Fact:There are explicit Sidon sets $A\subseteq\{0,1\}^n$ such that $\Un{A}\geq 2^{n/4}$ but $\Unr{A}{r}\leq n$ already for $r=2$.

The following fact shows that even Shannon type counting arguments fail for approximating circuits already for very small factors $r=1+o(1)$.

Fact:There exist $2^{2^{\Omega(n)}}$ sets $A\subseteq\{0,1\}^n$ such that $\Un{A}\geq 2^{\Omega(n)}$ but each of these sets can be approximated within a factor $r=1+o(1)$ by a single Minkowski circuit of size $O(n^2)$.

A *norm-measure* of vectors in $\NN^n$ is any assignment $\Nor:\NN^n\to\RR_+$ of
nonnegative real numbers to vectors in $\NN^n$ such that every $0$-$1$
vector with at most one $1$ gets norm at most $1$, and the norm is
monotone and subadditive: $\nor{x}\leq \nor{x+y}\leq \nor{x}+\nor{y}$
holds for all vectors $x,y\in\NN^n$.

Lemma 3(Balanced decomposition lemma, Jukna 2016): If a set $B\subset\NN^n$ can be produced by a Minkowski circuit of size $t$, then $B$ is a union of $t$ sumsets $X+Y\subseteq B$ with the following property.

- $(\ast)$ For every norm-measure $\Nor:\NN^n\to\RR_+$, for every vector $\a\in B$ of norm $\nor{\a}>0$, and every $\bbal$ satisfying $1/\nor{\a}\leq \bbal < 1$ at least one of these sumsets $X+Y$ contains vectors $x\in X$ and $y\in Y$ such that $x+y=\a$ and $\tfrac{\bbal}{2}\cdot\nor{\a}<\nor{x}\leq \bbal\cdot \nor{\a}$.

Note the subadditivity of norms also yields $(1-\bbal)\cdot\nor{\a}\leq \nor{y} < (1-\tfrac{\bbal}{2})\cdot \nor{\a}$. The main difference from previous "balanced decomposition" results for monotone arithmetic circuits (starting from the classical Hyafil-Valiant decomposition) is that one can choose *different* norm-measures $\Nor$ for *different* vectors $b\in B$.

In the following lemma, $0<\bal<1$ is an arbitrary fixed ``balance''
parameter.
Say that a $0$-$1$ vector $a$ *appears $r$-balanced* in $X+Y$ if
\begin{equation}\label{eq:balanced}
\skal{a,x+y}\geq \tfrac{1}{r}\cdot |a|\,,\ \skal{a,x} > \tfrac{\bal}{2r}\cdot|a| \ \mbox{ and }\ \skal{a,y}\geq
\tfrac{1-\bal}{r}\cdot |a|\,.
\end{equation}

Lemma 4:Let $A\subseteq \{0,1\}^n$ be an $m$-homogeneous set, and $1\leq r\leq \bal m$. If $\Unr{A}{r}\leq t$, then there exist $\leq t$ rectangles lying below the set $A$ such that every vector of $A$ appears $r$-balanced in at least one of these rectangles.

- (i) if $b\in B$, then $b\leq a$ for some $a\in A$;
- (ii) if $a\in A$, then $\skal{a,b}\geq \frac{1}{r}\cdot|a|=m/r$ for some $b\in B$.

The balanced decomposition lemma (Lemma 3) gives us a collection of $t$ sumsets $X+Y\subseteq B$ with the following property holding for every norm-measure $\Nor:\NN^n\to\RR_+$, for every vector $b\in B$ of norm $\nor{b}>0$, and every real number $\bbal$ satisfying $1/\nor{\a}\leq \bbal < 1$: at least one of the sumsets $X+Y$ contains vectors $x\in X$ and $y\in Y$ such that $x+y=b$ and $\tfrac{\bbal}{2}\cdot\nor{b}<\nor{x}\leq \bbal\cdot \nor{b}$.

Since the set $A$ consists of only $0$-$1$ vectors, property (i) implies that the set $B$ and, hence, each of the sumsets $X+Y$ consists of $0$-$1$ vectors. In particular, we then have $\skal{x,y}=0$ for all $x\in X$ and $y\in Y$, i.e. $X+Y$ is a rectangle. It remains to show that every vector of $A$ appears $r$-balanced in at least one of these rectangles.

To show this, fix an arbitrary vector $a\in A$, and consider the norm-measure $\nor{x}=\nnor{x}{a}:=\skal{a,x}$. By (ii), for a shadow $b\in B$ of vector $a$ in $B$, we have $\nor{b}\geq m/r$. Since $r\leq \bal m$, the condition $1/\nor{\a}\leq \bbal < 1$ of Lemma 3 is fulfilled. Thus, at least one of the $t$ rectangles $X+Y$ contains vectors $x\in X$ and $y\in Y$ such that $x+y=\a$ and \begin{equation}\label{eq:balanced1} \tfrac{\bbal}{2}\cdot\nor{\a}<\nor{x}\leq \bbal\cdot \nor{\a}\,. \end{equation} Since $\nor{b}\geq m/r$ holds for the shadow $b$ of $a$, we have $\skal{a,x+y}=\skal{a,b}\geq m/r$; this gives us the first inequality of \eqref{eq:balanced}. The second inequality in \eqref{eq:balanced} follows from the first inequality in \eqref{eq:balanced1}: \[ \skal{a,x}=\nor{x} > \tfrac{\bbal}{2}\cdot\nor{b}=\tfrac{\bbal}{2}\cdot \skal{a,b}\geq \tfrac{\bbal}{2}\cdot\frac{m}{r}=\tfrac{\bal}{2r}\cdot m. \] Since $\skal{x,y}=0$, the third inequality in \eqref{eq:balanced} follows from $\skal{a,x+y}\geq m/r$ and the second inequality in \eqref{eq:balanced1}. $\Box$

- $m$-uniformity: every vector has exactly $|a|=m$ ones;
- $d$-disjointness: $\skal{a,b} < d$ for all $a\neq b\in A$.

Lemma 5:Let $A\subseteq \{0,1\}^n$ be an $(m,d)$-design with $d\leq m/2$. Then for every approximation factor $1\leq r\leq m/2d$, we have $\Unr{A}{r}\geq |A|/\ddeg{A}{l}$ for $l=m/4r$.

Since $|z|\geq l$, the claim yields the desired inequality $|A_R|\leq \ddeg{A}{l}$. So, it remains to prove the claim.Claim:The componentwise OR $z$ of all vector in $X$ is contained in all vectors $a\in A_R$.

To do this, take an arbitrary vector $a\in A_R$. We have to show that $z\leq a$ holds. Then $\skal{a,y_a}\geq d$ for some vector $y_a\in Y$. On the other hand, since the rectangle $R$ lies below $A$, every vector $x\cup y_a$ with $x\in X$ must be contained in some vector of $A$. Since all these vectors contain the vector $y_0$ with $|y_0|\geq d$ ones, the $d$-disjointness of $A$ implies that *all* vectors of $X+\{y_0\}$ and, hence, also the vector $z$ must be contained in *one* vector $a'$ of $A$.
Since both vectors $a$ and $a'$ of $A$ contain the same vector $y_0$ with $|y_0|\geq d$ ones, and since the set $A$ is
$d$-disjoint, the equality $a=a'$ and, hence, the desired inequality $z\leq a$ follows.
$\Box$

Construction of large designs are known. A simplest one is based on the fundamental fact about polynomials: no univariate polynomial of degree $s$ can have $s+1$ roots.

**Example** (Polynomial designs)
Let $n=m^2$ for a prime power $m$, $1\leq d\leq m$ an
integer, and consider the vectors $a\in\{0,1\}^n$ whose positions are pairs $(i,j)$ of elements in $\gf{m}$. Let $P$ be the set of all univariate polynomials $p(z)$ of degree at most
$d-1$ over $\gf{m}$. The
*polynomial $(n,d)$-design* $A\subseteq \{0,1\}^n$ consists of all $|A|=m^d$ vectors $a\in\{0,1\}^n$ such that for some polynomial $p\in P$ and for every pair $(i,j)\in \gf{m}\times\gf{m}$, we have $a_{i,j}=1$ iff $j=p(i)$. That is, the set $A$ consists of the characteristic $0$-$1$ vectors of the graphs of polynomials in $P$.

It is well known (and easy to show) that $\ddeg{\f}{l}\leq m^{d-l}$ holds for every $0\leq l\leq d$. Together with Lemma 5, this yields the following lower bound on the size of Minkowski circuits approximating polynomial designs.

Corollary 3:Let $A\subseteq \{0,1\}^n$ be the polynomial $(n,d)$-design with $d\leq \sqrt{n}/2$. Then for every approximation factor $1\leq r\leq \sqrt{n}/2d$, we have $\Unr{A}{r}\geq n^{\Omega(d)}$.

The *degree* of a vector is the sum of its
entries, and a set of vectors is *homogeneous* of degree $m$
if all its vectors have the same degree $m$. Let $B\subset\NN^n$ be a homogeneous set of degree $m$, and
$0 < \bal < 1$ be a parameter. A sumset $X+Y\subseteq B$ is $\bal$-*balanced* if every vector of $X$ has degree
between $\tfrac{\bal}{2}\cdot m$ and $\bal\cdot m$. The following lemma is the well-known Hyafil-Valiant rectangle lemma for monotone arithmetic circuits.

Rectangle Lemma(Hyafil-Valiant): Let $B\subset\NN^n$ be a homogeneous set of degree $m\geq 2$, and $0 < \bal < 1$. Then $B$ is a union of at most $\Un{B}$ $\bal$-balanced sumsets.

**Proof:** We will apply Lemma 3 with the degree $|x|=x_1+\cdots+x_n$ of vectors as a single norm-measure used for *all* vectors $b\in B$. By Lemma 3, there are at most $t$ sumsets $X+Y\subseteq B$ such that for every vector $b\in B$ at least one of
these sumsets $X+Y$ contains vectors $x\in X$ and $y\in Y$ such
that $x+y=\a$, $\bal m/2 \leq |x| \leq \bal m$. Since $X+Y\subseteq B$ and the set $B$ is homogeneous, the set $X$ is $r$-homogeneous for $r=|x|$ and $Y$ is $(m-r)$-homogeneous. So, the sumset $X+Y$ is $\bal$-balanced.
$\Box$

Homogenization Lemma(Strassen 1973): Let $A\subset\NN^n$ be a set of degree $d$. If the set $A$ can be produced by a Minkowski circuit of size $s$, then all its homogeneous parts $A^{(0)},A^{(1)},\ldots, A^{(d)}$ can be simultaneously produced by a Minkowski circuit of size $O(d^2s)$.

**Proof:** Let $\Phi$ be a Minkowski circuit of size $s$ producing the set $A$. For every gate $v$ in $\Phi$, we take $d+1$ copies $v_0,v_1,\ldots,v_d$ of this gate in the new circuit $\Psi$. The goal is to achieve the situation where at the $r$th copy $v_r$ of a gate $v$ in the circuit $\Phi$ the $r$th homogeneous part $X_v^{(r)}$ of the set $X_v\subset\NN^n$ of vectors produced at the gate $v$ in $\Phi$ is produced. This can be achieved by appropriately connecting the gates in the new circuits. Namely, if $v=u\circ w$ is a gate in $\Phi$ with $\circ\in\{\cup,+\}$, then we have the copies $u_0,u_1,\ldots,u_d$ and $w_0,w_1,\ldots,w_d$ of its two input gates in $\Psi$. If $v$ is a union gate, then just take $v_r=u_r\cup w_r$. If $v$ is a Minkowski sum gate then let $v_r$ be the union of all Minkowski sums $u_i+w_{r-i}$ for $i=0,1,\ldots,r$. Induction implies that the circuit $\Psi$ has the claimed functionality. Each gate in $\Phi$ correspond to at most $O(d^2)$ gates in $\Psi$. So, the new circuit $\Psi$ has at most $O(d^2s)$ gates.
$\Box$

In other words, in order to prove a lower bound on arithmetic circuit size of $f$, it is enough to prove a monotone circuit lower bound on $g_{\epsilon}(x)$ for $\epsilon>0$ sufficiently small.Theorem 5(Hrubeš 2020): If an $n$-variate polynomial $f$ of degree $d$ has an arithmetic circuit of size $s$, then the polynomial \[ g_{\epsilon}(x)=(x_1+\cdots+x_n+1)^d+\epsilon\cdot f(x) \] has a monotone arithmetic circuit of size $O(sd^2 + n \log n)$, for some constant $\epsilon > 0$.

Another way to interpret the result is as follows. It is well known that every arithmetic circuit can be simulated by an arithmetic circuit with only one subtraction (see Valiant 1980). That is, if $f$ has a circuit of size $s$, then $f$ can be written as $f=g-h$, where $g$ and $h$ are monotone polynomials of monotone circuit size $O(s)$. Then Theorem 5 simply asserts that $h$ can be chosen as a scalar multiple $\epsilon^{-1}\cdot U$ of a fixed universal polynomial $U=(x_1+\cdots+x_n+1)^d$ independent of $f$: $f=\epsilon^{-1}\cdot g_{\epsilon} -\epsilon^{-1}\cdot U$.

Theorem 5 is also reminiscent of the so-called slice functions in Boolean complexity. Recall that a Boolean function $f$ is a slice function, if there exists a $k$ such that $f$ accepts on all inputs of Hamming weight $> k$, rejects on inputs of Hamming weight $< k$, and is arbitrary on inputs with weight $k$. Valiant (1986) has shown that for slice functions, general and monotone Boolean circuits are of essentially the same power. This resembles Theorem 5 because the universal polynomial $U$ is constant on inputs $x$ with fixed $x_1+\cdots+x_n$.

Theorem 5 is also "bad news" for the arguments we used above to prove lower bounds on the size of monotone arithmetic circuits: they explore the structure of monomials of polynomials but totally ignore their coefficients.

"Good news" are that, still, even these "free from coefficients" arguments allow us to prove nontrivial lower bounds for approximating tropical circuits (simulating pure dynamic programming algorithms) and, hence, are of some practical importance.

- S. Bova, F. Capelli, S. Mengel, F. Slivovsky (2015), A strongly exponential separation of DNNFs from CNF formulas. arXiv:1411.1995
- B. P. Cavalar, M. Kumar and B. Rossman (2020), Monotone circuit lower bounds from robust sunflowers. Proc. of 14th Latin American Theoretical Informatics Symposium (LATIN 2020). arXiv:2012.03883
- S. Gashkov (1987), On one method of obtaining lower bounds on the
monotone complexity of polynomials.
*Vestnik Moskov. Univ., Ser. 1 Math., Mekh.*, 5, 7-13 (in Russian). paper - S. Gashkov and I. Sergeev (2012), A method for deriving lower bounds for the complexity of monotone arithmetic circuits computing real polynomials,
*Sb. Math.*, 203, 1411-1147. paper - P. Hrubeš (2020), On $\epsilon$-sensitive monotone computations.
*Comput. Complexity*, 29:2. ECCC TR19-034. - M. Jerrum and M. Snir (1982), Some exact complexity results for straight-line computations over semirings.
*J. ACM*, 29, 874-897. paper - S. Jukna (2014), Lower bounds for tropical circuits and dynamic programs.
*Theory of Computing Systems*, 57:1, 160-194. paper - $\Rule{1.5cm}{0.1pt}{1pt}$ (2016), Tropical complexity, Sidon sets, and dynamic programming. SIAM J. on Discrete Math. 30:4, 2064-2085. paper
- $\Rule{1.5cm}{0.1pt}{1pt}$ (2017), Minkowski complexity of sets: an easy lower bound.
*Amer. Math. Monthly*, 124:8 (2017) 749-753. paper - S. Jukna and H. Seiwert, Approximation limitations of pure dynamic programming.
*SIAM J. on Computing*, 49:1, 170-207. paper - O. Kasim-Zade (1986), O.M.: On arithmetical complexity of monotone polynomials. In: Proceedings of All-Union Conference on Theoretical Problems in Cybernetics, vol. 1, pp. 68-69. (in Russian) paper
- J. Kollár, L. Rónyai, and T. Szabó (1996),
Norm-graphs and bipartite Turán numbers.
*Combinatorica*, 16, 399-406. paper - S. E. Kuznetsov (1981), Circuits composed of functional elements
without zero paths in the basis $\{\&,\lor,-\}$.
*Izv. Vyssh. Uchebn. Zaved. Mat.,*1981, no. 5, 56-63;*Soviet Math. (Iz. VUZ)*, 25:5, 62-73. paper - R. Lidl and H. Niederreiter (1986),
*Introduction to Finite Fields and their Applications*. Cambridge University Press, Cambridge. - N. Pippenger (1980), On another Boolean matrix.
*Theoret. Comput. Sci.*, 11, 49-56. paper - R. Raz and A. Yehudayoff (2011), Multilinear formulas, maximal-partition discrepancy and mixed-sources
extractors.
*J. Comput. Syst. Sci.*77(1), 167-190. Preliminary version in: Proceedings of 49th FOCS, 2008. paper - C. Schnorr (1976), A lower bound on the number of additions in monotone computations.
*Theoret. Comput. Sci.*, 2, 305-315. paper - S. Srinivasan (2020), Strongly exponential separation between monotone VP and monotone VNP,
*ACM Transactions on Computation Theory*, September 2020 Article No.: 23. arXiv:1903.01630 - V. Strassen (1973), Vermeidung von Divisionen.
*J. Reine Angew. Math.*, v. 264, 184-202. - L. G. Valiant (1980), Negation can be exponentially powerful.
*Theor. Comput. Sci.*, 12:303-314. paper - $\Rule{1.5cm}{0.1pt}{1pt}$ (1986), Negation is powerless for Boolean slice functions.
*SIAM J. Comput.*15(2), 531-535.

S. Jukna

February 18, 2021

Back to the home page of the book.