Contents
In this comment, we recall the simplified proof of Gashkov-Sergeev general lower bound. We then turn to monotone arithmetic circuits that, instead of computing a given multilinear polynomial $P$ exactly, are only required to approximate it within a given factor $r\geq 1$: compute any polynomial $Q$ such that every monomial of $Q$ is a part of at least one monomial of $P$, and every monomial of $P$ shares at least a $1/r$ portion of its variables with at least one monomial of $Q$. There are no further restrictions on the polynomial $Q$ to be computed. Motivation: lower bounds on approximating monotone arithmetic circuits are lower bounds on the approximating tropical $(\max,+)$ circuits and, hence, on approximating pure dynamic programming algorithms.
Let $\NN=\{0,1,2,\ldots\}$. The Minkowski sum or sumset of two sets $X$ and $Y$ of vectors in $\NN^n$ is the set $X+Y=\{x+y\colon x\in X\ \mbox{ and }\ y\in Y\}$ of vectors in $\NN^n$, where $x+y=(x_1+y_1,\ldots,x_n+y_n)$ is the componentwise sum of vectors $x$ and $y$. That is, we add every vector of $X$ to every vector of $Y$. A Minkowski circuit has the $n+1$ singleton sets $\{\vec{0}\},\{\vec{e}_1\},\ldots,\{\vec{e}_n\}$ as inputs, where $\vec{e}_i=(0,\ldots,0,1,0,\ldots,0)$ is the $i$th unit vector. Gates are fanin-2 union $(\cup)$ and Minkowski sum $(+)$ gates. The size of a circuit is the number of its (non input) gates.
At each node $v$ of such a circuit, some set $X_v\subseteq \NN^n$ of vectors is produced in a natural way. If $v$ is a source node holding a single-element set $\{x\}$, then $X_v=\{x\}$. If $v$ is a gate entered by nodes $u$ and $w$, then $X_v=X_u\cup X_w$ if $v$ is a union gate, and $X_v=X_u+ X_w$ if $v$ is a Minkowski sum gate. The set produced by the entire circuit is the set $A=X_o$ produced at the output gate $o$. For a set $A\subseteq \NN^n$ of vectors, let $\Un{A}$ denote the minimum size of a Minkowski circuit producing $A$.
There is a natural relation between Minkowski and monotone arithmetic $(+,\times)$ circuits. For a polynomial $f(x_1,\ldots,x_n)$, let $A_f\subseteq \NN^n$ denote the set of its exponent vectors. Hence, $f$ is of the form $f(x)=\sum_{a\in A_f}\, c_a\prod_{i=1}^n x_i^{a_i}$ for some coefficients $c_a > 0$. Let $\arithm{f}$ denote the minimum size of a monotone arithmetic $(+,\times)$ circuit computing the polynomial $f$. The following simple fact motivates Minkowski circuits.
Fact 1: For every polynomial $f$, we have $\arithm{f}\geq \Un{A_f}$.Proof: There is a natural homomorphism from the semiring of polynomials to the semiring $(2^{\mathbb{N}^n},\cup,+)$ of finite subsets of vectors that maps every polynomial $f$ to the set $A_f\subset\NN^n$ of its exponent vectors. In particular, every single variable $x_i$ is mapped to $A_{x_i}=\{\vec{e}_i\}$, and every input constant $c$ is mapped to $A_c=\{\vec{0}\}$. That this is indeed a homomorphism follows from easily verifiable equalities $A_{f+h}=A_f\cup A_h$ and $A_{f\cdot h}=A_f+A_h$, the latter sum being the Minkowski sum of $A_f$ and $A_h$. $\Box$
In fact, almost all known lower bounds on the monotone algebraic complexity of polynomials $f$ were obtained by proving lower bounds on $\Un{A_f}$.
Example (due to Schnorr): Let $n=\tbinom{m}{2}$, and let $A$ be the set of all $|A|=\tbinom{m}{k}$ characteristic $0$-$1$ vectors of $k$-cliques, viewed as sets of their edges in a complete graph $K_m$ on $\{1,\ldots,m\}$ nodes. It is easy to verify that no union of two $k$-cliques can contain some third $k$-clique. Indeed, the latter clique must then have a node $u$ not in the first clique and a node $v$ not in the second clique. If $u=v$ then the node $u$ is not covered, and if $u\neq v$ then the edge $\{u,v\}$ is not covered by the first two cliques, a contradiction.
Theorem 1 (Schnorr 1976): For every cover-free set $A\subset\NN^n$, we have $\Un{A}\geq |A|-1$.By taking the clique size $k=m/2$, we have an explicit set $A\subset\{0,1\}^n$ with $\Un{A}\geq 2^m/2\sqrt{m}= 2^{\sqrt{n}-O(\log n)}$. Explicit lower bounds of the form $\Un{A}=2^{\Omega(\sqrt{n})}$ were also proved by Jerrum and Snir (1982) using different arguments. Gashkov (1987) and then Gashkov and Sergeev (2012) have extended Schnorr's lower bound to a much larger class of so-called "thin" sets, which led to explicit almost maximal possible lower bound of the form $\Un{A}\geq 2^{n-o(n)}$.
A set $A\subseteq \NN^n$ is $(k,l)$-thin ($1\leq k\leq l$) if $X+Y\subseteq A$ implies that $|X|\leq k$ or $|Y|\leq l$. In other words, if $\chi_A:\NN^n\to\{0,1\}$ is the characteristic function of a set $A\subset\NN^n$ (with $\chi_A(x)=1$ iff $x\in A$), then $A$ is $(k,l)$-thin iff for every $k+1$ distinct vectors $a_1,\ldots,a_{k+1}\in\NN^n$, the system of equations $\chi_A(a_1+x)=1,\ldots, \chi_A(a_{k+1}+x)=1$ has at most $l$ solutions $x\in\NN^n$. In particular, a set $A$ is $(1,1)$-thin iff for every $a\neq b\in\NN^n$, the system $\chi_A(a+x)=1,\chi_A(b+x)=1$ has at most one solution $x\in\NN^n$.
Remark: In terms of polynomials, a polynomial $P$ is $(k,l)$-thin if it cannt be written in the form $P=P_1\cdot P_2+P_3$ with both $|\mon{P_1}|>k$ and $|\mon{P_2}|>l$. (We only consider polynomials with positive coefficients; so, there are no cancellations.)
Remark: The concept of thin sets is motivated by the classical combinatorial Zarankiewicz problem: how many ones can a $0$-$1$ matrix have if it does not have any $(k,l)$ all-$1$ submatrix? Now associate with a finite set $A\subset\NN^n$ of vectors the (infinite) $0$-$1$ matrix $M_A$ whose rows and columns are labeled by vectors in $\NN^n$. Let $M_A[x,y]=1$ iff $x+y\in A$. Then $A$ is $(k,l)$-thin iff the matrix $M_A$ has no $(k+1,l+1)$ all-$1$ submatrix.
For every set $A\subseteq \NN^n$, the number $|A+A|$ of distinct vectors in the sumset $A+A$ is at most the number of all sets $\{a,b\}$ with $a,b\in A$, which is at most the number $\binom{|A|}{2}$ of sets with $a\neq b$ plus the number $|A|$ of sets with $a=b$. So, \[ |A+A|\leq \binom{|A|}{2}+|A|=\binom{|A|+1}{2}\,. \] (This can be also shown by an easy induction on $|A|$; see here). Sidon sets are those sets $A$ for which the equality $|A+A|=\binom{|A|+1}{2}$ holds. In other words, a set $A\subseteq \NN^n$ is a Sidon set if, for every vectors $a,b\in A$, \[ \tag{$\ast$} \mbox{the sum $a+b$ determines the pair $\{a,b\}$,} \] that is, if $a+b=c+d$ for some $c,d\in A$, then $\{c,d\}=\{a,b\}$. Note that in order to show that a set $A$ is a Sidon set, it is enough to only show that $(\ast)$ holds for all pairs of distinct vectors $a\neq b\in A$: if the sum $a+a$ of some vector $a\in A$ with itself does not determine the vector $a$, then $a+a=c+d$ but $\{a\}\neq \{c,d\}$ holds for some $c,d\in A$. Since $x+x=y+y$ over the semigroup $(\NN^n,+)$ implies $x=y$, we then have $c\neq d$. But then the condition $(\ast)$ is violated by the pair $c\neq d$ of distinct vectors of $A$.
Fact 2: Cover-free $\Rightarrow$ Sidon $\Rightarrow$ $(1,1)$-thin.One can also show that $(1,1)$-thin $\Rightarrow$ Sidon; see here.
Proof: If a set $A\subseteq \NN^n$ is cover-free, then it is also a Sidon set just because $a+b=c+d$ yields both inequalities $a+b\geq c$ and $a+b\geq d$. Now let $A\subseteq\NN^n$ be a Sidon set, and assume to the contrary that $A$ is not $(1,1)$-thin. Then $\{x,x'\}+\{y,y'\}\subseteq A$ holds for some vectors $x\neq x'$ and $y\neq y'$. The sum $a+b$ of the two vectors $a=x+y$ and $b=x'+y'$ of $A$ is then equal to the sum $c+d$ of the two vectors $c=x+y'$ and $d=x'+y$ of $A$. Since $A$ is a Sidon set, at least one of $x+y'=x+y$ or $x+y'=x'+y'$ must hold, contradicting that $x\neq x'$ and $y\neq y'$. $\Box$
Gashkov proved the following lower bound on the Minkowski complexity of $(k,l)$-thin sets when $k=l$.
Theorem 2 (Gashkov 1987): For every $(k,k)$-thin set $A\subset\NN^n$, we have $\Un{A}\geq |A|/k^3-1$.Later, Gashkov and Sergeev (2012) extended this lower bound to the inbalanced case when $1\leq k\leq l$.
Theorem 3 (Gashkov-Sergeev 2012): For every $(k,l)$-thin set $A\subset\NN^n$, we have $\Un{A}\geq |A|/\max\{k^3,l^2\}-1$.The proofs of these lower bounds use clever, but technically involved, analysis of monotone arithmetic circuits. Motivated by the argument used by Pippenger (1980), we gave a short and amazingly simple proof of the following lower bound.
Theorem 4 (Jukna 2017): For every $(k,l)$-thin set $A\subset\NN^n$ of size $|A|>k$, we have $\Un{A}\geq |A|/2lk^2$.This lower bound holds for slightly more general Minkowski circuits, where any single-element sets $\{x\}$ with $x\in\NN^n$ can be used as inputs, not necessarily only unit vectors $\vec{e}_i$. In the context of arithmetic circuits, the bound $\arithm{f_A}\geq |A|/2lk^2$ holds for circuits that can use (for free) arbitrary monomial (of arbitrary degree) as inputs.
What we actually show is that every $(k,l)$-thin set $A\subset\NN^n$ of Minkowski circuit complexity $t=\Un{A}$ is a union of at most $2t$ sumsets $X+Y\subseteq A$ such that $|X|\leq k^2$ and $|Y|\leq l$.
In the next section, we show how either of Theorems 3 and 4, together with known construction of large sufficiently thin sets $A$, yield truly exponential lower bounds on $\Un{A}$. Then we give an easy proof of Theorem 4.
Example (Greedy construction) A Sidon set $A\subseteq\{0,1\}^n$ of size $|A|\geq c 2^{n/3}$ for a constant $c > 0$ can be constructed by a simple greedy algorithm. Start with an empty set $A=\emptyset$, and at each step, include a vector $x$ in $A$ if $x$ is not of the form $x=a+b-c$ for some vectors $a,b,c$ already in the set $A$. If we already have a set of size $|A|=K$, and if $K+K^3 < 2^n$, at least one new vector $x$ will be included in $A$. So, the final Sidon set will have $|A|\geq c2^{n/3}$ vectors for a constant $c>0$.
Example (Cubic parabola) Let $q$ be an even prime power, and let $\FF=\gf{q}$ be the field with $q$ elements. The (graph of the) cubic parabola over $\FF$ is the set $A=\{v(x)\colon x\in\FF\}$ with $v(x):=(x,x^3)$. We claim that $A$ is a Sidon set over the group $(\FF,+)$. To show this, fix a pair $(a,b)\in \FF^2$ of elements, and suppose that $v(x)+v(y)=(a,b)$ holds for two elements $x\neq y$ in $\FF$. Then $x+y=a\neq 0$. Since $x+x+x=x$ over $\FF$ ($q$ is even), we also have $x^3+y^3=(x+y)^3-3xy(x+y)=a^3-xyb$. So, the second equation $x^3+y^3=b$ yields $a^3-xya = b$, that is, $xy=c$ for $c=(b/a)-a^2$. Since the system $x+y=a, x\cdot y=c$ cannot have more than two solutions in the field, we find that $\{x, y\}$ is uniquely determined by $(a,b)$. By taking $n=2m$, $q=2^m$ and identifying the elements of $\gf{q}=\gf{2^m}$ with vectors in $\{0,1\}^n$, we obtain a set $A\subseteq\{0,1\}^n$ of size $|A|=2^m=2^{n/2}$ such that for every vectors $a\neq b\in A$, the vector $a+b$ determines the pair $\{a,b\}$ over $(\FF,+)$ and, hence, also over $(\NN^n,+)$. Since over $(\NN^n,+)$, each sum $a+a$ also determines $a$, $A$ is a Sidon set.
Corollary 1: Let $f(x_1,\ldots,x_n)$ be a multilinear polynomial whose set of exponent vectors forms a cubic parabola in $\gf{2^{n}}$.Then $\arithm{f}\geq 2^{n/2}$.
Example (Norm sets) Let $q$ be a prime-power, $t\geq 2$ an integer, and consider the field $\gf{q^t}$ with $q^t$ elements. The norm is a mapping $N:\gf{q^t}\to \gf{q}$ given by $N(a)=a\cdot a^q\cdots a^{q^{t-1}}=a^{(q^t-1)/(q-1)}$. Consider the set $A=\{a\in\gf{q^t}\colon N(a)=1\}$. It is known (see Lidl and Niederreiter 1986) that $|A|=(q^t-1)/(q-1)$. Kollár, Rónyai and Szabó (1996) proved that for every $t\geq 2$ distinct elements $a_1,\ldots,a_t$ of $\gf{q^t}$, the system of equations $ N(a_1+x)=1, N(a_2+x)=1,\ldots,N(a_t+x)=1 $ has at most $t!$ solutions $x\in \gf{q^t}$. So, the set $A$ is $(t+1,t!)$-thin over the group $(\gf{2^n},+)$.
Now let $n$ be a positive integer of the form $n=r\cdot t$ for $t\geq 2$, and let $q=2^r$. Hence, $\gf{q^t}=\gf{2^n}$. By viewing elements of $\gf{2^n}$ as vectors in $\{0,1\}^n$, we obtain a set $A$ of $|A|=(q^t-1)/(q-1)\geq q^{t-1}= 2^{n-r}$ vectors in $\{0,1\}^n$ which is $(t+1,t!)$-thin over $(\gf{2^n},+)$, and hence, also over the semigroup $(\NN^n,+)$. Call the set $A$ the $(n,t)$-norm set.
Corollary 2: Let $f(x_1,\ldots,x_n)$ be a multilinear polynomial whose set of exponent vectors forms an $(n,t)$-norm set $A\subset\{0,1\}^n$. Then $\arithm{f}\geq 2^{n-n/t-2t\log t}$.For $t=\sqrt{n}$, we obtain $\arithm{f}\geq 2^{n-o(n)}$.
Proof: We know that the set $A$ is $(k,l)$-thin for $k=t+1$ and $l=t!\sim(t/e)^t$. Thus, Theorem 4 yields $\Un{A}\geq |A|/2lk^2\geq |A|/t^{2t}\geq 2^{n-n/t-2t\log t}$. $\Box$
Remark: Lower bounds of the form $\Un{A}\geq 2^{cn}$ for various constants $0 < c < 1$ were also proved using different arguments. Raz and Yehudayoff (2011) used discrepancy arguments and exponential sum estimates to derive such a bound. Jukna (2014) proved such a bound using graphs with good expansion properties (Ramanujan graphs). Cavalar, Kumar and Rossman (2020) proved such a bound using recently constructed error-correcting codes with very good parameters, Srinivasan (2020) proved such a bound using communication complexity arguments.
Remark: Actually, a lower bound $\Un{A}\geq 2^{n/6}$ already follows from a lower bound $2^{n/3}$ of Kuznetsov (1981) on the size of Boolean $\{\land,\lor,\neg\}$ circuits without null-chains. Every Boolean circuit not only computes a unique Boolean function, but also produces (purely syntactically) some DNF $D$. When forming this DNF, we allow the laws $x\cdot x=x\lor xy=x$ be used but the law $x\land\overline{x}=0$ is not used. So, in general, the produced DNF may contain zero terms, i.e. those containing a variable together with its negation. A circuit is null-chain free if $D$ contains no zero terms. He proved that if every two vectors in every two vectors accepted by a Boolean function $f(x_1,\ldots,x_n)$ differ in at least $d$ positions, then every Boolean $\{\land,\lor,\neg\}$ circuits without null-chains computing $f$ must have at least $|f^{-1}(1)|^{d/(n+d)}$ gates. For characteristic functions $f$ of linear codes, this leads to lower bounds of the form $2^{\Omega(\sqrt{n})}$ (say, for Reed-Muller code with $d=\sqrt{n}$) and even of the form $2^{\Omega(n)}$ for less trivial codes.
Using another argument, Kuznetsov (1981) proved a lower bound of the form $2^{n/3}$. A forbidden pattern is a set $B\subset\{0,1\}^n$ of $|B|=4$ vectors such that, after appropriate permutation of positions, the set is of the form $B=\{x,x'\}\times\{y,y'\}\times\{z\}$ with $x\neq x'$ and $y\neq y'$. Using a greedy procedure (as we have done for Sidon sets), it is possible to construct a subset $A\subseteq \{0,1\}^n$ of $|A|\geq 2^{n/3}$ vectors such that $A$ contains no forbidden patterns. Kuznetsov (1981) proved that any null-chain free circuit computing a Boolean function $f_n(x_1,\ldots,x_n)=\bigvee_{a\in A}\bigwedge_{i=1}^nx_i^{a_i}$ requires at least $|A|\geq 2^{n/3}$ gates; here, $x_i^1=x_i$ and $x_i^0=\overline{x}_i$.
To translate this lower bound to one for monotone arithmetic circuits, consider the multilinear polynomial $P_m(x,y)=\sum_{a\in A}\prod_{i=1}^nz_i^{a_i}$ of $m=2n$ variables, where $z_i^1=x_i$ and $z_i^0=y_i$. Note that no monomial of $P_m$ contains both variables $x_i$ and $y_i$. So, given a monotone arithmetic circuit for $P_m$, we can transform it to a Boolean circuit without null-chains by replacing each input variable $y_i$ by a negated Boolean variable $\overline{x}_i$, and by replacing the $+$ and $\times$ gates by $\lor$ and $\land$ gates. The obtained circuit computes $f_n$, and the lower bound $\arithm{P_m}\geq 2^{n/3}=2^{m/6}$ follows.
Apparently unaware of this Kuznetsov's more general 40 years old result, a lower bound of the form $2^{\Omega(n)}$ for a very special case of Boolean circuits without null-chains (so-called DNNFs) was recently proved by Bova et al. (2015). These circuit are, in fact, read-once DeMorgan circuits.
Let $\Phi$ be a Minkowski $(\cup,+)$ circuit producing some set $A\subset\NN^n$; hence, $X_o=A$ holds for the output gate $o$. If $v$ is a non-output gate of this circuit, then the set $X_v\subset\NN^n$ of vectors produced at a non-output gate $v$ needs not lie in the set $A$ produced by the entire circuit, but at least one of its translates $X_v+y$ for some vector $y\in\NN^n$ must already lie in $A$. We collect all such vectors $y$ and call the set \[ Y_v:=\{y\in\NN^n\colon X_v+y\subseteq A\} \] the residue of $X_v$ (within $A$). The sumset $\cont{v}:=X_v+Y_v$ is the content of the gate $v$. Note that $\cont{v}\subseteq A$ already holds for every gate $v$. For example, if $v$ is the output gate, then $X_v=A$ and $Y_v=\{0\}$. If $v$ is a source node holding a set $\{x\}$ with $x\in\NN^n$, then $X_v=\{x\}$ and $Y_v$ consists of all vectors $y\in\NN^n$ such that $x+y\in A$.
The sets $X_v$ produced at gates $v$ and their residues $Y_v$ have the following properties. Let $v$ be a gate entered from gates $u$ and $w$.
When defining the contents $\cont{v}= X_v+Y_v$ of nodes $v$ (gates in the circuit), it is irrelevant whether $v$ is a union $(\cup)$ or a Minkowski sum $(+)$ gate: $X_v$ is the set of vectors produced at $v$, and $Y_v$ consists of all vectors $y\in\NN^n$ for which the set $X_v+\{y\}$ lies in $A$. When defining the contents $\cont{e}$ of edges $e=(u,v)$, we break this symmetry. Namely, if the head $v$ of $e$ is a Minkowski sum $(+)$ gate, then we let the content of $e$ be the content $\cont{e}:=X_v+Y_v$ of the head $v$. But if the head $v$ of $e$ is a union $(\cup)$ gate, then we let $\cont{e}:=X_u+Y_v$.
Note that in both case we have $\cont{e}\subseteq A$. Indeed, if the head $v$ of the edge $e=(u,v)$ is a Minkowski sum gate, then $\cont{e}=\cont{v}\subseteq A$. If $v$ of $e$ is a union gate, and $w$ is the second gate entering $v$, then $\cont{e}=X_u+Y_v=X_u+(Y_u\cap Y_w)\subseteq X_u+Y_u=\cont{u}\subseteq A$.
The reason for the asymmetry in this definition is explained by the following lemma.
Lemma 1 (The content propagation lemma): Let $v$ be a gate and $a\in\cont{v}$ a vector in its content.Proof: Let first $v=u\cup w$ be a union gate, and take a vector $a\in\cont{v}$ in the content of $v$. Since in this case, $\cont{v}=X_{v}+Y_{v} =(X_{u}\cup X_{w})+(Y_{u}\cap Y_w)$, the vector $a$ must belong at last one of the sets \[ \mbox{$\cont{(u,v)}=X_{u}+Y_{v}\subseteq X_u+Y_u=\cont{u}$ or $\cont{(w,v)}=X_{w}+Y_{v}\subseteq X_w+Y_w=\cont{w}$,} \] as desired.
- If $v$ is a union gate, then vector $a$ belongs to the content of at least one edge entering $v$, as well as to the content of the tail of this edge.
- If $v$ is a Minkowski sum gate, then vector $a$ belongs to the contents of both edges entering $v$, as well as to the contents of the tails of these edges.
Now let $v=u+w$ be a Minkowski sum gate. In this case, the contents of both edges $(u,v)$ and $(w,v)$ coincide with the content $\cont{v}$ of the gate $v$. So, it is enough to show that $\cont{v}\subseteq \cont{u}\cap \cont{w}$ holds in this case. This follows directly from the inclusions (*): \[ \cont{v}= X_v+Y_v =(X_u+X_w)+Y_v = X_u+(X_w+Y_v)\subseteq X_u+Y_u=\cont{u}\,, \] and similarly, $\cont{v}=X_v+Y_v\subseteq X_w+Y_w=\cont{w}$. $\Box$
Remark: What Lemma 1 ensures is the following. Take an arbitrary vector $a\in A$ in the set $A$ produced by the circuit. This vector belongs to the content $\cont{o}=A$ of the output gate $o$. Start at the output gate $o$, and construct a path in the underlying directed acyclic graph by going backwards and by always choosing an edge whose content contains vector $a$. Lemma 1 ensures that no such path will get stuck, that is, every such path will eventually reach an input node.
Claim 1: Every vector $a\in A$ belongs to the content of at least one cheap edge.Given this claim, the theorem follows easily. Indeed, if $E$ is the set of all cheap edges in the circuit, then Claim 1 implies that $A\subseteq \bigcup_{e\in E} \cont{e}$. Since $|\cont{e}|\leq lk^2$ holds for every edge $e\in E$, this implies $|A|\leq \sum_{e\in E}|\cont{e}|\leq lk^2|E|$. So, there must be at least $|E|\geq |A|/lk^2$ edges in the circuit. Since every gate (non-source node) in the circuit is entered by exactly two edges, and every edge must enter some gate, the total number of edges in the circuit is twice the total number $t$ of gates. Hence, $t\geq |E|/2\geq |A|/2lk^2$, as claimed.
So, it remains to prove Claim 1. Call a node $u$ in the circuit cheap if $|X_u|\leq k$, and expensive otherwise. We can assume that the output gate is expensive because otherwise the set $A$ produced by the circuit would be also cheap, and there would be nothing to prove. Fix a vector $a\in A$. Start at the output gate of the circuit, and construct a path in the underlying directed acyclic graph by going backwards and using the following rule, where $v$ is the last previously reached node.
If $v$ is a union gate, then $|\cont{e}|=|X_u+Y_v|\leq |X_u|\cdot|Y_v| \leq k\cdot l$, which is clearly at most $\leq k^2l$. So, the edge $e$ is cheap in this case. If $v$ is a Minkowski sum gate, and $w$ is its second input, then the content of $e$ is the Minkowski sum $\cont{e}=X_v+Y_v=X_u+X_w+Y_v$ of three sets. Since the node $u$ is cheap, rule (ii) in the construction of the path implies that the second node $w$ entering $v$ must have also been cheap, that is, $|X_w|\leq k$ must hold (for otherwise, we would have had chosen $w$ instead of $u$ during the construction of the path). We thus have $|\cont{e}|\leq |X_u|\cdot|X_w|\cdot|Y_v|\leq k^2\cdot l$, as desired. $\Box$
The set $B$ $r$-approximates the set $A$ if $B$ lies below $A$ and for every $a\in A$ there is a vector $b\in B$ such that $\skal{a,b}\geq \tfrac{1}{r}\cdot |a|$, where $|a|=\skal{a,a}$; $r\geq 1$ is here the approximation factor. Let $\Unr{A}{r}$ denote the minimum size of a Minkowski circuit $r$-approximating the set $A$: \[ \Unr{A}{r}=\min\{\Un{B}\colon \mbox{$B\subset\NN^n$ $r$-approximates $A$}\}\,. \]
Motivation (Approximation limits of dynamic programming) It is known (Jukna and Seiwert 2020) that $\Unr{A}{r}$ is a lower bound on the size of tropical $(\max,+)$ circuits approximating the maximization problem $f(x)=\max_{a\in A} \skal{a,x}$ within the factor $r$. Tropical $(\max,+)$ circuits is a mathematical model to simulate so-called "pure" dynamic programming algorithms, those using only basic operations $\max$ and $+$ in their recursion equations. So, lower bounds on $\Unr{A}{r}$ and, hence, on the size of approximating $(\max,+)$ circuits show the weakness of DP algorithms approximating maximization problems.
Fact: There are explicit Sidon sets $A\subseteq\{0,1\}^n$ such that $\Un{A}\geq 2^{n/4}$ but $\Unr{A}{r}\leq n$ already for $r=2$.Proof: see here.
The following fact shows that even Shannon type counting arguments fail for approximating circuits already for very small factors $r=1+o(1)$.
Fact: There exist $2^{2^{\Omega(n)}}$ sets $A\subseteq\{0,1\}^n$ such that $\Un{A}\geq 2^{\Omega(n)}$ but each of these sets can be approximated within a factor $r=1+o(1)$ by a single Minkowski circuit of size $O(n^2)$.Proof: see here.
A norm-measure of vectors in $\NN^n$ is any assignment $\Nor:\NN^n\to\RR_+$ of nonnegative real numbers to vectors in $\NN^n$ such that every $0$-$1$ vector with at most one $1$ gets norm at most $1$, and the norm is monotone and subadditive: $\nor{x}\leq \nor{x+y}\leq \nor{x}+\nor{y}$ holds for all vectors $x,y\in\NN^n$.
Lemma 3 (Balanced decomposition lemma, Jukna 2016): If a set $B\subset\NN^n$ can be produced by a Minkowski circuit of size $t$, then $B$ is a union of $t$ sumsets $X+Y\subseteq B$ with the following property.Proof: see here.
- $(\ast)$ For every norm-measure $\Nor:\NN^n\to\RR_+$, for every vector $\a\in B$ of norm $\nor{\a}>0$, and every $\bbal$ satisfying $1/\nor{\a}\leq \bbal < 1$ at least one of these sumsets $X+Y$ contains vectors $x\in X$ and $y\in Y$ such that $x+y=\a$ and $\tfrac{\bbal}{2}\cdot\nor{\a}<\nor{x}\leq \bbal\cdot \nor{\a}$.
Note the subadditivity of norms also yields $(1-\bbal)\cdot\nor{\a}\leq \nor{y} < (1-\tfrac{\bbal}{2})\cdot \nor{\a}$. The main difference from previous "balanced decomposition" results for monotone arithmetic circuits (starting from the classical Hyafil-Valiant decomposition) is that one can choose different norm-measures $\Nor$ for different vectors $b\in B$.
In the following lemma, $0<\bal<1$ is an arbitrary fixed ``balance'' parameter. Say that a $0$-$1$ vector $a$ appears $r$-balanced in $X+Y$ if \begin{equation}\label{eq:balanced} \skal{a,x+y}\geq \tfrac{1}{r}\cdot |a|\,,\ \skal{a,x} > \tfrac{\bal}{2r}\cdot|a| \ \mbox{ and }\ \skal{a,y}\geq \tfrac{1-\bal}{r}\cdot |a|\,. \end{equation}
Lemma 4: Let $A\subseteq \{0,1\}^n$ be an $m$-homogeneous set, and $1\leq r\leq \bal m$. If $\Unr{A}{r}\leq t$, then there exist $\leq t$ rectangles lying below the set $A$ such that every vector of $A$ appears $r$-balanced in at least one of these rectangles.Proof: Let $\Phi$ be a Minkowski circuit of size $t$ approximating the set $A$ within the factr\or $r$, and let $B\subset\NN^n$ be the set of vectors produced by $\Phi$. Hence, we know that the set $B$ has the following two properties:
The balanced decomposition lemma (Lemma 3) gives us a collection of $t$ sumsets $X+Y\subseteq B$ with the following property holding for every norm-measure $\Nor:\NN^n\to\RR_+$, for every vector $b\in B$ of norm $\nor{b}>0$, and every real number $\bbal$ satisfying $1/\nor{\a}\leq \bbal < 1$: at least one of the sumsets $X+Y$ contains vectors $x\in X$ and $y\in Y$ such that $x+y=b$ and $\tfrac{\bbal}{2}\cdot\nor{b}<\nor{x}\leq \bbal\cdot \nor{b}$.
Since the set $A$ consists of only $0$-$1$ vectors, property (i) implies that the set $B$ and, hence, each of the sumsets $X+Y$ consists of $0$-$1$ vectors. In particular, we then have $\skal{x,y}=0$ for all $x\in X$ and $y\in Y$, i.e. $X+Y$ is a rectangle. It remains to show that every vector of $A$ appears $r$-balanced in at least one of these rectangles.
To show this, fix an arbitrary vector $a\in A$, and consider the norm-measure $\nor{x}=\nnor{x}{a}:=\skal{a,x}$. By (ii), for a shadow $b\in B$ of vector $a$ in $B$, we have $\nor{b}\geq m/r$. Since $r\leq \bal m$, the condition $1/\nor{\a}\leq \bbal < 1$ of Lemma 3 is fulfilled. Thus, at least one of the $t$ rectangles $X+Y$ contains vectors $x\in X$ and $y\in Y$ such that $x+y=\a$ and \begin{equation}\label{eq:balanced1} \tfrac{\bbal}{2}\cdot\nor{\a}<\nor{x}\leq \bbal\cdot \nor{\a}\,. \end{equation} Since $\nor{b}\geq m/r$ holds for the shadow $b$ of $a$, we have $\skal{a,x+y}=\skal{a,b}\geq m/r$; this gives us the first inequality of \eqref{eq:balanced}. The second inequality in \eqref{eq:balanced} follows from the first inequality in \eqref{eq:balanced1}: \[ \skal{a,x}=\nor{x} > \tfrac{\bbal}{2}\cdot\nor{b}=\tfrac{\bbal}{2}\cdot \skal{a,b}\geq \tfrac{\bbal}{2}\cdot\frac{m}{r}=\tfrac{\bal}{2r}\cdot m. \] Since $\skal{x,y}=0$, the third inequality in \eqref{eq:balanced} follows from $\skal{a,x+y}\geq m/r$ and the second inequality in \eqref{eq:balanced1}. $\Box$
Lemma 5: Let $A\subseteq \{0,1\}^n$ be an $(m,d)$-design with $d\leq m/2$. Then for every approximation factor $1\leq r\leq m/2d$, we have $\Unr{A}{r}\geq |A|/\ddeg{A}{l}$ for $l=m/4r$.Proof: We will apply Lemma 4 with the balance parameter $\bal:=1/2$. Take an arbitrary rectangle $R=X+Y$ lying below the set $A$, and consider the set $A_R\subseteq A$ of all vectors of $A$ appearing $r$-balanced in the rectangle $R$. Thus, $A_R$ consists of all vectors $a\in A$ for which there are vectors $x\in X$ and $y\in Y$ with $\skal{a,x+y}\geq \tfrac{1}{r}\cdot |a|=\tfrac{m}{r}$ (we will not use this in the proof) and the bounds: \[ \skal{a,x} > \tfrac{\bal}{2r}\cdot|a|= \tfrac{m}{4r} =l \qquad \mbox{ and } \qquad \skal{a,y}\geq \tfrac{1-\bal}{r}\cdot |a|=\tfrac{m}{2r}\geq d\,. \] By Lemma 4, it is enough to show that $|A_R|\leq \ddeg{A}{l}$. We can assume that $|x|\geq l$ and $|y|\geq d$ holds for all $x\in X$ and $y\in Y$: vectors $x\in X$ with $|x|< l$ ones cannot fulfill the inequality $\skal{a,x} \leq l$, and similarly for vectors $y\in Y$.
Claim: The componentwise OR $z$ of all vector in $X$ is contained in all vectors $a\in A_R$.Since $|z|\geq l$, the claim yields the desired inequality $|A_R|\leq \ddeg{A}{l}$. So, it remains to prove the claim.
To do this, take an arbitrary vector $a\in A_R$. We have to show that $z\leq a$ holds. Then $\skal{a,y_a}\geq d$ for some vector $y_a\in Y$. On the other hand, since the rectangle $R$ lies below $A$, every vector $x\cup y_a$ with $x\in X$ must be contained in some vector of $A$. Since all these vectors contain the vector $y_0$ with $|y_0|\geq d$ ones, the $d$-disjointness of $A$ implies that all vectors of $X+\{y_0\}$ and, hence, also the vector $z$ must be contained in one vector $a'$ of $A$. Since both vectors $a$ and $a'$ of $A$ contain the same vector $y_0$ with $|y_0|\geq d$ ones, and since the set $A$ is $d$-disjoint, the equality $a=a'$ and, hence, the desired inequality $z\leq a$ follows. $\Box$
Construction of large designs are known. A simplest one is based on the fundamental fact about polynomials: no univariate polynomial of degree $s$ can have $s+1$ roots.
Example (Polynomial designs) Let $n=m^2$ for a prime power $m$, $1\leq d\leq m$ an integer, and consider the vectors $a\in\{0,1\}^n$ whose positions are pairs $(i,j)$ of elements in $\gf{m}$. Let $P$ be the set of all univariate polynomials $p(z)$ of degree at most $d-1$ over $\gf{m}$. The polynomial $(n,d)$-design $A\subseteq \{0,1\}^n$ consists of all $|A|=m^d$ vectors $a\in\{0,1\}^n$ such that for some polynomial $p\in P$ and for every pair $(i,j)\in \gf{m}\times\gf{m}$, we have $a_{i,j}=1$ iff $j=p(i)$. That is, the set $A$ consists of the characteristic $0$-$1$ vectors of the graphs of polynomials in $P$.
It is well known (and easy to show) that $\ddeg{\f}{l}\leq m^{d-l}$ holds for every $0\leq l\leq d$. Together with Lemma 5, this yields the following lower bound on the size of Minkowski circuits approximating polynomial designs.
Corollary 3: Let $A\subseteq \{0,1\}^n$ be the polynomial $(n,d)$-design with $d\leq \sqrt{n}/2$. Then for every approximation factor $1\leq r\leq \sqrt{n}/2d$, we have $\Unr{A}{r}\geq n^{\Omega(d)}$.
The degree of a vector is the sum of its entries, and a set of vectors is homogeneous of degree $m$ if all its vectors have the same degree $m$. Let $B\subset\NN^n$ be a homogeneous set of degree $m$, and $0 < \bal < 1$ be a parameter. A sumset $X+Y\subseteq B$ is $\bal$-balanced if every vector of $X$ has degree between $\tfrac{\bal}{2}\cdot m$ and $\bal\cdot m$. The following lemma is the well-known Hyafil-Valiant rectangle lemma for monotone arithmetic circuits.
Rectangle Lemma (Hyafil-Valiant): Let $B\subset\NN^n$ be a homogeneous set of degree $m\geq 2$, and $0 < \bal < 1$. Then $B$ is a union of at most $\Un{B}$ $\bal$-balanced sumsets.
Proof: We will apply Lemma 3 with the degree $|x|=x_1+\cdots+x_n$ of vectors as a single norm-measure used for all vectors $b\in B$. By Lemma 3, there are at most $t$ sumsets $X+Y\subseteq B$ such that for every vector $b\in B$ at least one of these sumsets $X+Y$ contains vectors $x\in X$ and $y\in Y$ such that $x+y=\a$, $\bal m/2 \leq |x| \leq \bal m$. Since $X+Y\subseteq B$ and the set $B$ is homogeneous, the set $X$ is $r$-homogeneous for $r=|x|$ and $Y$ is $(m-r)$-homogeneous. So, the sumset $X+Y$ is $\bal$-balanced. $\Box$
Homogenization Lemma (Strassen 1973): Let $A\subset\NN^n$ be a set of degree $d$. If the set $A$ can be produced by a Minkowski circuit of size $s$, then all its homogeneous parts $A^{(0)},A^{(1)},\ldots, A^{(d)}$ can be simultaneously produced by a Minkowski circuit of size $O(d^2s)$.
Proof: Let $\Phi$ be a Minkowski circuit of size $s$ producing the set $A$. For every gate $v$ in $\Phi$, we take $d+1$ copies $v_0,v_1,\ldots,v_d$ of this gate in the new circuit $\Psi$. The goal is to achieve the situation where at the $r$th copy $v_r$ of a gate $v$ in the circuit $\Phi$ the $r$th homogeneous part $X_v^{(r)}$ of the set $X_v\subset\NN^n$ of vectors produced at the gate $v$ in $\Phi$ is produced. This can be achieved by appropriately connecting the gates in the new circuits. Namely, if $v=u\circ w$ is a gate in $\Phi$ with $\circ\in\{\cup,+\}$, then we have the copies $u_0,u_1,\ldots,u_d$ and $w_0,w_1,\ldots,w_d$ of its two input gates in $\Psi$. If $v$ is a union gate, then just take $v_r=u_r\cup w_r$. If $v$ is a Minkowski sum gate then let $v_r$ be the union of all Minkowski sums $u_i+w_{r-i}$ for $i=0,1,\ldots,r$. Induction implies that the circuit $\Psi$ has the claimed functionality. Each gate in $\Phi$ correspond to at most $O(d^2)$ gates in $\Psi$. So, the new circuit $\Psi$ has at most $O(d^2s)$ gates. $\Box$
Theorem 5 (Hrubeš 2020): If an $n$-variate polynomial $f$ of degree $d$ has an arithmetic circuit of size $s$, then the polynomial \[ g_{\epsilon}(x)=(x_1+\cdots+x_n+1)^d+\epsilon\cdot f(x) \] has a monotone arithmetic circuit of size $O(sd^2 + n \log n)$, for some constant $\epsilon > 0$.In other words, in order to prove a lower bound on arithmetic circuit size of $f$, it is enough to prove a monotone circuit lower bound on $g_{\epsilon}(x)$ for $\epsilon>0$ sufficiently small.
Another way to interpret the result is as follows. It is well known that every arithmetic circuit can be simulated by an arithmetic circuit with only one subtraction (see Valiant 1980). That is, if $f$ has a circuit of size $s$, then $f$ can be written as $f=g-h$, where $g$ and $h$ are monotone polynomials of monotone circuit size $O(s)$. Then Theorem 5 simply asserts that $h$ can be chosen as a scalar multiple $\epsilon^{-1}\cdot U$ of a fixed universal polynomial $U=(x_1+\cdots+x_n+1)^d$ independent of $f$: $f=\epsilon^{-1}\cdot g_{\epsilon} -\epsilon^{-1}\cdot U$.
Theorem 5 is also reminiscent of the so-called slice functions in Boolean complexity. Recall that a Boolean function $f$ is a slice function, if there exists a $k$ such that $f$ accepts on all inputs of Hamming weight $> k$, rejects on inputs of Hamming weight $< k$, and is arbitrary on inputs with weight $k$. Valiant (1986) has shown that for slice functions, general and monotone Boolean circuits are of essentially the same power. This resembles Theorem 5 because the universal polynomial $U$ is constant on inputs $x$ with fixed $x_1+\cdots+x_n$.
Theorem 5 is also "bad news" for the arguments we used above to prove lower bounds on the size of monotone arithmetic circuits: they explore the structure of monomials of polynomials but totally ignore their coefficients.
"Good news" are that, still, even these "free from coefficients" arguments allow us to prove nontrivial lower bounds for approximating tropical circuits (simulating pure dynamic programming algorithms) and, hence, are of some practical importance.
Back to the home page of the book.