$ \def\<#1>{\left<#1\right>} \newcommand{\OR}[1]{\mathrm{OR}(#1)} \newcommand{\XOR}[1]{\mathrm{XOR}(#1)} \newcommand{\twoXOR}[1]{\mathrm{XOR}_2(#1)} \newcommand{\SUM}[1]{\mathrm{SUM}(#1)} \newcommand{\dSUM}[2]{\mathrm{SUM}_{#2}(#1)} \newcommand{\GEN}{\mathrm{GEN}} \newcommand{\Cov}{c} \newcommand{\Dec}{d} \newcommand{\tr}{\mathrm{tr}} \newcommand{\pr}{\mathrm{rk}_+} \newcommand{\br}{\mathrm{rk}_{\lor}} \newcommand{\Geq}{\succcurlyeq} \newcommand{\Leq}{\preccurlyeq} \newcommand{\normf}[1]{\|#1\|_{F}} \newcommand{\norm}[1]{\|#1\|} \newcommand{\skal}[1]{\langle #1\rangle} \newcommand{\ZZ}{{\Bbb Z}} \newcommand{\NN}{{\Bbb N}} \newcommand{\RR}{{\Bbb R}} $

Determinant and bounded-depth SUM circuits

In the book, we have proved Theorem 3.3 only in a very special (and severely restricted) case when all layers have the same number of nodes. That is, we only considered the products $A=A_1\cdot A_2\cdots A_d$ of square matrices - an extremely restricted case.

The original proof in [1] (for the general case) is short but somewhat "encrypted." Igor suggested an "explicit" proof (via orthonormal bases) showing what is actually going on, and we present this proof below (Lemma 1).

A classical inequality of Hadamard for real valued matrices $A=(a_{ij})$ states that \begin{equation}\label{eq:det-Hadamard-inequality} |\det(A)| \leq \prod_{j=1}^n\sqrt{\sum_{i=1}^na_{ij}^2}\,. \end{equation} The Frobenius norm of a matrix $A$ is just the Euclidean norm $\normf{A}:= \sqrt{\sum_{i,j}a_{ij}^2}$ of the corresponding vector, that is, the square root of the sum of the squares of its entries. Using the arithmetic-geometric mean inequality \begin{equation}\label{eq:arith-geom} \left(\prod_{j=1}^n x_j\right)^{1/n}\ \leq\ \frac{1}{n}\sum_{j=1}^n x_j\,, \end{equation} Hadamard's inequality \eqref{eq:det-Hadamard-inequality} yields \begin{align*} |\det(A)|^{2/n}&\leq \left[\prod_{j=1}^n\sqrt{\sum_{i=1}^na_{ij}^2}\right]^{2/n} = \left[\prod_{j=1}^n\sum_{i=1}^na_{ij}^2\right]^{1/n} \overset{\eqref{eq:arith-geom}}{\leq} \frac{1}{n}\sum_{j=1}^n \sum_{i=1}^n a_{ij} ^2 =\left(\frac{\normf{A}^2}{n}\right)^2\,, \end{align*} that is, \begin{equation}\label{eq:det1} |\det(A)|\leq \left(\frac{\normf{A}^2}{n}\right)^{n/2}\,. \end{equation} The following lemma extends \eqref{eq:det1} to products of matrices.

Lemma 1 [Pudlák 2000] Let a real $n\times n$ matrix $A=A_1\cdot A_2\cdots A_d$ be a product of $d$ matrices $A_1, A_2,\ldots,A_d$. Then \[ |\det(A)|\leq \left(\frac{\normf{A_1}^2}{n}\right)^{n/2} \left(\frac{\normf{A_2}^2}{n}\right)^{n/2}\cdots \left(\frac{\normf{A_d}^2}{n}\right)^{n/2}\,. \]

Proof:

Linear combinations of rows $\vec{a}_1,\ldots, \vec{a}_n$ of matrix $A_1$ generate an $n$-dimensional vector subspace $V$ in $\RR^m$. By a standard normalization procedure we can find an orthonormal basis $\vec{e}_1,\ldots,\vec{e}_n\in\RR^m$ in $V$; hence, $\skal{\vec{e}_{i},\vec{e}_{i}}=1$ and $\skal{\vec{e}_{i},\vec{e}_{j}}=0$ for all $i\neq j$. Extend it to an orthonormal basis $\vec{e}_1,\ldots,\vec{e}_m\in\RR^m$ of $V\oplus V^{\perp}=\RR^m$. Representing each row $\vec{a}_i$ of the matrix $A_1$ as a linear combination $\vec{a}_i=c_{i,1}\vec{e}_1+ \cdots+c_{i,n}\vec{e}_n$ of basis vectors, we obtain \[ A_1=A_1'\cdot E\,, \] where $A_1':=(c_{i,j})$ is the $n\times n$ matrix consisting of coefficients of linear combinations of the rows of matrix $A_1$, and $E:=[\vec{e}_1,\cdots,\vec{e}_n]^T$ is the $n\times m$ matrix $E$ whose rows are the first $n$ basis vectors $\vec{e}_1,\ldots,\vec{e}_n\in\RR^m$. Since the basis is orthonormal, we have $E\cdot E^T=I_n$, where $E^T$ is the transpose of matrix $E$, and $I_n$ denotes the identity $n\times n$ matrix. By multiplying from the right both sides of the equality $A_1=A_1'\cdot E$ with matrix $E^T$, we obtain \[ A_1'=A_1\cdot E^T\,. \] Hence \[ (A_1E^T)\cdot E=A_1'\cdot E=A_1\,. \] In particular, $(A_1E^T)(EA_2)=A_1A_2$. So, by taking $A_2':=E\cdot A_2$, we can represent our matrix $A$ as \[ A=A_1\cdot A_2\cdots A_d=(\overbrace{A_1\cdot E^T}^{A_1'})\cdot (\overbrace{E\cdot A_2}^{A_2'}\cdot A_3\cdots A_d)=(A_1')\cdot (A_2'\cdot A_3\cdots A_d)\,. \] The right-hand side is a product of two $n\times n$ matrices. Since these are square matrices, we can apply the inequality \eqref{eq:det1} to the first $n\times n$ matrix $A_1'$ and the induction hypothesis to the second $n\times n$ matrix $A_2'\cdot A_3\cdots A_d$, and obtain \begin{align*} |\det(A)|&=|\det(A_1')|\cdot |\det(A_2'A_3\cdots A_d)|\\ &\leq \left(\frac{\normf{A_1'}^2}{n}\right)^{n/2} \left(\frac{\normf{A_2'}^2}{n}\right)^{n/2} \left(\frac{\normf{A_3}^2}{n}\right)^{n/2}\cdots \left(\frac{\normf{A_d}^2}{n}\right)^{n/2}\,. \end{align*} It remains therefore to show that $\normf{A_1'}^2\leq \normf{A_1}^2$ and $\normf{A_2'}^2\leq \normf{A_2}^2$. An $m\times m$ matrix $U$ is orthonormal if $U\cdot U^T=U^T\cdot U=I_m$ (such matrices are also called \emph{orthogonal} matrices). Important property of such matrices is that multiplication by them preserves Euclidean norm: \begin{align*} \norm{Ux}_2^2&=(Ux)^T\cdot(Ux)=x^T(U^T\cdot U)x =x^T\cdot x=\norm{x}_2^2\,;\\ \norm{x^TU}_2^2&=\norm{(x^TU)^T}_2^2 =\norm{U^Tx}_2^2 =(U^Tx)^T\cdot(U^Tx) =x^T(UU^T)x =x^T\cdot x=\norm{x}_2^2\,. \end{align*} Hence, multiplication with an orthonormal matrix from left preserves the Euclidean norms of columns, and multiplication from right preserves the Euclidean norms of rows. In both cases, the square of the Frobenius norm does not increase.

Consider the $m\times m$ matrix $\hat{E}:= [\vec{e}_1,\cdots,\vec{e}_m]^T$ whose rows are all $m$ vectors $\vec{e}_1,\ldots,\vec{e}_m\in\RR^m$ of our chosen orthonormal basis of the entire linear space $V\oplus V^{\perp}=\RR^m$. Since the basis is orthonormal, we have $\hat{E}\cdot\hat{E}^T=I_n$. Since $\hat{E}$ is a square matrix, we also have $(\hat{E}^T\cdot \hat{E})^T =\hat{E}\cdot E^T=I_m$ and, hence, also $\hat{E}^T\cdot \hat{E}=I_m$. Thus, $\hat{E}$ is an orthonormal matrix. Since $A_1=A_1'\cdot E = \hat{A}_1\cdot \hat{E}$, and since $\hat{E}$ is an orthonormal matrix, the corresponding rows of $A_1$ and $\hat{A}_1$ and, hence, also of $A_1$ and $A_1'$ have the same Euclidean norms. In particular, $\normf{A_1'}^2\leq \normf{A_1}^2$ holds. Again, since the matrix $\hat{E}$ is orthonormal, columns of $\hat{E}\cdot A_2$ have the same norms as corresponding columns of matrix $A_2$. Since the columns of $A_2'=E\cdot A_2$ are subcolumns of $\hat{E}\cdot A_2$, norms of the columns of $A_2'$ do not exceed norms of the corresponding columns of $A_2$. In particular, $\normf{A_2'}^2\leq \normf{A_2}^2$. $\Box$

Lemma 1 immediately yields the following lower bound for bounded-depth SUM circuits (this Theorem 3.3 in the book).

Corollary [Pudlák 2000] For every Boolean $n\times n$ matrix $A$, and every integer $d\geq 1$, \[ \dSUM{A}{d}\geq dn\cdot |\det(A)|^{\frac{2}{dn}}\,. \]

Proof:

References:

P. Pudlák : A note on the use of determinant for proving lower bounds on the size of linear circuits, Information Processing Letters 74 (2000) 197-201. Local copy