Chernoff's bounds for dependent random variables

Recently, Dmitry Gavinsky, Shachar Lovett, Michael Saks, and Srikanth Srinivasan proved the following extension of Chernoff bounds to the case when some dependencies between the random variables are allowed.
Definition: A family $Y_1,\ldots,Y_r$ of random variables is read-$k$ if there exits a sequence $X_1,\ldots,X_m$ of independent variables, and a sequence $S_1,\ldots,S_r$ of subsets of $[m]=\{1,\ldots,m\}$ such that
Note that for $k=1$, the variables $Y_i$ are independent. The other extreme is when some element $i\in [m]$ appears in all sets $S_i$ (i.e. when $k=r$): then the $Y_i$'s may be arbitrary.
Theorem: Let $Y_1,\ldots,Y_r$ be a family of read-k indicator variables with $Pr[Y_i =1] = p_i$, and let $p$ be the average of $p_1,...,p_r$. Then for any $\epsilon > 0$, both probabilities \[ Pr[Y_1+\cdots +Y_r\geq (p+\epsilon)r]\ \ \mbox{and}\ \ Pr[Y_1+\cdots +Y_r\leq (p-\epsilon)r] \] are at most \[ e^{-2\epsilon^2 r/k}. \]
That is, we obtain the same bound as that of the standard Chernoff bound, except that the exponent is divided by $k$. This is clearly tight: let $Y_1= \ldots = Y_k = X_1$, $Y_{k+1}= \ldots = Y_{2k} = X_2$, etc, where $X_1,X_2,\ldots,X_{r/k}$ are independent with $Pr[X_i =1] = p$. Then for example \[ Pr[Y_1+\cdots +Y_r\geq (p+\epsilon)r]= Pr[X_1+\cdots +X_{r/k}\geq (p+\epsilon)r/k]. \]

Reference::


Return to the home page of the 2nd edition