Tail Inequalities

Prev: Moments and Deviations Next: The Probabilistic Method

Problems

4.1

Suppose you are given a biased coin that has $Pr [HEADS] = p \geq a$ , for some fixed $a$ , without being given any other information about $p$ . Devise a procedure for estimating $p$ by a value $\tilde{p}$ such that you can guarantee that

Pr [∣ p - \tilde{p} ∣ > ϵ p] < δ,

for any choice of the constants $0 < a, ϵ, δ < 1$ . Let $N$ be the number of times you need to flip the biased coin to obtain the estimate. What is the smallest value of $N$ for which you can still give this guarantee?

4.2

Let $X$ be a random variable. Define the $k$ th factorial moment of $X$ , $E [X^{[k]}]$ , as the expected value of

X^{[k]} = X (X - 1) \dots (X - k + 1) .

Let $G_{1}$ denote a random graph on $n$ vertices where each edge independently is present with probability $p$ , and $G_{2}$ denote a graph on $n$ vertices that has $m$ edges chosen uniformly at random. Let $X_{n}$ denote the number of isolated vertices in $G_{1}$ , and let $Y_{n}$ be the number of isolated vertices in $G_{2}$ . Consider the case

p = \frac{lo g n + c}{n}

and

m = \frac{n ( lo g n + c )}{2},

for a real value $c$ . Prove that $E [X_{n}^{[k]}]$ and $E [Y_{n}^{[k]}]$ are asymptotically equal to $λ^{k}$ , where $λ = e^{- c}$ .

4.3

For $μ$ in the range $[1, ln n]$ , use (4.1) to obtain a closed-form upper bound for $Δ^{+} (μ, 1/ n^{2})$ , as a function of $μ$ and $n$ , that is within a constant factor of the best possible.

4.4

Let $X_{1}, X_{2}, \dots, X_{n}$ be independent geometrically distributed random variables each having expectation $2$ , each of the $X_{i}$ is an independent experiment counting the number of tosses of an unbiased coin up to and including the first HEADS. Let

X = i = 1 \sum n X_{i}

and $δ$ be a positive real constant. Use moment generating functions and the Chernoff technique to derive the best upper bound you can on

Pr [X > (1 + δ) (2 n)] .

4.5

The result of Theorem 4.2 bounds the probability of the sum of Poisson trials deviating far below its expectation. Use this to give a bound on the probability of the sum of independent geometric random variables deviating above its expectation, thus providing an alternative approach to that in Problem 4.4.

4.6

Hoeffding’s Bound [202].

Suppose $Y_{1}, \dots, Y_{n}$ are independent Poisson trials such that $Pr [Y_{i} = 1] = p_{i}$ . Let

Y = i = 1 \sum n Y_{i}, μ = E [Y] = i = 1 \sum n p_{i}, p = μ / n .

Our goal is to show that from the standpoint of deviations from the mean, the worst case is when the $p_{i}$ ‘s are all equal. Let $X$ be the sum of $n$ independent Bernoulli trials each having probability $p$ of assuming the value $1$ . Then, for any $a \geq μ + 1$ and any $b \leq μ - 1$ , show that

Pr [Y \geq a] \leq Pr [X \geq a],

and

Pr [Y \leq b] \leq Pr [X \leq b] .

4.7

Due to W. Hoeffding [202].

This problem deals with a useful generalization of the Hoeffding bound in Problem 4.6.

A function $f : R \to R$ is said to be convex if for any $x_{1}$ , $x_{2}$ and $0 \leq λ \leq 1$ , the following inequality is satisfied:

f (λ x_{1} + (1 - λ) x_{2}) \leq λ f (x_{1}) + (1 - λ) f (x_{2}) .

Show that the function $f (x) = e^{t x}$ is convex for any $t > 0$ . What can you say when $t \leq 0$ ? 2. Let $Z$ be a random variable that assumes values in the interval $[0, 1]$ , and let $p = E [Z]$ . Define the Bernoulli random variable $X$ such that $Pr [X = 1] = p$ and $Pr [X = 0] = 1 - p$ . Show that for any convex function $f$ , $E [f (Z)] \leq E [f (X)]$ . 3. Let $Y_{1}, \dots, Y_{n}$ be independent and identically distributed random variables over $[0, 1]$ , and define

Y = i = 1 \sum n Y_{i} .

Using parts (1) and (2), derive upper and lower tail bounds for the random variable $Y$ using the Chernoff bound technique. In particular, show that

Pr [Y - E [Y] > δ] \leq exp (- 2 δ^{2} / n) .

Remark: While the results in this problem hold for continuous random variables, they may be a bit easier to prove in the case where $Z, Y_{1}, \dots, Y_{n}$ take on a discrete set of values in the interval $[0, 1]$ . Also, it should be easy to generalize this to distributions defined over arbitrary intervals $[l, h]$ . See also Problem 4.21.

4.8

Consider a BPP algorithm that has an error probability of $1/2 - 1/ p (n)$ , for some polynomially bounded function $p (n)$ of the input size $n$ . Using the Chernoff bound on the tail of the binomial distribution, show that a polynomial number of independent repetitions of this algorithm suffice to reduce the error probability to $1/ 2^{n}$ .

4.9

Consider now the following variant of the bit-fixing algorithm. Each packet randomly orders the bit positions in the label of its source, and then corrects the mismatched bits in that order. Show that there is a permutation for which with high probability this algorithm requires $2^{Ω (n)}$ steps to complete the routing.

4.10

Suppose we run Valiant’s scheme on an $N$ -node network in which every node is of degree $d$ ; each packet first goes to a random destination chosen uniformly from all the nodes and then on to its final destination. Show that the expected number of steps for the completion of the first phase is

Ω (\frac{lo g N}{d lo g lo g N} + \frac{lo g N}{lo g d}) .

4.11

The lattice approximation problem is an extension of the set-balancing problem, Example 4.5. As before, we are given an $n \times n$ matrix $A$ all of whose entries are $0$ or $1$ . In addition, we are given a column vector $p$ with $n$ entries, all of which are in the interval $[0, 1]$ . We wish to find a column vector $q$ with $n$ entries, all of which are from the set ${0, 1}$ , so as to minimize

∥ A (p - q) ∥_{\infty} .

We think of the vector $q$ as an integer approximation to the given real vector $p$ , in the sense that $A q$ is close to $A p$ in every component. This has applications to approximating certain integer programs given solutions to their linear programming relaxations, along the lines of Section 4.3. Derive a bound on $∥ A (p - q) ∥_{\infty}$ assuming that $q$ were derived from $p$ using randomized rounding.

4.12

Consider the global wiring problem of Section 4.3. We wish to approximate the best possible solution without the restriction that only one-bend routes are used. Adapt the approach in Section 4.3 to devise an algorithm running in time polynomial in the number of gates and nets, achieving an approximation similar to that in Theorem 4.8.

4.13

The set-cover problem is the following: given sets $S_{1}, \dots, S_{n}$ over a universe $U$ , find the smallest set $T \subseteq U$ such that for $1 \leq i \leq n$ , $T \cap S_{i} \neq = \emptyset$ . An alternative formulation of this problem is the following: given a $0$ - $1$ matrix $M$ , find a $0$ - $1$ column vector $c$ such that the dot product of each row of $M$ with $c$ is positive while minimizing $∥ c ∥_{1}$ . The matrix $M$ has $n$ rows, and the $i$ th row is the incidence vector of the set $S_{i}$ .

Given a matrix $M$ , let $C (M)$ denote the size of the smallest set-cover for $M$ . Let $n$ be the number of rows in $M$ . Show that we can adapt the technique of linear programming followed by randomized rounding to find a set-cover of size $O (lo g n)$ times $C (M)$ .

4.14

Show that the RandQS algorithm of Chapter 1 runs in time $O (n lo g n)$ with high probability.

4.15

Redesign the parameters of the LazySelect algorithm of Chapter 3 and invoke the Chernoff bound to show that with high probability it finds the $k$ th smallest of $n$ elements in

n + k + n lo g^{O (1)} n

steps, with probability $1 - o (1)$ .

4.16

Prove Lemmas 4.9 and 4.10. Also, formulate and prove their generalizations to the case where the conditioning is done on more than one random variable. Finally, using these, prove Lemma 4.11.

4.17

Prove Theorem 4.12.

4.18

Prove Lemma 4.14.

4.19

Using Lemma 4.14, prove Theorem 4.13.

4.20

Derive the tail bounds described in Problem 4.7(3) by applying Azuma’s inequality, Corollary 4.17, to the Doob martingale sequence obtained from $Y$ by setting

X_{0} = E [Y]

and, for $1 \leq i \leq n$ ,

X_{i} = E [Y ∣ Y_{1}, \dots, Y_{i}] .

How does this bound compare with the one obtained in Problem 4.7?

4.21

Prove Azuma’s inequality, Theorem 4.16, for the case where $c_{k} = 1$ for all $k$ . Note that this is the same as Corollary 4.17 with $c = 1$ . Do you see how to generalize this to the case of arbitrary $c_{k}$ ‘s? Hint: Concentrate on the upper tail bound, since the lower tail bound can be obtained by negating the random variables. Consider the martingale difference sequence $Y_{1}, Y_{2}, \dots$ obtained by setting $Y_{i} = X_{i} - X_{i - 1}$ , and note that

X_{t} = i = 1 \sum t Y_{i} .

You can essentially mimic the proof of Theorem 4.1, but be careful to use conditional expectations and the martingale property in going from the analog of equation (4.2) to that of equation (4.3). Since the random variables $Y_{i}$ could have arbitrary distributions over the interval $[- 1, 1]$ , you will also have to make use of an argument similar to that in Problem 4.7.

4.22

Due to A. Kamath, R. Motwani, K. Palem, and P. Spirakis [228].

Consider again the issue of tail bounds on the number of empty bins studied in Theorem 4.18. In this setting, let $I_{i}$ be the indicator variable whose value is $1$ if and only if bin $i$ is empty, and define

Z = i = 1 \sum n I_{i}

as the number of empty bins.

Define

p = E [I_{i}] = (1 - 1/ n)^{m},

and let $I_{i}^{'}$ be mutually independent Bernoulli random variables that take value $1$ with probability $p$ and value $0$ with probability $1 - p$ ; note that the sum

Y = i = 1 \sum n I_{i}^{'}

has the binomial distribution with parameters $n$ and $p$ .

Show that for all $t \geq 0$ ,

E [e^{tZ}] \leq E [e^{t Y}] .

Conclude that any Chernoff bound on the upper tail of $Y$ ‘s distribution also applies to the upper tail of $Z$ ‘s distribution, even though the Bernoulli variables $I_{i}$ are not mutually independent. The point is that their correlation is negative and only helps to reduce the tail probability. How does the resulting bound on the upper tail of $Z$ ‘s distribution compare with the bound given in Theorem 4.18? 2. Can you show that for all $t < 0$ ,

E [e^{tZ}] \leq E [e^{t Y}]?

Repeat the exercise in part (1) for the lower tail.

Takashi's Notes

Explorer

Tail Inequalities

Tail Inequalities

Problems

4.1

4.2

4.3

4.4

4.5

4.6

4.7

4.8

4.9

4.10

4.11

4.12

4.13

4.14

4.15

4.16

4.17

4.18

4.19

4.20

4.21

4.22

Graph View

Table of Contents

Backlinks