Expectation
Prev: Random variables and their distributions Next: Continuous random variables
Problems
Exercises marked as solved in the book have detailed solutions at http://stat110.net.
Expectations and variances
4.1
Bobo, the amoeba from Chapter 2, currently lives alone in a pond. After one minute Bobo will either die, split into two amoebas, or stay the same, with equal probability. Find the expectation and variance for the number of amoebas in the pond after one minute.
4.2
In the Gregorian calendar, each year has either 365 days (a normal year) or 366 days (a leap year). A year is randomly chosen, with probability of being a normal year and of being a leap year. Find the mean and variance of the number of days in the chosen year.
4.3
(a) A fair die is rolled. Find the expected value of the roll.
(b) Four fair dice are rolled. Find the expected total of the rolls.
4.4
A fair die is rolled some number of times. You can choose whether to stop after 1, 2, or 3 rolls, and your decision can be based on the values that have appeared so far. You receive the value shown on the last roll of the die, in dollars. What is your optimal strategy (to maximize your expected winnings)? Find the expected winnings for this strategy.
Hint: Start by considering a simpler version of this problem, where there are at most 2 rolls. For what values of the first roll should you continue for a second roll?
4.5
Find the mean and variance of a Discrete Uniform r.v. on .
Hint: See the math appendix for some useful facts about sums.
4.6
Two teams are going to play a best-of-7 match (the match will end as soon as either team has won 4 games). Each game ends in a win for one team and a loss for the other team. Assume that each team is equally likely to win each game, and that the games played are independent. Find the mean and variance of the number of games played.
4.7
A certain small town, whose population consists of 100 families, has 30 families with 1 child, 50 families with 2 children, and 20 families with 3 children. The birth rank of one of these children is 1 if the child is the firstborn, 2 if the child is the secondborn, and 3 if the child is the thirdborn.
(a) A random family is chosen (with equal probabilities), and then a random child within that family is chosen (with equal probabilities). Find the PMF, mean, and variance of the child’s birth rank.
(b) A random child is chosen in the town (with equal probabilities). Find the PMF, mean, and variance of the child’s birth rank.
4.8
A certain country has four regions: North, East, South, and West. The populations of these regions are 3 million, 4 million, 5 million, and 8 million, respectively. There are 4 cities in the North, 3 in the East, 2 in the South, and there is only 1 city in the West. Each person in the country lives in exactly one of these cities.
(a) What is the average size of a city in the country? (This is the arithmetic mean of the populations of the cities, and is also the expected value of the population of a city chosen uniformly at random.)
Hint: Give the cities names (labels).
(b) Show that without further information it is impossible to find the variance of the population of a city chosen uniformly at random. That is, the variance depends on how the people within each region are allocated between the cities in that region.
(c) A region of the country is chosen uniformly at random, and then a city within that region is chosen uniformly at random. What is the expected population size of this randomly chosen city?
Hint: First find the selection probability for each city.
(d) Explain intuitively why the answer to (c) is larger than the answer to (a).
4.9
Consider the following simplified scenario based on Who Wants to Be a Millionaire?, a game show in which the contestant answers multiple-choice questions that have 4 choices per question. The contestant (Fred) has answered 9 questions correctly already, and is now being shown the 10th question. He has no idea what the right answers are to the 10th or 11th questions. He has one “lifeline” available, which he can apply on any question, and which narrows the number of choices from 4 down to 2. Fred has the following options available.
(a) Walk away with $16,000.
(b) Apply his lifeline to the 10th question, and then answer it. If he gets it wrong, he will leave with $1,000. If he gets it right, he moves on to the 11th question. He then leaves with $32,000 if he gets the 11th question wrong, and $64,000 if he gets the 11th question right.
(c) Same as the previous option, except not using his lifeline on the 10th question, and instead applying it to the 11th question (if he gets the 10th question right).
Find the expected value of each of these options. Which option has the highest expected value? Which option has the lowest variance?
4.10
Consider the St. Petersburg paradox (Example 4.3.14), except that you receive $n rather than $2^n if the game lasts for rounds. What is the fair value of this game? What if the payoff is $n^2?
4.11
Martin has just heard about the following exciting gambling strategy: bet $1 that a fair coin will land Heads. If it does, stop. If it lands Tails, double the bet for the next toss, now betting $2 on Heads. If it does, stop. Otherwise, double the bet for the next toss to $4. Continue in this way, doubling the bet each time and then stopping right after winning a bet. Assume that each individual bet is fair, i.e., has an expected net winnings of 0. The idea is that
so the gambler will be $1 ahead after winning a bet, and then can walk away with a profit.
Martin decides to try out this strategy. However, he only has $31, so he may end up walking away bankrupt rather than continuing to double his bet. On average, how much money will Martin win?
4.12
Let be a discrete r.v. with support for some positive integer . Suppose that the PMF of satisfies the symmetry property for all integers . Find .
4.13
Stat110 solution available.
Are there discrete random variables and such that but is greater than with probability at least 0.99?
4.14
Let have PMF
where is a parameter with and is a normalizing constant. We have
as seen from the Taylor series
This distribution is called the Logarithmic distribution (because of the log in the above Taylor series), and has often been used in ecology. Find the mean and variance of .
4.15
Player A chooses a random integer between 1 and 100, with probability of choosing (for ). Player B guesses the number that player A picked, and receives from player A that amount in dollars if the guess is correct (and 0 otherwise).
(a) Suppose for this part that player B knows the values of . What is player B’s optimal strategy (to maximize expected earnings)?
(b) Show that if both players choose their numbers so that the probability of picking is proportional to , then neither player has an incentive to change strategies, assuming the opponent’s strategy is fixed. (In game theory terminology, this says that we have found a Nash equilibrium.)
(c) Find the expected earnings of player B when following the strategy from (b). Express your answer both as a sum of simple terms and as a numerical approximation. Does the value depend on what strategy player A uses?
4.16
The dean of Blotchville University boasts that the average class size there is 20. But the reality experienced by the majority of students there is quite different: they find themselves in huge courses, held in huge lecture halls, with hardly enough seats or Haribo gummi bears for everyone. The purpose of this problem is to shed light on the situation. For simplicity, suppose that every student at Blotchville University takes only one course per semester.
(a) Suppose that there are 16 seminar courses, which have 10 students each, and 2 large lecture courses, which have 100 students each. Find the dean’s-eye-view average class size (the simple average of the class sizes) and the student’s-eye-view average class size (the average class size experienced by students, as it would be reflected by surveying students and asking them how big their classes are). Explain the discrepancy intuitively.
(b) Give a short proof that for any set of class sizes (not just those given above), the dean’s-eye-view average class size will be strictly less than the student’s-eye-view average class size, unless all classes have exactly the same size.
Hint: Relate this to the fact that variances are nonnegative.
4.17
The sociologist Elizabeth Wrigley-Field posed the following puzzle [29]:
American fertility fluctuated dramatically in the decades surrounding the Second World War. Parents created the smallest families during the Great Depression, and the largest families during the postwar Baby Boom. Yet children born during the Great Depression came from larger families than those born during the Baby Boom. How can this be?
(a) For a particular era, let be the number of American families with exactly children, for each . (Assume for simplicity that American history has cleanly been separated into eras, where each era has a well-defined set of families, and each family has a well-defined set of children; we are ignoring the fact that a particular family’s size may change over time, that children grow up, etc.) For each , let
For a family selected randomly in that era (with all families equally likely), find the expected number of children in the family. Express your answer only in terms of the ‘s.
(b) For a child selected randomly in that era (with all children equally likely), find the expected number of children in the child’s family, only in terms of the ‘s.
(c) Give an intuitive explanation in words for which of the answers to (a) and (b) is larger, or whether they are equal. Explain how this relates to the Wrigley-Field puzzle.
Named distributions
4.18
Stat110 solution available.
A fair coin is tossed repeatedly, until it has landed Heads at least once and has landed Tails at least once. Find the expected number of tosses.
4.19
Stat110 solution available.
A coin is tossed repeatedly until it lands Heads for the first time. Let be the number of tosses that are required (including the toss that landed Heads), and let be the probability of Heads, so that . Find the CDF of , and for sketch its graph.
4.20
Let . For each of the following parts, construct an example showing that it is possible, or explain clearly why it is impossible. In this problem, is a random variable on the same probability space as ; note that and are not necessarily independent.
(a) Is it possible to have with ?
(b) Is it possible to have with ?
(c) Is it possible to have with ?
4.21
Stat110 solution available.
Let and , independently.
(a) Let be the smaller of and , and let be the larger of and . So if crystallizes to and crystallizes to , then crystallizes to and crystallizes to . Find .
(b) Show that , with notation as in (a).
(c) Compute in two different ways.
4.22
Stat110 solution available.
Raindrops are falling at an average rate of 20 drops per square inch per minute. What would be a reasonable distribution to use for the number of raindrops hitting a particular region measuring in minutes? Why? Using your chosen distribution, compute the probability that the region has no rain drops in a given 3-second time interval.
4.23
Stat110 solution available.
Alice and Bob have just met, and wonder whether they have a mutual friend. Each has 50 friends, out of 1000 other people who live in their town. They think that it’s unlikely that they have a friend in common, saying “each of us is only friends with 5% of the people here, so it would be very unlikely that our two 5%‘s overlap.”
Assume that Alice’s 50 friends are a random sample of the 1000 people (equally likely to be any 50 of the 1000), and similarly for Bob. Also assume that knowing who Alice’s friends are gives no information about who Bob’s friends are.
(a) Compute the expected number of mutual friends Alice and Bob have.
(b) Let be the number of mutual friends they have. Find the PMF of .
(c) Is the distribution of one of the important distributions we have looked at? If so, which?
4.24
Let and . Using a story about a sequence of Bernoulli trials, prove that .
4.25
Stat110 solution available.
Calvin and Hobbes play a match consisting of a series of games, where Calvin has probability of winning each game (independently). They play with a “win by two” rule: the first player to win two games more than his opponent wins the match. Find the expected number of games played.
Hint: Consider the first two games as a pair, then the next two as a pair, etc.
4.26
Nick and Penny are independently performing independent Bernoulli trials. For concreteness, assume that Nick is flipping a nickel with probability of Heads and Penny is flipping a penny with probability of Heads. Let be Nick’s results and be Penny’s results, with and .
(a) Find the distribution and expected value of the first time at which they are simultaneously successful, i.e., the smallest such that .
Hint: Define a new sequence of Bernoulli trials and use the story of the Geometric.
(b) Find the expected time until at least one has a success (including the success).
Hint: Define a new sequence of Bernoulli trials and use the story of the Geometric.
(c) For , find the probability that their first successes are simultaneous, and use this to find the probability that Nick’s first success precedes Penny’s.
4.27
Stat110 solution available.
Let and be r.v.s, and . Suppose that and are not independent, and in fact . Prove or disprove the claim that in this scenario.
4.28
William is on a treasure hunt. There are pieces of treasure, each of which is hidden in one of locations. William searches these locations one by one, without replacement, until he has found all the treasure. (Assume that no location contains more than one piece of treasure, and that William will find the treasure piece when he searches a location that does have treasure.) Let be the number of locations that William searches during his treasure hunt. Find the distribution of , and find .
4.29
Let , and define the function by , for all real . Find . (The notation means first evaluate in terms of and , and then plug in for ; it is not correct to say "".)
4.30
(a) Use LOTUS to show that for and any function ,
assuming that both sides exist. This is called the Stein-Chen identity for the Poisson.
(b) Find the third moment for by using the identity from (a) and a bit of algebra to reduce the calculation to the fact that has mean and variance .
4.31
In many problems about modeling count data, it is found that values of zero in the data are far more common than can be explained well using a Poisson model (we can make large for by making small, but that also constrains the mean and variance of to be small since both are ). The Zero-Inflated Poisson distribution is a modification of the Poisson to address this issue, making it easier to handle frequent zero values gracefully.
A Zero-Inflated Poisson r.v. with parameters and can be generated as follows. First flip a coin with probability of Heads. Given that the coin lands Heads, . Given that the coin lands Tails, is distributed . Note that if occurs, there are two possible explanations: the coin could have landed Heads (in which case the zero is called a structural zero), or the coin could have landed Tails but the Poisson r.v. turned out to be zero anyway.
For example, if is the number of chicken sandwiches consumed by a random person in a week, then for vegetarians (this is a structural zero), but a chicken-eater could still have occur by chance (since they might happen not to eat any chicken sandwiches that week).
(a) Find the PMF of a Zero-Inflated Poisson r.v. .
(b) Explain why has the same distribution as , where is independent of .
(c) Find the mean of in two different ways: directly using the PMF of , and using the representation from (b). For the latter, you can use the fact (which we prove in Chapter 7) that if r.v.s and are independent, then .
(d) Find the variance of .
4.32
Stat110 solution available.
A discrete distribution has the memoryless property if for a random variable with that distribution,
for all nonnegative integers .
(a) If has a memoryless distribution with CDF and PMF , find an expression for in terms of , , , .
(b) Name a discrete distribution which has the memoryless property. Justify your answer with a clear interpretation in words or with a computation.
4.33
Find values of such that the Negative Hypergeometric distribution with parameters reduces to a Discrete Uniform on . Justify your answer both in terms of the story of the Negative Hypergeometric and in terms of its PMF.
Indicator r.v.s
4.34
Stat110 solution available.
Randomly, distinguishable balls are placed into distinguishable boxes, with all possibilities equally likely. Find the expected number of empty boxes.
4.35
Stat110 solution available.
A group of 50 people are comparing their birthdays (as usual, assume their birthdays are independent, are not February 29, etc.). Find the expected number of pairs of people with the same birthday, and the expected number of days in the year on which at least two of these people were born.
4.36
Stat110 solution available.
A group of people are comparing their birthdays (as usual, assume their birthdays are independent, are not February 29, etc.). Let be the indicator r.v. of and having the same birthday (for ). Is independent of ? Is independent of ? Are the independent?
4.37
Stat110 solution available.
A total of 20 bags of Haribo gummi bears are randomly distributed to 20 students. Each bag is obtained by a random student, and the outcomes of who gets which bag are independent. Find the average number of bags of gummi bears that the first three students get in total, and find the average number of students who get at least one bag.
4.38
Each of people puts their name on a slip of paper (no two have the same name). The slips of paper are shuffled in a hat, and then each person draws one (uniformly at random at each stage, without replacement). Find the average number of people who draw their own names.
4.39
Two researchers independently select simple random samples from a population of size , with sample sizes and (for each researcher, the sampling is done without replacement, with all samples of the prescribed size equally likely). Find the expected size of the overlap of the two samples.
4.40
In a sequence of independent fair coin tosses, what is the expected number of occurrences of the pattern HTH (consecutively)? Note that overlap is allowed, e.g., HTHTH contains two overlapping occurrences of the pattern.
4.41
You have a well-shuffled 52-card deck. On average, how many pairs of adjacent cards are there such that both cards are red?
4.42
Suppose there are types of toys, which you are collecting one by one. Each time you collect a toy, it is equally likely to be any of the types. What is the expected number of distinct toy types that you have after you have collected toys? (Assume that you will definitely collect toys, whether or not you obtain a complete set before then.)
4.43
A building has floors, labeled . At the first floor, people enter the elevator, which is going up and is empty before they enter. Independently, each decides which of floors to go to and presses that button (unless someone has already pressed it).
(a) Assume for this part only that the probabilities for floors are equal. Find the expected number of stops the elevator makes on floors .
(b) Generalize (a) to the case that floors have probabilities (respectively); you can leave your answer as a finite sum.
4.44
Stat110 solution available.
There are 100 shoelaces in a box. At each stage, you pick two random ends and tie them together. Either this results in a longer shoelace (if the two ends came from different pieces), or it results in a loop (if the two ends came from the same piece). What are the expected number of steps until everything is in loops, and the expected number of loops after everything is in loops? (This is a famous interview problem; leave the latter answer as a sum.)
Hint: For each step, create an indicator r.v. for whether a loop was created then, and note that the number of free ends goes down by 2 after each step.
4.45
Show that for any events ,
Hint: First prove a similar-looking statement about indicator r.v.s, by interpreting what the events and mean.
4.46
You have a well-shuffled 52-card deck. You turn the cards face up one by one, without replacement. What is the expected number of non-aces that appear before the first ace? What is the expected number between the first ace and the second ace?
4.47
You are being tested for psychic powers. Suppose that you do not have psychic powers. A standard deck of cards is shuffled, and the cards are dealt face down one by one. Just after each card is dealt, you name any card (as your prediction). Let be the number of cards you predict correctly. (See Diaconis [5] for much more about the statistics of testing for psychic powers.)
(a) Suppose that you get no feedback about your predictions. Show that no matter what strategy you follow, the expected value of stays the same; find this value. (On the other hand, the variance may be very different for different strategies. For example, saying “Ace of Spades” every time gives variance 0.)
Hint: Indicator r.v.s.
(b) Now suppose that you get partial feedback: after each prediction, you are told immediately whether or not it is right (but without the card being revealed). Suppose you use the following strategy: keep saying a specific card’s name (e.g., “Ace of Spades”) until you hear that you are correct. Then keep saying a different card’s name (e.g., “Two of Spades”) until you hear that you are correct (if ever). Continue in this way, naming the same card over and over again until you are correct and then switching to a new card, until the deck runs out. Find the expected value of , and show that it is very close to .
Hint: Indicator r.v.s.
(c) Now suppose that you get complete feedback: just after each prediction, the card is revealed. Call a strategy “stupid” if it allows, e.g., saying “Ace of Spades” as a guess after the Ace of Spades has already been revealed. Show that any non-stupid strategy gives the same expected value for ; find this value.
Hint: Indicator r.v.s.
4.48
Stat110 solution available.
Let be Hypergeometric with parameters .
(a) Find by thinking, without any complicated calculations.
(b) Use (a) to find the variance of . You should get
where , , .
4.49
There are prizes, with values $1, $2, …, $n. You get to choose random prizes, without replacement. What is the expected total value of the prizes you get?
Hint: Express the total value in the form , where the are constants and the are indicator r.v.s. Or find the expected value of the th prize received directly.
4.50
Ten random chords of a circle are chosen, independently. To generate each of these chords, two independent uniformly random points are chosen on the circle (intuitively, “uniformly” means that the choice is completely random, with no favoritism toward certain angles; formally, it means that the probability of any arc is proportional to the length of that arc). On average, how many pairs of chords intersect?
Hint: Consider two random chords. An equivalent way to generate them is to pick four independent uniformly random points on the circle, and then pair them up randomly.
4.51
Stat110 solution available.
A hash table is being used to store the phone numbers of people, storing each person’s phone number in a uniformly random location, represented by an integer between 1 and (see Exercise 27 from Chapter 1 for a description of hash tables). Find the expected number of locations with no phone numbers stored, the expected number with exactly one phone number, and the expected number with more than one phone number (should these quantities add up to ?).
4.52
A coin with probability of Heads is flipped times. The sequence of outcomes can be divided into runs (blocks of H’s or blocks of T’s), e.g., HHHTTHTTTH becomes HHH TT H TTT H, which has 5 runs. Find the expected number of runs.
Hint: Start by finding the expected number of tosses (other than the first) where the outcome is different from the previous one.
4.53
A coin with probability of Heads is flipped 4 times. Let be the number of occurrences of HH (for example, THHT has 1 occurrence and HHHH has 3 occurrences). Find and .
4.54
A population has people, with ID numbers from 1 to . Let be the value of some numerical variable for person , and
be the population average of the quantity. For example, if is the height of person then is the average height in the population, and if is 1 if person holds a certain belief and 0 otherwise, then is the proportion of people in the population who hold that belief. In this problem, are thought of as constants rather than random variables.
A researcher is interested in learning about , but it is not feasible to measure for all . Instead, the researcher gathers a random sample of size , by choosing people one at a time, with equal probabilities at each stage and without replacement. Let be the value of the numerical variable (e.g., height) for the th person in the sample. Even though are constants, is a random variable because of the random sampling. A natural way to estimate the unknown quantity is using
Show that in two different ways:
(a) by directly evaluating using symmetry;
(b) by showing that can be expressed as a sum over the population by writing
where is the indicator of person being included in the sample, and then using linearity and the fundamental bridge.
4.55
Stat110 solution available.
Consider the following algorithm, known as bubble sort, for sorting a list of distinct numbers into increasing order. Initially they are in a random order, with all orders equally likely. The algorithm compares the numbers in positions 1 and 2, and swaps them if needed, then it compares the new numbers in positions 2 and 3, and swaps them if needed, etc., until it has gone through the whole list. Call this one “sweep” through the list. After the first sweep, the largest number is at the end, so the second sweep (if needed) only needs to work with the first positions. Similarly, the third sweep (if needed) only needs to work with the first positions, etc. Sweeps are performed until sweeps have been completed or there is a swapless sweep.
For example, if the initial list is 53241 (omitting commas), then the following 4 sweeps are performed to sort the list, with a total of 10 comparisons:
53241 -> 35241 -> 32541 -> 32451 -> 32415.
32415 -> 23415 -> 23415 -> 23145.
23145 -> 23145 -> 21345.
21345 -> 12345.(a) An inversion is a pair of numbers that are out of order (e.g., 12345 has no inversions, while 53241 has 8 inversions). Find the expected number of inversions in the original list.
(b) Show that the expected number of comparisons is between and .
Hint: For one bound, think about how many comparisons are made if sweeps are done; for the other bound, use Part (a).
4.56
A certain basketball player practices shooting free throws over and over again. The shots are independent, with probability of success.
(a) In shots, what is the expected number of streaks of 7 consecutive successful shots? (Note that, for example, 9 in a row counts as 3 streaks.)
(b) Now suppose that the player keeps shooting until making 7 shots in a row for the first time. Let be the number of shots taken. Show that .
Hint: Consider the first 7 trials as a block, then the next 7 as a block, etc.
4.57
Stat110 solution available.
An urn contains red, green, and blue balls. Balls are chosen randomly with replacement (each time, the color is noted and then the ball is put back). Let be the probabilities of drawing a red, green, blue ball, respectively ().
(a) Find the expected number of balls chosen before obtaining the first red ball, not including the red ball itself.
(b) Find the expected number of different colors of balls obtained before getting the first red ball.
(c) Find the probability that at least 2 of balls drawn are red, given that at least 1 is red.
4.58
Stat110 solution available.
Job candidates are interviewed one by one, and the interviewer compares them and keeps an updated list of rankings (if candidates have been interviewed so far, this is a list of the candidates, from best to worst). Assume that there is no limit on the number of candidates available, that for any the candidates are equally likely to arrive in any order, and that there are no ties in the rankings given by the interview.
Let be the index of the first candidate to come along who ranks as better than the very first candidate (so is better than , but the candidates after 1 but prior to (if any) are worse than . For example, if and are worse than but is better than , then . All orderings of the first 4 candidates are equally likely, so it could have happened that the first candidate was the best out of the first 4 candidates, in which case .
What is (which is a measure of how long, on average, the interviewer needs to wait to find someone better than the very first candidate)?
Hint: Find by interpreting what says about how compares with other candidates, and then apply the result of Theorem 4.4.8.
4.59
People are arriving at a party one at a time. While waiting for more people to arrive they entertain themselves by comparing their birthdays. Let be the number of people needed to obtain a birthday match, i.e., before person arrives there are no two people with the same birthday, but when person arrives there is a match.
Assume for this problem that there are 365 days in a year, all equally likely. By the result of the birthday problem from Chapter 1, for 23 people there is a 50.7% chance of a birthday match (and for 22 people there is a less than 50% chance). But this has to do with the median of (defined below); we also want to know the mean of , and in this problem we will find it, and see how it compares with 23.
(a) A median of a random variable is a value for which and (this is also called a median of the distribution of ; note that the notion is completely determined by the CDF of ). Every distribution has a median, but for some distributions it is not unique. Show that 23 is the unique median of .
(b) Show that , where is the indicator r.v. for the event . Then find in terms of ‘s defined by and for ,
(c) Compute numerically. In R, the pithy command cumprod(1-(0:364)/365) produces the vector .
(d) Find the variance of , both in terms of the ‘s and numerically.
Hint: What is , and what is for ? Use this to simplify the expansion
Note: In addition to being an entertaining game for parties, the birthday problem has many applications in computer science, such as in a method called the birthday attack in cryptography. It can be shown that if there are days in a year and is large, then
4.60
Elk dwell in a certain forest. There are elk, of which a simple random sample of size is captured and tagged (so all sets of elk are equally likely). The captured elk are returned to the population, and then a new sample is drawn. This is an important method that is widely used in ecology, known as capture-recapture. If the new sample is also a simple random sample, with some fixed size, then the number of tagged elk in the new sample is Hypergeometric.
For this problem, assume that instead of having a fixed sample size, elk are sampled one by one without replacement until tagged elk have been recaptured, where is specified in advance (of course, assume that ). An advantage of this sampling method is that it can be used to avoid ending up with a very small number of tagged elk (maybe even zero), which would be problematic in many applications of capture-recapture. A disadvantage is not knowing how large the sample will be.
(a) Find the PMFs of the number of untagged elk in the new sample (call this ) and of the total number of elk in the new sample (call this ).
(b) Find the expected sample size using symmetry, linearity, and indicator r.v.s.
(c) Suppose that are such that is an integer. If the sampling is done with a fixed sample size equal to rather than sampling until exactly tagged elk are obtained, find the expected number of tagged elk in the sample. Is it less than , equal to , or greater than (for )?
LOTUS
4.61
Stat110 solution available.
For , find (the average factorial of ), if it is finite.
4.62
For , find , if it is finite.
4.63
For , find (if it is finite) and (if it is finite). For each, make sure to clearly state what the values of are for which it is finite.
4.64
Stat110 solution available.
Let and let be a constant. Find , as a function of (this is known as the moment generating function; we will see in Chapter 6 how this function is useful).
4.65
Stat110 solution available.
The number of fish in a certain lake is a random variable. Worried that there might be no fish at all, a statistician adds one fish to the lake. Let be the resulting number of fish (so is 1 plus a random variable).
(a) Find .
(b) Find .
4.66
Stat110 solution available.
Let be a random variable, where is fixed but unknown. Let , and suppose that we are interested in estimating based on the data. Since is what we observe, our estimator is a function of , call it . The bias of the estimator is defined to be , i.e., how far off the estimate is on average; the estimator is unbiased if its bias is 0.
(a) For estimating , the r.v. itself is an unbiased estimator. Compute the bias of the estimator . Is it unbiased for estimating ?
(b) Show that is an unbiased estimator for . (In fact, it turns out to be the only unbiased estimator for .)
(c) Explain intuitively why is a silly choice for estimating , despite (b), and show how to improve it by finding an estimator for that is always at least as good as and sometimes strictly better than . That is,
with the inequality sometimes strict.
Poisson approximation
4.67
Stat110 solution available.
Law school courses often have assigned seating to facilitate the Socratic method. Suppose that there are 100 first-year law students, and each takes the same two courses: Torts and Contracts. Both are held in the same lecture hall (which has 100 seats), and the seating is uniformly random and independent for the two courses.
(a) Find the probability that no one has the same seat for both courses (exactly; you should leave your answer as a sum).
(b) Find a simple but accurate approximation to the probability that no one has the same seat for both courses.
(c) Find a simple but accurate approximation to the probability that at least two students have the same seat for both courses.
4.68
Stat110 solution available.
A group of people play “Secret Santa” as follows: each puts their name on a slip of paper in a hat, picks a name randomly from the hat (without replacement), and then buys a gift for that person. Unfortunately, they overlook the possibility of drawing one’s own name, so some may have to buy gifts for themselves (on the bright side, some may like self-selected gifts better). Assume .
(a) Find the expected value of the number of people who pick their own names.
(b) Find the expected number of pairs of people, A and B, such that A picks B’s name and B picks A’s name (where and order doesn’t matter).
(c) What is the approximate distribution of if is large (specify the parameter value or values)? What does converge to as ?
4.69
A survey is being conducted in a city with a million () people. A sample of size 1000 is collected by choosing people in the city at random, with replacement and with equal probabilities for everyone in the city. Find a simple, accurate approximation to the probability that at least one person will get chosen more than once (in contrast, Exercise 26 from Chapter 1 asks for an exact answer).
Hint: Indicator r.v.s are useful here, but creating 1 indicator for each of the million people is not recommended since it leads to a messy calculation. Feel free to use the fact that .
4.70
Stat110 solution available.
Ten million people enter a certain lottery. For each person, the chance of winning is one in ten million, independently.
(a) Find a simple, good approximation for the PMF of the number of people who win the lottery.
(b) Congratulations! You won the lottery. However, there may be other winners. Assume now that the number of winners other than you is , and that if there is more than one winner, then the prize is awarded to one randomly chosen winner. Given this information, find the probability that you win the prize (simplify).
4.71
In a group of 90 people, find a simple, good approximation for the probability that there is at least one pair of people such that they share a birthday and their biological mothers share a birthday. Assume that no one among the 90 people is the biological mother of another one of the 90 people, nor do two of the 90 people have the same biological mother. Express your answer as a fully simplified fraction in the form , where and are positive integers and .
Make the usual assumptions as in the birthday problem. To simplify the calculation, you can use the approximations and , and the fact that for .
4.72
Use Poisson approximations to investigate the following types of coincidences. The usual assumptions of the birthday problem apply.
(a) How many people are needed to have a 50% chance that at least one of them has the same birthday as you?
(b) How many people are needed to have a 50% chance that there is at least one pair of people who not only were born on the same day of the year, but also were born at the same hour (e.g., two people born between 2 pm and 3 pm are considered to have been born at the same hour)?
(c) Considering that only of pairs of people born on the same day were born at the same hour, why isn’t the answer to (b) approximately ?
(d) With 100 people, there is a 64% chance that there is at least one set of 3 people with the same birthday (according to R, using pbirthday(100,classes=365,coincident=3) to compute it). Provide two different Poisson approximations for this value, one based on creating an indicator r.v. for each triplet of people, and the other based on creating an indicator r.v. for each day of the year. Which is more accurate?
4.73
A chess tournament has 100 players. In the first round, they are randomly paired to determine who plays whom (so 50 games are played). In the second round, they are again randomly paired, independently of the first round. In both rounds, all possible pairings are equally likely. Let be the number of people who play against the same opponent twice.
(a) Find the expected value of .
(b) Explain why is not approximately Poisson.
(c) Find good approximations to and , by thinking about games in the second round such that the same pair played each other in the first round.
*Existence
4.74
Stat110 solution available.
Each of 111 people names their 5 favorite movies out of a list of 11 movies.
(a) Alice and Bob are 2 of the 111 people. Assume for this part only that Alice’s 5 favorite movies out of the 11 are random, with all sets of 5 equally likely, and likewise for Bob, independently. Find the expected number of movies in common to Alice’s and Bob’s lists of favorite movies.
(b) Show that there are 2 movies such that at least 21 of the people name both of these movies as favorites.
4.75
Stat110 solution available.
The circumference of a circle is colored with red and blue ink such that of the circumference is red and is blue. Prove that no matter how complicated the coloring scheme is, there is a way to inscribe a square in the circle such that at least three of the four corners of the square touch red ink.
4.76
Stat110 solution available.
A hundred students have taken an exam consisting of 8 problems, and for each problem at least 65 of the students got the right answer. Show that there exist two students who collectively got everything right, in the sense that for each problem, at least one of the two got it right.
4.77
Stat110 solution available.
Ten points in the plane are designated. You have ten circular coins (of the same radius). Show that you can position the coins in the plane (without stacking them) so that all ten points are covered.
Hint: Consider a honeycomb tiling of the plane (this is a way to divide the plane into hexagons). You can use the fact from geometry that if a circle is inscribed in a hexagon then the ratio of the area of the circle to the area of the hexagon is .
4.78
Stat110 solution available.
Let be a set of binary strings of length (where juxtaposition means concatenation). We call -complete if for any indices and any binary string of length , there is a string in such that . For example, for , the set is 2-complete since all 4 patterns of 0’s and 1’s of length 2 can be found in any 2 positions. Show that if
then there exists a -complete set of size at most .
Mixed practice
4.79
A hacker is trying to break into a password-protected website by randomly trying to guess the password. Let be the number of possible passwords.
(a) Suppose for this part that the hacker makes random guesses (with equal probability), with replacement. Find the average number of guesses it will take until the hacker guesses the correct password (including the successful guess).
(b) Now suppose that the hacker guesses randomly, without replacement. Find the average number of guesses it will take until the hacker guesses the correct password (including the successful guess).
Hint: Use symmetry.
(c) Show that the answer to (a) is greater than the answer to (b) (except in the degenerate case ), and explain why this makes sense intuitively.
(d) Now suppose that the website locks out any user after incorrect password attempts, so the hacker can guess at most times. Find the PMF of the number of guesses that the hacker makes, both for the case of sampling with replacement and for the case of sampling without replacement.
4.80
A fair 20-sided die is rolled repeatedly, until a gambler decides to stop. The gambler receives the amount shown on the die when the gambler stops. The gambler decides in advance to roll the die until a value of or greater is obtained, and then stop (where is a fixed integer with ).
(a) What is the expected number of rolls (simplify)?
(b) What is the expected square root of the number of rolls (as a sum)?
4.81
Stat110 solution available.
A group of 360 people is going to be split into 120 teams of 3 (where the order of teams and the order within a team don’t matter).
(a) How many ways are there to do this?
(b) The group consists of 180 married couples. A random split into teams of 3 is chosen, with all possible splits equally likely. Find the expected number of teams containing married couples.
4.82
Stat110 solution available.
The gambler de Mere asked Pascal whether it is more likely to get at least one six in 4 rolls of a die, or to get at least one double-six in 24 rolls of a pair of dice. Continuing this pattern, suppose that a group of fair dice is rolled times.
(a) Find the expected number of times that “all sixes” is achieved (i.e., how often among the rolls it happens that all dice land 6 simultaneously).
(b) Give a simple but accurate approximation of the probability of having at least one occurrence of “all sixes”, for large (in terms of but not ).
(c) de Mere finds it tedious to re-roll so many dice. So after one normal roll of the dice, in going from one roll to the next, with probability he leaves the dice in the same configuration and with probability he re-rolls. For example, if and the 7th roll is , then of the time the 8th roll remains and of the time the 8th roll is a new random outcome. Does the expected number of times that “all sixes” is achieved stay the same, increase, or decrease (compared with (a))? Give a short but clear explanation.
4.83
Stat110 solution available.
Five people have just won a $100 prize, and are deciding how to divide the $100 up between them. Assume that whole dollars are used, not cents. Also, for example, giving $50 to the first person and $10 to the second is different from vice versa.
(a) How many ways are there to divide up the $100, such that each gets at least $10?
(b) Assume that the $100 is randomly divided up, with all of the possible allocations counted in (a) equally likely. Find the expected amount of money that the first person receives.
(c) Let be the event that the th person receives more than the first person (for ), when the $100 is randomly allocated as in (b). Are and independent?
4.84
Stat110 solution available.
Joe’s iPod has 500 different songs, consisting of 50 albums of 10 songs each. He listens to 11 random songs on his iPod, with all songs equally likely and chosen independently (so repetitions may occur).
(a) What is the PMF of how many of the 11 songs are from his favorite album?
(b) What is the probability that there are 2 (or more) songs from the same album among the 11 songs he listens to?
(c) A pair of songs is a match if they are from the same album. If, say, the 1st, 3rd, and 7th songs are all from the same album, this counts as 3 matches. Among the 11 songs he listens to, how many matches are there on average?
4.85
Stat110 solution available.
Each day that the Mass Cash lottery is run in Massachusetts, 5 of the integers from 1 to 35 are chosen (randomly and without replacement).
(a) When playing this lottery, find the probability of guessing exactly 3 numbers right, given that you guess at least 1 of the numbers right.
(b) Find an exact expression for the expected number of days needed so that all of the possible lottery outcomes will have occurred.
(c) Approximate the probability that after 50 days of the lottery, every number from 1 to 35 has been picked at least once.
4.86
A certain country has three political parties, denoted by A, B, and C. Each adult in the country is a member of exactly one of the three parties. There are adults in the country, consisting of members of party A, members of party B, and members of party C, where are positive integers with .
A simple random sample of size is chosen from the adults in the country (the sampling is done without replacement, and all possible samples of size are equally likely). Let be the number of members of party A in the sample, be the number of members of party B in the sample, and be the number of members of party C in the sample.
(a) Find , for nonnegative integers with .
(b) Find .
(c) Find , and briefly explain why your answer makes sense in the extreme cases and .
4.87
The U.S. Senate consists of 100 senators, with 2 from each of the 50 states. There are Democrats in the Senate. A committee of size is formed, by picking a random set of senators such that all sets of size are equally likely.
(a) Find the expected number of Democrats on the committee.
(b) Find the expected number of states represented on the committee (by at least one senator).
(c) Find the expected number of states such that both of the state’s senators are on the committee.
(d) Each state has a junior senator and a senior senator (based on which of them has served longer). A committee of size 20 is formed randomly, with all sets of 20 senators equally likely. Find the distribution of the number of junior senators on the committee, and the expected number of junior senators on the committee.
(e) For the committee from (d), find the expected number of states such that both senators from that state are on the committee.
4.88
A certain college has good courses and bad courses, where and are positive integers. Alice, who is hoping to find a good course, randomly shops courses one at a time (without replacement) until she finds a good course.
(a) Find the expected number of bad courses that Alice shops before finding a good course (as a simple expression in terms of and ).
(b) Should the answer to (a) be less than, equal to, or greater than ? Explain this using properties of the Geometric distribution.
4.89
A DNA sequence can be represented as a sequence of letters, where the alphabet has 4 letters: A, C, T, G. Suppose such a sequence is generated randomly, where the letters are independent and the probabilities of A, C, T, G are , respectively.
(a) In a DNA sequence of length 115, what is the variance of the number of occurrences of the letter C?
(b) In a DNA sequence of length 115, what is the expected number of occurrences of the expression CATCAT? Note that, for example, the expression CATCATCAT counts as 2 occurrences.
(c) In a DNA sequence of length 6, what is the probability that the expression CAT occurs at least once?
4.90
Alice is conducting a survey in a town with population size 1000. She selects a simple random sample of size 100 (i.e., sampling without replacement, such that all samples of size 100 are equally likely). Bob is also conducting a survey in this town. Bob selects a simple random sample of size 20, independent of Alice’s sample. Let be the set of people in Alice’s sample and be the set of people in Bob’s sample.
(a) Find the expected number of people in .
(b) Find the expected number of people in .
(c) The 1000 people consist of 500 married couples. Find the expected number of couples such that both members of the couple are in Bob’s sample.
4.91
The Wilcoxon rank sum test is a widely used procedure for assessing whether two groups of observations come from the same distribution. Let group 1 consist of i.i.d. with CDF and group 2 consist of i.i.d. with CDF , with all of these r.v.s independent. Assume that the probability of 2 of the observations being equal is 0 (this will be true if the distributions are continuous).
After the observations are obtained, they are listed in increasing order, and each is assigned a rank between 1 and : the smallest has rank 1, the second smallest has rank 2, etc. Let be the rank of among all the observations for , and let be the sum of the ranks for group 1.
Intuitively, the Wilcoxon rank sum test is based on the idea that a very large value of is evidence that observations from group 1 are usually larger than observations from group 2 (and vice versa if is very small). But how large is “very large” and how small is “very small”? Answering this precisely requires studying the distribution of the test statistic .
(a) The null hypothesis in this setting is that . Show that if the null hypothesis is true, then .
(b) The power of a test is an important measure of how good the test is about saying to reject the null hypothesis if the null hypothesis is false. To study the power of the Wilcoxon rank sum test, we need to study the distribution of in general. So for this part, we do not assume . Let . Find in terms of .
Hint: Write in terms of indicator r.v.s for being greater than various other r.v.s.
4.92
The legendary Caltech physicist Richard Feynman and two editors of The Feynman Lectures on Physics (Michael Gottlieb and Ralph Leighton) posed the following problem about how to decide what to order at a restaurant. You plan to eat meals at a certain restaurant, where you have never eaten before. Each time, you will order one dish.
The restaurant has dishes on the menu, with . Assume that if you had tried all the dishes, you would have a definite ranking of them from 1 (your least favorite) to (your favorite). If you knew which your favorite was, you would be happy to order it always (you never get tired of it).
Before you’ve eaten at the restaurant, this ranking is completely unknown to you. After you’ve tried some dishes, you can rank those dishes amongst themselves, but don’t know how they compare with the dishes you haven’t yet tried. There is thus an exploration-exploitation tradeoff: should you try new dishes, or should you order your favorite among the dishes you have tried before?
A natural strategy is to have two phases in your series of visits to the restaurant: an exploration phase, where you try different dishes each time, and an exploitation phase, where you always order the best dish you obtained in the exploration phase. Let be the length of the exploration phase (so is the length of the exploitation phase). Your goal is to maximize the expected sum of the ranks of the dishes you eat there (the rank of a dish is the “true” rank from 1 to that you would give that dish if you could try all the dishes). Show that the optimal choice is
or this rounded up or down to an integer if needed. Do this in the following steps:
(a) Let be the rank of the best dish that you find in the exploration phase. Find the expected sum of the ranks of all the dishes you eat (including both phases), in terms of , , and .
(b) Find the PMF of , as a simple expression in terms of binomial coefficients.
(c) Show that
Hint: Use Example 1.5.2 (about the team captain) and Exercise 20 from Chapter 1 (about the hockey stick identity).
(d) Use calculus to find the optimal value of .