Transformations

Prev: Joint distributions Next: Conditional expectation

Problems

Exercises marked with s have detailed solutions at http://stat110.net.

Change of Variables

8.1

Find the PDF of for .


8.2

Find the PDF of for .


8.3

Find the PDF of for .


8.4

Stat110 solution available.

Find the PDF of for .


8.5

Find the PDF of for .


8.6

Stat110 solution available.

Let . Find the PDFs of and .


8.7

Let . Find the PDF of .


8.8

(a) Find the distribution of for .

(b) Find the distribution of for .


8.9

Let and let and be constants with . Find a simple transformation of that yields an r.v. that equals with probability and equals with probability .


8.10

Let and be the indicator of being odd. Find the PMF of .

Hint: Find by writing and as series and then using the fact that is 1 if is even and if is odd.


8.11

Let be a continuous r.v. and . Show that their CDFs are related as follows:


8.12

Let be the ratio of two i.i.d. r.v.s . This is the Cauchy distribution and, as shown in Example 7.1.25, it has PDF

(a) Use the result of the previous problem to find the CDF of . Then use calculus to find the PDF of . (Note that the one-dimensional change of variables formula does not apply directly, since the function , even though it has for all , is undefined at and is not a strictly decreasing function on its domain.)

(b) Show that has the same distribution as without using calculus, in 140 characters or fewer.


8.13

Let and be i.i.d. , and . Find the CDF and PDF of .


8.14

Let and have joint PDF , and transform linearly by letting

where are constants such that .

(a) Find the joint PDF (in terms of , though your answer should be written as a function of and ).

(b) For the case where , , show that


8.15

Stat110 solution available.

Let be continuous r.v.s with a spherically symmetric joint distribution, which means that the joint PDF is of the form for some function . Let be the polar coordinates of , so is the squared distance from the origin and is the angle (in ), with , .

(a) Explain intuitively why and are independent. Then prove this by finding the joint PDF of .

(b) What is the joint PDF of when is Uniform in the unit disk ?

(c) What is the joint PDF of when and are i.i.d. ?


8.16

Let and be i.i.d. r.v.s, , and . We know from Example 7.5.8 that and are independent r.v.s (note that is Multivariate Normal with ). Give another proof of this fact, using the change of variables theorem.


8.17

Let and be i.i.d. r.v.s, and be the polar coordinates for the point , so and with and . Find the joint PDF of and . Also find the marginal distributions of and , giving their names (and parameters) if they are distributions we have studied before.


8.18

Let and be independent positive r.v.s, with PDFs and , respectively. Let be the ratio .

(a) Find the joint PDF of and , using a Jacobian.

(b) Find the marginal PDF of , as a single integral.


8.19

Let and be i.i.d. , and transform them to , .

(a) Find the joint PDF of and . Are they independent?

(b) Find the marginal PDFs of and .


Convolutions

8.20

Let and , independently. Find the PDF of .


8.21

Let and be i.i.d. . Use a convolution integral to show that the PDF of is for all real ; this is known as the Laplace distribution.


8.22

Use a convolution integral to show that if and are independent, then (to simplify the calculation, we are assuming that the variances are equal). You can use a standardization (location-scale) idea to reduce to the standard Normal case before setting up the integral.

Hint: Complete the square.


8.23

Stat110 solution available.

Let and be independent positive r.v.s, with PDFs and , respectively, and consider the product . When asked to find the PDF of , Jacobno argues that: “It’s like a convolution, with a product instead of a sum. To have we need and for some ; that has probability , so summing up these possibilities we get that the PDF of is .”

Evaluate Jacobno’s argument, while getting the PDF of (as an integral) in 2 ways:

(a) using the continuous version of the law of total probability to get the CDF, and then taking the derivative (you can assume that swapping the derivative and integral is valid);

(b) by taking the log of both sides of and doing a convolution (and then converting back to get the PDF of ).


8.24

Let and be i.i.d. Discrete Uniform r.v.s on , where is a positive integer. Find the PMF of .

Hint: In finding , it helps to consider the cases and separately. Be careful about the range of summation in the convolution sum.


8.25

Let and be i.i.d. , and let .

(a) Find the mean and variance of , without yet deriving the PDF.

(b) Show that the distribution of is symmetric about 0, without yet deriving the PDF.

(c) Find the PDF of .

(d) Use the PDF of to verify your results from (a) and (b).

(e) How does the distribution of relate to the distribution of , the Triangle distribution derived in Example 8.2.5? Give a precise description, e.g., using the concepts of location and scale.


8.26

Let and be i.i.d. , and . We derived the distribution of (a Triangle distribution) in Example 8.2.5, using a convolution integral. Since is Uniform in the unit square , we can also interpret as the area of , for any region within the unit square. Use this idea to find the CDF of , by interpreting the CDF (evaluated at some point) as an area.


8.27

Let be i.i.d. , and . Find the PDF of .

Hint: We already know the PDF of . Be careful about limits of integration in the convolution integral; there are 3 cases that should be considered separately.


Beta and Gamma

8.28

Stat110 solution available.

Let . Find the distribution of in two ways: (a) using a change of variables and (b) using a story proof. Also explain why the result makes sense in terms of Beta being the conjugate prior for the Binomial.


8.29

Stat110 solution available.

Let and be independent, with and integers. Show that in three ways: (a) with a convolution integral; (b) with MGFs; (c) with a story proof.


8.30

Let . Use integration by pattern recognition to find for positive integers . In particular, show that


8.31

Stat110 solution available.

Fred waits minutes for the bus to work, and then waits for the bus going home, with and independent. Is the ratio independent of the total wait time ?


8.32

Stat110 solution available.

The -test is a very widely used statistical test based on the distribution, which is the distribution of with , . Find the distribution of for .


8.33

Stat110 solution available.

Customers arrive at the Leftorium store according to a Poisson process with rate customers per hour. The true value of is unknown, so we treat it as a random variable. Suppose that our prior beliefs about can be expressed as . Let be the number of customers who arrive at the Leftorium between 1 pm and 3 pm tomorrow. Given that is observed, find the posterior PDF of .


8.34

Stat110 solution available.

Let and be independent, positive r.v.s. with finite expected values.

(a) Give an example where , computing both sides exactly.

Hint: Start by thinking about the simplest examples you can think of!

(b) If and are i.i.d., then is it necessarily true that ?

(c) Now let and . Show without using calculus that

for every real .


8.35

Let , with and . So has a Weibull distribution, as discussed in Example 6.5.5. Using LOTUS and the definition of the gamma function, we showed in this chapter that for ,

for all real . Use this result to show that


8.36

Alice walks into a post office with 2 clerks. Both clerks are in the midst of serving customers, but Alice is next in line. The clerk on the left takes an time to serve a customer, and the clerk on the right takes an time to serve a customer. Let be the time until the clerk on the left is done serving their current customer, and define likewise for the clerk on the right.

(a) If , is independent of ?

Hint: Note that .

(b) Find (do not assume here or in the next part, but do check that your answers make sense in that special case).

(c) Find the expected total amount of time that Alice spends in the post office (assuming that she leaves immediately after she is done being served).


8.37

Let and , where is a positive integer. Show using a story about a Poisson process that


8.38

Visitors arrive at a certain scenic park according to a Poisson process with rate visitors per hour. Fred has just arrived (independent of anyone else), and will stay for an number of hours. Find the distribution of the number of other visitors who arrive at the park while Fred is there.


8.39

(a) Let , where and are positive real numbers. Find , fully simplified ( should not appear in your final answer).

Two teams, and , have an upcoming match. They will play five games and the winner will be declared to be the team that wins the majority of games. Given , the outcomes of games are independent, with probability of team winning and of team winning. But you don’t know , so you decide to model it as an r.v., with a priori (before you have observed any data).

To learn more about , you look through the historical records of previous games between these two teams, and find that the previous outcomes were, in chronological order, AAABBAABAB. (Assume that the true value of has not been changing over time and will be the same for the match, though your beliefs about may change over time.)

(b) Does your posterior distribution for , given the historical record of games between and , depend on the specific order of outcomes or only on the fact that won exactly 6 of the 10 games on record? Explain.

(c) Find the posterior distribution for , given the historical data.

The posterior distribution for from (c) becomes your new prior distribution, and the match is about to begin!

(d) Conditional on , is the indicator of winning the first game of the match positively correlated with, uncorrelated with, or negatively correlated with the indicator of winning the second game? What about if we only condition on the historical data?

(e) Given the historical data, what is the expected value for the probability that the match is not yet decided when going into the fifth game (viewing this probability as an r.v. rather than a number, to reflect our uncertainty about it)?


8.40

An engineer is studying the reliability of a product by performing a sequence of trials. Reliability is defined as the probability of success. In each trial, the product succeeds with probability and fails with probability . The trials are conditionally independent given . Here is unknown (else the study would be unnecessary!). The engineer takes a Bayesian approach, with as prior.

Let be a desired reliability level and be the corresponding confidence level, in the sense that, given the data, the probability is that the true reliability is at least . For example, if , , we can be 95% sure, given the data, that the product is at least 90% reliable. Suppose that it is observed that the product succeeds all times. Find a simple equation for as a function of .


Order Statistics

8.41

Stat110 solution available.

Let and , where is a positive integer and is a positive integer with . Show using a story about order statistics that

This shows that the CDF of the continuous r.v. is closely related to the CDF of the discrete r.v. , and is another connection between the Beta and Binomial.


8.42

Show that for i.i.d. continuous r.v.s ,


8.43

Show that

without using calculus, for all and positive integers with .


8.44

Let be i.i.d. continuous r.v.s with PDF and a strictly increasing CDF . Suppose that we know that the th order statistic of i.i.d. r.v.s is a , but we have forgotten the formula and derivation for the distribution of the th order statistic of . Show how we can recover the PDF of quickly using a change of variables.


8.45

Stat110 solution available.

Let and be independent r.v.s and . Show that has the same distribution as , in two ways: (a) using calculus and (b) by remembering the memoryless property and other properties of the Exponential.


8.46

Stat110 solution available.

(a) If and are i.i.d. continuous r.v.s with CDF and PDF , then has PDF . Now let and be discrete and i.i.d., with CDF and PMF . Explain in words why the PMF of is not .

(b) Let and be i.i.d. r.v.s, and . Find the joint PMF of and , i.e., , and the marginal PMFs of and .


8.47

Let be i.i.d. r.v.s with CDF , and let . Find the joint distribution of and , for each .


8.48

Stat110 solution available.

Let be i.i.d. r.v.s with CDF and PDF . Find the joint PDF of the order statistics and for , by drawing and thinking about a picture.


8.49

Stat110 solution available.

Two women are pregnant, both with the same due date. On a timeline, define time 0 to be the instant when the due date begins. Suppose that the time when the woman gives birth has a Normal distribution, centered at 0 and with standard deviation 8 days. The two birth times are i.i.d. Let be the time of the first of the two births (in days).

(a) Show that

Hint: For any two random variables and , we have and . Example 7.2.3 derives the expected distance between two i.i.d. r.v.s.

(b) Find , in terms of integrals. You can leave your answers unsimplified for this part, but it can be shown that the answer works out to


8.50

We are about to observe random variables , i.i.d. from a continuous distribution. We will need to predict an independent future observation , which will also have the same distribution. The distribution is unknown, so we will construct our prediction using rather than the distribution of . In forming a prediction, we do not want to report only a single number; rather, we want to give a predictive interval with “high confidence” of containing . One approach to this is via order statistics, where denotes the th order statistic.

(a) For fixed and with , find .

Hint: By symmetry, all orderings of are equally likely.

(b) Let . Construct a predictive interval, as a function of , such that the probability of the interval containing is 0.95.


8.51

Let be i.i.d. continuous r.v.s with odd. Show that the median of the distribution of the sample median of the ‘s is the median of the distribution of the ‘s.

Hint: Start by reading the problem carefully; it is crucial to distinguish between the median of a distribution (as defined in Chapter 6) and the sample median of a collection of r.v.s (as defined in this chapter). Of course they are closely related: the sample median of i.i.d. r.v.s is a very natural way to estimate the true median of the distribution that the r.v.s are drawn from. Two approaches to evaluating a sum that might come up are (i) use the first story proof example and first story proof exercise from Chapter 1, or (ii) use the fact that, by the story of the Binomial, implies .


Mixed Practice

8.52

Let be i.i.d. , and let for all .

(a) Find the distribution of . What is its name?

(b) Find the distribution of the product .

Hint: First take the log.


8.53

Stat110 solution available.

A DNA sequence can be represented as a sequence of letters, where the alphabet has 4 letters: A,C,T,G. Suppose such a sequence is generated randomly, where the letters are independent and the probabilities of A,C,T,G are , respectively.

(a) In a DNA sequence of length 115, what is the expected number of occurrences of the expression “CATCAT” (in terms of the )? (Note that, for example, the expression “CATCATCAT” counts as 2 occurrences.)

(b) What is the probability that the first A appears earlier than the first C appears, as letters are generated one by one (in terms of the )?

(c) For this part, assume that the are unknown. Suppose we treat as a r.v. before observing any data, and that then the first 3 letters observed are “CAT”. Given this information, what is the probability that the next letter is C?


8.54

Stat110 solution available.

Consider independent Bernoulli trials with probability of success for each. Let be the number of failures incurred before getting a total of successes.

(a) Determine what happens to the distribution of as , using MGFs; what is the PDF of the limiting distribution, and its name and parameters if it is one we have studied?

Hint: Start by finding the MGF. Then find the MGF of , and use the fact that if the MGFs of converges to the MGF of , then the CDF of converges to the CDF of .

(b) Explain intuitively why the result of (a) makes sense.