STAB22 section 5.1 5.5 To solve (a), look at Table C, page T-7. Find the rows of the table with n = 4 (the third block down), and the column with p = 0.3. The numbers in that column give the probabilities of observing each possible value k in a binomial distribution with that n and that p. Thus for n = 4 and p = 0.3, the probabilities of observing 0, 1, 2, 3, 4 successes are 0.2401, 0.4116, 0.2646, 0.0756 and 0.0081 respectively. P (X = 0) you just read off the table as 0.2401, and for P (X ≥ 3) you add up the ones you want: P (X ≥ 3) = P (X = 3) + P (X = 4) = 0.0756 + 0.0081 = 0.0837. 5.1 We go through the 1500 students one at a time and ask each one “did you use the Internet to find a place to live?”. So n, the number of trials, is 1500. Each trial gives us a “success” or a “failure”, which we are free to define however we like: we could define a success as a Yes answer, or we could define a success as a No answer. Since we are really interested in the Yes answers here, it makes more sense to define success as Yes. In that case, X is the number of Yes answers observed, which, in this sample, is 525. Then p̂ = 525/1500 = 0.35. (If you chose to define success as No, your X is the number of No answers, 975, and your p̂ is 975/1500 = 0.65.) If you are going to use Table C for (b), you’ll need to arrange things so that the success probability is 0.5 or less. What you do is to interchange successes and failures: subtract p from 1 to get 1 − 0.7 = 0.3, and subtract the numbers of successes from n so that 4 successes in the question becomes 4 − 4 = 0 successes in your calculation, and X ≤ 1 becomes X ≥ 4 − 1 = 3. (Note that the ≤ becomes ≥). Since the values for n, p, k are now the same as in (a), the answers will be the same too. 5.2 200 seniors were questioned, so n = 200. p̂ is the fraction in your sample that were successes (said that they had taken a statistics course): 40% or 0.40. The number of successes in your sample must have been 40% of 200, 80, which is your value of X. The thought process in doing part (b) by Table C explains why the answers are the same: whatever is not a success is a failure, and you can count either. 5.3 Each coin toss is independent, with the same probability 0.5 of getting a head each time. So the number of heads in 20 tosses has a binomial distribution with n = 20 and p = 0.5. When you actually do it, there’s no knowing what number of heads you’ll get, but the values near the middle (10) are more likely: you’d “expect” to get about half heads and half tails. So 11 heads is more likely than 19, though both are possible. You can use Minitab instead of Table C. Select Calc, Probability Distributions and Binomial. Fill in the Number of Trials (n) and the Probability of Success (p). At the top you can select either Individual Probability (for working out P (X = 0)) or Cumulative Probability (for working out P (X ≤ 1): always ≤). If you are working out something like ≥, you have to rewrite what you want as a ≤. For instance, P (X ≥ 3) = 1 − P (X ≤ 2) since X has to be either 3 or bigger, or 2 or smaller. Finally, at the bottom of the dialog box click on Input Constant, and put in your value of k. So for the first part of (a), click on Probability, put in n = 4, p = 0.3, click on Input Constant and enter 0 in the box. My result, 0.2401, is shown in Figure 1. For the second part, get the dialog box again, click on Cumulative Probability, make sure n and p are correct, and enter 2 next to Input Constant (which should 5.4 According to the genetic theory, each child inherits genes from its parents independently of other children. So we have n = 4 “trials” (children) who each have probability 0.25 of having type O blood, so the number of children who actually do end up having type O blood has a binomial distribution with n = 4 and p = 0.25. (You say that something has a particular distribution before you observe any data; once you have the data, you have a value like “2 children with type O blood”, not a distribution.) 1 middle of page 320: the count of still be selected). This gives 0.9163, so the answer you want is q the number of heads has mean 1 − 0.9163 = 0.0837. For (b), first click on Probability again, np = (100)(0.5) = 50 and SD (100)(0.5)(0.5) = 5. This says enter n = 4 and p = 0.7 now, and next to Input Constant enter 4. that the number of heads should be relatively close to 50, which This again gives 0.2401. Then click on Cumulative Probability, is what you’d expect. ensure that n = 4 and p = 0.7 are correct, and enter 1 (for ≤ 1) next to Input Constant. This gives 0.0837, and there’s no 5.7 The first thing we need when using a normal approximation is the mean and SD of the thing we’re normal-approximating, here the subtracting from 1 this time. These answers are all the same as proportion of heads in 100 tosses. We can use the answers from you get from Table C. The advantage of using Minitab is that it 5.6 for this: mean 0.5, SD 0.05. Then turn the given values into will give you answers for all combinations of n and p, not just z-scores and use Table A as in §1.3. the ones that happen to appear in Table C. In fact, Minitab can handle large values of n as well, so that even if you are using the 0.4 has z-score (0.4 − 0.5)/0.05 = −2, and 0.6 has z-score normal approximation by hand, you can get Minitab to give you (0.6 − 0.5)/0.05 = 2. These give 0.9772 and 0.0228 in Table A, so the exact answer (and you can see how good your approximation subtract these to give the probability: 0.9772 − 0.0228 = 0.9544. was). There is a better than 95% chance that the proportion of heads after 100 tosses will be between 0.4 and 0.6, which may strike you Probability Density Function as surprisingly high, but that’s the way it works. Binomial with n = 4 and p = 0.3 x 0 For (b), follow the same steps: 0.45 has z-score (0.45−0.5)/0.05 = −1 and 0.55 has z-score (0.55 − 0.5)/0.05 = 1, so the chance of ending up between these is 0.8413 − 0.1587 = 0.6826. P( X = x ) 0.2401 You might also notice that “between 0.4 and 0.6” is “within 2 SDs of the mean”, and “between 0.45 and 0.55” is “within 1 SD of the mean”, so that the 68-95-99.7 rule gives the answers (without using a table) as “about 95%” and “about 68%” respectively. Figure 1: Minitab binomial probability output 5.6 Here n = 100 and p = 0.5 (fair coin). Use the formulas in the box at the top of page 322 to find that µp̂ = p = 0.5 and σp̂ = q (0.5)(0.5)/100 = 0.05: that is, the fraction of heads you’ll get will be close to 0.5 (50%) almost certainly, because the SD of p̂ is small. Since we know we’re tossing the coin 100 times, the question could also have asked “find the probability that the number of heads is between 40 and 60, between 45 and 55”, which we would have expected to give the same answers. To do it this way, we use the mean and SD of the count of heads, 50 and 5, as I found at the end of 5.6. Using these figures with the proper mean and SD gives the same z-scores as working with the proportions, and so the same answers. This is not the same as the mean and SD of the count of the number of heads, because the proportion of heads will be about 0.50 (regardless of the number of times you toss the coin), whereas the number of heads will be about half the number of tosses — if you toss the coin 100 times, you’d expect about 50 heads. If you want the mean and SD of the count, use the formulas in the You can get the exact answers from Minitab, for comparison with your normal-approximation answers, but you have to work with 2 one doesn’t make much difference.) the success counts (you can’t work with the sample proportions). Use Cumulative Probability with n = 100 and p = 0.5. Before you fill in the dialog box, though, you can make your life easier by 5.9 (a) A properly tossed fair coin has no memory, so that individual tosses are independent: what happened in the past (three confilling column C1 with the values 40, 60, 45 and 55 (the values you secutive heads) has no influence over what’s going to happen this want the probabilities for). Then you go to the dialog box (Calc, time. “Tails are due” is a fallacy. (b) has the same reasoning: the Probability Distributions, Binomial), select Cumulative Probabilprobability is still exactly 0.5. (c) p̂ is the sample proportion: you ity, enter n and p, and at the bottom click on Input Column and perform the experiment and see how many successes you get, so type C1 into the box. This gets you all four probabilities of ≤ that p̂ is a number that you know afterwards. This is unlike p, one all at once. See Figure 2. Then to find the probabilities of “beof the parameters of the binomial distribution, the probability of tween”, you subtract: the chance of being between 40 and 60 is success, which would be known before you toss any coins, roll any 0.982400 − 0.028444 = 0.953956 and the chance of being between dice, etc. 45 and 55 is 0.864373 − 0.184101 = 0.680272. These are close to what we found by the normal approximation; indeed, they are 5.10 (a) X is the number of successes, a number like 19, and not a close to the 68% and 95% we got without using any tables or proportion at all (which would be a number like 19/50 = 0.38). software at all. (b) is wrong two ways: the given quantity is an SD not a variance (because of the square root), and it is the SD of the proportion Cumulative Distribution Function and not the count. (c) The accuracy of the normal approximation Binomial with n = 100 and p = 0.5 depends on p as well as n: if p is very close to 0 or 1, even n = 10000 might not be large enough. (If p is 0.5, even a small n x P( X <= x ) like n = 20 would do. Try this n and p, and also n = 10000, p = 40 0.028444 60 0.982400 0.0001, in the rule of thumb in the box on page 323.) 45 55 0.184101 0.864373 5.11 (a) If the poll is a simple random sample, this one will be OK, with n = 200 and p being some reasonable value for the probability of a randomly chosen student being “usually irritable in the Figure 2: Cumulative binomial probabilities for 5.7 morning”. (b) is no good because the number of trials (tosses) is not fixed: every time you do this experiment, you’ll need a dif(You might be wondering whether we are finding the probability of ferent number of tosses. (c) is OK, again because it is a random “between 40 and 60 inclusive”, or if we are omitting the endpoints. sample, with n = 500 and p = 1/12. This is a discrete distribution, so it makes a difference — recall, for instance, exercise 4.55. Actually, here, we are including 60 5.12 (a) There is no notion of “success” here. If a count were made of the number of students with mean systolic blood pressure greater and excluding 40; if we were going to include 40 as well, we’d (or less) than some target value, that count could be binomial. (b) have to get P (X ≤ 39) and subtract that. Because n is so large, looks OK (random sampling) with a fixed sample size (20), and though, the individual probabilities are very small (you can get a clear definition of “success” (defect) and “failure” (no defect). Minitab to show you how small), so including or excluding just 3 (c) also looks OK, for the same reason: a student will either 5.14 The number of visitors has a binomial distribution with n = 15 and p = 0.5, approximately. report that they eat the required amount of fruits and vegetables (success), or report that they don’t (failure). Consult Table C, page T-9, with n = 15 and use the p = 0.5 column. Take the probabilities for k = 8 onwards, and add them 5.13 The number of errors caught will have a binomial distribution up. This gives 0.1964+0.1527+0.0916+0.0417+0.0139+0.0032+ with n = 20 and p = 0.7. The number of errors missed also has 0.0005 = 0.5000. (Because p = 0.5, the numbers in Table C are a binomial distribution with n = 20 and p = 0.3, interchanging the same ones going up and down, and they add up to 1, and the successes and failures. numbers you want are just the half that go down. So you could (You wouldn’t tell the student proofreader that there were 20 guess that the answer is 0.5 without adding them up.) errors, because he or she might keep trying until finding all 20, but from your point of view there are 20 opportunities to catch 5.15 The mean is np. For the number of errors caught, this is an error, and each time the student may or may not succeed. (20)(0.7) = 14 and for the number of errors missed the mean Some errors might be easier to catch than others, which would is (20)(0.3) = 6. (These add up to 20 as they should.) q make the probability of success at each trial unequal, but we’re The SD of the number of errors caught is np(1 − p) = not worrying about that here.) q (20)(0.7)(0.3) = 2.05. (The SD of the number of errors missed In (b), we’re counting the number of errors missed, so p = 1 − is the same, because the formula has the same numbers multiplied 0.7 = 0.3. Table C (starting at page T-6 in the back of the together in a different order.) textbook) has binomial probabilities; the second table on page q T-9, with n = 10 and values of p from 0.10 to 0.50, is the one you need. Look in the p = 0.30 column, and add up the probabilities from 4 on (to get “4 or more errors missed”): this is 0.2001 + 0.1029 + 0.0368 + 0.0090 + 0.0014 + 0.0001 = 0.3503. This is quite high, because 10 errors isn’t very many, and it’s quite likely to have this poor a performance by chance. If p goes up to 0.9, the SD becomes (20)(0.9)(0.1) = 1.34, which is smaller. If p goes up to 0.99, the SD decreases further to 0.44. If the probability of success gets closer and closer to 1, the proofreader will make fewer and fewer mistakes, so the number of errors caught will get (almost certainly) closer and closer to 20. The spread will decrease to nothing, so the SD should (and does) approach zero. Notice that Table C doesn’t have any probabilities for p > 0.5. This is because you can always rephrase a problem to use a p less than 0.5. Another way to ask the question in (b) is: “how likely 5.16 The mean of the count is np = (15)(0.5) = 7.5. The mean of the proportion p̂ is np/n = p = 0.5 no matter what n is. is it that the proofreader will catch 6 or more errors of the 10?”. When n = 150, the mean count of people visiting is np = The connection is: interchange successes and failures (6 errors (150)(0.5) = 75, and the mean proportion of people visiting is caught is 10 − 6 = 4 errors missed), and replace p (here 0.7) with p = 0.5. When n = 1200, the mean count of people visiting 1 − p (1 − 0.7 = 0.3). Either you’ll have a p you can use directly, is np = (1500)(0.5) = 750, and the mean proportion of people or you can get one by this recipe. visiting is still p = 0.5. If you want to, you can do this question using Minitab. Page 96–97 of the Minitab manual shows you how. The mean count of people visiting goes up as n goes up, but the 4 mean proportion of people visiting is constant at 0.5. (If you work out the standard deviations using the formulas on pages 320 and 322, you’ll find that the ones for the counts go up, and the ones for the proportions go down. With a larger n, that is, a larger sample, the sample proportion becomes more predictably close to 0.5.) When you take an actual sample of male internet users, the actual behaviour is not quite as predictable as this as n increases, but you can say, almost certainly, that the mean count of successes will increase as n increases, and the mean proportion of successes will head towards 0.5 as n increases. Figure 3: Probability histogram for blood type data 5.19 For “0” in the question, read “any particular digit, such as a 5”. The number of 5’s in a group of 5 digits has a binomial distribution with n = 5 and p = 0.10. So go into Table C (page T-7). Probability of at least one five is one minus probability of no five; this is 1 − 0.5905 = 0.4095. (Or take the probabilities for k = 1, 2, 3, 4, 5 and add them up. Or, if you really want to, use the binomial probability formula to get 1 − (0.1)5 = 0.4095.) q σp̂ = (0.49)(0.51)/1016 = 0.0157. A sample proportion of 0.46 gives a z of (0.46 −0.49)/0.0157 = −1.91, and 0.52 gives z = 1.91. Table A gives the chance of being between these values as 0.9713− 0.0287 = 0.9426. (To use Minitab, 1016(0.46) = 467 successes, and 1016(0.52) = 528 successes. In a binomial distribution with n = 1016 and p = 0.49, the chance of a number of successes between these values is 0.972824 − 0.028387 = 0.944437, so the normal approximation is very close.) In lines 40 digits long, n = 40, so the mean number of fives is (40)(0.1) = 4. About one-tenth of all digits are fives, so in a line of 40 digits, about 4 of them will be fives. (This doesn’t mean that exactly 4 will be fives; occasionally you’ll get 4, but usually you’ll get more or fewer.) The chance of getting a sample proportion here between 0.46 and 0.52, that is, within 0.03 of the true value 0.49 of p, is high, about 95%. But it is not a certainty. Some of the possible samples you could draw will have a sample proportion less than 0.46 or bigger than 0.52 (that is, outside the poll’s stated margin of error of 0.03). We will see in §6.1 that it is impossible to be certain, because anything could happen in a sample, so what we do is to offer something like a margin of error that is correct “19 times out of 20”, that is, it has probability 0.95 of being correct, over all the possible samples that we could take. If we wanted to have a higher chance of being correct, say 99%, we’d have to accept a larger margin of error. 5.21 n = 4 and p = 0.25 (one quarter). Copy the five entries from Table C for n = 4, p = 0.25 (page T-7), and draw yourself the histogram as shown in Figure 3. The vertical scale (labelled “frequency”) is actually the probability times 10000; I had to do this to get Minitab to plot it. The mean is np = (4)(0.25) = 1, which goes right under the 1 bar on the histogram; this bar happens to be the tallest. 5.22 n is large and p is near 0.5, so use the normal approximation. The mean (for the sample proportion) is µp̂ = 0.49 and the SD is 5 5.23 The calculations here are like those in 5.22: use the normal approximation to the binomial. Here, n is large but p is not that close to 0.50, so we should check the rules of thumb: for p = 0.3, np = 1011(0.3) = 303.3 and n(1 − p) = 1011(0.7) = 707.7, both of which are safely greater than 10, and for p = 0.06, np = 1011(0.06) = 60.66 and n(1 − p) = 1011(0.94) = 950.34, so this is also safe. Figure 4: Spreadsheet for calculations of 5.24 When q n = 1011 and p = 0.30, the mean and SD (for p̂) are 0.30 and (0.3)(0.7)/1011 = 0.0144. The z values for 0.28 and 0.32 are ±(0.32 − 0.30)/0.0144 = ±1.39, so the probability of being between is 0.9177 − 0.0823 = 0.8354. with a larger sample, the sample proportion is more likely to be close to the population proportion. If you were able to take an infinitely large sample (or sample the whole population), you would be certain to get p̂ = p. In real life, though, you’ll have to accept that your sample proportion won’t be exactly equal to the population proportion. q If p = 0.06, the mean and SD are 0.06 and (0.06)(0.94)/1011 = 0.0075. The z-values for 0.04 and 0.08 are ±(0.08−0.06)/0.0075 = ±2.68, so the probability of being between is 0.9963 − 0.0037 = 5.25 The sample proportion is 140/200 = 0.7 or 70%. You can use 0.9926. a normal approximation to find the chance that 140 or more stuThe probability of being within 0.02 of p appears to be getting dents in a sample of 200 would support the crackdown, if p = 0.67. larger as p gets smaller. You can reason this out in a couple of Or you can pull the answer out of Minitab: the probability of 139 ways: first, as p gets closer to 0, σp̂ is getting closer to 0 as well, or less is 0.7951, so the chance of 140 or more is 0.2049. A normal which means that p̂ is almost certainly close to p, and the chance approximation gives 1 − 0.8165 = 0.1835, which is not bad. of being “within” anything will get closer to 1. Second, as p gets The upshot of this calculation is that if the proportion of students closer to 0, the chance of observing any successes at all in your favouring the crackdown is really 0.67, it is quite likely (the probsample gets smaller, and thus the sample proportion p̂ is more ability is about 0.20) that you will get as many as 140 = 70% in likely to be close to 0 (ie. very close to p) as well. favour in your sample, just by chance. So this, by itself, is not evidence that the proportion of students in favour at “your col5.24 The calculations here are the same idea again: use the normal lege” is higher than 0.67. (Your letter needs to make this point: approximation to the binomial and the mean and SD of p̂ to because of random sampling, the result that was observed could get z-values and probabilities for the various values of n. Since easily have happened by chance.) you have to do the calculations several times over, you can use a spreadsheet to do the repetitive calculations for you. Mine is If you really wanted evidence that “your college” was different, shown in Figure 4. If you can follow the calculations in 5.22, you’ll you would have to either (a) get a sample proportion quite a bit be able to see what I’m doing here. bigger, or (b) get the same result (70%) as here with a bigger sample. With a bigger sample, a sample proportion as high as The probability of getting a sample proportion within 0.03 of the true p appears to be (and is) heading towards 1. That is, 0.70 becomes progressively less likely if p = 0.67, and so with a 6 bigger sample you would be more entitled to conclude that p is not equal to 0.67 after all. (This is the logic of a test of significance, which we’ll see a lot more of in §6.2). From Table A, the probability of “less than” is 0.9996, so the probability of “more than” is 1 − 0.9996 = 0.0004. So the college will rarely get caught out: most of the time, they won’t end up with more than 950 students following this strategy. 5.27 (a) There are four shapes, of which the subject guesses one, so p = 41 . (b) The number of shapes guessed out of 20 has a binomial distribution with n = 20 and p = 41 = 0.25, so from Table C (page T-10), the probability of 10 or more correct guesses is 0.0099 + 0.0030 + 0.0008 + 0.0002 = 0.0139. (c) This is just the mean and SD of the binomial distribution here, ie. mean is np = 20(0.25) √ = 5, variance is np(1 − p) = 20(0.25)(0.75) = 3.75, and SD is 3.75 = 1.94 guesses. (d) Knowing that the deck has exactly 5 of each card might change things: for instance, if the subject hasn’t seen a star in the first 10 cards, he/she knows that 5 of the last 15 cards are stars and may start guessing stars, with a higher chance of being correct (that is, the chance of guessing a card correctly isn’t constant all the way through, and therefore a binomial distribution is no good.) This is the same strategy as “counting face cards” if you are playing blackjack; counting cards is a winning strategy in a casino if you are discreet enough not to get thrown out! 5.28 The mean is np = 1200(0.75) = 900, √ 1200(0.75)(0.25) = 225, so SD is 225 = 15. We’re using the normal approximation to the binomial here because n is large (1200) and p is not too far from 0.5. The rule of thumb on page 323 says this will be OK if np ≥ 10 and n(1 −p) ≥ 10. Here np = 1200(0.75) = 900 and n(1−p) = 1200(0.25) = 300, so we are safe. (You might be concerned that the binomial distribution deals with whole numbers, whereas the normal distribution deals with fractional numbers. Or you might be thinking that “at least 950” and “more than 950” are different — in the first you include 950, and in the second you don’t. But the normal approximation above treats them the same way. You can get a more accurate answer using a “continuity correction”: ask yourself “what decimal number would round off to the whole number I want?” In this case, “more than 950” means “bigger than 950.5”, so use 950.5 instead of 950 to get z = (950.5 − 900)/15 = 3.37. With large n, this often doesn’t make much difference; here the probability a little smaller, but is the same to 4 decimals. If you are not concerned about this, don’t worry; you don’t need to know continuity corrections in this course.) variance is Change n to 1300, so the mean changes to np = 1300(0.75) = 975, variance becomes np(1 − p) = 1300(0.75)(0.25) = 243.75 and SD √ is 243.75 = 15.6125. Now z = (950 − 975)/15.6125 = −1.60, so prob. is 1 − 0.0548 = 0.9452. The college is now very likely to end up with too many students. Here, n = 1200 is larger than Table C has, so we need to use the normal approximation. The idea is to find the mean and SD (as we just did) and “pretend” that the count has a normal distribution with this mean and SD. (It actually has a binomial distribution, of course, but with a large n we can often get away with it. See the “rule of thumb” calculations below.) Or use the continuity correction and start from 950.5, so get z = (950.5−975)/15.6125 = −1.57 and a prob. of 1−0.0582 = 0.9418. This time the continuity correction makes more of a difference (though still not much). 950 is a “value”, so turn it into a z and then look it up in Table A, using the mean and SD you just found. This gives: z= 950 − 900 = 3.33. 15 5.29 The success prob. is 7 1 5 = 0.2, so the mean is 900(0.2) = 180, the √ variance is 900(0.2)(0.8) = 144, and the SD is 144 = 12. For the proportion, which you get by dividing the count by n, divide the mean and SD by n as well to get a mean of 200/1000 = 0.2 and an SD of 12/1000 = 0.012. (Or you can use the formulas for p̂ in the box on page 323). chance of her getting 80% or lower, which is less than she would expect, goes down compared with n = 100. To cut the SD for the proportion in half, n has to be multiplied √ 2 by 2 = 4 (because of the n on the bottom of the formula). Try it by calculation or algebra if you don’t believe me. You’d need to solve s (0.85)(0.15) 0.0357 = n 2 for n. That means that 400 questions would be needed (but there is the little matter of how long a 400-question exam would take!) This holds true for any p, including the p = 0.75 of Laura in part (d). To do that one by calculation, figure out Laura’s SD for 100 q questions, divide that in half, and put it equal to (0.75)(0.25)/n, solving for n. For (c), use the mean and SD you got in (b) along with the normal approximation, so z = (0.24 − 0.2)/0.012 = 3.33 and the prob. is 1 − 0.9996 = 0.0004. You may not think that 24% is a very impressive performance, but see the discussion below. The last part has you working backwards, from the table to a z to a proportion. The probability to be looked up backwards in the table is 1 − 0.01 = 0.99 (we want “this well or better”), which goes with z = 2.33 (the closest value). Turn this back into a value by multiplying by the SD and adding the mean, to get (2.33)(0.012) + 0.2 = 0.22796; that is, a subject must get 23% or more successes, or 206 out of 900, to have evidence of ESP. (You 5.34 For this question, the binomial distribution no longer applies because we no longer have a fixed number of trials: the number might think that this is not much more than the 20% a subject of rolls of the die is whatever it needs to be to get a 1. Thus for could get by guessing, but with so many attempts (trials), it’s 5 25 (a), 65 × 16 = 36 ; (b) is 65 × 56 × 61 = 216 ; and for (c) the answers are very unlikely that someone could do this well by guessing alone.) 5 5 5 1 5 5 5 5 1 × 6 × 6 × 6 and 6 × 6 × 6 × 6 × 6 . To get the first 1 being on 6 Sanity-checking: the chance of at least 24% successes is 0.0004, the k-th roll, you need k − 1 non-1’s followed by a 1, and this has which is less than 0.01, the chance of at least 23% successes. probability ( 56 )k−1 ( 16 ). This distribution for the number of rolls to get the 1st 1 is called a geometric distribution; see exercise 5.35 5.33 You can use the normal approximation to the binomial for this for more. one; we are dealing with proportions, so be sure to use the right formulas for the mean and SD. 5.35 Y could be 1, 2, 3, 4 and so on (it could be very large because For Jodi, n = 100 and p = 0.85, so for the proportion she gets you might wait a long time for the first success). Using the same correct, the mean is p = 0.85, and the variance is p(1 − p)/n = ideas as in 5.34, Y = 1 if you get a success on the first trial, which √ (0.85)(0.15)/100 = 0.001275, so the SD is 0.001275 = 0.0357. happens with probability p; Y = 2 if you get a failure followed by To get 80% or lower, her z is z = (0.80 − 0.85)/0.0357 = −1.40, a success, which happens with probability (1 − p)p, and Y = k in giving a probability from the table of 0.0808. general if you get k − 1 failures followed by a success in that order (because if you get your success any earlier, Y is not equal to k (b) is the same thing, but with n = 250, so mean √ is 0.85, variance any more), which has probability (1 − p)k−1p. These probabilities (0.85)(0.15)/250 = 0.00051. z = (0.80−0.85)/ 0.00051 = −2.21, form an “infinite series”, because Y could be as big as you like; so probability is now 0.0136. With more questions, Jodi’s proporall the infinite number of probabilities add up to 1, as they should tion of correct answers should be closer to her p of 0.85, so the 8 (the rationale being that you can get the total as close to 1 as you like by taking enough probabilities and adding them up). 9
© Copyright 2026 Paperzz