Skip to main content

Statistical Inference for Two Populations–Independent Samples

  • Chapter
  • First Online:
  • 5765 Accesses

Part of the book series: Springer Texts in Statistics ((STS))

Abstract

William Shakespeare penned the famous quote “A rose by any other name would smell as sweet”. Does this sentiment carry over to names given you by your parents? Christenfeld et al. (1999) were not so sure after studying the possible effect that the initials of your name may have on your life expectancy! Dividing names into those with “bad initials”, such as DED, SIC, UGH, ROT, etc., and those with “good initials”, such as GOD, HUG, VIP, WIN, etc., they studied California death certificates to see if there appeared to be a difference in age at death for people with initials in these two general categories.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The only requirement is that the variances exist for both populations. Neither underlying normality nor equal population variances is required for the procedures of this section.

Bibliography

  • Abeles, H. F., & Porter, S. Y. (1978). The sex-stereotyping of musical instruments. Journal of Research in Music Education, 26, 65–75.

    Article  Google Scholar 

  • Ali, A., Rasheed, A., Siddiqui, A. A., Naseer, M., Wasim, S., & Akhtar, W. (2015). Non-parametric test for ordered medians: The Jonckheere Terpstra test. International Journal of Statistics in Medical Research, 4, 203–207.

    Article  Google Scholar 

  • Archer, V. E. (1979). Anencephalus, drinking water, geomagnetism and cosmic radiation. American Journal of Epidemiology, 109, 88–97.

    Article  Google Scholar 

  • Arellano, L., Castillo-Guevara, C., Huerta, C., Germán-García, A., & Lara, C. (2015). Effect of using different types of animal dung for feeding and nesting by the dung beetle Onthophagus lecontei (Coleoptera: Scarabaeinae). Canadian Journal of Zoology, 93, 337–343.

    Article  Google Scholar 

  • Ault, R. G., Hudson, E. J., Linehan, D. J., & Woodward, J. D. (1967). A practical approach to the assessment of head retention of bottled beers. Journal of the Institute of Brewing, 73(6), 558–566.

    Article  Google Scholar 

  • Badawy, M. E. I., Kenawy, A., & El-Aswad, A. F. (2013). Toxicity assessment of Buprofezin, Lufenuron, and Triflumuron to the earthworm Aporrectodea caliginosa. International Journal of Zoology, Article ID 174523, 9 pages.

    Google Scholar 

  • Bennett, P. E. (1957). The statistical measurement of a stylistic trait in Julius Caesar and As You Like It. Shakespeare Quarterly, 8(1), 33–50.

    Article  Google Scholar 

  • Borden, P., Nyland, J., Caborn, D. N. M., & Pienkowski, D. (2003). Biomechanical comparison of the FasT-Fix meniscal repair suture system with vertical mattress sutures and meniscus arrows. The American Journal of Sports Medicine, 31(3), 374–378.

    Article  Google Scholar 

  • Christenfeld, N., Phillips, D. P., & Glynn, L. M. (1999). What’s in a name: Mortality and the power of symbols. Journal of Psychosomatic Research, 47(3), 241–254.

    Article  Google Scholar 

  • Chu, S. (2001). Pricing the C’s of diamond stones. Journal of Statistics Education, 9(2), 12 pages online.

    Google Scholar 

  • Dearwater, S. R., Coben, J. H., Campbell, J. C., Nah, G., Glass, N., McLoughlin, E., & Bekemeier, B. (1998). Prevalence of intimate partner abuse in women treated at community hospital emergency departments. Journal of the American Medical Association, 280(5), 433–438.

    Article  Google Scholar 

  • Eirk, K. G. (1972). An experimental evaluation of accepted methods for removing spots and stains from works of art on paper. Bulletin of the American Group International Institute for Conservation of Historic and Artistic Works, 12(2), 82–87.

    Article  Google Scholar 

  • Elwood, J. M. (1977). Anencephalus and drinking water composition. American Journal of Epidemiology, 105(5), 460–468.

    Article  Google Scholar 

  • Hoffman, D. L., & Novak, T. P. (1998). Bridging the racial divide on the internet. Science, 280, 390–391.

    Article  Google Scholar 

  • Johnson, B. (1984). Personal communication for report in Statistics 661. Columbus: Ohio State University.

    Google Scholar 

  • Kamimura, A., Takahashi, T., & Watanabe, Y. (2000). Investigation of topical application of procyanidin B-2 from apple to identify its potential use as a hair growing agent. Phytomedicine, 7(6), 529–536.

    Article  Google Scholar 

  • Leichliter, J. S., Meilman, P. W., Presley, C. A., & Cashin, J. R. (1998). Alcohol use and related consequences among students with varying levels of involvement in college athletics. Journal of American College Health, 46, 257–262.

    Article  Google Scholar 

  • Mackowiak, P. A., Wasserman, S. S., & Levine, M. M. (1992). A critical appraisal of 98.6o F, the upper limit of the normal body temperature, and other legacies of Carl Reinhold August Wunderlich. Journal of the American Medical Association, 268(12), 1578–1580.

    Article  Google Scholar 

  • Moore, T. L. (2006). Paradoxes in film ratings. Journal of Statistics Education, 14(1), 8 pages online.

    Google Scholar 

  • Mrosovsky, N., & Shettleworth, S. J. (1974). Further studies of the sea-finding mechanism in green turtle hatchlings. Behaviour, 51, 195–208.

    Article  Google Scholar 

  • Nsor, C. A., & Obodai, E. A. (2014). Environmental determinants influencing seasonal variations of bird diversity and abundance in wetlands, Northern Region (Ghana). International Journal of Zoology, 2014, 1–10, Article ID 548401, 10 pages.

    Article  Google Scholar 

  • O’Neill, S. A., & Boulton, M. J. (1996). Boys’ and girls’ preferences for musical instruments: A function of gender? Psychology of Music, 24, 171–183.

    Article  Google Scholar 

  • Pérez-Stable, E. J., Herrera, B., Jacob, P., III, & Benowitz, N. L. (1998). Nicotine metabolism and intake in black and white smokers. Journal of the American Medical Association, 280, 152–156.

    Article  Google Scholar 

  • Pye, A. E. (1974). Microbial activation of prophenoloxidase from immune insect larvae. Nature, 251, 610–613.

    Article  Google Scholar 

  • Salit, S. A., Kuhn, E. M., Hartz, A. J., Vu, J. M., & Mosso, A. L. (1998). Hospitalization costs associated with homelessness in New York City. New England Journal of Medicine, 338, 1734–1740.

    Article  Google Scholar 

  • Shoemaker, A. L. (1996). What’s normal?—Temperature, gender, and heart rate. Journal of Statistics Education, 4(2), 4 pages online.

    Google Scholar 

  • Storm, L., & Thalbourne, M. A. (2005). The effect of a change in pro attitude on paranormal performance: A pilot study using naïve and sophisticated skeptics. Journal of Scientific Exploration, 19(1), 11–29.

    Google Scholar 

  • Tarbill, G. L., Manley, P. N., & White, A. M. (2015). Drill, baby, drill: The influence of woodpeckers on post-fire vertebrate communities through cavity excavation. Journal of Zoology, 296, 95–103.

    Article  Google Scholar 

  • Thalbourne, M. A. (1995). Further studies of the measurement and correlates of belief in the paranormal. Journal of the American Society for Psychical Research, 89, 234–237.

    Google Scholar 

  • Thalbourne, M. A. (2004). The common thread between ESP and PK. New York: The Parapsychology Foundation.

    Google Scholar 

  • Whelan, R. J. (1982). An artificial medium for feeding choice experiments with slugs. Journal of Applied Ecology, 19(1), 89–94.

    Article  Google Scholar 

  • Wolfe, J., Martinez, R., & Scott, W. A. (1998). Baseball and beer: An analysis of alcohol consumption patterns among male spectators at major-league sporting events. Annals of Emergency Medicine, 31, 629–632.

    Article  Google Scholar 

  • Woodard, R., & Leone, J. (2008). A random sample of Wake County, North Carolina residential real estate plots. Journal of Statistics Education, 16(3), 3 pages online.

    Google Scholar 

  • Wypijewski, J. (1997). Painting by numbers: Komar and Melamid’s scientific guide to art. Farrar, Straus, & Giroux, Inc.

    Google Scholar 

  • Zelazo, P. R., Zelazo, N. A., & Kolb, S. (1972). “Walking” in the newborn. Science, 176, 314–315.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Chapter 9 Comprehensive Exercises

Chapter 9 Comprehensive Exercises

9.1.1 9.A. Conceptual

9.A.1. Let X 1, …, X m and Y 1, …, Y n denote independent random samples from two distinct (X and Y) continuous populations. Let W and U (9.20) be the rank sum statistic and counting statistic, respectively, discussed in Sect. 2. Show that W=U+ n n + 1 2 when there are no tied values between the X’s and/or Y’s.

9.A.2. Let X 1, …, X m and Y 1, …, Y n be independent random samples from two distinct (X and Y) continuous populations. Let S 1, …, S m and R 1, …, R n denote the joint ranks of X 1, …, X m and Y 1, …, Y n, respectively, among the combined sample of N = (m + n) X and Y observations. Even though each individual S i and R j rank is random, explain why the total sum of the ranks, i = 1 m S i + j = 1 n R j , is not random. Show, either in general or for m = 5 and n = 6, that this total sum of ranks is always equal to the constant N(N + 1)/2.

9.A.3. The rank sum statistic W discussed in Sect. 2 uses only the sum of the combined samples ranks R 1, …, R n of the Y observations. Wouldn’t the statistic V= j = 1 n R j i = 1 m S i , where S 1, …, S m are the combined samples ranks of X 1, …, X m, respectively, be a more informative statistic to use in testing H 0: [ηY = ηX]? Explain why this is not the case.

9.A.4. Consider the 100CL% confidence interval for the difference in population means, μY - μX, as given in (9.26).

  1. (a)

    For fixed sample sizes m and n and a given set of data, how does the length of this confidence interval vary as a function of the confidence level CL?

  2. (b)

    For a fixed set of data and confidence level CL, how does the length of this confidence interval vary as a function of the two sample sizes, m and n?

9.A.5. Consider the 100CL% confidence interval for the difference in population means, μY - μX, as given in (9.26) and the corresponding 100CL% upper confidence bound for μY - μX. For fixed sample sizes m and n and a given set of data, compare the upper endpoint of the 100CL% confidence interval with the 100CL% upper confidence bound.

9.A.6. Notice that both of the 100CL% confidence intervals for the difference in population means, μY - μX, given in (9.26) and (9.36) are centered at the point estimator for μY - μX, namely, Y ¯ X ¯ . Explain why this is not necessarily the case for the point estimator D ˜ (9.14) for the difference in population medians ηY - ηX and the 100CL% confidence interval for ηY - ηX given in (9.18). What can be said about D ˜ relative to the confidence interval in (9.18)?

9.1.2 9.B. Data Analysis/Computational

9.B.1. Binge Drinking and Athletics . Find an approximate 95% lower confidence bound for p Y - p X for the athlete/non-athlete binge drinking data in Example 9.1.

9.B.2. Driving Under the Influence and Athletics . In Example 9.1, we used sample data from Leichliter et al. (1998) to compare the percentages of intercollegiate athletes and non-athletes who were involved in binge drinking in the 2 weeks prior to completing a Core and Alcohol Survey. Those authors also reported sample data on whether the respondents had driven under the influence during the past year. These results are presented in Table 9.9 for participants and non-participants in intercollegiate sports.

Table 9.9 Numbers of students reporting that they have driven under the influence in the last year prior to completing the core and alcohol survey

Let p par and p nonpar denote the percentages of all participants and non-participants, respectively, in intercollegiate athletics who have driven under the influence during the prior year.

  1. (a)

    Find a lower confidence bound for p par p nonpar . Choose your own reasonable confidence level.

  2. (b)

    Find the approximate P-value for an appropriate test of the conjecture that participants in intercollegiate athletics are more likely to have driven while under the influence in the prior year than non-participants. What is your decision at significance level .06?

  3. (c)

    How do you think these results would compare with today’s campuses?

9.B.3. Gender and Musical Instrument Choice . Consider the instrument opinion data in Table 9.2. Find a confidence interval for p Y - p X, where p X = [proportion of 9-11 year old boys who believe that girls should not play the trumpet] and p Y = [proportion of 9-11 year old girls who believe that girls should not play the trumpet]. Choose your own reasonable confidence level.

9.B.4. Gender and Musical Instrument Choice . Consider the instrument opinion data in Table 9.2. Find the approximate P-value for a test of the hypothesis that both girls and boys agree that boys should not play the flute against the general alternative that they disagree on that issue.

9.B.5. Insect Infection by Parasites . Infection of an insect by a parasite can either lead directly to a lethal disease in the insect itself or it can be transmitted further to a vertebrate host by the insect. (Malaria is an example of the latter case, since it is a disease that is transmitted to man through various mosquito species carrying the infecting Plasmodia parasite.) An important part of such a host-parasite relationship is the defense system of the host. In the case of insects, the presence of the enzyme phenoloxidase has been suggested as a possible deterrent to infection by parasites. This enzyme produces quinones that react with proteins to produce a black pigment melanin, which is then deposited on a parasite by the insect as part of its defense against it. However, large amounts of active phenoloxidase can also kill insects. Hence, for generation of an adequate, but not lethal, supply of quinones to respond effectively to a parasite, close control of the phenoloxidase activity by the insect is essential. With this question in mind, Pye (1974) studied the activation of prophenoloxidase in the plasma of immune Galleria mellonella larvae in response to exposure to a variety of microbial products. For each product, ten Galleria mellonella larvae were involved, with five of them serving as controls (no immunization) and the other five being first immunized through injection of 1.0-μg doses of Shigella flexneri lipopolysaccharide B (Difco) . Both the control and immune larvae were then exposed to the microbial product. Using a quick freezing method with acetone-dry ice, the level of prophenoloxidase activation was then obtained for all ten larvae by using a Guilford recording spectrophotometer. The data in Table 9.10 represent the results obtained for five control larvae and five immune larvae exposed to a 1 mg/ml water mixture of the microbial product Zymosan (a yeast polysaccharide). The measurements are in units of prophenoloxidase activity per .20 ml plasma of the larvae.

Table 9.10 Prophenoloxidase activation (units per .20 ml plasma) for control larvae and immune larvae, both exposed to a 1 mg/ml water mixture of the microbial product zymosan
  1. (a)

    Estimate the difference in the median prophenoloxidase activation levels for the control and immune larvae populations after exposure to the stipulated dose of Zymosan.

  2. (b)

    Estimate the probability that a randomly selected control larva will exhibit a smaller prophenoloxidase activation level than a randomly selected immune larva after exposure of each to a 1 mg/ml water mixture of Zymosan.

  3. (c)

    Find a confidence interval for the difference in the median prophenoloxidase activation levels for the control larvae population and the immune larvae population after exposure to the stipulated dose of Zymosan. Choose your own reasonable confidence level.

  4. (d)

    Find the P-value for a test of the conjecture that immunization of the larvae leads to an increase in prophenoloxidase activation resulting from exposure to a 1 mg/ml water mixture of Zymosan. What is your decision at significance level .05?

9.B.6. Anencephalus and Magnesium in Tap Water. Anencephalus is a fatal, congenital birth anomaly where a child is born without an effectively functioning brain. Links between the occurrence of this disease and a number of environmental factors were investigated in Elwood (1977) and later by Archer (1979). One of the factors that Archer considered to be a possible influence on the anencephalus rate for a region was the magnesium content of its water. He obtained anencephalus rates (deaths from anencephalus / 1000 total births) for 36 cities in Canada for the period (1950-1969), as well as the average magnesium content of their water (parts per million) during that period of time. These two quantities are presented for these Canadian cities in Table 9.11.

Table 9.11 Rate of death from anencephalus per 1000 total births and magnesium in tap water (ppm) for thirty-six cities in Canada for the period (1950–1969)

For this exercise, we divide the Canadian cities into those considered to have unusually high magnesium tap water levels (≥ 7.6 ppm) and those with low magnesium levels (< 7.6 ppm) and search for potential differences in rates of death from anencephalus. (We consider an alternative approach to analyzing these same data without grouping by high or low magnesium level in Chap. 11.)

  1. (a)

    Provide a list of the anencephalus death rates for the two samples created by this high/low magnesium criterion. What are the two sample sizes?

  2. (b)

    Find a confidence interval for the differences in mean anencephalus death rate for areas with high magnesium tap water levels and those with low magnesium levels. Choose your own reasonable confidence level.

  3. (c)

    Find the P-value for a test of the conjecture that cities with high magnesium tap water levels will have greater anencephalus death rates than those with low manesium levels. What is your decision at significance level .025?

9.B.7. Binge Drinking Athletes —Leaders or Not? In Example 9.1, we used sample data from Leichliter et al. (1998) to compare the percentages of intercollegiate athletes and non-athletes who were involved in binge drinking in the 2 weeks prior to completing a Core and Alcohol Survey. Those authors also differentiated between whether an athlete was simply a member of the team or was considered a leader on the team. The binge drinking data for these two subgroups are presented in Table 9.12. Let p member and p leader denote the percentages of all participants in intercollegiate athletics who have engaged in binge drinking in the previous 2 weeks and who are team members only or leaders of teams, respectively.

Table 9.12 Numbers of intercollegiate athletic team members and leaders reporting involvement in binge drinking in the 2 weeks prior to completing the core and alcohol survey
  1. (a)

    Find a confidence interval for p leader  − p member . Choose your own reasonable confidence level.

  2. (b)

    Find the approximate P-value for an appropriate test of the conjecture that leaders on intercollegiate athletic teams are more likely to have been involved in binge drinking in the previous 2 weeks than are athletes in lesser positions on their teams. What is your decision at significance level .045?

9.B.8. Insect Infection by Parasites. In his study of an insect’s prophenoloxidase activation response to microbial products (see Exercise 9.B.5), Pye (1974) also considered the microbial product Pseudomonas aeruginosa . The prophenoloxidase activation values (units per .20 ml plasma) for five control larvae and five immunized larvae after exposure to .10 ml aliquots of Pseudomonas aeruginosa are given in Table 9.13.

Table 9.13 Prophenoloxidase activation (units per .20 ml plasma) for control larvae and immune larvae, both exposed to .10 ml aliquots of Pseudomonas aeruginosa
  1. (a)

    Estimate the difference in median prophenoloxidase activation for the control and immune larvae populations after exposure to the stipulated dose of Pseudomonas aeruginosa.

  2. (b)

    Estimate the probability that a randomly selected control larva will exhibit a smaller prophenoloxidase activation level than a randomly selected immune larva after exposure of each to .10 ml aliquots of Pseudomonas aeruginosa.

  3. (c)

    Find a confidence interval for the difference in median prophenoloxidase activation for the control larvae population and the immune larvae population after exposure to the stipulated dose of Pseudomonas aeruginosa. Choose your own reasonable confidence level.

  4. (d)

    Find the P-value for a test of the conjecture that immunization of the larvae leads to an increase in prophenoloxidase activation resulting from exposure to .10 ml aliquots of Pseudomonas aeruginosa. What is your decision at significance level .05?

9.B.9. Anencephalus and Geomagnetic Flux . In his study of factors affecting anencephalus death rates (see Exercise 9.B.6), Archer (1979) also considered the possible linkage between these rates and the horizontal geomagnetic flux of a region. The horizontal geomagnetic flux of a region has a strong influence on where incoming charged cosmic particles strike the earth’s atmosphere, with higher flux regions diverting the particles to those with low flux. Since ionizing radiation is a known mutagen and carcinogen, it is possible that some of the geographical differences in congenital anomalies, such as anencephalus, could be accounted for by the differing intensities of cosmic radiation for the geographical regions. Dividing the 36 cities into those with high (≥ .0162) and low (< .0162) horizontal geomagnetic flux, respectively, the corresponding anencephalus death rates are given in Table 9.14.

Table 9.14 Rate of death from anencephalus per 1000 total births for thirty-six cities in Canada, divided into groups with high (≥ .0162) and low (< .0162) horizontal geomagnetic flux values, for the period (1950–1969)
  1. (a)

    Estimate the difference in mean death rates from anencephalus for cities with high horizontal geomagnetic flux values and those with low flux values.

  2. (b)

    Find a lower confidence bound for the difference in mean anencephalus death rate for areas with high horizontal geomagnetic flux values and those with low flux values. Choose your own reasonable confidence level.

  3. (c)

    Find the P-value for a test of the conjecture that cities with high horizontal geomagnetic flux values will have greater anencephalus death rates than those with low flux levels. What is your decision at significance level .030?

9.B.10. Driving Under the Influence and Gender . In Exercise 9.B.2, we used sample data from Leichliter et al. (1998) to compare the percentages of intercollegiate athletes and non-athletes who had driven while under the influence during the prior year. Those authors also reported the gender of the respondents. These results are presented in Table 9.15 for participants in intercollegiate sports.

Table 9.15 Numbers of male and female intercollegiate athletes reporting that they have driven under the influence in the last year prior to completing the core and alcohol survey

Let p femalepar and p malepar denote the percentages of all female and male participants in intercollegiate athletics, respectively, who have driven under the influence during the prior year.

  1. (a)

    Find a confidence interval for p femalepar  − p malepar . Choose your own reasonable confidence level.

  2. (b)

    Find the approximate P-value for an appropriate test of the conjecture that female participants in intercollegiate athletics are less likely to have driven while under the influence in the prior year than are male participants in intercollegiate athletics. What is your decision at significance level .06?

9.B.11. If You Have Seen One Slug, Have You Seen Them All? In Examples 9.3, 9.4, 9.5, and 9.6 we discussed statistical analyses of the data collected by Whelan (1982) on how the woodland site and waste site slugs responded to the toxic plant Allium Ursinum, commonly found in woodland but not waste sites, as the test gel. It would, of course, also be of interest to see how the two types of slugs responded to a toxic plant that was commonly found in waste, but not woodland, sites. In Table 9.16 we present precisely that data for the toxic waste site plant Rumex obtusifolius.

Table 9.16 Acceptability indices (AI) for Arion Subfuscus from woodland and waste sites with the toxic waste site plant Rumex obtusifolius as test gel

Conduct the same statistical analyses as in Examples 9.3, 9.4, 9.5, and 9.6 for the data on the toxic waste site plant Rumex obtusifolius in Table 9.16. Discuss the similarities and differences between your findings and those obtained in Examples 9.3, 9.4, 9.5, and 9.6 for the toxic woodland plant Allium Ursinum .

9.B.12. Hospital Admissions—Substance Abuse and/or Mental Illness . In a study of hospital admissions and related costs, Salit et al. (1998) collected hospital discharge and admissions records from New York City public and private hospitals for the 2 years 1992 and 1993. Among other things, they found that 44,959 of the 244,345 public hospital admissions during that period were for substance abuse and/or mental illness. For private hospitals, 37,982 out of 139,641 admissions were for substance abuse and/or mental illness.

  1. (a)

    Viewing these data from New York City as reasonably representative of data from all public and private hospitals, estimate the difference in the percentages of admissions due to substance abuse and/or mental illness for private and public hospitals.

  2. (b)

    Find an approximate 95% confidence interval for the difference in the percentages of admissions due to substance abuse and/or mental illness for private and public hospitals.

9.B.13. Baseball and Beer ! Baseball is the American pastime, but what goes with watching a baseball game? The well-known song says peanuts and crackerjack, but how about some beer to wash those snacks down? Wolfe et al. (1998) conducted a study to see just how much beer and baseball have become synonymous. Male spectators of drinking age were sampled over a three-game period—on a Friday night, a Saturday afternoon, and a Monday night—during the 1993 season at two major league ballparks. Wolfe et al. found that 65 out of 166 sampled spectators in the age group 20-35 had consumed alcohol immediately prior to entering the ballpark. For the age group 36-50, they found that 44 of the 145 sampled individuals had consumed alcohol immediately prior to entering the ballpark. Find the approximate P-value for a test of the conjecture that fans in the age group 20-35 are more likely to consume alcohol prior to going to a major league ball game than are fans in the age group 36-50.

9.B.14. Baseball and Beer and Age . In their study of beer and baseball (see Exercise 9.B.13), Wolfe et al. (1998) also found that 28 out of 212 sampled spectators in the age group 20-35 were legally intoxicated at the end of the fifth inning of the baseball game. The analogous sampling for the age group 51-65 yielded 4 out of 16 sampled spectators who were legally intoxicated at that stage of the ball game. Find an approximate 90% confidence interval for the difference in percentages of baseball fans in the age groups 20-35 and 51-65 who will be legally intoxicated at the end of the fifth inning of a baseball game.

9.B.15. Did All Americans Have the Same Access to a Home Computer ? Internet usage is the norm for Americans today, but were there differences between groups within America in the 1990’s as far as Internet access was concerned? Hoffman and Novak (1998) considered data provided by Nielsen Media Research from the Spring 1997 CommerceNet/Nielsen Internet Demographic Study (IDS) , conducted from December 1996 through 1997. Among other things, the study found that 2173 of 4906 white respondents owned a home computer, while the corresponding figures for African Americans were 143 home computer owners out of 493 respondents. Find the approximate P-value for a test of the conjecture that the percentage of home computer owners was greater for white Americans than for African Americans in the 1990’s.

9.B.16. Buying a Personal Computer . Hoffman and Novak (1998) considered data provided by Nielsen Media Research from the Spring 1997 CommerceNet/Nielsen Internet Demographic Study (IDS), conducted from December 1996 through 1997. One part of the data collected involved the number of respondents who plan to buy a personal computer in the next 6 months. Those figures for white Americans and African Americans were 819 out of 4906 and 134 out of 493, respectively. Find an approximate 97.5% confidence interval for the difference in percentages of white Americans and African Americans who plan to buy a personal computer in the 6 months following completion of the survey data collection in 1997. Comment on this finding in conjunction with the result of Exercise 9.B.15.

9.B.17. Removing Spots and Stains From Works of Art on Paper . Consider the study by Eirk (1972) in which she compared various approaches to removing stains or spots from works of art on paper, as previously discussed in Exercise 9.4.8. A second feature used for comparison of these treatments was the bursting strength in pounds per square inch of the dried paper following treatment. Again using the relatively white ledger paper without disfiguring effects, the observed average bursting strength for ten replicates of the powdered sodium formaldehyde sulfoxylate (SFS) treatment was \( {\overline{x}}_{SFS} \) = 36.4 pounds per square inch, with standard deviation s SFS  = 5.17 pounds per square inch, while the average bursting strength for ten replicates of the 1:2 aqueous 5% hypochlorite/5% sodium metabisulfite (HSM) treatment was \( {\overline{x}}_{HSM} \) = 18.5 pounds per square inch, with standard deviation s HSM  = 1.63 pounds per square inch. Assume that bursting strengths for the SFS and HSM treatments are normally distributed with means μ SFS and μ HSM , respectively, and common variance σ 2.

  1. (a)

    Estimate the difference in mean bursting strengths μ SFS  − μ HSM .

  2. (b)

    Find a lower confidence bound for μ SFS  − μ HSM . Choose your own reasonable confidence level.

  3. (c)

    Find the P-value for a test of H0: μ SFS  = μ HSM against the one-sided alternative HA: μ SFS  > μ HSM . What is your decision at significance level .001?

  4. (d)

    Do you feel comfortable with the assumption of common variance for the SFS and HSM bursting strengths? Why or why not? What alternative could you pursue if you are not comfortable with the assumption?

9.B.18. Will My Hair EVER Grow Again? One of the major concerns for men as they age is whether they will lose some or all of their hair. While it is well known that much of male baldness can be blamed on genetic inheritance from mom (guys, look at the men on your mother’s side for clues), hair restoration after initial loss of hair has become an important cosmetic industry for men. Kamimura et al. (2000) studied the effect that topical application of procyanidin B-2 (PB-2) isolated from apple juice might have on new hair growth. For 6 months they treated one group of 19 balding men twice a day with 1.8 ml of agent containing 1% PB-2, corresponding to 30 mg of PB-2 daily. A second group of 10 balding men served as a control group. They were treated in exactly the same way, except that the agent contained no PB-2. No other hair care products except shampoos and rinses were permitted during the study. Before and after the six-month period, hairs at a predetermined site were clipped from each participating subject and the diameters of the collected hairs were measured. The change in total hairs per .25 cm 2 and the change in terminal hairs (defined as >60 μm in diameter) for each of the participants was recorded and is presented in Table 9.17.

Table 9.17 Total and terminal hair growth in each subject
  1. (a)

    Estimate the difference in the medians for total hair growth in the control and PB-2 treated populations.

  2. (b)

    Estimate the probability that a randomly selected control individual will exhibit a smaller amount of total hair growth than a randomly selected individual treated with PB-2.

  3. (c)

    Find a confidence interval for the difference in the medians in total hair growth for the control and PB-2 treated populations. Choose your own reasonable confidence level.

  4. (d)

    Find the P-value for a test of the conjecture that treatment with PB-2 improves the amount of total hair growth for balding individuals. What is your decision at significance level .025?

9.B.19. Will My Hair EVER Grow Again? Consider the hair growth study by Kamimura et al. (2000) discussed in Exercise 9.B.18. Answer parts (a) through (d) of that Exercise again for total terminal hair growth.

9.B.20. Will My Hair EVER Grow Again? Consider the hair growth study by Kamimura et al. (2000) discussed in Exercise 9.B.18. Assume that total terminal hair growth for the Control and PB-2 treated populations are normally distributed with means μ Control and μ PB−2 and variances \( {\sigma}_{Control}^2 \) and \( {\sigma}_{PB-2}^2 \), respectively.

  1. (a)

    Estimate the difference in mean hair growth μ PB−2 − μ Control .

  2. (b)

    Find an upper confidence bound for μ Control  − μ PB−2. Choose your own reasonable confidence level.

  3. (c)

    Find the P-value for a test of H0 : μ Control  = μ PB−2 against the one-sided alternative HA : μ PB−2 > μ Control . What is your decision at significance level .010?

9.B.21. Will My Hair EVER Grow Again? Consider the hair growth study by Kamimura et al. (2000) discussed in Exercise 9.B.18. Assume that total hair growth for the Control and PB-2 treated populations are normally distributed with means μ Control and μ PB−2 and variances \( {\sigma}_{Control}^2 \) and \( {\sigma}_{PB-2}^2 \), respectively.

  1. (a)

    Estimate the difference in mean total hair growth μ PB−2 − μ Control .

  2. (b)

    Find an upper confidence bound for μ Control  − μ PB−2. Choose your own reasonable confidence level.

  3. (c)

    Find the P-value for a test of H0 : μ Control  = μ PB−2 against the one-sided alternative HA : μ PB−2 > μ Control . What is your decision at significance level .010?

9.1.3 9.C. Activities

9.C.1. Are Female College Students More Liberal With Regard to Social Issues Than Male College Students? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect the relevant data for samples of 10 men and 10 women, conduct an appropriate set of statistical analyses, and write a two-page report describing your experiment and statistical conclusions.

9.C.2. Do College Science/Math Majors Spend Less Time Exercising Per Week than College Non-Science/Non-Math Majors? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect the relevant data for samples of 10 college science/math majors and 10 college non-science/non-math majors, conduct an appropriate set of statistical analyses, and write a two-page report describing your experiment and statistical conclusions.

9.C.3. Just For You! Find a journal article in a field of your interest that presents the results of a study that involved independent samples from two distinct populations. Prepare a short (2-3 pages) summary report of the statistical findings in the article and attach a copy of the original paper with your summary.

9.C.4. M&M Colors—Peanuts Versus Plain. Mars, Inc. makes both M&M’s Plain and M&M’s Peanut candies. They claim that their production processes provide for the same percentage red pieces for both the plain and the peanut candies. Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this claim. Collect adequate relevant data, conduct an appropriate set of statistical analyses, and write a three-page report describing your experiment and statistical conclusions. (You can eat the M&M’s upon completion of your report!)

9.C.5. Lasting Power—Pennies or Nickels? Is there a difference between the length of time that U. S. pennies and nickels stay in common circulation? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect adequate relevant data, conduct an appropriate set of statistical analyses, and write a three-page report describing your experiment and statistical conclusions.

9.C.6. Do Female College Students Study More Than Male College Students ? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect the relevant data for samples of 10 male college students and 10 female college students, conduct an appropriate set of statistical analyses, and write a two-page report describing your experiment and statistical conclusions.

9.C.7. Do Male College Students Get Better Grades Than Female College Students ? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect the relevant data for samples of 10 male college students and 10 female college students, conduct an appropriate set of statistical analyses, and write a two-page report describing your experiment and statistical conclusions.

9.C.8. Does Smoking Participation Decrease with College Advancement? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect the relevant data for samples of 10 underclassmen (freshmen or sophomores) and 10 upperclassmen (juniors or seniors), conduct an appropriate set of statistical analyses, and write a two-page report describing your experiment and statistical conclusions.

9.C.9. Who Has More Friends on Facebook—Men or Women? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect the relevant data for samples of 10 men and 10 women, conduct an appropriate set of statistical analyses, and write a two-page report describing your experiment and statistical conclusions.

9.C.10. Do College Students Sleep Later on Weekends Than Their Parents? Design an experiment (including the appropriate data to collect and how to collect it) that will enable you to statistically address this question. Collect the relevant data for samples of 20 college students and 20 parents (from different families), conduct an appropriate set of statistical analyses , and write a two-page report describing your experiment and statistical conclusions.

9.1.4 9.D. Internet Archives

9.D.1. Surveys . Identify three organizations that routinely collect survey data on current topics and locate the Internet sites where they periodically present the results of their surveys. Select one such survey of interest to you that involves comparison of percentages for at least two groups and prepare a brief report on its findings.

9.D.2. Federal Government . Identify three government agencies that routinely gather national data and locate the Internet sites where they periodically present the updates to their data collections. Select one specific data collection that is of interest to you and prepare a brief report using the data to compare two groups.

9.D.3. Professional Societies . Identify three professional societies that routinely gather information relevant to their membership and locate the Internet sites where they report their findings. Select one specific data collection that is of interest to you and prepare a brief report using the data to compare two groups.

9.D.4. Nonprofit Organizations . Identify three nonprofit organizations that routinely gather information relevant to their cause and locate the Internet sites where they report their findings. Select one specific data collection that is of interest to you and prepare a brief report using the data to compare two groups.

9.D.5. Academic Organizations . Identify three academic entities that routinely gather information relevant to their ongoing research projects and locate the Internet sites where they report their findings. Select one specific data collection that is of interest to you and prepare a brief report using the data to compare two groups.

9.D.6. Medical Research . Use the Internet to locate a paper published in a medical field within the past 2 years that presents a study involving data collection and comparison of two groups. If the data are not actually available in the published article, contact the authors to see if they will allow you to access the data. If you are successful, use the data to verify the statistical summary in the published article.

9.D.7. Climate Change Research . Use the Internet to locate a paper published within the past 2 years on a topic related to climate change that presents a study involving data collection and comparison of two groups. If the data are not actually available in the published article, contact the authors to see if they will allow you to access the data. If you are successful, use the data to verify the statistical summary in the published article.

9.D.8. Social Science Research . Use the Internet to locate a paper published in a social science field within the past 2 years that presents a study involving data collection and comparison of two groups. If the data are not actually available in the published article, contact the authors to see if they will allow you to access the data. If you are successful, use the data to verify the statistical summary in the published article.

9.D.9. Humanities Research . Use the Internet to locate a paper published in a humanities field within the past 2 years that presents a study involving data collection and comparison of two groups. If the data are not actually available in the published article, contact the authors to see if they will allow you to access the data. If you are successful, use the data to verify the statistical summary in the published article.

9.D.10. STEM Research . Use the Internet to locate a paper published in a STEM field (science, technology, engineering, or mathematics) within the past 2 years that presents a study involving data collection and comparison of two groups. If the data are not actually available in the published article, contact the authors to see if they will allow you to access the data. If you are successful, use the data to verify the statistical summary in the published article.

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Wolfe, D.A., Schneider, G. (2017). Statistical Inference for Two Populations–Independent Samples. In: Intuitive Introductory Statistics. Springer Texts in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-56072-4_9

Download citation

Publish with us

Policies and ethics