Statistics Primer
Statistics Primer
(See related pages)

Introduction
Representation of Data
Descriptive Statistics
Correlation Statistics
Inferential Statistics
Summary

### Introduction

To understand God's thoughts we must study statistics; for these are the measure of his purpose.
(Florence Nightingale, 1820-1910)

Sociological research can have three distinct goals: description, explanation, and prediction. Description is always an important part of research, but most sociologists attempt to explain and predict what they observe. The three research methods most commonly used by sociologists are observational techniques, surveys, and experiments. In each case, measurement is involved that yields a set of numbers, which are the findings, or data, produced by the research study. Sociologists and other scientists summarize data, find relationships between sets of data, and determine whether experimental manipulations have had an effect on some variable of interest.

The word statistics has two meanings: (1) the field that applies mathematical techniques to the organizing, summarizing, and interpreting of data, and (2) the actual mathematical techniques themselves. Knowledge of statistics has many practical benefits. Even a rudimentary knowledge of statistics will make you better able to evaluate statistical claims made by science reporters, weather forecasters, television advertisers, political candidates, government officials, and other persons who may use statistics in the information or arguments they present.

### Representation of Data

Because a list of raw data may be difficult to interpret, sociologists prefer to represent their data in an organized way. Two of the most common ways are frequency distributions and graphs.

Frequency Distributions

Suppose that you had a set of 20 scores from a 100-point sociology exam. You might arrange them in a frequency distribution, listing the frequency of each score or group of scores in a set of scores. Using the set of scores in Table B.1, you would set up a column including the highest and lowest scores, as well as the possible scores in between. In this case, the highest score is 94 and the lowest is 80. You would then count the frequency of each score and list it in a separate column. The total of the frequencies in the distribution is symbolized by the letter N.

The frequency distribution might show a pattern in the set of scores that is not apparent when simply examining the individual scores. In this example (presented in Table B.1), the exam scores do not bunch up toward the lower, middle, or upper portions of the distribution. In some cases, typically when the difference between the highest score and the lowest score is greater than 15, you might prefer to use a grouped frequency distribution. The scores are grouped into intervals, and the frequency of scores in each interval is listed in a separate column. The intervals can be of any size, but, for ease of construction, a grouped frequency distribution should end up with no more than about 10 groups. A grouped frequency distribution provides less precise information than does an ungrouped one, because the individual scores are lost. However, the benefit of a grouped frequency distribution is that one can understand any trends in the data at a quick glance.

Learning Check #1: Suppose we ask 23 students how many music CDs they own. Present the following data in a frequency distribution: 43, 15, 52, 24, 84, 36, 75, 70, 98, 44, 56, 60, 48, 41, 38, 7, 62, 49, 32, 71, 25, 46, 58.

Graphs

If a picture is worth a thousand words, then a graph is worth several paragraphs in a research report. Because it provides a pictorial representation of the distribution of scores, a graph can be an even more effective representation of research data than a frequency distribution. Among the most common kinds of graphs are pie graphs, frequency histograms, frequency polygons, and line graphs.

Pie Graph

A simple, but visually effective, way of representing data is the pie graph. It represents data as percentages of a pie-shaped graph. The total of the slices of the pie must add up to 100 percent.

Learning Check #2: Suppose in a class of 150 students there are 13 First-year students, 68 Sophomores, 50 Juniors, and 19 Seniors. Construct a pie chart to illustrate this data.

Frequency Polygon

A frequency polygon serves the same purpose as a frequency histogram. As shown in Figure B.2, the frequency polygon is drawn by connecting the points, representing frequencies, located above the scores. Note that the polygon is completed by extending it to the abscissa one score below the lowest score and one score above the highest score in the distribution.

An advantage of the frequency polygon over the frequency histogram is that it permits the plotting of more than one distribution on the same set of axes. Plotting more than one frequency histogram on a set of axes would create a confusing graph. If more than one frequency polygon is plotted on a set of axes, they should be distinguished from one another. This can be done by drawing a different kind of line for each polygon (perhaps a solid line for one and a broken line for the other), drawing the lines in different colors (perhaps red for one polygon and blue for the other), or representing the points above the scores with geometric shapes (perhaps a circle for one polygon and a triangle for the other).

There are a few shapes that a frequency polygon can take that are particularly interesting to sociologists and other social researchers. A graph in which scores bunch up toward either end of the abscissa (as shown in Figure B.3) is said to be skewed. The skewness of a graph is in the direction of its "tail." If the scores bunch up toward the high end, the graph has a negative skew. If the scores bunch up toward the low end, the graph has a positive skew. A distribution is said to be normal (or bell-shaped) if the scores bunch up in the middle and then taper off fairly equally on each side. Finally, a distribution is called a rectangular distribution if the scores are fairly evenly distributed throughout the graph.

Learning Check #3: Remember the 23 students who reported the number of music CDs that they own? Present the following data in a frequency polygon: 43, 15, 52, 24, 84, 36, 75, 70, 98, 44, 56, 60, 48, 41, 38, 7, 62, 49, 32, 71, 25, 46, 58.

Learning Check #4: What shape is the distribution graphed in?
Learning Check #5: What would have made the distribution take on a positive skew? A negative skew? A rectangular shape?

Line Graph

Whereas pie graphs, frequency histograms, and frequency polygons are useful for plotting frequency data, a line graph is useful for plotting data generated by experimental social research. It uses lines to represent the relationship between independent variables and dependent variables. If you skim through your introductory sociology textbook, you will see several examples of line graphs. The graph shown in Figure B.4 represents the data from an investigation of the relationship between exercise and weight loss. Note in this figure that one line represents a group of people who agree to exercise regularly and the other line represents a group of people who do not engage in exercise. In all other ways these two groups are equal. They are weighed one week after agreeing to participate in the study and again two weeks after agreeing to participate. Note that this graph allows the reader to note quickly the benefits of exercise on weight loss.

### Descriptive Statistics

Suppose you gained access to the hundreds, or thousands, of high school grade point averages of all the freshmen at your college or university. What is the most typical score? How similar are the scores? Simply scanning the scores would provide, at best, gross approximations of the answers to these questions. To obtain precise answers, sociologists use descriptive statistics, which include measures of central tendency and measures of variability.

Measures of Central Tendency

A measure of central tendency is a single score that best represents an entire set of scores. The measures of central tendency include the mode, the median, and the mean.

Mode

The mode is the most frequently occurring score in a set of scores. In the frequency distribution of exam scores discussed, the mode is 90. If two scores occur equally often, the distribution is bimodal. If the data set is made up of a counting of categories, then the category with the most cases is considered the mode. For example, in determining the most common academic major at your school, the mode is the major with the most students. The winner of a presidential primary election in which there are several candidates would represent the mode--the person selected by more voters than any other.

The mode can be the best measure of central tendency for practical reasons. Imagine a car dealership given the option of carrying a particular model, but limited to selecting just one color. The dealership owner would be wise to choose the modal color.

Learning Check #5: A researcher is interested in the effect of family size on self-esteem. To begin this study, 10 students are each asked how many brothers and sisters they have. The responses are as follows: 2, 3, 1, 0, 9, 2, 3, 2, 4, 2. What is the mode for this set of data?

Mean

The mean is the arithmetic average, or simply the average, of a set of scores. You are probably more familiar with it than any other measure of central tendency. You encounter the mean in everyday life whenever you calculate your exam average, batting average, gas mileage average, or a host of other averages.

The mean of a sample is calculated by adding all the scores and dividing by the number of scores.

Exam Scores: 99, 92, 93, 94, 97

Learning Check #6: What is the mean number of brothers and sisters listed in Learning Check #5?

Median

The median is the middle score in a distribution of scores that have been ranked in numerical order. If the median is located between two scores, it is assigned the value of the midpoint between them (for example, the median of 23, 34, 55, and 68 would equal 44.5). The median is the best measure of central tendency for skewed distributions, because it is unaffected by extreme scores. Note that in the example below the median is the same in both sets of exam scores, even though the second set contains an extreme score. The mean is quite different, due to the one extreme score on Exam B.

Exam A: 23, 25, 63, 64, 67

Exam B: 23, 25, 63, 64, 98

When Disraeli pointed out the ease of lying with statistics, he might have been referring, in particular, to measures of central tendency. Suppose a baseball general manager is negotiating with an agent about a salary for a baseball catcher of average ability. Both might use a measure of central tendency to prove their own points, perhaps based on the salaries of the top seven catchers, as shown in Table B.2. The general manager might claim that a salary of \$340,000 (the median) would provide the player with what he deserves, based on an average salary of the other players. The agent might counter that a salary of \$900,000 (the mean) would provide the player with what he deserves, based on an average salary of the other players. Note that neither would technically be lying: they would simply be using statistics that favored their position. As Scottish writer Andrew Lang (1844-1912) warned, beware of anyone who "uses statistics as a drunken man uses lampposts--for support rather than for illumination."

Learning Check #7: What is the median number of brothers and sisters listed in Learning Check #5?

Learning Check #8: Note that the mean number of brothers and sisters is quite a bit different than the median number of brothers and sisters. In this case, which measure of central tendency would be most appropriate to report? Why?

Measures of Variability

Although a measure of central tendency is certainly important, it does not completely represent a distribution by itself. Given a measure of central tendency, you have an idea of where scores tend to fall, but you don’t know to what extent the scores differ from one another. A measure of the amount of dispersion contained within a data set is called a measure of variability. Except when all scores in a data set are identical, all sets of scores vary to some degree. Consider the members of your sociology class. They would vary on a host of measures, including height, weight, and grade point average. Measures of variability include the range, the variance, and the standard deviation.

Range

The range is the difference between the highest and lowest scores in a distribution. The range provides limited information, because distributions in which scores bunch up toward the beginning, middle, or end of the distribution might have the same range. Of course the range is useful as a rough estimate of how a score compares with the highest and lowest in a distribution. For example, a student might find it useful to know whether he or she did near the best or the worst on an exam. The range of scores in the distribution of 20 grades in the earlier example in Table B.1 would be the difference between 94 and 80, or 14.

Learning Check #9: A social researcher would like to know how many digits people in different age categories can recall with only one presentation of a list. She creates random lists of digits and presents them to participants. The number of digits recalled by the first 10 participants is as follows: 5, 9, 6, 10, 9, 7, 8, 7, 9, 12. What is the range of this data set?

Variance

A more informative measure of variability is the variance, which represents the variability of scores around their group mean. Unlike the range, the variance takes into account every score in the distribution. Technically, the variance is the average of the squared deviations from the mean.

Suppose you wanted to calculate the variance for the sets of 10-point quiz scores in Quiz A and Quiz B (Table B.3). First, find the group mean. Second, find the deviation of each score from the group mean. Note that deviation scores will be negative for scores that are below the mean. As a check on your calculations, the sum of the deviation scores should equal zero. Third, square the deviation scores. By squaring the scores, negative scores are made positive and extreme scores are given relatively more weight. Fourth, find the sum of the squared deviation scores. Fifth, divide the sum by the number of scores. This yields the variance. Note that the variance for Quiz A is larger than that for Quiz B, indicating the students were more varied in their performances on Quiz A.

Standard Deviation

The standard deviation, or S, is the square root of the variance. The standard deviation of Quiz A would be

S = 3.19.

The standard deviation of Quiz B would be

S = 1.414.

Why not simply use the variance? One reason is that, unlike the variance, the standard deviation is in the same units as the raw scores. This makes the standard deviation more meaningful. Thus, it would make more sense to discuss the variability of a set of IQ scores in IQ points than in squared IQ points. The standard deviation is used in the calculation of many other statistics.

Learning Check #10: The exam scores for two sections of introductory sociology are listed below. Compute the standard deviation for each section. Section #1: 42, 45, 56, 56, 60, 62, 67, 68, 70, 71. Section #2: 57, 57, 57, 70, 75, 77, 79, 83, 83, 92.

Learning Check #11: Suppose that there were two groups that discussed issues related to abortion. Each member of each group rated on a scale of 1 to 10 their opinion regarding abortion (1 = Totally against abortion; 5 = Neutral; 10 = Totally in favor of abortion). The mean for Group A was found to be 5 with a standard deviation of .02. For Group B the mean was also 5, but the standard deviation was 3.42. Which group would have the more lively debates?

The Normal Curve and Percentiles

As illustrated in Figure B.5, the normal curve is a bell-shaped graph that represents a hypothetical frequency distribution in which the frequency of scores is greatest near the mean and progressively decreases toward the extremes. In essence, the normal curve is a smooth frequency polygon based on an infinite number of scores. The mean, median, and mode of a normal curve are the same. Many variable human characteristics, such as height, weight, and intelligence, fall on a normal curve.

One useful characteristic of a normal curve is that certain percentages of scores fall at certain distances (measured in standard deviation units) from its mean. A special statistical table makes it a simple matter to determine the percentage of scores that fall above or below a particular score or between two scores on the curve. For example, about 68 percent of scores fall between plus and minus one standard deviation from the mean; about 95 percent fall between plus and minus two standard deviations from the mean; and about 99 percent fall between plus and minus three standard deviations from the mean.

For example, consider an aptitude test, with a mean of 100 and a standard deviation of 15. What percentage of people score above 115? Because aptitude scores fall on a normal curve, about 34 percent of the scores fall between the mean and one standard deviation (in this case 15 points) above the mean. We also know that for a normal distribution 50 percent of the scores fall above the mean and 50 percent fall below the mean. Thus, about 84 percent (50 percent below the mean and 34 percent between the mean and a score of 115) of the scores fall below 115. If 84 percent fall below 115, then 16 percent (100 percent minus 84 percent) must fall above a score of 115.

Learning Check #12: An introductory sociology teacher who has taught for years has developed a comprehensive final exam that is normally distributed with a mean of 200 points and a standard deviation of 25 points. (a) What percentage of the students score above 200 points? (b) What percentage of the students score below 175 points? (c) What percentage of the students score more than 250 points?

Scores along the abscissa of the normal curve also represent percentiles--the scores at or below which particular percentages of scores fall. Percentiles are frequently used, as they give us a quick idea of how a score compares with the rest of the data set. If a score is equal to the 10th percentile, then you know that 10 percent of the scores fell at or below that value and 90 percent of the scores were above that value. With respect to IQ scores, a score of 115 would have a percentile rank of 84.

Learning Check #13: What are the percentile ranks for the three scores listed in Learning Check #12: 200, 175, and 250?

Learning Check #14: Suppose you take your daughter Emily to the doctor’s office for a well-check and find out that she is in the 5th percentile for height and 7th percentile for weight. What do you now know about Emily, as compared with other children her age?

### Correlational Statistics

So far, you have been reading about statistics that describe sets of data. In many research studies, sociologists might want to know the extent to which two variables are related. Correlational statistics do just that. Correlational statistics yield a number called the coefficient of correlation. The coefficient may vary from 0.00 to 1.00. Correlations may also be either positive or negative. In a positive correlation, scores on two different variables increase and decrease together. For example, there is a positive correlation between high school average and freshmen grade point average in college. In a negative correlation, as scores for one variable decrease, they increase for the other variable. For example, there is a negative correlation between absenteeism and course performance. The strength of a correlation depends on its size, not its sign. For example, a correlation of -.72 is stronger than a correlation of +.53.

Correlational statistics are important because they permit us to determine the strength and direction of the relationship between different sets of data or to predict scores on one distribution based on our knowledge of scores on another. If the correlation between two sets of data were a perfect 1.00, we could predict one score from another with complete accuracy. But because correlations are almost always less than perfect, we predict one score from another only with a particular probability of being correct--the higher the correlation, the higher the probability.

It cannot be stressed strongly enough that correlation does not mean causation. For example, years ago, authorities presumed that autistic children, who have poor social and communication skills, were caused by "refrigerator mothers." Mothers of autistic children were aloof from them. This was taken as a sign that the children suffered from mothers who were emotionally cold. Knowing that this is simply a correlation, you might wonder whether causality was in the opposite direction. Perhaps autistic children, who do not respond to their mothers, cause their mothers to become aloof from them. Moreover, why would a mother have several normal children, then an autistic child, and then several more normal ones? It would be difficult to believe she was a warm parent to all but one. Today, evidence indicates that autism is a neurological problem that has nothing to do with the mother's emotionality.

As another example, although there is a positive correlation between smoking and cancer in human beings, this correlation is not scientifically acceptable evidence that smoking causes cancer. Perhaps another factor (such as a level of stress tolerance) might make someone prone to both smoking and cancer, without smoking's necessarily causing cancer. Of course, correlation does not imply the absence of causation. For example, there may indeed be a causal relationship between smoking and cancer. The point is that if two variables are strongly correlated, one of the variables may cause the other, or there may not be a causal link: we just cannot tell for sure based on a correlation coefficient. But remember that knowing that two variables are related is still an important piece of information.

Learning Check #15: Many studies have determined that there is a positive correlation between viewing violence on television and violent behavioral patterns. What does this mean?

Learning Check #16: Given that there is a positive correlation between viewing violence on television and violent behavior, can we conclude from this data that watching the violence on television causes children to behave violently?

Learning Check #17: Researchers used to believe that there was a negative correlation between age and IQ. Recently, this correlation has turned out to be much weaker than we originally thought. Describe what is meant by a negative correlation between age and IQ.

Scatter Plots

Correlational data are graphed using a scatter plot, also known as a scattergram or scatter diagram. In a scatter plot, one variable is plotted on the abscissa and the other on the ordinate. Each participant's scores on both variables are indicated by a dot placed at the junction between those scores on the graph. This produces one dot for each participant. The pattern of the dots gives a rough impression of the size and direction of the correlation. In fact, a line drawn through the dots, or line of best fit, helps estimate this. The closer the dots lie to a straight line, the stronger the correlation. Figure B.6 illustrates several kinds of correlation.

Pearson's Product-Moment Correlation

The most commonly used coefficient of correlation is the Pearson's product-moment correlation (Pearson's r), named for the English statistician Karl Pearson. One formula for calculating it is presented in Figure B.7. The example assesses the relationship between home runs and stolen bases by five baseball players during one month of a season. Recall that correlation coefficients range from 0 to 1.00 and can be either negative or positive. This correlation of -.23 is considered to be a weak, negative correlation.

Learning Check #18: In a large study of twins, the Minnesota Twin study found a correlation of +.71 between the IQ scores of identical twins. Another study found that family income is correlated +.30 with the IQ of children. What do these correlation coefficients mean?

Coefficient of Determination

One last number that can be helpful in understanding the relationship between two variables is the coefficient of determination. The coefficient of determination is the amount of variability that can be accounted for in one variable by knowing a second variable. Think for a moment of all the things that can have an impact on an exam score: amount of time spent studying, how you feel the day of the exam, amount of sleep the previous night, whether you were sick or felt well, as well as a host of other factors. This means that the variability in your exam scores (as they are usually not all the exact same score) is due to many factors. A certain amount of the variability may be due to the number of hours you studied for the exam. Suppose that you compute the Pearson correlation between the number of hours you spent studying for the exam and the score on the exam and find a correlation of +.70. To get the coefficient of determination you simply square the Pearson correlation, which in this case is the square of .70, or .49. If you multiply this result by 100 percent, you end up with 49 percent. This indicates that of all the things that can affect your exam score, 49 percent of the influence is due to the amount of time spent studying.

Learning Check #19: Given the correlation coefficients in Learning Check #18 of +.71 and +.30, explain what you can determine with respect to the coefficient of determination.

Learning Check #20: Suppose that the correlation coefficient between two variables is -.80. Would this lead to a different conclusion based on the coefficient of determination than a correlation of +.80?

### Inferential Statistics

Inferential statistics help us determine whether the difference we find between our experimental and control groups is caused by the manipulation of the independent variable or by chance variation in the performances of the groups. If the difference has a low probability of being caused by chance variation, we can feel confident in the inferences we make from our samples to the populations they represent.

Hypothesis Testing

In experimental research, sociologists use inferential statistics to test the null hypothesis. The null hypothesis states that the independent variable has no effect on the dependent variable. Consider an experimental study of the effect of overlearning on examination performance in college students. When we use overlearning, we study material until we know it perfectly, and then continue to study it some more. At the beginning of the experiment, the participants would be selected from the same population (college students) and randomly assigned to either the experimental group (overlearning) or the control group (normal studying). Thus, the independent variable would be the method of studying (overlearning versus normal studying). The dependent variable might be the score on a 100-point exam on the material studied.

Learning Check #21: Identify the independent variable, dependent variable, and the null hypothesis from the following scenario:

A researcher would like to know if highlighting a textbook helps students to score better on the exams. She randomly selects one-half of the students in an introductory class and instructs them to highlight their textbooks as they read. The other students are instructed to do NO highlighting as they read.

If the experimental manipulation has no effect, the experimental and control groups would not differ significantly in their performance on the exam. In that case, we would fail to reject the null hypothesis. If the experimental manipulation has an effect, the two groups would differ significantly in their performance on the exam. In that case, we would reject the null hypothesis. This would indirectly support the research hypothesis, which would predict that overlearning improves exam performance. But how large must a difference be between groups for it to be significant? To determine whether the difference between groups is large enough to minimize chance variation as an alternative explanation of the results, we must determine the statistical significance of the difference between them.

Statistical Significance

The characteristics of samples drawn from the population they represent will almost always vary somewhat from those of the true population. This is known as sampling error. Thus, a sample of five students taken from your sociology class (the population) would vary somewhat from the class means in age, height, weight, intelligence, grade point average, and other characteristics.

If we repeatedly took random samples of five students, we would continue to find that they differ from the population. But what of the difference between the means of two samples, presumably representing different populations, such as a population of students who practice overlearning and a population of students who practice normal study habits? How large would the differences have to be before we attributed them to the independent variable rather than to chance? In this example, how much difference in the performance of the experimental group and the control group would be needed before we could confidently attribute the difference to the practice of overlearning?

The larger the difference between the means of two samples, the less likely it would be attributable to chance. Sociologists typically accept a difference between sample means as statistically significant if it has a probability of less than 5 percent of occurring by chance. This is known as the .05 level of statistical significance. In regard to the example, if the difference between the experimental group and the control group has less than a 5 percent probability of occurring by chance, we would reject the null hypothesis. Our research hypothesis would be supported: overlearning is effective. Scientists who wish to use a stricter standard employ the .01 level of statistical significance. This means that a difference would be statistically significant if it had a probability of less than 1 percent of being obtained by chance alone.

The difference between the means of two groups will more likely be statistically significant under the following conditions:

1. When the samples are large.
2. When the difference between the means is large.
3. When the variability within the groups is small.

Note that statistical significance is a statement of probability. We can never be certain that what is true of our samples is true of the population they represent. This is one of the reasons why all scientific findings are tentative. Moreover, statistical significance does not indicate practical significance. A statistically significant effect may be too small or be produced at too great a cost of time or money to be useful. What if those who practice overlearning must study two extra hours each day to improve their exam performance by a statistically significant, yet relatively small, 3 points? Knowing this, students might choose to spend their time in another way. As the American statesman Henry Clay (1777-1852) noted, in determining the importance of research findings, by themselves "statistics are no substitute for judgment."

Learning Check #22: Suppose that the researcher in Learning Check #21 rejected the null hypothesis and concluded that there was a significant difference due to highlighting. What would this mean in terms of probability?

Learning Check #23: Can research demonstrate statistical significance, yet have no real practical value?

Learning Check #24: Could two groups have a difference that looked important, yet not be statistically different from one another? That is, could the difference between two groups appear to have practical value, yet not achieve statistical significance?

### Summary

Representation of Data

Data are often represented in frequency distributions, which indicate the frequency of each score in a set of scores. Sociologists also use graphs to represent data. These include pie graphs, frequency histograms, frequency polygons, and line graphs. Line graphs are important in representing the results of experiments, because they are used to illustrate the relationship between independent and dependent variables.

Descriptive Statistics

Descriptive statistics summarize and organize research data. Measures of central tendency represent the typical score in a set of scores. The mode is the most frequently occurring score, the median is the middle score, and the mean is the arithmetic average of the set of scores. Measures of variability represent the degree of dispersion of scores. The range is the difference between the highest and lowest scores. The variance is the average of the squared deviations from the mean of the set of scores. And the standard deviation is the square root of the variance.

Many kinds of measurements fall on a normal, or bell-shaped, curve. A certain percentage of scores fall below each point on the abscissa of the normal curve. Percentiles identify the percentage of scores that fall below a particular score.

Correlational Statistics

Correlational statistics assess the relationship between two or more sets of scores. A correlation may be positive or negative and vary from 0.00 to plus or minus 1.00. The existence of a correlation does not necessarily mean that one of the correlated variables causes changes in the other. Nor does the existence of a correlation preclude that possibility. Correlations are commonly graphed on scatter plots. Perhaps the most common correlational technique is the Pearson's product-moment correlation. You square the Pearson's product-moment correlation to get the coefficient of determination, which will indicate the amount of variance in one variable accounted for by another variable.

Inferential Statistics

Inferential statistics permit social researchers to determine whether their findings can be generalized from their samples to the populations they represent. Consider a simple investigation in which an experimental group that is exposed to a condition is compared with a control group that is not. For the difference between the means of the two groups to be statistically significant, the difference must have a low probability (usually less than 5 percent) of occurring by normal random variation.