| bar graph | A graph on which data from groups of subjects are represented by bars of differing heights tied to the value of the dependent variable for the group.
|
 |
 |
 |
| bivariate linear regression | A statistical technique for fitting a straight line to a set of data points representing the paired values of two variables.
|
 |
 |
 |
| boxplot | A graphical display of the values of the five-number summary of a distribution.
|
 |
 |
 |
| coefficient of nondetermination | Statistic indicating the proportion of variance in one variable not accounted for by variation in a second variable.
|
 |
 |
 |
| correlation matrix | A matrix giving the set of all possible bivariate correlations among three or more variables.
|
 |
 |
 |
| descriptive statistics | Statistics that allow you to summarize the properties of an entire distribution of scores with just a few numbers.
|
 |
 |
 |
| dummy code | In a data file, numbers used to stand for category values; for example, 0 5 male, 1 5 female.
|
 |
 |
 |
| exploratory data analysis (EDA) | Examining data for potentially important patterns and relationships, especially through the use of simple graphical techniques and numerical summaries.
|
 |
 |
 |
| five-number summary | A set of five numbers used to summarize the characteristics of a distribution: the minimum, first quartile, median, third quartile, and maximum.
|
 |
 |
 |
| frequency distribution | A graph or table displaying a set of values or range of values of a variable, together with the frequency of each.
|
 |
 |
 |
| histogram | A graph depicting a frequency distribution in which the frequencies of class intervals are represented by adjacent bars along the scale of measurement.
|
 |
 |
 |
| interquartile range | A measure of spread in which an ordered distribution of scores is divided into four groups. The score separating the lower 25% is subtracted from the score separating the upper 25%. The resulting difference is divided by 2.
|
 |
 |
 |
| least-squares regression line | Straight line, fit to data, that minimizes the sum of the squared distances between each data point and the line.
|
 |
 |
 |
| line graph | A graph on which data relating the variables are plotted as points connected by lines.
|
 |
 |
 |
| linear regression | Statistical technique used to determine the straight line that best fits a set of data.
|
 |
 |
 |
| mean | The arithmetic average of the scores in a distribution. The most frequently reported measure of center.
|
 |
 |
 |
| measure of center | A single score, computed from a data set, that represents the general magnitude of the scores in the distribution.
|
 |
 |
 |
| measure of spread | A single score, computed from a data set, that represents the amount of variability of the scores in the distribution (i.e., how spread out they are).
|
 |
 |
 |
| median | The middle score in an ordered distribution.
|
 |
 |
 |
| mode | The most frequent score in a distribution. The least informative measure of center.
|
 |
 |
 |
| normal distribution | A specific type of frequency distribution in which most scores fall around the middle category. Scores become less frequent as you move from the middle category. Also referred to as a bell-shaped curve.
|
 |
 |
 |
| outliers | Values of a variable in a set of data that lie far from the other values.
|
 |
 |
 |
| Pearson product-moment correlation (Pearson r) | The most popular measure of correlation. Indicates the magnitude and direction of a correlational relationship between variables.
|
 |
 |
 |
| phi (φ) coefficient | Measure of correlation used when both variables are measured on a dichotomous scale.
|
 |
 |
 |
| pie graph | Type of graph in which a circle is divided into segments. Each segment represents the proportion or percentage of responses falling in a given category of the dependent variable.
|
 |
 |
 |
| point-biserial correlation | A variation of the Pearson correlation used when one variable is measured on a dichotomous scale.
|
 |
 |
 |
| range | The least informative measure of spread; the difference between the lowest and highest scores in a distribution.
|
 |
 |
 |
| regression weight | Value computed in a linear regression analysis that provides the slope of the least squares regression line. See also beta weight.
|
 |
 |
 |
| resistant measure | Statistics that are not strongly affected by the presence of outliers or skewness in the data.
|
 |
 |
 |
| scatter plot | A plot used to display correlational data from two measures. Each point represents the two scores provided by each subject, one for each measure, plotted against one another.
|
 |
 |
 |
| skewed distribution | A frequency distribution in which most scores fall into categories above or below the middle category.
|
 |
 |
 |
| Spearman rank order correlation (rho) | A measure of correlation used when variables are measured on at least an ordinal scale.
|
 |
 |
 |
| standard deviation | The most frequently reported measure of spread. The square root of the variance.
|
 |
 |
 |
| standard error of estimate | A measure of the accuracy of prediction in a linear regression analysis. It is a measure of the distance between the observed data points and the least squares regression line.
|
 |
 |
 |
| stemplot | A graphical display of a distribution of scores consisting of a column of values (the stems) representing the leftmost digit or digits of the scores and, aligned with each stem, a row of values representing the rightmost digit of each score having that particular stem value.
|
 |
 |
 |
| variance | A measure of spread. The averaged square deviation from the mean.
|