Correlation
The tendency for two measures or variables, such as height and weight, to vary together or be related for individuals in a group. If, as in the case of height and weight, people who are high on one variable (tall) tend to be high on the other (heavy), the correlation is said to be positive. As another example, months of practice and golf scores would have negative correlation; for, ordinarily, as the first variable is high (practice increases), the second tends to be low (score decreases as golfer improves). Correlation implies only association, not cause.
Correlation coefficient
The customary index for expressing the degree of relationship observed between two sets of measures for the same group. The coefficient can range from -1.00, showing perfect negative correlation, through zero, indicating no correlation, to +1.00, showing perfect positive correlation. If, for example, the correlation coefficient between height and weight for a group of men were +1.00, knowing a man's height would permit you to predict his weight without error. But if the correlation coefficient between height and weight were zero, you could not predict a man's weight by knowing his height any more accurately than by not knowing his height.
Most correlation coefficients of test scores and measures of academic success fall somewhere between zero and +1.00. Knowledge of an individual's score on one variable enables you to predict that individual's standing on the other variable with greater accuracy than if the correlation were zero. The higher the coefficient, the lower likelihood of error in prediction.
Equating
A statistical procedure that puts the raw scores of newly introduced forms of a test on a continuing scale and compensates for variations in difficulty among various forms of the test. Equating often involves comparison of the performance of an old and a new group of candidates on the same test material. Sometimes a random sample of new test-takers takes an entire old form of the test; or a representative sample of material from an old form may be added to a new form.
Frequency distribution
A tabulation of scores from high to low, or low to high, showing the number of individuals who obtain each score or whose scores fall in each score interval. Frequency distributions are used to determine tables of percentile ranks.
Grade point average (GPA) or ratio
A system used by many schools for evaluating the overall scholastic performance of students. Grade points are determined by first multiplying the number of hours given for a course by the numerical value of the grade and then dividing the sum of all grade points by the total number of hours carried. The most common system of numerical values for grades is A = 4, B = 3, C = 2, D = 1, and E or F = 0. Also called a quality point average ratio.
Mean or arithmetic mean
The average.
Median
The score below which 50 percent of the cases in a score distribution fall. If the distribution of scores is distorted by the presence of a few aberrant cases of little importance, the median may be a better summary description of the group than the mean. If the distribution is symmetric, the median and mean will be almost identical. The median is also by definition the 50th percentile.
Norms
A statistical description of the test performance of a well-defined group that serves as a reference by which to gauge the performance of the other individuals who take the test. Most norms tables show, in descending order, various test scores and the percentage of people in the reference group who scored below each score level. Thus, knowing an individual's score, you can quickly determine how he or she compares with the reference group.
Obtained score
The score actually achieved by a person taking a test.
Percentile rank
The percent of scores in a distribution that is lower than a particular obtained score. The remaining scores are at the same level or higher.
Proficiency scale ratings
A proficiency scale represents hierarchical descriptions of performance in specific linguistic skills (i.e., speaking, listening, reading, and writing). The levels range from the ability to perform the simplest to the most advanced tasks. Each description is a representative sample of a particular range of ability, and each level subsumes all previous levels. The scale allows assessment of what an individual can and cannot do, regardless of how the language has been learned or acquired. Each rating describes the level of competence at which an individual is able to use the language for both basic communicative tasks and academic purposes.
PSAT/NMSQT® (Preliminary SAT®/National Merit Scholarship Qualifying Test)
A shorter version of the SAT, with an additional writing skills section as well as a diagnostic component providing skills feedback. Administered by high schools to sophomores and juniors each year in October, the PSAT/NMSQT aids high schools in the early guidance of students planning for college and serves as the qualifying test for scholarships awarded by the National Merit Scholarship Corporation.
Raw score
The number of correct responses minus a fraction of the incorrect responses. The raw score is converted to a scaled score for reporting.
Recentering
A one-time statistical adjustment that restored the distribution of scores to the center of the College Board 200-to-800 scales in 1995. This process set both the verbal and mathematical means for the SAT Reasoning Test™ at 500 (with a standard deviation of 110) for the 1990 reference group.
Reliability
The extent to which a test measures consistently.
Restriction of range
A statistical procedure (e.g., Pearson-Lawley multivariate correction) used to replicate score ranges for the total population by adjusting score range for a subset of the population (e.g., the admitted freshman class).
SAT
The test of developed language skills and mathematical reasoning abilities (formerly called SAT I), given on specified dates throughout the year at test centers in the United States and other countries. The SAT is required by many colleges and sponsors of financial aid programs.
SAT Question-and-Answer Service
A service of the College Board that provides students with a copy of their SAT test, their answers and the correct answers, scoring instructions, and information about the questions. The service is available only for certain test dates.
SAT Subject Tests™
Tests in specific subjects (formerly called the SAT II), given at test centers in the United States and other countries on specified dates throughout the year. The tests are used by colleges not only to help with decisions about admissions but also to assist in course placement and exemption of enrolled first-year students. They include the English Language Proficiency Test™ (ELPT™).
Scaling
A means of defining a system for transforming raw scores to reported (scaled) scores for a test or testing program.
Services for Students with Disabilities (SSD)
A College Board service that assists students by providing services and reasonable accommodations appropriate to the student's disability and the purpose of the exam the student is taking. SSD provides Advanced Placement Program® (AP®), PSAT/NMSQT, and SAT testing accommodations for students who have documented disabilities.
Standard deviation
A measure of the spread or extent of variability of a set of scores around their mean. The standard deviation reflects the degree of homogeneity of the group with respect to the variable in question. That is, the less the dispersion of scores, the smaller will be the standard deviation.
Standard error of the difference
An indication of the extent to which the difference between the scores of two people on the same test or the scores of one person on two different tests may represent error due to the unreliability of the test. The user can be reasonably confident that the higher score represents greater ability or achievement as measured by the test if the difference between two scores exceeds 1.5 times the standard error of the difference for the test.
Validity
An indication of the extent to which a test or other measure does the job for which it was intended. There are several kinds of validity. Predictive validity is the extent to which test scores, for instance, are able to predict a criterion variable such as grades or faculty ratings. Validity is expressed as a correlation coefficient between the predictor variable, such as test scores, and the criterion variable. Validity coefficients, like all correlation coefficients, are heavily influenced by the extent to which the individuals studied are spread out on the predictor measure and on the criterion measure. In practice, the range of scores for admitted students is almost always smaller than that for the total applicant group.