This Document Contains Chapters 1 to 4 Chapter 1 1. Who is thought to have spread the testing movement in the United States? a. Clark Wissler b. E.L. Thorndike c. James McKeen Cattell d. Sir Francis Galton Answer: c. James McKeen Cattell 2. Who is credited with the creation of the first formal personality test? a. Alfred Binet b. David Wechsler c. Hermann Rorschach d. Robert Woodworth Answer: d. Robert Woodworth 3. ______ is an activity that involves judging or appraising the value or worth of something. a. Assessment b. Evaluation c. Measurement d. Testing Answer: b. Evaluation 4. Which of the following would be an example of a subjective test? a. Exam consisting of essay questions b. Exam consisting of T/F questions c. Stanford-Binet Intelligence Scale d. Tests & Measurements Exam Answer: a. Exam consisting of essay questions 5. Which of the following terms refers to the degree to which test scores are free from measurement error? a. Reliability b. Validity c. Accuracy d. Consistency Answer: a. Reliability 6. A typical response test would be used to measure which of the following constructs? a. Achievement b. Aptitude c. Attitudes d. Intelligence Answer: c. Attitudes 7. A student scored better than 85% of his or her peers. This is an example of which approach to score interpretation? a. Construct-referenced b. Criterion-referenced c. Norm-referenced d. Standard-referenced Answer: c. Norm-referenced 8. Which of the following uses a criterion-referenced approach to score interpretation? a. Driver’s license exam b. Intelligence test c. MMPI-2 Answer: a. Driver’s license exam 9. Which of the following assumptions of psychological assessment is correct? a. Assessment procedures are essentially error free b. One source of information is enough for the assessment process c. Psychological constructs can be measured d. There is only one way to measure a construct Answer: c. Psychological constructs can be measured 10. Which of the following involves situations where people are assigned to different tracks, ordered in some way? a. Categorization b. Classification c. Placement d. Selection Answer: c. Placement 11. How many tests did the American Psychological Association (APA) estimate are developed every year? a. 20 b. 200 c. 2000 d. 20000 Answer: d. 20000 12. Psychological assessment is: a. broader in scope than testing. b. one component of testing. c. a less detailed and technical process than testing. d. less precise and accurate than testing. Answer: a. broader in scope than testing. 13. Testing is to assessment as _____________ is to ________________. a. blood test; medical exam b. placement; classification c. X ray; MRI d. intern; doctor Answer: a. blood test; medical exam 14. Validity refers to: a. the accuracy of the interpretation of test scores. b. the stability or consistency of test scores. c. the method in which norms for the tests were developed. d. whether the test is a good measure of a construct. Answer: a. the accuracy of the interpretation of test scores. 15. Reliability refers to: a. the stability or consistency of test scores. b. the method in which the test was developed. c. the specialization of the test. Answer: a. the stability or consistency of test scores. 16. Amy completes a 100 item questionnaire that asks her to respond “Yes” or “No” to statements about her typical ways of thinking and behaving. What type of test is this? a. Aptitude Test b. Objective Personality Test c. Power Test d. Speed Test Answer: b. Objective Personality Test 17. __________ is defined as a systematic procedure for collecting information that can be used to make inferences about the characteristics of people or objects. a. Evaluation b. Measurement c. Assessment d. Testing Answer: c. Assessment 18. Which is of the following is NOT a right of a test taker according to the Joint Committee on Testing Practices? a. The right to review their test questions b. The right to receive test administration by trained individuals c. The right to receive information regarding their test results d. The right to confidentiality of their results Answer: a. The right to review their test questions 19. A psychological or educational professional who has specialized in the area of testing, measurement, and assessment is referred to as a/an __________. a. academician b. diagnostician c. psychologist d. psychometrician Answer: d. psychometrician 20. Since all psychological tests contain a fixed number of items, they: a. should be viewed as samples of behavior. b. by definition, are reliable. c. cannot be used to predict behavior. d. by definition, are valid. Answer: a. should be viewed as samples of behavior. 21. _______ typically contain test items that are all about the same level of difficulty. a. Objective tests b. Speed tests c. Power tests d. Projective tests Answer: b. Speed tests 22. __________ tests are typically used to measure what has been learned at a specific point in time; _________ tests are often used to predict future performance or measure potential for learning. a. Aptitude; achievement b. Achievement; aptitude c. Speed; power d. Power; speed Answer: b. Achievement; aptitude 23. Who was the German mathematician that first recognized measurement error? a. Carl Gauss b. Sigmond Freud c. James Cattell d. Clark Wissler Answer: a. Carl Gauss 24. ______ is often considered the father of mental tests and measurements. a. Carl Gauss b. Clark Wissler c. Sir Francis Galton d. Alfred Binet Answer: c. Sir Francis Galton 25. A power test: a. is a type of typical response test. b. requires a stringent time limit. c. emphasizes the use of items of similar difficulty. d. can focus on aptitude or achievement. Answer: d. can focus on aptitude or achievement. 26. Typical response tests measure constructs such as: a. attitudes. b. achievement. c. aptitude. d. Intelligence. Answer: a. attitudes. 27. Maximum performance tests are designed to: a. classify students into ability levels. b. assess students’ ability levels c. assess upper limits of examinee’s knowledge and abilities. d. assess lower limits of examinee’s knowledge and abilities. Answer: c. assess upper limits of examinee’s knowledge and abilities. 28. Which test below is considered a maximum performance test? a. Achievement test b. Depression test c. Personality test d. Interests test Answer: a. Achievement test 29. Performance on pure ________ tests are assessed based on time, while pure ________ tests are assessed based on difficulty. a. speed; power b. power; speed c. achievement; maximum performance d. maximum performance; achievement Answer: a. speed; power 30. Johnny is shown a picture of two kids playing in the park and asked to describe what he believes each child is thinking. What type of test is this? a. Objective personality test b. Typical response test c. Maximum performance test d. Projective personality test Answer: d. Projective personality test 31. Which scores would be interpreted appropriately for measuring a student’s mastery of a specific domain of knowledge? a. Norm-referenced scores b. Criterion-referenced scores c. Standardized-referenced scores d. Projective-referenced scores Answer: b. Criterion-referenced scores 32. An assumption of educational assessment is that tests are designed to measure traits or characteristics, known as: a. abilities. b. behaviors. c. constructs. d. skills. Answer: c. constructs. 33. A classroom teacher gives her students a final exam that is the bases for 50% of their final grades in the course. This is an example of which type of evaluation? a. Projective evaluation b. Summative evaluation c. Formative evaluation d. Feedback evaluation Answer: b. Summative evaluation 34. Tim received his third exam score for Tests and Measurements and realizes that he needs to study more for the final. What type of evaluation would help by providing instructive feedback to him? a. Comprehensive evaluation b. Feedback evaluation c. Formative evaluation d. Summative evaluation Answer: c. Formative evaluation 35. The majority of assessment information collected by most teachers comes from: a. professionally developed tests. b. state-wide tests. c. performance tests. d. teacher made tests. Answer: d. teacher made tests. 36. At the classroom level, ________ must be able to interpret assessment results accurately and use them appropriately. a. counselors b. diagnosticians c. school psychologists d. teachers Answer: d. teachers 37. Susan has been evaluated and determined to be learning disabled. This is an example of: a. assignment. b. classification. c. placement d. selection. Answer: b. classification. 38. In reference to projective tests, what is the “projective hypothesis”? a. Examinees’ responses to ambiguous stimuli reflect their genuine unconscious desires, motives, and drives without interference from the ego or conscious mind. b. Examinees’ responses to specific stimuli reflect their genuine conscious desires, motives, and drives. c. Examinees’ responses to specific stimuli reflect their genuine unconscious desires, motives, and drives without interference from the ego or conscious mind. d. Examinees’ responses to ambiguous stimuli reflect their genuine conscious desires, motives, and drives. Answer: a. Examinees’ responses to ambiguous stimuli reflect their genuine unconscious desires, motives, and drives without interference from the ego or conscious mind. 39. The Scholastic Achievement Test (SAT) is a(n): a. pure speed test. b. maximum performance test. c. typical response test. d. projective test. Answer: b. maximum performance test. 40. _________ is any systematic procedure for collecting information that can be used to make inferences about the characteristics of people. a. Appraisal b. Assessment c. Evaluation d. Measurement Answer: b. Assessment Chapter 2 1. The correct order of the following scales of measurement, from most to least precise, is: a. nominal, ordinal, interval, ratio b. interval, ratio, ordinal, nominal c. ordinal, nominal interval, ratio d. ratio, interval, ordinal, nominal Answer: d. ratio, interval, ordinal, nominal 2. A scale used in an experiment assigns a value of “1” to subjects that are female and a value of “2” to subjects that are male. What type of scale is used in this experiment? a. Interval b. Nominal c. Ordinal d. Ratio Answer: b. Nominal 3. Which scale does not necessarily have equal distance between intervals? a. Interval b. Ordinal c. Ratio Answer: b. Ordinal 4. Addition and subtraction operations can be used with ________ scale(s) of measurement. a. all b. ordinal, interval, and ratio c. interval and ratio d. ratio Answer: c. interval and ratio 5. With which scale(s) of measurement is it possible to correctly state that a score of 80 reflects twice as much as a score of 40? a. All scales b. No scales c. Ratio, interval, and ordinal d. Interval and ratio e. Ratio Answer: e. Ratio 6. Which scale of measurement has a true zero point? a. Interval b. Nominal c. Ordinal d. Ratio Answer: d. Ratio 7. Weight in pounds is an example of which scale of measurement? a. Ratio b. Ordinal c. Nominal d. Interval Answer: a. Ratio 8. The final standing of runners after a race would be an example of which scale of measurement? a. Interval b. Nominal c. Ordinal d. Ratio Answer: c. Ordinal 9. Ratio scales are relatively rare in psychological measurement because: a. people generally do not like negative numbers. b. they frequently produce skewed distributions. c. using equal scale units makes it possible to compare individuals. d. it is difficult to define a true zero point with various psychological constructs. Answer: d. it is difficult to define a true zero point with various psychological constructs. 10. A negatively skewed distribution is one with: a. few scores at the low end and many scores grouped at the high end. b. few scores at the high end and many scores grouped at the low end. c. few scores at the high end and many negative scores. Answer: a. few scores at the low end and many scores grouped at the high end. 11. What is the preferred measure of central tendency for a distribution with a significant positive skew? a. Mean b. Median c. Mode d. Mid-percentile rank Answer: b. Median 12. When one has a small set of scores that contains an extreme score (relative to the others), which measure of central tendency would be significantly impacted? a. Mean b. Median c. Mode Answer: a. Mean 13. Which measure of central tendency can be used with all four scales of measurement? a. Mean b. Median c. Mode Answer: c. Mode 14. The standard deviation can be interpreted as measuring: a. the average distance that scores vary from the mean of the distribution. b. the greatest distance that scores vary from the mean of the distribution. c. the average percentage of the distance that the scores vary from the mean of the distribution. d. the percentage of the distance that half the scores vary from the mean of the distribution. Answer: a. the average distance that scores vary from the mean of the distribution. 15. A set of scores with a standard deviation (SD) of 4 would have a variance equal to: a. 2. b. 4. c. 8. d. 16. Answer: d. 16. 16. What measure of central tendency must you be able to first calculate in order to calculate a standard deviation? a. Mean b. Median c. Mode d. Range Answer: a. Mean 17. A set of scores has a variance equal to 25. The standard deviation would be equal to: a. 5. b. 12.5. c. 50. d. 100. Answer: a. 5. 18. In a distribution that is positively skewed, what is the relationship between the mean and median? a. There is no predictable relationship. b. The mean is greater than the median. c. The mean is less than the median. d. The mean and median are equal. Answer: b. The mean is greater than the median. 19. If the distribution of scores on a classroom test has a strong positive skew, then for this group of students, test was most likely: a. of average difficulty. b. too easy. c. too difficult. d. not enough information to determine. Answer: c. too difficult. 20. Approximately what percentage of scores falls below 1 standard deviation above the mean? a. 34% b. 82% c. 84% d. 98% Answer: c. 84% 21. A bimodal distribution is best described as: a. a distribution with two scores that are the same. b. a distribution with two scores of the same frequency. c. a distribution with two scores that are equal in frequency and higher than other scores. d. a distribution with two scores that are equal in frequency and lower than other scores. Answer: c. a distribution with two scores that are equal in frequency and higher than other scores. 22. A ________ is a graph that visually displays the relationship between two variables. a. bar graph b. list c. table d. scatterplot Answer: d. scatterplot 23. If a large decrease on one variable is associated with a correspondingly large increase on another variable, the correlation is likely: a. Strong and negative b. Weak and negative c. Strong and positive d. Weak and positive Answer: a. Strong and negative 24. Assume a correlation between two tests is 0.40 (r = 0.40). What would the coefficient of determination equal? a. 0.16 b. 0.20 c. 0.40 d. 0.60 Answer: a. 0.16 25. If the coefficient of determination is 0.30 what is the coefficient of non-determination? a. 0.09 b. 0.30 c. 0.60 d. 0.70 Answer: d. 0.70 26. A ________ is appropriate when both variables are measured on an interval or ratio scale. a. Pearson product-moment correlation b. Spearman rank correlation coefficient c. Point-Biserial correlation coefficient d. Beta coefficient Answer: a. Pearson product-moment correlation 27. If you want to calculate the correlation between spelling skills and reading comprehension which are both reported as percentile ranks, the appropriate coefficient is: a. Beta Coefficient. b. Cronbach’s Coefficient Alpha. c. Pearson Product-Moment Correlation. d. Spearman Rank-Difference Correlation. Answer: d. Spearman Rank-Difference Correlation. 28. The appropriate correlation coefficient to use when one variable is dichotomous and one is measured on an interval or ratio scale is: a. Alpha Coefficient. b. Pearson Product-Moment Correlation. c. Point-Biserial Correlation. d. Spearman Rank-Difference Correlation. Answer: c. Point-Biserial Correlation. 29. If the range of one or both variables is restricted, the resulting correlation coefficient will likely: a. be decreased. b. be increased. c. remain the same. d. either increases or decrease depending on the calculations. Answer: a. be decreased. 30. Coefficients based on samples with ________ variances will generally produce _______ correlation coefficients than those based on samples with ________ variances. a. Large; higher; small b. Large; lower; small c. Small; higher; large d. Small; equal; large Answer: a. Large; higher; small 31. Correlation _______ imply causation. a. does b. does not c. might, depending on the scale of measurement, Answer: b. does not 32. If the correlation between math and writing achievement scores in the general population is 0.40, the correlation between these scores among students at MIT, Yale, and Stanford (selective and prestigious universities) would likely be: a. approximately 0.40. b. less than 0.40. c. greater than 0.40. d. there is insufficient information to determine. Answer: b. less than 0.40. 33. The correlation between two variables is 0.70. Using the concept of the coefficient of determination, the proportion of variance that is determined or predictable from the relationship between the two measures is: a. 14%. b. 30%. c. 49%. d. 70%. Answer: c. 49%. 34. The correlation between two variables is 0.70. Using the concept of the coefficient of no determination, the proportion of variance that is NOT determined or predictable from the relationship between the two variables is: a. 14%. b. 30%. c. 51%. d. 70%. Answer: c. 51%. 35. A special mathematical procedure for predicting scores on one variable (criterion or Y) given a score on another (predictor or X) is: a. correlational analysis. b. linear regression. c. regression analysis. d. prediction constant. Answer: b. linear regression. 36. According to the guidelines presented in the text, a correlation of .25 would be considered: a. weak. b. moderate. c. strong. Answer: a. weak. 37. Why would a psychologist feel that the variance might be difficult to interpret? a. It is a nonlinear transformation. b. It is an area transformation. c. It may be a negative number. d. It uses squared raw score units. Answer: d. It uses squared raw score units. 38. What is the range of the data below? 8 10 12 14 16 18 20 22 a. 8 b. 12 c. 14 d. 22 Answer: c. 14 39. In a normal distribution, what percentage of scores will fall between one standard deviation below the mean and one standard deviation above the mean? a. 16% b. 34% c. 68% d. 84% e. 98% Answer: c. 68% 40. Most correlation coefficients assume a linear relationship. If a curvilinear exists, traditional correlation coefficients will likely _________ this relationship. a. overestimate b. underestimate c. not change Answer: b. underestimate Chapter 3 1. Norm-referenced interpretations are _________ while criterion-referenced interpretations are ________. a. absolute; relative b. not raw scores; raw scores c. raw scores; not raw scores d. relative; absolute Answer: d. relative; absolute 2. Norm-referenced interpretations are considered to be ___________, while criterion-referenced interpretations are considered to be __________. a. absolute; relative b. relative; absolute c. reliable; unreliable d. unreliable; reliable Answer: b. relative; absolute 3. A test can produce interpretations for: a. norm-referenced scores only. b. criterion-referenced scores only c. both norm- and criterion-referenced scores. d. neither norm-referenced nor criterion-referenced scores. Answer: c. both norm- and criterion-referenced scores. 4. For norm-referenced interpretations, what is the most important consideration? a. The relevance of the norm group to which the examinee is compared b. The size of the norm group to which the examinee is compared c. How clearly the knowledge or skill domain being assessed is specified d. How clearly the group being compared to is similar in skill domain Answer: a. The relevance of the norm group to which the examinee is compared 5. Norm-referenced interpretations can be applied to: a. maximum performance tests only. b. typical response tests only. c. both maximum performance and typical response tests. d. achievement tests only. Answer: c. both maximum performance and typical response tests. 6. Standard scores are: a. more reliable than raw scores. b. more valid than raw scores. c. linear transformations of raw scores. Answer: c. linear transformations of raw scores. 7. Ultimately, it is the responsibility of the _______ to evaluate the adequacy of the standardization sample and the appropriateness of comparing the examinee’s score to this group. a. test developer b. test publisher c. test taker d. test user Answer: d. test user 8. Which of the following is best used for norm-referenced score interpretation? a. Percent correct b. Mastery testing c. Standard scores d. Standards-based interpretations Answer: c. Standard scores 9. A student receives a score of 78. What is the major concern in interpreting this score? a. It is too low for the typical grading system. b. It cannot be interpreted without knowing the distribution it comes from. c. It is insufficiently precise for accurate assessment. d. It is too precise for accurate assessment. Answer: b. It cannot be interpreted without knowing the distribution it comes from. 10. Z- and t-scores are types of: a. curvilinear transformations. b. nominal measures. c. ordinal measures. d. standard scores. Answer: d. standard scores. 11. Z-scores have a mean of ___ and a standard deviation of ____. a. 0; 1 b. 0; 1.6 c. 1; 1 d. 1; 1.6 Answer: a. 0; 1 12. What is the variance of z-scores? a. .01 b. .1 c. 0 d. 1 e. 10 Answer: d. 1 13. Jimmy received a z-score of +2.0 on a math test. What do we know about Jimmy's performance, assuming that the math test scores are distributed normally? a. He scored better than 16% of the other students. b. He scored better than 34% of the other students. c. He scored better than 50% of the other students. d. He scored better than 84% of the other students. e. He scored better than 98% of the other students. Answer: e. He scored better than 98% of the other students. 14. T-scores have an advantage over z-scores because: a. t-scores have no negative numbers. b. z-scores have no decimal points. c. z-scores are less precise. d. t-scores are reported in squared units. Answer: a. t-scores have no negative numbers. 15. T-scores have a mean ____ and a standard deviation of _____. a. 0; 1 b. 10; 50 c. 50; 10 d. 100; 50 Answer: c. 50; 10 16. Approximately what percentage of scores falls below a T-score of 40 assuming the score distribution is normal? a. 16% b. 34% c. 50% d. 84% e. 98% Answer: a. 16% 17. Julie receives a t-score of 30 on a reading skills assessment. What can be determined from this information, assuming that the reading skills assessment scores are normally distributed? a. She scored better than 2% of the other students. b. She scored better than 16% of the other students. c. She scored better than 34% of the other students. d. She scored better than 50% of the other students. Answer: a. She scored better than 2% of the other students. 18. Approximately what percentage of scores falls below an IQ of 115? a. 16% b. 34% c. 50% d. 84% e. 98% Answer: d. 84% 19. On a math test, the raw score mean is 70 and the standard deviation is 3. The raw scores are converted to T-scores. The mean and standard deviation, respectively of the T-scores will be: a. 0 and 1. b. 10 and 3. c. 50 and 10. d. 70 and 3. Answer: c. 50 and 10. 20. According to the text authors, ___________ are norm-referenced derived scores that need to be interpreted with extreme caution and are only ordinal in nature. a. percentile ranks b. standard scores c. grade equivalents d. all of the above Answer: c. grade equivalents 21. A test designed to yield information about whether or not a student has mastered the ability to add and subtract three digit numbers would most likely produce: a. criterion-referenced scores. b. norm-referenced scores. c. standardized scores. d. percentile ranks. Answer: a. criterion-referenced scores. 22. For criterion-referenced interpretations, the most important consideration is: a. the relevance of the group to which the examinee is compared. b. how broad the knowledge or skill domain being assessed is. c. how clearly the knowledge or skill domain being assessed is specified. d. how well the examinee performs. Answer: c. how clearly the knowledge or skill domain being assessed is specified. 23. Which of the following is an example of a criterion-referenced score? a. Cut score b. IQ c. Standard score d. T-score Answer: a. Cut score 24. What can you conclude about two people with percentile rank scores of 55 and 30? a. Their raw test scores differed by 25 points. b. They differ by the same amount as people with percentile rank scores of 60 and 85. c. The person with a rank of 55 had a higher score. d. All of the above. Answer: c. The person with a rank of 55 had a higher score. 25. A(n) _______ trait is some ability or characteristic that is inferred to exist based on theories of behavior as well as evidence of its existence. a. indirect b. latent c. overt d. subtle Answer: b. latent 26. If you want to calculate the correlation between reading skills and reaction speed which are both reported as T-scores, the appropriate coefficient is: a. Cronbach’s Coefficient Alpha. b. Pearson Product-Moment Correlation. c. Phi Coefficient. d. Spearman Rank-Difference Correlation. Answer: b. Pearson Product-Moment Correlation. 27. Criterion-referenced interpretations are commonly applied to: a. maximum performance tests only. b. typical response tests only. c. both maximum performance and typical response tests. Answer: a. maximum performance tests only. 28. The majority of percentile ranks are found where in the distribution? a. Towards the left tail b. Towards the right tail c. Clustered in the middle d. Evenly distributed across the distribution Answer: c. Clustered in the middle 29. The “Flynn Effect” refers to: a. decreases in IQ observed during the 20th century. b. decreases in SAT scores observed during the 20th century. c. increases in SAT scores observed during the 20th century. d. increases in IQ observed during the 20th century. Answer: d. increases in IQ observed during the 20th century. 30. What person or entity developed stanine scores? a. US Air Force b. US Army c. David Wechsler d. Alfred Binet Answer: a. US Air Force 31. Wechsler subtest scaled scores have a mean of _____ and a standard deviation of ______. a. 0; 1 b. 10; 3 c. 50; 10 d. 500; 100 Answer: b. 10; 3 32. Normalized standard scores are often: a. linear transformations. b. nonlinear transformations. d. oblique transformations. e. orthogonal transformations. Answer: b. nonlinear transformations. 33. The norms for a standardized intelligence test describe the: a. ideal level of performance. b. minimum acceptable level of performance. c. performance of a specified group. d. performance of a successful group. Answer: c. performance of a specified group. 34. A percentile rank of 80 implies that: a. 20% of the individuals in the standardization group scored below this score. b. 80% of the individuals in the standardization group scored below this score. c. 80% of the individuals in the standardization group scored above this score. Answer: b. 80% of the individuals in the standardization group scored below this score. 35. A percentile rank of _____ indicates performance at the median of the reference group. a. 49 b. 49.5 c. 50 d. 51 Answer: c. 50 36. Which level of measurement do grade equivalents reflect? a. Nominal b. Ordinal c. Interval d. Ratio Answer: b. Ordinal 37. ___________ interpretations can be applied to a wider variety of tests than __________ interpretations. a. Criterion-referenced; norm-referenced b. Norm-referenced; criterion-referenced c. Both are about equal in their applications. Answer: b. Norm-referenced; criterion-referenced 38. __________ is a model of mental measurement that holds that the responses to the items are accounted for by latent traits. a. Criterion-referenced interpretation b. Item Response Theory c. Norm-referenced interpretation Answer: b. Item Response Theory 39. CEEB (SAT/GRE) scores have a mean of _____ and a standard deviation of ______. a. 0; 1 b. 50; 10 c. 100; 15 d. 500; 100 Answer: d. 500; 100 40. A z-score of 1 is equal to a T-score of: a. 10. b. 20. c. 30. d. 50. e. 60. Answer: e. 60. Chapter 4 1. In Classical Test Theory, the X represents _______ and the T represents ________. a. measurement error, observed score b. observed score; stable test-taker characteristics c. observed score; measurement error d. stable test-taker characteristics; observed score Answer: b. observed score; stable test-taker characteristics 2. _________ refers to the consistency or stability of test scores. a. Measurement error b. Reliability c. Variance d. Validity Answer: b. Reliability 3. Classical Test Theory focuses our attention on ________ measurement error. a. random b. variable c. standard d. systematic Answer: a. random 4. The mean of error scores in a population is equal to ________. a. 0.1 b. 0 c. 1 d. 10 Answer: b. 0 5. There is (a) _______ relationship between an individual’s level on a construct and the amount of measurement error impacting their observed score. a. no b. weak c. moderate d. strong Answer: a. no 6. ________ is/are usually considered the largest source of error in test scores. a. Administrative errors b. Clerical errors c. Content sampling d. Time sampling Answer: c. Content sampling 7. On a test comprised of constructed response items, it is important to consider: a. administrative errors. b. clerical errors. c. inter-rater differences. d. time sampling. Answer: c. inter-rater differences. 8. The reliability coefficient (rxx) equals true score variance (2T) divided by the: a. observed score. b. measurement error . c. variance due to measurement error. d. variance of the total test. Answer: d. variance of the total test. 9. What conclusion could be accurately drawn from a reliability coefficient of .80? a. 18% of test score variance is due to true score variance. b. 20% of test score variance is due to true score variance. c. 64% of test score variance is due to true score variance. d. 80% of test score variance is due to true score variance. Answer: d. 80% of test score variance is due to true score variance. 10. If 6% of test scores’ observed variance is due to measurement error, the reliability coefficient of the test would be: a. .06. b. .36. c. .60. d. .94. Answer: d. .94. 11. Test-retest reliability is primarily sensitive to measurement error due to: a. content sampling. b. content sampling and temporal instability. c. factor invariance. d. temporal instability. Answer: d. temporal instability. 12. Alternate form reliability based on simultaneous administration is primarily sensitive to measurement error due to: a. content sampling. b. content sampling and temporal instability. c. practice effects. d. temporal instability. Answer: a. content sampling. 13. Alternate form reliability based on delayed administration is sensitive to measurement error due to: a. content sampling. b. content sampling and temporal instability. c. practice effects. d. temporal instability. Answer: b. content sampling and temporal instability. 14. As a general rule, _________ tests produce more reliable scores than ______ tests. a. brief; lengthy b. intensive; extensive c. longer; shorter d. shorter; longer Answer: c. longer; shorter 15. The uncorrected split-half reliability coefficient __________ the reliability of the full test score. a. accurately reflects b. overestimates c. underestimates d. provides an indeterminate reflection of Answer: c. underestimates 16. Split-half reliability is primarily sensitive to measurement error due to: a. content sampling. b. content sampling and temporal instability. c. practice effects. d. temporal instability. Answer: a. content sampling. 17. __________ is sensitive to the heterogeneity of the test content. a. Alternate-from reliability with delayed administration b. Coefficient Alpha c. Split-half reliability d. Test-retest reliability Answer: b. Coefficient Alpha 18. _________ is applicable when test items are scored dichotomously while _________ can be used when test items produce multiple values. a. Coefficient Alpha; KR 20 b. KR 20; Coefficient Alpha c. Split-half reliability; test-retest reliability d. Test-retest reliability; split-half reliability Answer: b. KR 20; Coefficient Alpha 19. On a classroom essay test, __________ is a major concern. a. inter-rater reliability b. internal consistency reliability c. split-half reliability d. test-retest reliability Answer: a. inter-rater reliability 20. The reliability of composite scores is generally ________ the reliability of the individual scores that contributed to the composite. a. equal to b. higher than c. lower than Answer: b. higher than 21. Which of the following is a measure of inter-rater agreement that takes into consideration the degree of agreement expected by chance? a. KR 20 b. Strong-Campbell Beta c. Cronbach’s Coefficient alpha d. Cohen’s kappa Answer: d. Cohen’s kappa 22. Which of the following methods is necessary when estimating the reliability of a test score intended to predict performance at a future time? a. Alternate form reliability with simultaneous administration b. Coefficient alpha c. KR 20 d. Test-rest reliability Answer: d. Test-rest reliability 23. Which reliability estimate would be preferred for a score derived from a test with heterogeneous content? a. Coefficient Aplha b. KR 20 c. Split-Half Coefficient Answer: c. Split-Half Coefficient 24. Which method of rating reliability would be appropriate for scores from a speed test? a. Coefficient Alpha b. Kuder Richardson 20 c. Test-retest reliability d. Split-half reliability Answer: c. Test-retest reliability 25. Reliability coefficients based on a homogeneous sample would likely be ________ coefficients based on a heterogeneous sample. a. equal to b. larger than c. smaller than Answer: c. smaller than 26. As the reliability of a test score _______ the standard error of measurement _______. a. decreases; increases b. decreases; decreases c. increases; remains the same d. decreases; remains the same Answer: a. decreases; increases 27. Sally’s obtained scored on a statistics exam is 75. The SEM is 2. With what confidence interval would we capture her true score 68% of the time? a. 71 to 79 b. 73 to 77 c. 69 to 81 d. 70 to 80 Answer: b. 73 to 77 28. Generalizability Theory typically uses which statistical procedure to estimate reliability? a. Analysis of Variance (ANOVA) b. Correlation Coefficient c. Linear Regression d. Multivariate Analysis of Vvariance (MANOVA) Answer: a. Analysis of Variance (ANOVA) 29. The average of all possible split-half coefficients is known as: a. Coefficient alpha. b. correlation coefficient. c. alternate form reliability. d. Spearman-Brown coefficient. Answer: a. Coefficient alpha. 30. A limitation of the test-retest approach to estimating reliability is the influence of: a. administration effects. b. content effects. c. practice effects. d. temporal effects. Answer: c. practice effects. 31. The Spearman-Brown formula is used to: a. correct a split-half reliability coefficient. b. estimate construct reliability. c. perform a curvilinear transformation of the scores. d. perform a linear transformation of the scores. Answer: a. correct a split-half reliability coefficient. 32. As reliability increases, confidence intervals: a. decrease. b. do not change. c. increase. Answer: a. decrease. 33. _________ is a result of transient events in the test taker (fatigue, illness, etc.) and the testing environment (temperature, noise level, etc.). a. Administration error b. Content sampling error c. Temporal instability d. Systematic measurement error Answer: c. Temporal instability 34. The reliability of difference scores is typically _______ the reliability of the individual scores. a. equal to b. higher than c. lower than Answer: c. lower than 35. The reliability index reflects the correlation between: a. true scores and observed scores. b. true scores and measurement error. c. observed scores and measurement error. d. true scores and true scores. Answer: a. true scores and observed scores. 36. If the reliability coefficient equals .81, the reliability index equals: a. .19. b. .81. c. .90. d. 1.0. Answer: c. .90. 37. What happens to the size of confidence intervals as reliability coefficients increase? a. They decrease. b. They increase. c. They remain the same. d. It is indeterminate – it depends on the construct being measured. Answer: a. They decrease. 38. The ____________ is an index of the amount of measurement error in test scores and is used in calculating confidence intervals. a. Standard Error of Estimate b. Standard Error of Measurement c. Spearman-Brown Coefficient d. Skew Coefficient Answer: b. Standard Error of Measurement 39. ______________________ is a useful index when comparing the reliability of the scores produced by different tests, but when the focus is on interpreting the test scores of individuals, the ________________________ is more practical. a. Reliability Coefficient; Standard Error of Measurement b. Standard Error of Measurement; Reliability Coefficient c. Standard Error of Estimate; Coefficient Alpha d. Standard Error of Estimate; Reliability Coefficient Answer: a. Reliability Coefficient; Standard Error of Measurement 40. In Item Response Theory, information on the reliability of test scores is typically reported as a: a. Test Information Function b. Standard Error of Estimate c. Skew Coefficient d. Coefficient of Determination e. Coefficient of Non-determination Answer: a. Test Information Function

