Preview (12 of 37 pages)

This Document Contains Chapters 5 to 8 Chapter 5 1. _________ refers to the appropriateness or accuracy of the interpretations of test scores. a. Correlation b. Evidence c. Reliability d. Validity Answer: d. Validity 2. A 4th grade final exam for mathematics has extensive written instructions. In addition to math skills, this exam is probably measuring reading comprehension skills as well. This is an example of which threat to validity? a. Construct-irrelevant variance b. Construct underrepresentation c. Factor variance d. Item homogeneity Answer: a. Construct-irrelevant variance 3. A 3rd grade mathematics exam designed to be a comprehensive measure of the mathematics skills covered in a 4th grade curriculum only includes questions involving fractions. This is an example of which threat to validity? a. Construct-irrelevant variance b. Construct underrepresentation c. Factor variance d. Item homogeneity Answer: b. Construct underrepresentation 4. Only true score variance is reliable, and only _________ can be related systematically to any construct the test is designed to measure. a. true score variance b. observed score variance c. error variance Answer: a. true score variance 5. Contemporary psychometric standards emphasize: a. three different types of validity. b. that different validity types apply to different assessments. c. that validity is a unitary construct. d. that validity is conceptualized as being all or none. Answer: c. that validity is a unitary construct. 6. What is produced by an examination of the relationship between the content of the test and the construct or domain the test is designed to measure? a. Validity evidence based on test content b. Validity evidence based on internal structure c. Validity evidence based on response processes d. Validity evidence based on relations to other variables e. Validity evidence based on consequences of testing Answer: a. Validity evidence based on test content 7. __________ refers to the degree to which a test ‘appears’ to measure what it is designed to measure. a. Content validity b. Construct validity c. Criterion-related validity d. Face validity Answer: d. Face validity 8. What type of validity study used to collect test-criterion evidence involves an administration of the test, a time interval, and then a measure of the criterion? a. Concurrent study b. Construct study c. Delayed evidence study d. Predictive study Answer: d. Predictive study 9. The GRE is given to students prior to entering their first year of graduate school. What type of validity evidence is most important for this test? a. Validity evidence based on consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on response processes e. Validity evidence based on internal structure Answer: c. Validity evidence based on relations to other variables 10. According to the authors of the text, the minimum size for acceptable validity coefficients is: a. 0.60. b. 0.70. c. 0.80. d. 0.90. e. indeterminate — it depends on many factors. Answer: e. indeterminate — it depends on many factors. 11. A math test intended to cover the year’s content was administered to a class. The teacher ran out of time the night before and only asked questions from the first semester’s material. What type of threat to validity is present? a. Construct underrepresentation b. Construct-irrelevant variance c. Face validity d. Discriminant validity Answer: a. Construct underrepresentation 12. A statistic called the _____________ is used to describe the amount of prediction error resulting from the imperfect correlation between a test score and a criterion. a. error variance b. observed score variance c. standard error of estimate d. standard error of measurement Answer: c. standard error of estimate 13. ________ models help the test user determine how much information a predictor test can contribute when making classification decisions. a. Choice-theory b. Classification-theory c. Decision-theory d. Selection-theory Answer: c. Decision-theory 14. The __________ of a measure to a diagnostic condition is the ability of the test at a predetermined cut score to detect the presence of the disorder. a. correlation b. reliability c. sensitivity d. specificity Answer: c. sensitivity 15. Research has shown that validity coefficients: a. can rarely be generalized across samples. b. can be generalized more than previously thought. c. must be > .90 to have any utility. d. are of little utility in evaluating the validity of score interpretations. Answer: b. can be generalized more than previously thought. 16. Dr. Jones has developed a new test to measure intelligence in children and decides to see how well it correlates with the Wechsler Intelligence Scale for Children (WISC-IV). This would be an example of which type of validity evidence? a. Convergent evidence of validity b. Discriminant evidence of validity c. Validity evidence based on response processes d. Validity evidence based on test content Answer: a. Convergent evidence of validity 17. Reynolds and Kamphaus (2003) note that ___________ is a statistical approach that allows one to evaluate the presence and structure of any latent constructs existing among a set of variables. a. factor analysis b. linear regression c. multimatrix transformation d. multi-trait regression Answer: a. factor analysis 18. When deciding how many factors to retain in a factor analysis, a common approach, referred to as the Kaiser-Guttman criteria, is to examine eigenvalues and retain those greater than ______. a. 0.3 b. 0.5 c. 0.8 d. 1.0 Answer: d. 1.0 19. A test developer assembles a group of high school science teachers to evaluate and compare the themes and wording of the items on a science test with the intended objectives. Which category of validity evidence does this best represent? a. Validity evidence based on the consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on internal structure e. Validity evidence based on response processes Answer: b. Validity evidence based on test content 20. Of the following, which is NOT a threat to validity? a. Coaching b. Examinee characteristics such as high test anxiety c. Appropriateness of reference group d. High reliability coefficients Answer: d. High reliability coefficients 21. Typically, qualitative approaches such as having experts judge the relevance of items on a test have been used to collect which type of validity evidence? a. Validity evidence based on consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on response processes e. Validity evidence based on internal structure Answer: b. Validity evidence based on test content 22. ____________ is typically used to asses classroom academic achievement tests. a. Validity evidence based on consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on response processes e. Validity evidence based on internal structure Answer: b. Validity evidence based on test content 23. Which statistical procedure allows one to predict performance on one test from performance on another? a. factor analysis b. linear regression c. sensitivity analysis d. analysis of variance Answer: b. linear regression 24. When tests are used to make decisions regarding personnel selection, the proportion of applicants needed to fill positions is referred to as the: a. base rate. b. positive retention rate. c. negative retention rate. d. selection ratio. Answer: d. selection ratio. 25. When tests are used to make decisions regarding personnel selection, the proportion of applicants who can be successful on the criterion is referred to as the: a. base rate. b. positive retention rate. c. negative retention rate. d. selection ratio. Answer: a. base rate. 26. Dr. Handle has developed a new test designed to measure anxiety. He subsequently correlates his new test with a test that measures sensation seeking. He discovers a negative correlation between the two measures. This is an example of which type of validity evidence? a. Convergent evidence b. Discriminant evidence c. Validity evidence based on test content d. Validity evidence based on response processes Answer: b. Discriminant evidence 27. There exists a relatively sophisticated validation technique referred to as ___________ that combines convergent and divergent strategies. a. analysis of variance b. factor analysis c. linear regression d. multitrait-multimethod matrix Answer: d. multitrait-multimethod matrix 28. A new intelligence test is developed and then given to a group of students with no known disabilities and a group students diagnosed with mental retardation. This method of obtaining validity evidence is referred to as a: a. contrasted groups study. b. test-retest study. c. alternate form study. d. norm referenced study. Answer: a. contrasted groups study. 29. Which factor analytic method analyzes only shared variance while excluding unique and error variance? a. Component Factor Analysis b. Confirmatory Factor Analysis c. Principal Component Analysis d. Principal Factor Analysis Answer: d. Principal Factor Analysis 30. Factor analysis is commonly used to collect validity evidence based on: a. consequences of testing. b. test content . c. relations to other variables. d. response processes. e. internal structure. Answer: e. internal structure. 31. After examinees complete a math reasoning test, they are interviewed by researchers and asked what processes and strategies they engaged in when completing the test. Which type of validity evidence are these researchers collecting? a. Validity evidence based on consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on response processes e. Validity evidence based on internal structure Answer: d. Validity evidence based on response processes 32. Which type of validity evidence is most controversial and ignites considerable debate among the experts? a. Validity evidence based on consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on response processes e. Validity evidence based on internal structure Answer: a. Validity evidence based on consequences of testing 33. Which type of validity evidence can best be strengthened by the use of a table of specifications? a. Validity evidence based on consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on response processes e. Validity evidence based on internal structure Answer: b. Validity evidence based on test content 34. The __________ of a measure is the ability of the test at a predetermined cut score to determine accurately the absence of the disorder. a. correlation b. reliability c. sensitivity d. specificity Answer: d. specificity 35. _______________ is the most common approach to establishing the validity of academic achievement tests. a. Validity evidence based on consequences of testing b. Validity evidence based on test content c. Validity evidence based on relations to other variables d. Validity evidence based on response processes e. Validity evidence based on internal structure Answer: b. Validity evidence based on test content 36. When discriminant measures are used in validation studies, we expect _________. a. positive correlations b. negative correlations c. Indeterminate – it depends on the nature of relationship. Answer: b. negative correlations 37. Using the scree plot below, how many factors should be retained? a. 2 b. 4 c. 5 d. 7 Answer: c. 5 38. What is the relationship between the standard error of estimate and the validity coefficient? a. As the validity coefficient increases, so does the standard error of estimate. b. As the validity coefficient increases, the standard error of estimate decreases. c. No relationship exists. d. A relationship exists only when the correlation is orthogonal. Answer: b. As the validity coefficient increases, the standard error of estimate decreases. 39. Test-criterion validity evidence is an example of: a. validity evidence based on consequences of testing. b. validity evidence based on test content. c. validity evidence based on relations to other variables. d. validity evidence based on response processes. e. validity evidence based on internal structure. Answer: c. validity evidence based on relations to other variables. 40. In order to avoid __________ when collecting test-criterion evidence, it is important that predictor and criterion scores be obtained independently. a. construct underrepresentation b. criterion contamination c. construct-irrelevant variance d. criterion misrepresentation Answer: b. criterion contamination Chapter 6 1. An oral examination, scored by examiners who use a manual and rubric, is an example of _________ scoring. a. objective b. subjective c. projective d. validity Answer: b. subjective 2. A fill-in-the-blank question is a ___________ item. a. constructed-response b. selected-response c. typical-response d. objective-response Answer: a. constructed-response 3. Which of the following formats is a selected-response format? a. Multiple-choice b. True-false c. Matching d. All of above Answer: d. All of above 4. How many distracters is it recommended that one provide for multiple choice items? a. 2 b. 2 to 6 c. 3 to 5 d. 4 Answer: c. 3 to 5 5. When writing true-false items, one should include approximately _________ true and ________ false. a. 30%; 70% b. 50%; 50% c. 70%; 30% Answer: b. 50%; 50% 6. When developing matching items, one should keep the lists as ___________ as possible. a. heterogeneous b. homogeneous c. sequential d. simultaneous Answer: b. homogeneous 7. What is a strength of selected-response items? a. Selected-response items are easy and quick to write. b. Selected-response items can be used to assess all constructs. c. Selected-response items can be objectively scored. Answer: c. Selected-response items can be objectively scored. 8. ___________ require examinees to complete a process or produce a project in a real-life simulation. a. Projective tests b. Performance assessments c. Selected response test d. Multi-trait/multi-method tasks Answer: b. Performance assessments 9. A strength of constructed-response items is that they: a. eliminate random guessing. b. produce highly reliable scores c. can be quickly completed by examinees. d. eliminate “feigning.” Answer: a. eliminate random guessing. 10. You are creating a test designed to assess a flute player’s ability. Which format would assess this domain most effectively? a. Performance assessment b. Matching c. Selected-response d. True-false Answer: a. Performance assessment 11. General guidelines for writing test items include: a. the frequent use of negative statements. b. the use of complex, compound sentences to challenge the examinees. c. the avoidance of inadvertent cues to the answers. d. arranging items in a non-systematic manner. Answer: c. the avoidance of inadvertent cues to the answers. 12. When developing maximum performance tests, it is best to arrange the items: a. from easiest to hardest. b. from hardest to easiest. c. in the order the information was taught. d. randomly. Answer: a. from easiest to hardest. 13. Including more selected-response and other time-efficient items can: a. enhance the sampling of the content domain and increase reliability. b. enhance the sampling of the content domain and decrease reliability. c. introduce construct irrelevant variance. d. decrease validity. Answer: a. enhance the sampling of the content domain and increase reliability. 14. In order to determine the number of items to include on a test, one should consider the: a. age of examinees. b. purpose of test. c. types of items. d. type of test. e. All of the above Answer: e. All of the above 15. __________ are reported as the most popular selected-response items. a. Essays b. Matching c. Multiple-choice d. True-false Answer: c. Multiple-choice 16. When writing multiple-choice items, one advantage to the ______________ is that it may present the problem in a more concise manner. a. direct-question format b. incomplete sentence format c. indirect question format Answer: a. direct-question format 17. What would be the recommended multiple-choice format for the stem: ‘What does 10x10 equal?’ a. Best answer b. Correct answer c. Closed negative d. Double negative Answer: b. Correct answer 18. Which multiple-choice answer format requires the examinee to make subtle distinctions among distracters? a. Best answer b. Correct answer c. Closed negative d. Double negative Answer: a. Best answer 19. Which of the following is NOT a guideline for developing true-false items? a. Include more than one idea in the statement. b. Avoid using specific determiners such as all, none, or never. c. Ensure that true and false statements are approximately the same length. d. Avoid using moderate determiners such as sometimes and usually. Answer: a. Include more than one idea in the statement. 20. What is a strength of true-false items? a. They can measure very complex objectives. b. Examinees can answer many items in a short period of time. c. They are not vulnerable to guessing. Answer: b. Examinees can answer many items in a short period of time. 21. _________ scoring rubrics identify different aspects or dimensions, each of which is scored separately. a. Analytic b. Holistic c. Sequential d. Simultaneous Answer: a. Analytic 22. With a _______ rubric, a single grade is assigned based on the overall quality of the response. a. analytic b. holistic c. reliable d. structured Answer: b. holistic 23. One way to increase the reliability of short-answer items is to: a. give partial credit. b. provide a word bank. c. use the incomplete sentence format with multiple blanks. d. use a scoring rubric. Answer: d. use a scoring rubric. 24. What item format is commonly used in both maximum performance tests and typical response tests? a. Constructed-response b. Multiple-choice c. Rating scales d. True-false Answer: d. True-false 25. For typical-response tests, which format provides more information per item and thus can increase the range and reliability of scores? a. Constructed-response b. Frequency ratings c. True-false d. Matching Answer: b. Frequency ratings 26. Which format is the most popular when assessing attitudes? a. Constructed-response b. Forced choice c. Frequency scales d. Likert items e. True-false Answer: d. Likert items 27. What is a guideline for developing typical response items? a. Include more than one construct per item to increase variability. b. Include items that are worded in both positive and negative directions. c. Include more than 5 options on rating scales in order to increase reliability. d. Include statements that most people will endorse in a specific manner. Answer: b. Include items that are worded in both positive and negative directions. 28. Examinees tend to overuse the neutral response when Likert items use ________ and may omit items when Likert items use __________. a. an odd number of options; an even number of options b. an even number of options; an odd number of options c. homogenous options; heterogeneous options d. heterogeneous options; homogenous options Answer: a. an odd number of options; an even number of options 29. Which of the following items are difficult to score in a reliable manner and subject to feigning? a. Constructed-response b. True-false c. Selected-response d. Forced choice Answer: a. Constructed-response 30. Guttman scales provide which scale of measurement? a. Nominal b. Ordinal c. Interval d. Ratio Answer: b. Ordinal 31. Which assessment would best use a Thurstone scale? a. Constructed-response test b. Maximum performance test c. Speed test d. Power test e. Typical response test Answer: e. Typical response test 32. According to a study by Powers and Kaufman (2002) regarding the relationship between performance on the GRE and creativity, depth, and quickness, what were the findings? a. There is substantial evidence that creative, deep-thinkers are penalized by multiple-choice items. b. There was no evidence that creative, deep-thinkers are penalized by multiple-choice items. c. There was a significant negative correlation between GRE scores and depth. d. There was a significant negative correlation between GRE scores and creativity. Answer: b. There was no evidence that creative, deep-thinkers are penalized by multiple-choice items. 33. _________ are a form of performance assessment that involves the systematic collection and evaluation of work products. a. Rubrics b. Virtual exams c. Practical assessments d. Portfolio assessments Answer: d. Portfolio assessments 34. Distracters are: a. rubric grading criteria. b. the incorrect response on a multiple-choice items. c. words inserted in an item intended to “trick” the examinee. d. unintentional clues to the correct answer. Answer: a. rubric grading criteria. Chapter 7 1. Reliability relates to test ___________. a. items b. length c. scores d. constructs Answer: c. scores 2. On a maximum performance test administered to 100 students, 60 students correctly answer item #4. The item difficulty index equals: a. 0.40 b. 0.60 c. 40 d. 60 Answer: b. 0.60 3. Item 10 on an exam had an item difficulty index equal to .00. From this information, one knows that: a. the item was very difficult and all students answered it incorrectly. b. the item was very easy and all students answered it correctly. c. the item was of medium difficulty and half of the students answered it correctly. d. nothing since there is not enough information provided. Answer: a. the item was very difficult and all students answered it incorrectly. 4. Items with a difficulty index of _______ do not contribute to the measurement characteristics of a test. a. 0.25 b. 0.50 c. 0.75 d. 1.0 Answer: d. 1.0 5. Your employer wants a test that will help him to select the upper 30% of employees to consider for new positions. It would be beneficial for the item difficulty index to average around: a. 0.20 b. 0.30 c. 0.70 d. 0.80 e. 0.90 Answer: b. 0.30 6. On ___________ tests, measures of item difficulty largely reflects the position of the item in the test. a. power b. typical response c. multidimensional d. speed Answer: d. speed 7. As a general rule, the authors of the chapter suggest that items with D values greater than ______ are acceptable. a. 0.25 b. 0.30 c. 0.50 d. 0.70 Answer: b. 0.30 8. The item-total correlation is typically calculated using which correlation? a. Coefficient Alpha b. Pearson product moment correlation c. Point-biserial correlation d. Spearman rank order correlation Answer: c. Point-biserial correlation 9. The item-total correlation for an item is 0.50. Which of the following interpretations is correct? a. 2.5% of the total test variance is predictable from performance on the item. b. 5% of the total test variance is predictable from performance on the item. c. 25% of the total test variance is predictable from performance on the item. d. 75% of the total test variance is predictable from performance on the item. Answer: c. 25% of the total test variance is predictable from performance on the item. 10. What is the optimal Item Difficulty Index on a test consisting of only constructed response items? a. 0.0 b. 0.25 c. 0.50 d. 0.75 e. 1.0 Answer: c. 0.50 11. What is the approximate optimal Item Difficulty Index for a multiple-choice item with 4 choices? a. 0.0 b. 0.25 c. 0.50 d. 0.75 e. 1.0 Answer: d. 0.75 12. A general recommendation is to use items with p values that have a range of approximately _________ around the optimal value. a. 0.05 b. 0.10 c. 0.15 d. 0.20 e. 0.25 Answer: d. 0.20 13. While the item difficulty index is only applicable for _____________, the percent endorsement statistic can be calculated for ____________. a. maximum performance tests; constructed-response tests b. maximum performance tests; typical-response tests c. typical-response tests; constructed-response tests d. typical-response tests; maximum performance tests Answer: b. maximum performance tests; typical-response tests 14. On a reading comprehension exam, the proportion of examinees in the top group that answered item 5 correctly equaled 0.70 and the proportion of examinees in the bottom group that answered item 5 correctly equaled 0.10. What is the discrimination index for item 5? a. 0.36 b. 0.49 c. 0.60 d. 0.80 Answer: c. 0.60 15. When a test item has a discrimination index ______, it is considered to be acceptable by the chapter authors. a. greater than 0.30 b. less than 0.30 c. greater than 0.60 d. less than 0.60 Answer: a. greater than 0.30 16. A test item with a p = .70 and D = .45 is: a. relatively easy and discriminates well. b. relatively easy and does not discriminate well. c. relatively difficult and discriminates well. d. relatively difficult and does not discriminate well. e. of intermediate difficulty and does not discriminate well. Answer: a. relatively easy and discriminates well. 17. A test item with a p = .30 and D = .15 is: a. relatively easy and discriminates well. b. relatively easy and does not discriminate well. c. relatively difficult and does discriminate well. d. relatively difficult and does not discriminate well. e. of intermediate difficulty and does not discriminate well. Answer: d. relatively difficult and does not discriminate well. 18. A test item with p = .80 and D = .40 is: a. relatively easy and discriminates well. b. relatively easy and does not discriminate well. c. relatively difficult and does discriminate well. d. relatively difficult and does not discriminate well. e. of intermediate difficulty and does not discriminate well. Answer: a. relatively easy and discriminates well. 19. A test item with p = .50 and D = .40 is: a. relatively easy and discriminates well b. relatively easy and does not discriminate well. c. relatively difficult and does discriminate well. d. relatively difficult and does not discriminate well. e. of intermediate difficulty and discriminates well. Answer: e. of intermediate difficulty and discriminates well. 20. In a class of 100 students, 70 answer item #4 correctly and 30 answer it incorrectly. What is the p value of this item? a. 0.30 b. 0.40 c. 0.49 d. 0.70 Answer: d. 0.70 21. An effective distracter should be selected by: a. at least some examinees and demonstrate positive discrimination. b. at least some examinees and demonstrate negative discrimination. c. no examinees and demonstrate zero discrimination. d. all examinees and demonstrate positive discrimination. Answer: b. at least some examinees and demonstrate negative discrimination. 22. Item difficulty and distracter analysis are related to: a. Classical Test Theory. b. Factor Analytic Theory. c. Item Characteristic Curve Theory. d. Item Response Theory. Answer: a. Classical Test Theory. 23. An Item Characteristic Curve is a graph with ____________ reflected on the horizontal axis and ______________ reflected on the vertical axis. a. ability; probability of a correct response b. probability of a correct response; ability c. ability; probability of an incorrect response d. probability of an incorrect response; ability Answer: a. ability; probability of a correct response 24. The one-parameter IRT model assumes that items only differ in: a. the chance of a random correct answer. b. difficulty. c. discrimination. d. scree point. Answer: b. difficulty. 25. The ___________ IRT model takes into consideration the possibility of an examinee with no ‘ability’ answering the item correctly by chance. a. one-parameter b. Rasch model c. two-parameter d. three-parameter Answer: d. three-parameter 26. On an Item Characteristic Curve, the point halfway between the lower and upper asymptotes is referred to as the __________ and represents the difficulty of the item. a. medial apex b. modal beta c. beta index d. inflection point Answer: d. inflection point 27. __________ illustrates the reliability of measurement at different points along the distribution. a. Classical Test Theory b. Item Characteristic Curve c. Reliability Curve d. Test Information Function Answer: d. Test Information Function 28. In a class of 100 students, only 20 answered item 10 correctly. The p value for this item equals: a. 0.20 b. 0.40 c. 0.64 d. 0.80 Answer: a. 0.20 29. For item 21, pT is 0.80 and pB is 0.30. What is the value of D for this item? a. 0.20 b. 0.25 c. 0.50 d. 0.70 Answer: c. 0.50 30. The optimal p value for a selected-response item with two choices is approximately: a. 0.55. b. 0.65. c. 0.75. d. 0.85. Answer: d. 0.85. 31. It is reasonable to present items with p = 1.0 at the beginning of an exam to: a. enhance the overall variability of the exam. b. enhance examinees’ confidence. c. enhance the measurement characteristics of the exam. d. ensure examinees do not become overconfident. Answer: b. enhance examinees’ confidence. 32. A test item’s statistics are displayed in the following table. Based on this data, what is the most obvious problem with this item? a. All distracters are performing well but the item is too easy. b. Distracter B demonstrates positive discrimination and should be revised. c. Distracter D demonstrates negative discrimination and should be revised. d. No problem, retain the item as is. Answer: b. Distracter B demonstrates positive discrimination and should be revised. 33. A test analysis is displayed below. Based on this information, what is the most obvious problem? a. All distracters are performing well but the item is too easy. b. Distracter B demonstrates negative discrimination and should be revised. c. Distracter A demonstrates negative discrimination and should be revised. d. No problem, retain the item as is. Answer: d. No problem, retain the item as is. 34. A test analysis is displayed below. Based on this information, what is the most obvious problem? a. All distracters are performing well. b. Distracter A displays positive discrimination and should be revised. c. Distracter B displays positive discrimination and should be revised. d. No problem, retain the item as is. Answer: b. Distracter A displays positive discrimination and should be revised. 35. Which of the following was described as a qualitative approach to item analysis? a. Set the test aside and review at a later time. b. Have a colleague review the test. c. Allow examinees to provide feedback. d. All of the above. Answer: d. All of the above. 36. Which of the following statements regarding p values is correct? a. They can range from -1.0 to 1.0. b. Items with a high p value are more difficult. c. Items with a low p value are more difficult. d. Items with a p value of 1.0 were answered incorrectly by all students. Answer: c. Items with a low p value are more difficult. 37. Items with a value of _____ and ______ do not contribute to the variability of test scores. a. -1.0; 0.00 b. 0.00; 1.0 c. -1.0; 1.0 Answer: b. 0.00; 1.0 38. Item difficulty indexes on mastery tests tend to be ________ item difficulty indexes on tests designed to produce norm-referenced scores. a. larger than b. lower than c. equal to Answer: a. larger than 39. The optimal p value for a true/false items is approximately: a. 0.55. b. 0.65. c. 0.75. d. 0.85. Answer: d. 0.85. 40. How can reliability be reduced with many items close to a p value of 1.0? a. The total score is transformed in an oblique manner. b. The total score is an area transformation. c. This results in a restricted range of scores. d. This results in an artificially inflated validity coefficient. Answer: c. This results in a restricted range of scores. Chapter 8 1. A test designed to evaluate a student’s knowledge in a domain in which they have received instruction is a(n): a. assessment. b. achievement test. c. aptitude test. d. standardized test. Answer: b. achievement test. 2. What is typically an advantage of group administered standardized tests? a. Efficiency b. Very large normative samples c. Typically use objective items d. All of the above Answer: d. All of the above 3. _________ achievement tests typically provide a more thorough assessment of a student’s skills than ________ tests. a. Commercial; state-developed b. State-developed; commercial c. Group; individual d. Individual; group Answer: d. Individual; group 4. CTB McGraw-Hill publishes: a. California Achievement Tests, 5th edition b. Stanford Achievement Test Series, 10th edition c. Iowa Test of Basic Skills d. Metropolitan Tests of Achievement, 8th edition Answer: a. California Achievement Tests, 5th edition 5. Which test published by CTB McGraw-Hill combines selected-response and constructed-response items? a. Iowa Test of Basic Skills b. Stanford Achievement Test Series, 10th edition c. TerraNova CTBS, 4th edition d. TerraNova, 2nd edition (CAT/6) Answer: c. TerraNova CTBS, 4th edition 6. A test published by Pearson Assessment is the: a. Iowa Test of Basic Skills b. Stanford Achievement Test Series, 10th edition c. TerraNova, 2nd edition (CAT/6) d. California Achievement Tests, 5th edition Answer: b. Stanford Achievement Test Series, 10th edition 7. The TerraNova CTBS, 4th edition can be paired with ____________, a measure of academic aptitude. a. California Achievement Tests, 5th edition b. InView c. Iowa Test of Basic Skills d. Test of Cognitive Skills, 2nd edition Answer: d. Test of Cognitive Skills, 2nd edition 8. The Stanford Achievement Test Series, 10th edition can be used with which grade levels? a. Kindergarten – 8th b. Kindergarten – 12th c. 1st – 12th d. 1st– 7th Answer: b. Kindergarten – 12th 9. The Iowa Tests of Educational Development is designed for use with students in which grade levels? a. Kindergarten – 8th b. 1st – 8th c. 8th – 12th d. 9th – 12th Answer: d. 9th – 12th 10. In order to assess a student’s specific strengths and weaknesses in mathematics, which assessment would be most appropriate? a. Iowa Mathematics Test, 3rd edition b. Iowa Test of Basic Skills c. Stanford Diagnostic Mathematics Test, 4th edition d. Test of Cognitive Skills, 2nd edition Answer: c. Stanford Diagnostic Mathematics Test, 4th edition 11. The state of California desires to compare the academic achievement levels of their students to those of students from the state of Texas. Which category of test would be appropriate? a. Commercially available achievement battery b. State-developed achievement battery c. School-developed achievement battery Answer: a. Commercially available achievement battery 12. What test would be most appropriate for severely impaired students in special education in Texas? a. Reading Proficiency Test in English b. State-Developed Alternative Assessment c. Texas Assessment of Knowledge and Skills d. Texas Special Education Assessment Answer: b. State-Developed Alternative Assessment 13. According to Education Week (2007) what was the only state that reported exclusively using an off-the-shelf test? a. California b. Colorado c. Iowa d. Ohio Answer: c. Iowa 14. Which of the following test preparation practices is recommended by the authors? a. Extensive use of practice forms of the test b. Instruction in generic test-taking skills c. Preparation emphasizing test content d. Preparation emphasizing test-specific item formats Answer: b. Instruction in generic test-taking skills 15. The __________ is a brief achievement test that can be administered in 30-45 minutes. a. Wechsler Individual Achievement Test, 2nd edition b. Wide Range Achievement Test 4 c. Woodcock-Johnson III Tests of Achievement d. Woodcock-Johnson Tests of Cognitive Abilities Answer: b. Wide Range Achievement Test 4 16. Of the following test preparation practices, which would most likely have a narrowing effect on instruction? a. Instruction in generic test-taking skills b. Preparation emphasizing test-specific item formats c. Preparation emphasizing test content d. Preparation using multiple instructional techniques Answer: c. Preparation emphasizing test content 17. Which of the following test preparation practices is most likely to produce increases in test scores associated with increases in mastery of the underlying domain of skills and knowledge? a. Extensive use of practice forms of the test b. Preparation using multiple instructional techniques c. Preparation emphasizing test-specific item formats d. Preparation emphasizing test content Answer: b. Preparation using multiple instructional techniques 18. Which technique can help make students more familiar and comfortable with the assessment process thus enhancing the validity of the score interpretation? a. Instruction in generic test-taking skills b. Preparation emphasizing test-specific item formats c. Preparation emphasizing test content d. Preparation using multiple instructional techniques Answer: 19. The Wechsler Individual Achievement Test (WIAT-II) contains which of the following subtests? a. Numerical Operations b. Letter-Word Recognition c. Math Fluency d. Story Recall Answer: a. Numerical Operations 20. Which comprehensive individual achievement test battery is available in two parallel forms? a. Wechsler Individual Achievement Test, 2nd edition b. Woodcock-Johnson III Tests of Achievement c. Wide Range Achievement Test 4 d. Stanford Achievement Test Series, 10th edition Answer: b. Woodcock-Johnson III Tests of Achievement 21. Which individual achievement test is used in the diagnosis of reading problems? a. Gray Diagnostic Reading Test, 4th edition b. Gray Oral Reading Test, 4th edition c. Wechsler Diagnostic Reading Test, 5th edition d. Wide Range Achievement Test 4 Answer: b. Gray Oral Reading Test, 4th edition 22. Which of the following tests is acceptable for screening purposes but not for in-depth diagnostic purposes? a. Stanford Achievement Test Series, 10th edition b. Wechsler Individual Achievement Test, 2nd edition c. Wide Range Achievement Test 4 d. Woodcock-Johnson III Tests of Achievement Answer: c. Wide Range Achievement Test 4 23. According to Stiggins & Conklin, approximately how much professional time do teachers devote to assessment related activities? a. 1/4 b. 1/3 c. 1/2 d. 2/3 Answer: b. 1/3 24. When developing educational objectives, one should follow which of the following general guidelines? a. Objectives should cover a broad spectrum of knowledge and abilities. b. Objectives should specify a narrow spectrum of knowledge and abilities. c. Identify latent behaviors that lend themselves to indirect measurement. d. Objectives should be vague so as not to restrict the teacher too much. Answer: b. Objectives should specify a narrow spectrum of knowledge and abilities. 25. Dr. Mark addresses Susie and gives her specific feedback regarding the term paper she submitted and then allows her to resubmit her paper for grading. This initial evaluation is best described as: a. critique evaluation. b. feedback evaluation. c. formative evaluation. d. summative evaluation. Answer: c. formative evaluation. 26. The recommendation is that grades be assigned based on: a. academic achievement only. b. academic achievement and attendance. c. academic achievement, attendance, and participation. Answer: a. academic achievement only. 27. Ms. Wilson assigns the top 10% of her students an A, the next 20% a B, and so on. This is an example of which score interpretation? a. Norm-referenced approach b. Criterion-referenced approach c. Absolute referenced approach Answer: a. Norm-referenced approach 28. Who possesses the authority to license professionals? a. Cities b. Counties c. States d. Countries Answer: c. States 29. What percentage of the Examination for Professional Practice on Psychology covers the Biological Basis of Behavior content area? a. 5% b. 11% c. 21% d. 32% Answer: b. 11% 30. Which content area of the EPPP covers learning and motivation? a. Assessment and Motivation b. Biological Basis of Behavior c. Cognitive and Affective Basis of Behavior d. Growth and Lifespan Development Answer: c. Cognitive and Affective Basis of Behavior 31. Performance on the EPPP is reported in which score format? a. Raw scores b. Scaled scores c. T scores d. Z scores Answer: b. Scaled scores 32. How many stages or steps are involved in the United States Medical Licensing Examination? a. two b. three c. four d. five e. six Answer: b. three 33. What is Step 2 of the United States Medical Licensing Examination? a. Clinical knowledge and clinical skills b. Computer based multiple-choice assessment c. Computer based multiple-choice assessment with computer based case simulation d. Patient simulations Answer: a. Clinical knowledge and clinical skills 34. The State of Texas uses which of the following achievement tests? a. Hybrid tests and off-the-shelf tests only b. Off-the-shelf tests only c. State developed tests only d. State developed tests and hybrid tests Answer: c. State developed tests only 35. An important test-taking skill to teach students is: a. “Do not answer questions you are not sure of.” b. “Never return to a question and change your answer.” c. “Get through the test as quickly as possible.” d. “Make informed guesses by process of elimination.” Answer: d. “Make informed guesses by process of elimination.” 36. Which of these measures is most often used in the hiring and promotion of employees? a. Stanford Aptitude Test b. Wechsler Selection Test c. Universal Attitude and Value Battery d. Wonderlic Personnel Test Answer: d. Wonderlic Personnel Test 37. The National Assessment of Educational Progress provides a comprehensive assessment of students’ achievement at certain critical periods in their academic experience. These grade levels are: a. 2nd, 6th, and 10th. b. 4th, 8th, and 12th. c. 5th, 7th, and 11th. d. 6th, 8th, 12th. Answer: b. 4th, 8th, and 12th. 38. The No Child Left Behind Act of 2001 allows states to administer alternative assessments to _______ of their students. a. 2% b. 3% c. 5% d. 9% Answer: b. 3% 39. What law mandates that any institution that receives federal funds must ensure that individuals with disabilities have equal access to all programs and services provided by the institution? a. No Child Left Behind Act 2001 b. Section 504 of the Rehabilitation Act of 1973 c. The Individuals with Disabilities Education Improvement Act of 2004 d. The Education of All Handicapped Children Act of 1975 Answer: b. Section 504 of the Rehabilitation Act of 1973 40. In her research on the history of grading in the United States, Brookhart (2004) found that letter grades became common practice during the: a. 1820s. b. 1840s. c. 1910s. d. 1920s. Answer: d. 1920s. Test Bank for Mastering Modern Psychological Testing: Theory and Methods Cecil R. Reynolds, Ronald B. Livingston 9780205886081

Document Details

person
Harper Mitchell View profile
Close

Send listing report

highlight_off

You already reported this listing

The report is private and won't be shared with the owner

rotate_right
Close
rotate_right
Close

Send Message

image
Close

My favorites

image
Close

Application Form

image
Notifications visibility rotate_right Clear all Close close
image
image
arrow_left
arrow_right