Chapter 5 Statistical Analysis of Data 5.1 Individuals Differences 1) Statistics are research tools for: A) generating experimental hypotheses. B) organizing and understanding large sets of data. C) identifying participants. D) generating hypotheses, but only at the experimental level. Answer: B Rationale: Statistics are primarily used for organizing and making sense of large sets of data collected during research. While they can aid in generating hypotheses indirectly by revealing patterns and relationships in data, their primary function is to summarize and analyze data rather than generating hypotheses directly. 2) Which of the following tasks do statistics NOT accomplish? A) summarize results B) evaluate data C) evaluate results D) represent and describe groups Answer: C Rationale: Statistics are used to summarize results, evaluate data for patterns or relationships, and represent and describe groups. However, the task of evaluating results typically involves interpretation and judgment based on statistical analysis, rather than being a direct function of statistics themselves. 3) Decisions concerning which statistical procedure(s) to use are made in the: A) procedures-design phase. B) observation phase. C) data-analysis phase. D) problem-definition phase. Answer: A Rationale: Decisions regarding which statistical procedures to employ are typically made during the procedures-design phase of a study. This phase involves planning the research methodology, including selecting appropriate statistical analyses based on research questions, study design, and the type of data collected. 4) Which type of statistics simplify and organize data? A) inferential statistics B) univariate statistics C) descriptive statistics D) multivariate statistics Answer: C Rationale: Descriptive statistics are used to simplify and organize data by summarizing key characteristics, such as central tendency, variability, and distribution. They provide concise summaries that allow researchers to understand and interpret the data more easily. 5) Virtually all organismic variables studied in psychology show: A) improvement after practice. B) decline under stress. C) individual differences. D) group differences. Answer: C Rationale: Organismic variables in psychology often exhibit individual differences, meaning that individuals within a population vary in their responses or characteristics. Understanding these individual differences is crucial for psychological research and practice. 6) Which type of statistics help us draw conclusions about the data? A) inferential statistics B) interpretational statistics C) descriptive statistics D) None of the above Answer: A Rationale: Inferential statistics are used to draw conclusions and make inferences about a population based on sample data. They allow researchers to generalize findings from a sample to a larger population and test hypotheses about relationships or differences between variables. 7) The two major classes of statistical procedures are: A) summary and inferential statistics. B) univariate and inferential statistics. C) univariate and descriptive statistics. D) inferential and descriptive statistics. Answer: D Rationale: The two major classes of statistical procedures are descriptive statistics, which summarize and describe data, and inferential statistics, which allow researchers to make inferences and test hypotheses about populations based on sample data. 8) Most variables manipulated in psychology make: A) large differences in how people behave compared with preexisting differences. B) individual differences larger. C) large differences in how people behave and increase preexisting individual differences. D) small differences in how people perform compared with preexisting individual differences. Answer: D Rationale: Variables manipulated in psychology experiments typically result in small differences in how people perform compared to preexisting individual differences. Experimental manipulations are often designed to isolate specific effects and minimize confounding factors. 9) The goal in psychological experimentation is to show that differences on dependent measures are due to: A) preexisting individual differences. B) research manipulations. C) research manipulations and measurement error. D) individual differences that were manipulated by the researcher. Answer: B Rationale: In psychological experimentation, the goal is to demonstrate that differences observed in dependent measures are attributable to the manipulations implemented by the researcher (i.e., independent variables). This helps establish causality and supports the validity of experimental findings. 10) The two major types of statistical procedures are: A) descriptive and prescriptive. B) descriptive and inferential. C) parametric and nonparametric. D) pure and applied. Answer: B Rationale: The two major types of statistical procedures are descriptive statistics, which summarize and describe data, and inferential statistics, which allow researchers to make inferences and test hypotheses about populations based on sample data. 11) Statistics used to summarize, simplify, and represent large numbers of measurements are called A) prescriptive statistics. B) parametric statistics. C) inferential statistics. D) descriptive statistics. Answer: D Rationale: Descriptive statistics involve methods for organizing, summarizing, and presenting data in a meaningful way, such as through measures of central tendency (e.g., mean, median, mode) and measures of variability (e.g., range, standard deviation). They are primarily concerned with describing and summarizing the characteristics of a dataset. 12) Which of the following tasks do descriptive statistics NOT accomplish? A) summarize B) simplify C) evaluate D) describe Answer: C Rationale: Descriptive statistics are used to summarize, simplify, and describe data. While they provide insight into the characteristics of a dataset, they do not involve evaluation or inference about relationships or hypotheses, which are tasks typically handled by inferential statistics. 13) Inferential statistics are used to A) summarize and simplify data. B) summarize and describe data. C) interpret and describe data. D) interpret data. Answer: D Rationale: Inferential statistics involve making inferences or generalizations about populations based on sample data. They are used to interpret data and draw conclusions beyond the observed data, such as testing hypotheses or making predictions. 14) Statistics used to help interpret what the data mean are called A) inferential statistics. B) descriptive statistics. C) heuristics. D) nonparametric statistics. Answer: A Rationale: Inferential statistics are used to infer or interpret what the data mean, such as making predictions or testing hypotheses, based on sample data. Descriptive statistics, on the other hand, are used to summarize and describe the data itself. 15) The purpose of descriptive statistics is to A) determine if the sample data accurately describe the population. B) simplify and organize large sets of data. C) help us decide whether population means are equal. D) All of the above Answer: B Rationale: Descriptive statistics are primarily used to simplify and organize large sets of data by summarizing their main characteristics, such as through measures of central tendency and variability. While they can provide insights into sample data, they do not directly address population parameters or hypotheses. 16) What type of data is generated by measures of annual income in dollars? A) nominal B) ordered C) interval D) ratio Answer: D Rationale: Annual income in dollars represents ratio data, as it possesses all the properties of interval data with the additional feature of having a true zero point, meaning that zero represents the absence of income. Ratio data allows for meaningful comparisons, addition, subtraction, multiplication, and division. 5.2 Organizing Data 1) Which of the following is NOT an important group of descriptive statistics? A) frequency distributions B) graphical representation of data C) summary statistics D) tests for mean differences Answer: D Rationale: Descriptive statistics include frequency distributions, graphical representation of data, and summary statistics, which are all important for organizing and summarizing data. However, tests for mean differences belong to inferential statistics, as they involve making inferences about population parameters based on sample data. 2) Statistical simplification for nominal or ordinal data is often done by using A) a t-test. B) frequency distributions. C) means. D) standard deviations. Answer: B Rationale: Nominal or ordinal data, which are categorical in nature, are often simplified using frequency distributions. Frequency distributions provide a summary of the number of occurrences of each category or rank within the dataset, making it easier to understand and analyze categorical data. 3) Which of the following is often computed with ordinal and nominal data? A) frequencies B) means C) t-tests D) random number tables Answer: A Rationale: Ordinal and nominal data are categorical in nature and are often summarized using frequencies, which count the occurrences of each category or rank within the dataset. Means, t-tests, and random number tables are typically associated with interval or ratio data and are less applicable to categorical data. 4) All of the following belong to the realm of descriptive statistics EXCEPT A) frequency counts and distributions. B) the null hypothesis. C) summary statistics. D) graphical representations of data. Answer: B Rationale: The null hypothesis belongs to the realm of inferential statistics rather than descriptive statistics. Descriptive statistics involve methods for summarizing and describing data, such as frequency counts and distributions, summary statistics, and graphical representations. The null hypothesis is a statement that is tested using inferential statistical methods to make inferences about population parameters based on sample data. 5) The variables of age, income, and number of visits to a hospital emergency room are all measured on ________ scales. A) ratio B) internal C) non-psychometric D) nominal Answer: A Rationale: Age, income, and number of visits to a hospital emergency room are all measured on ratio scales because they possess a true zero point and allow for meaningful comparisons in terms of magnitude and direction. For example, in age, the difference between 20 years and 30 years is the same as the difference between 50 years and 60 years, and there is a true absence of age at zero. Similarly, in income, zero income represents a true absence of income, and comparisons can be made in terms of magnitude. The number of visits to a hospital emergency room can also be measured on a ratio scale because zero visits represents a true absence of visits, and comparisons can be made in terms of magnitude and direction. 6) Categorizing participants on the basis of more than one variable at a time is called A) multivariate tabulation. B) algorithmic tabulation. C) cross-tabulation. D) multi-matrix tabulation. Answer: C Rationale: Cross-tabulation involves categorizing participants based on more than one variable simultaneously. It allows for the examination of relationships between variables by organizing data into a contingency table, where the intersection of rows and columns represents the joint distribution of the variables being studied. 7) Cross-tabulation is the method used to help elucidate the relationships between A) ordinal measures. B) ratio measures. C) interval measures. D) nominal measures. Answer: D Rationale: Cross-tabulation is particularly useful for elucidating relationships between variables measured on nominal scales. Nominal measures categorize data into distinct categories or groups without any inherent order or ranking. By cross-tabulating nominal variables, researchers can examine how the frequency of occurrences in one category of a variable varies with another category of another variable. 8) If a researcher wishes to categorize participants on the variables of sex and psychiatric diagnosis, he or she would arrange the data in a matrix called A) a graph. B) a cross-tabulation. C) a frequency distribution. D) a cross-linear matrix. Answer: B Rationale: When categorizing participants on the basis of two variables such as sex and psychiatric diagnosis, a cross-tabulation is used. A cross-tabulation organizes data into a matrix format, where the rows represent one variable (e.g., sex) and the columns represent another variable (e.g., psychiatric diagnosis), allowing for a clear examination of the relationship between the variables. 9) Cross-tabulation is used to A) categorize participants on the basis of two or more variables at one time. B) categorize participants on the basis of only two variables at one time. C) tabulate across different participants on one variable. D) tabulate across three variables for different participants. Answer: A Rationale: Cross-tabulation is used to categorize participants on the basis of two or more variables at one time. It allows for the examination of relationships between variables by organizing data into a contingency table, facilitating the analysis of patterns and associations between multiple factors simultaneously. 10) Suppose a researcher wanted to classify college participants according to both where they live (dorm, apartment, at home) and type of high school they attend (public, Catholic, other private). The best way to do this would be using a(n) A) univariate count. B) univariate distribution. C) cross-tabulation. D) grouped frequency distribution. Answer: C Rationale: The best way to classify participants according to multiple variables such as where they live and the type of high school they attended is through cross-tabulation. Cross-tabulation allows for the simultaneous examination of relationships between two or more categorical variables, making it suitable for analyzing the interplay between where participants live and their high school type. 11) The row totals in a cross-tabulation represent A) univariate frequency distributions. B) multivariate frequency distributions. C) means of the variable. D) the modes of the variable. Answer: A Rationale: In a cross-tabulation, the row totals represent univariate frequency distributions for one of the variables being studied. Each row corresponds to a category of one variable, and the totals in each row indicate the frequency of occurrences for that category across the other variable(s) being analyzed. 12) Cross-tabulation can be used with A) ordinal measures. B) ratio measures. C) nominal measures. D) all of the above Answer: D Rationale: Cross-tabulation can be used with variables measured on ordinal, ratio, and nominal scales. It is particularly effective for nominal measures, but it can also be applied to ordinal and ratio variables to examine relationships between categories or groups. 13) In a cross-tabulation of religious affiliation and race, the frequency totals for the variable race represent A) a multivariate frequency distribution. B) a univariate frequency distribution. C) the mean of race variable. D) the mode of the race variable. Answer: B Rationale: In a cross-tabulation, the frequency totals for each variable represent univariate frequency distributions for that variable. Therefore, in a cross-tabulation of religious affiliation and race, the frequency totals for the variable race indicate the distribution of race categories across the entire sample, providing insight into the racial composition of the population being studied. 14) The simplest way to organize score data is with a A) cross-tabulation. B) t-test. C) multivariate count. D) frequency distribution. Answer: D Rationale: The simplest way to organize score data is with a frequency distribution. A frequency distribution lists the number of occurrences of each score or category within a dataset, providing a clear summary of the distribution of scores or categories. It is a fundamental tool in descriptive statistics for understanding the distributional characteristics of a variable. 15) A grouped frequency distribution is required with A) all variables. B) a small number of possible scores. C) continuous variables. D) multivariate studies. Answer: C Rationale: Grouped frequency distributions are particularly useful for continuous variables, where there's a large range of values. By grouping the data into intervals, it becomes more manageable and easier to interpret. 16) To summarize a large number of different scores it is best to use a A) cross-tabulation. B) multivariate frequency distribution. C) continuous frequency distribution. D) grouped frequency distribution. Answer: D Rationale: When dealing with a large number of different scores, a grouped frequency distribution is preferred as it condenses the data into intervals, providing a concise summary of the distribution. 17) In a study on bruxism (grinding of the teeth), participants recorded for six months the number of times per day they experienced facial pain. The best way for a researcher to organize the large number of scores on this variable would be to employ A) an assistant. B) a frequency table. C) a grouped frequency distribution. D) a Wilcoxon matched-pairs signed-rank test. Answer: C Rationale: Given the wide range of scores, a grouped frequency distribution would be most suitable for organizing the data effectively, making it easier to analyze and interpret. 18) The sale price for a house is an example of a(n) A) intermittent variable. B) continuous variable. C) multi-intermittent variable. D) contingent variable. Answer: B Rationale: Sale price for a house is a continuous variable because it can take on any value within a certain range (e.g., $100,000, $150,000, $200,000, etc.), without any gaps or interruptions. 19) Which of the following descriptive statistical procedures CANNOT be used with a continuous variable? A) a frequency distribution B) a grouped frequency distribution C) a median D) a mean Answer: A Rationale: A frequency distribution can be used with continuous variables by grouping the data into intervals. Therefore, it is incorrect to say that it cannot be used with a continuous variable. 20) In order to organize a large number of scores based on a continuous variable, it is necessary to use A) a grouped frequency distribution. B) a frequency distribution. C) a graph. D) a cross-tabulation. Answer: A Rationale: Grouped frequency distributions are used to organize a large number of scores based on continuous variables into intervals, making the data more manageable and easier to understand. 21) In a study on bruxism (grinding of the teeth), participants recorded for six months the number of times per day they experienced facial pain. The data described above can be represented pictorially by a graph called A) a frequency parabola. B) a biaxial distribution. C) a frequency tetrahedron. D) a frequency polygon. Answer: D Rationale: A frequency polygon is a graph used to represent the distribution of a continuous variable, where the points are plotted at the midpoint of each interval and connected with straight lines. 22) The ordinate is another name for the A) abscissa. B) x-axis. C) z-axis. D) y-axis. Answer: D Rationale: The ordinate is the vertical axis of a graph, which represents the dependent variable. It is also commonly referred to as the y-axis. 23) Frequency or grouped frequency distributions can be represented graphically by A) frequency polygons. B) histograms. C) abscissas and ordinates. D) both A and B. Answer: D Rationale: Both frequency polygons and histograms are graphical representations of frequency or grouped frequency distributions, providing visual summaries of the data. 24) Another name for the horizontal axis of a graph is the A) y-axis. B) x-axis. C) ordinate. D) halidome. Answer: B Rationale: The horizontal axis of a graph is called the x-axis. It represents the independent variable in most cases and is perpendicular to the y-axis. 25) The vertical axis of a graph is also known as the A) x-axis. B) halidome. C) abscissa. D) y-axis. Answer: D Rationale: The vertical axis of a graph, often denoted as the y-axis, represents the dependent variable in the relationship being graphed. In mathematical terms, it typically represents the output or response variable. The y-axis is perpendicular to the horizontal axis (x-axis), forming a coordinate plane where data points are plotted. 26) In histograms or frequency polygons, the horizontal axis A) is also known as the ordinate. B) is also known as the abscess. C) represents the range of scores for the variable. D) represents the frequency of scores. Answer: C Rationale: The horizontal axis in histograms or frequency polygons typically represents the range of scores or values for the variable being measured. It is commonly referred to as the x-axis. The x-axis provides a scale along which the data points are plotted. 27) In histograms or frequency polygons, the vertical axis A) represents the range of scores for the variable. B) represents the frequency of the scores. C) is also known as the x-axis. D) is also known as the abscissa. Answer: B Rationale: In histograms or frequency polygons, the vertical axis (often referred to as the y-axis) represents the frequency of the scores or values. It displays how often each value or score occurs within the dataset. 28) The horizontal axis of a frequency polygon is also referred to as A) the abscissa. B) the ordinate. C) the intersect. D) the slope. Answer: A Rationale: The horizontal axis of a frequency polygon is known as the abscissa. It provides the scale for the independent variable, typically representing the values or categories being measured. 29) The vertical axis of a frequency polygon is also referred to as A) the slope. B) the ordinate. C) the intersect. D) the abscissa. Answer: B Rationale: The vertical axis of a frequency polygon is referred to as the ordinate. It represents the frequency or density of the data points plotted along the horizontal axis (abscissa). 30) The horizontal axis of a frequency polygon is also referred to as A) the y-axis. B) the x-axis. C) the ordinate. D) the kurtotic axis. Answer: B Rationale: The horizontal axis of a frequency polygon is commonly known as the x-axis. It represents the independent variable or categories being measured. 31) The vertical axis of a frequency polygon is also referred to as A) the x-axis. B) the abscissa. C) the y-axis. D) the slope. Answer: C Rationale: The vertical axis of a frequency polygon is typically referred to as the y-axis. It represents the dependent variable, such as frequency or density. 32) The two dimensions of a frequency polygon are represented by A) height and length. B) length and width. C) the amygdala and the ordinate. D) the ordinate and the abscissa. Answer: D Rationale: The two dimensions of a frequency polygon are represented by the ordinate (vertical axis) and the abscissa (horizontal axis), which provide the framework for plotting the data points. 33) In a histogram, the frequency of each score is represented by the height of a A) point above that score on the abscissa. B) point above that score on the ordinate. C) bar above that score on the ordinate. D) bar above that score on the abscissa. Answer: D Rationale: In a histogram, the frequency of each score is represented by the height of a bar above that score on the horizontal axis (abscissa). Each bar's height indicates how frequently a particular value occurs within the dataset. 34) In a frequency polygon, the frequency of a score is represented by the height of a A) bar above that score on the ordinate. B) point above that score on the abscissa. C) point above that score on the ordinate. D) bar above that score on the abscissa. Answer: B Rationale: In a frequency polygon, the frequency of a score is represented by the height of a point above that score on the horizontal axis (abscissa). Each point's height indicates the frequency or density of data points at that particular value or category. 35) One of the advantages of a frequency polygon is that A) with a small number of participants, it looks like a smooth curve. B) it primarily produces bell-shaped curves. C) two frequency distributions can be compared easily. D) it produces a more attractive representation of data. Answer: C Rationale: Frequency polygons allow for easy comparison of two frequency distributions because they can be plotted on the same graph, allowing visual comparison of the shapes and patterns of the distributions. 36) An advantage of a frequency polygon is that A) two frequency distributions can be compared on the same graph. B) only one frequency distribution can be plotted on one graph. C) no further statistical analysis is necessary. D) the data need no explanation or interpretation. Answer: A Rationale: Frequency polygons offer the advantage of being able to compare two frequency distributions on the same graph, aiding in visual analysis and comparison of data sets. 37) Frequency polygons with jagged edges usually represent A) incomplete data. B) small data sets. C) large data sets. D) bad artistry. Answer: B Rationale: Frequency polygons with jagged edges typically represent small data sets because with fewer data points, the plotted lines tend to have more abrupt changes in direction, resulting in a jagged appearance. 38) Distributions with bell-shaped, symmetric curves are referred to as A) skewed distributions. B) curvilinear distributions. C) normal distributions. D) Poisson distributions. Answer: C Rationale: Distributions with bell-shaped, symmetric curves are referred to as normal distributions, indicating that the data is evenly distributed around the mean in a symmetrical manner. 39) As group size increases A) frequency polygons appear more jagged. B) histograms will look like normal curves. C) frequency polygons appear more smooth. D) frequency polygons will look like normal curves. Answer: C Rationale: As group size increases, frequency polygons tend to appear more smooth because there are more data points, resulting in a smoother connection between plotted points. 40) Distributions with bell-shaped curves are also referred to as A) rectilinear distributions. B) skewed distributions. C) asymmetric distributions. D) normal or near normal distributions. Answer: D Rationale: Distributions with bell-shaped curves are commonly referred to as normal or near-normal distributions, indicating that they exhibit a symmetrical pattern around the mean. 41) In a distribution with a bell-shaped curve, A) most scores fall at the ends of the distribution. B) most scores fall at the top of the distribution. C) most scores fall in the middle of the distribution. D) all scores fall in the middle of the distribution. Answer: C Rationale: In a distribution with a bell-shaped curve, most scores fall in the middle of the distribution, near the mean, reflecting the symmetrical nature of the distribution. 42) In a symmetric bell-shaped distribution, A) most scores fall in the middle, with fewer scores in the tails. B) most scores fall in the top end, with few scores in the middle. C) most scores fall in the bottom end, with few scores in the middle. D) None of the above Answer: A Rationale: In a symmetric bell-shaped distribution, most scores are concentrated in the middle, near the mean, with fewer scores in the tails or extremes of the distribution, resulting in a symmetrical pattern. 43) A normal curve is A) positively skewed. B) negatively skewed. C) symmetric. D) bimodal. Answer: C Rationale: A normal curve is symmetric, meaning that it is evenly distributed around the mean without skewness in either direction. 44) As the group size increases, the frequency polygon A) tends to look negatively skewed. B) tends to look positively skewed. C) tends to look more like a smooth curve. D) tends to look more jagged in appearance. Answer: C Rationale: As group size increases, the frequency polygon tends to look more like a smooth curve because there are more data points, allowing for smoother connections between plotted points and reducing jaggedness. 45) In a skewed distribution, the scores A) primarily cluster in the center of the distribution. B) primarily cluster at both ends of the distribution. C) fall equally throughout the distribution. D) primarily cluster on one end of the distribution. Answer: D Rationale: In a skewed distribution, most of the scores tend to cluster towards one end of the distribution, either the left or the right side, leading to a tail on one side. This clustering creates an imbalance in the distribution, making option D, "primarily cluster on one end of the distribution," the correct choice. 46) In a skewed distribution, the direction of the skew is indicated by A) the height of the curve. B) the section where the most scores lie. C) the tail of the curve. D) the point where the most scores lie. Answer: C Rationale: The direction of skewness in a distribution is indicated by the tail of the curve. If the tail extends towards the right side, it's a positive skew; if it extends towards the left side, it's a negative skew. Hence, option C, "the tail of the curve," correctly identifies the direction of skew. 47) A distribution in which most of the scores cluster near the bottom is a A) symmetric distribution. B) rectilinear distribution. C) negatively skewed distribution. D) positively skewed distribution. Answer: D Rationale: When most of the scores in a distribution cluster near the bottom, it means there is an extended tail towards the higher end of the scale, indicating a concentration of scores on the lower end. This scenario characterizes a positively skewed distribution, making option D the correct choice. 48) A distribution in which most of the scores cluster at the top or high end of the scale is a A) symmetric distribution. B) rectilinear distribution. C) positively skewed distribution. D) negatively skewed distribution. Answer: D Rationale: When most of the scores in a distribution cluster at the top or high end of the scale, it indicates an extended tail towards the lower end of the scale. This pattern defines a negatively skewed distribution, making option D the correct choice. 49) A difficult test in which most of the class did badly would form a A) symmetric distribution. B) negatively skewed distribution. C) positively skewed distribution. D) rectilinear distribution. Answer: C Rationale: If most of the class performs poorly on a difficult test, the scores will be concentrated towards the lower end of the scale, resulting in a positively skewed distribution where the tail extends towards the higher end of the scale. 50) In a classroom for gifted students, we would expect the distribution of IQ scores to be A) rectilinear. B) identical. C) skewed. D) symmetric. Answer: C Rationale: In a classroom for gifted students, the IQ scores would likely be skewed towards the higher end of the scale, as most students would have above-average intelligence, leading to a skewed distribution with a tail towards the lower end. 51) In a large classroom, a distribution of the weights of the students would be expected to be A) positively skewed. B) negatively skewed. C) symmetric. D) rectilinear. Answer: C Rationale: In a large classroom, the distribution of weights among students is likely to be relatively symmetric, with a roughly equal number of students distributed across different weight ranges. 52) If a frequency polygon showed a distinctive bunching of scores near the bottom of the scale with few scores in the middle and upper parts of a scale, we would say that the data were A) normally distributed. B) positively skewed. C) negatively skewed. D) symmetric. Answer: B Rationale: A distinctive bunching of scores near the bottom of the scale with fewer scores in the middle and upper parts indicates a concentration of scores on the lower end, characteristic of a positively skewed distribution. 53) The horizontal spread of a distribution is known as its A) central tendency. B) variability. C) frequency distribution. D) symmetry. Answer: B Rationale: The horizontal spread of a distribution refers to the dispersion or variability of the data points along the x-axis, indicating how spread out the values are from each other. 54) A distribution's average location on the x-axis is known as its A) variability. B) symmetry. C) central tendency. D) frequency distribution. Answer: C Rationale: The average location of a distribution on the x-axis represents its central tendency, indicating where the center of the distribution lies in terms of the variable being measured. 5.3 Descriptive Statistics 1) Which of the following do descriptive statistics NOT accomplish? A) describe the data with one or two numbers B) evaluate the data to compare groups C) make it easier to compare groups D) provide a basis for later analyses Answer: B Rationale: Descriptive statistics are used to summarize and describe the main features of a dataset, such as central tendency, variability, and distribution. They are not primarily used for evaluating or comparing data between different groups. 2) Which of the following is NOT a measure of central tendency? A) mean B) mode C) range D) median Answer: C Rationale: Range is a measure of variability, not central tendency. It represents the difference between the highest and lowest values in a dataset. 3) Measures of central tendency describe A) the variability of scores. B) the typical or average score. C) the range of scores in the distribution. D) the most important scores in a distribution. Answer: B Rationale: Measures of central tendency describe the typical or average score in a distribution, providing a single value that represents the center of the data. 4) The mode is A) the largest score. B) the average score. C) the middle score. D) the most frequently occurring score. Answer: D Rationale: The mode is the value that appears most frequently in a dataset. It may or may not coincide with the largest or average score. 5) The three measures of central tendency used in describing psychological data are the mean, the median, and A) the mode. B) the variance. C) the centrum. D) the norm. Answer: A Rationale: The mean, median, and mode are the three measures of central tendency commonly used in psychology to describe the central or typical value of a dataset. 6) The most frequently occurring score in the distribution is A) the mean. B) the median. C) the meridian. D) the mode. Answer: D Rationale: The mode represents the most frequently occurring value in a dataset, making it a measure of central tendency. 7) A distribution can have more than one A) mode. B) mean. C) median. D) standard deviation. Answer: A Rationale: A distribution can have multiple modes if more than one value occurs with the highest frequency in the dataset. 8) A mode can be appropriately used with which type(s) of measurement? A) nominal B) ordinal C) ratio D) All of the above Answer: D Rationale: The mode can be used with all types of measurement scales, including nominal, ordinal, interval, and ratio scales. 9) Students receive the following scores on an exam: 80, 85, 95, 80, 90. The median would be A) 80 B) 95 C) 85 D) 88 Answer: C Rationale: To find the median, arrange the scores in ascending order: 80, 80, 85, 90, 95. The median is the middle value, which is 85. 10) A researcher is doing a study on type of high school attended and grades in college. To show the central tendency of the type of high school the researcher would use a A) mean. B) mode. C) median. D) standard deviation. Answer: B Rationale: Since the type of high school attended is a categorical variable, the appropriate measure of central tendency would be the mode, which represents the most frequently occurring category in the dataset. 11) Which of the following would researchers generally try to avoid using because it can be unstable? A) mean B) median C) mode D) variance Answer: C Rationale: Researchers generally try to avoid using the mode because it can be unstable, particularly when there are multiple modes or when the data is skewed. 12) The median is the A) most frequently occurring score. B) average score. C) middle score. D) highest score. Answer: C Rationale: The median is the middle score in a distribution when the scores are arranged in ascending order. It is not influenced by extreme values and provides a measure of central tendency that is less affected by outliers compared to the mean. 13) Which of the following measures of central tendency can have more than one value in a single sample? A) mean B) median C) mode D) None of the above Answer: C Rationale: The mode can have more than one value in a single sample if multiple values occur with the same highest frequency. 14) Which of the following measures of central tendency could be computed on a variable that is measured on a nominal scale of measurement? A) mean B) median C) mode D) All of the above Answer: C Rationale: The mode can be computed on a variable measured on a nominal scale because it simply represents the most frequently occurring value, regardless of the numerical values assigned to categories. 15) A distribution may have more than one A) mean. B) mode. C) median. D) average. Answer: B Rationale: A distribution may have more than one mode if there are multiple values that occur with the same highest frequency. 16) If a distribution has two modes, the distribution is said to be A) invalid. B) bivalent. C) bimodal. D) unequivocal. Answer: C Rationale: When a distribution has two modes, it is described as bimodal. 17) If a distribution has three modes, it is said to be A) unstable. B) unequal. C) trimodal. D) trifocal. Answer: C Rationale: A distribution with three modes is referred to as trimodal. 18) A disadvantage of relying on the mode is that it is A) unstable. B) difficult to compute. C) not affected by a change in one or two scores. D) not able to be used with all scales of measurement. Answer: A Rationale: The mode can be unstable, especially when there are multiple modes or when the data distribution is irregular. 19) The middle score in a distribution of scores arranged from lowest to highest is called A) the meridian. B) the median. C) the mode. D) the mean. Answer: B Rationale: The median is the middle score in a distribution when the scores are arranged in ascending order. 20) The 50th percentile of a distribution is referred to as the A) quintile. B) mode. C) median. D) mean. Answer: C Rationale: The 50th percentile of a distribution is the median, which divides the distribution into two equal parts. 21) In a study examining the effectiveness of a behavioral intervention to reduce cholesterol intake, researchers observe that there are three different scores that occur with the highest (and the same) frequency. The distribution of scores is said to be A) trimodal. B) tripolar. C) centrifugal. D) centripetal. Answer: A Rationale: Trimodal distribution refers to a distribution with three distinct peaks or modes. In this scenario, since there are three different scores occurring with the highest frequency, it suggests that the data has three modes or peaks, making it trimodal. 22) Participants receive the following scores on a test: 20, 50, 40, 30, 50, 10. The mode would be A) 40 B) 35 C) 50 D) 38 Answer: C Rationale: The mode is the score that occurs most frequently in a dataset. In this case, the score of 50 appears twice, which is more than any other score. Thus, the mode of the dataset is 50. 23) In a study on medication compliance in residents of a skilled nursing facility, the researcher wants to do a "median split." Where would the researcher find the median of the distribution of compliance scores? A) at the 30th percentile B) at the 50th percentile C) at the 100th percentile D) at the same point as the mode of the distribution Answer: B Rationale: The median split divides the data into two equal halves. Since the median is the midpoint of a distribution, it is located at the 50th percentile. 24) A researcher conducts a study comparing class grades and age in one classroom. While the majority of the class is 19-22 years old, there are three students in their 40s. In this case, the most reasonable measure of central tendency would be the A) mode. B) median. C) mean. D) standard deviation. Answer: B Rationale: The presence of outliers, such as the three students in their 40s, can heavily skew the mean. The median, which is less affected by extreme values, would provide a more accurate representation of the central tendency in this case. 25) When there are two possible values for the median, the median is A) the largest of the two. B) the smaller of the two. C) the average of the two. D) both of the two (termed bimedial). Answer: C Rationale: When there are two possible values for the median, the median is calculated as the average of those two values. This ensures that it lies exactly between them, representing the central position. 26) Half of the scores in a distribution fall below the A) mode. B) stanine. C) mean. D) median. Answer: D Rationale: The median is the value that separates the higher half from the lower half of a dataset. Therefore, half of the scores fall below the median. 27) The descriptive statistics that tend to be unstable A) can be used to describe all types of data. B) are the simplest descriptive measures. C) can be used to describe score data only. D) are the most complicated measures. Answer: B Rationale: Simple descriptive statistics like the mode and median can be sensitive to changes in the dataset and may not provide stable representations of the data, especially with small sample sizes or skewed distributions. 28) Which of the following is NOT a descriptive statistic? A) correlated t-test B) mean C) standard deviation D) range Answer: A Rationale: A correlated t-test is a statistical test used to determine if there is a significant relationship between two sets of scores, not a descriptive statistic. Descriptive statistics summarize and describe features of a dataset, such as its central tendency, variability, and spread. 29) A researcher is doing a study on type of high school attended and grades in college. To show the central tendency of college grades the researcher would use a A) mean. B) mode. C) variance. D) standard deviation. Answer: A Rationale: The mean is typically used to show the central tendency of a dataset when analyzing grades or continuous variables. It provides a measure of the average score in the dataset. 30) The median should NOT be used with which type of data? A) nominal B) ordinal C) score D) None of the above; the median can be used with all types of data Answer: A Rationale: The median should not be used with nominal data because nominal data consists of categories with no inherent order, and it doesn't make sense to find the midpoint between categories. However, the median can be used with ordinal, interval, and ratio data, where there is a meaningful order among the categories. 31) The mean is the ________ score. A) most frequently occurring B) middle C) highest D) average Answer: D Rationale: The mean is the measure of central tendency that represents the average value of a set of scores. It is calculated by summing up all the scores and dividing by the total number of scores. Therefore, option D, "average," is the correct choice. 32) An X with a line over it (read X bar) is the notation for the A) median. B) mean. C) mode. D) sum of all scores. Answer: B Rationale: The notation "X bar" represents the mean of a set of scores. It is calculated by summing up all the scores and dividing by the total number of scores. Therefore, option B, "mean," is the correct choice. 33) A mean can be appropriately used with which type(s) of data? A) nominal B) score C) ordered D) all of the above Answer: B Rationale: The mean is most appropriately used with score data, which represents numerical values. Nominal data (option A) and ordered data (option C) do not inherently lend themselves to mean calculations as they do not involve numerical values. 34) The most commonly used measure of central tendency is the A) median. B) mean. C) mode. D) meridian. Answer: B Rationale: The mean is the most commonly used measure of central tendency, especially with interval or ratio scale data. It is the arithmetic average of a set of scores and is widely used in statistical analysis. 35) A median would give us a better indication of central tendency in which of the following cases? A) when there are a couple of extremely low scores B) when there are a couple of extremely high scores C) when the researcher is asking a question about people's preferences for either Madonna or Garth Brooks D) either A or B Answer: D Rationale: The median is less influenced by extreme scores than the mean, making it more appropriate when extreme values are present. Therefore, option D, "either A or B," is correct because both scenarios involve extreme scores. 36) Which measure of variability COULD remain unchanged if one score in the sample decreased by 10 points? A) range B) variance C) standard deviation D) All of these measures would always change if one score were changed. Answer: A Rationale: The range is the difference between the highest and lowest scores in a data set. If one score decreases by 10 points but remains the highest score, the range would remain unchanged. 37) What would happen to the mean if all of the scores were converted by subtracting 10 points from each score? A) The mean would be unchanged. B) The mean would increase by 10 points. C) The mean would decrease by 10 points. D) The mean would decrease by an amount equal to 10 points divided by the number of participants. Answer: C Rationale: Subtracting 10 points from each score would decrease the value of each score in the dataset. Since the mean is the sum of all scores divided by the number of scores, decreasing each score would result in a decrease in the mean. 38) The arithmetic average of all the scores in a distribution is called the A) mean. B) median. C) mode. D) modem. Answer: A Rationale: The arithmetic average of a set of scores is called the mean. It is calculated by summing up all the scores and dividing by the total number of scores. 39) The mean is appropriately used only with A) large numbers of participants. B) nominal data. C) ordinal data. D) score data. Answer: D Rationale: The mean is most appropriately used with score data, which involves numerical values. It is not restricted to large numbers of participants and is not suitable for nominal or ordinal data. 40) The measure of central tendency that gives the best indication of a typical score in the presence of a few extreme scores is the A) median. B) mode. C) harmonic mean. D) mean. Answer: A Rationale: The median is less affected by extreme scores compared to the mean. Therefore, it provides a better indication of central tendency when extreme scores are present in the data set. 41) The ages of rock concert-goers probably exhibit ________ than do the ages of attendees of a county fair. A) less stability B) a wider range C) more variability D) less variability Answer: D Rationale: Rock concert-goers likely have a narrower age range compared to attendees of a county fair. Therefore, the variability in ages among rock concert-goers is expected to be lower, leading to less variability. 42) If there are some very deviant scores in the population, the best measure of central tendency is the A) mean. B) median. C) mode. D) variance. Answer: B Rationale: The median is less influenced by extreme values (outliers) compared to the mean. Therefore, if there are deviant scores in the population, the median is a more appropriate measure of central tendency as it provides a more robust estimate. 43) If we added 10 points to each score in a sample, the A) variance would not change. B) standard deviation would not change. C) range would not change. D) All of the above Answer: D Rationale: Adding a constant value to each score in a sample does not change the relative differences between the scores, hence the measures of variability such as variance, standard deviation, and range would not change. 44) What is the relationship between the different measures of variability? A) The range is a better measure than the variance. B) The variance is a better measure than the range. C) The standard deviation is always the best. D) The range is usually the best. Answer: B Rationale: Variance takes into account the squared deviations from the mean, providing a more comprehensive measure of variability compared to the simple range. Therefore, variance is considered a better measure of variability than the range. 45) Variability of scores is measured by all of the following EXCEPT the A) standard deviation. B) range. C) variation coefficient. D) variance. Answer: C Rationale: The coefficient of variation is a measure of relative variability, calculated by dividing the standard deviation by the mean. It is not listed as one of the typical measures of variability, unlike standard deviation, range, and variance. 46) The simplest measure of variability is the A) standard deviation. B) range. C) variation. D) variance. Answer: B Rationale: The range, which is the difference between the highest and lowest scores, is the simplest measure of variability because it only requires identifying the two extreme values in the data set. 47) The range A) is the distance from highest to lowest score. B) utilizes all the scores in quantifying variability. C) transforms the data into the same units of measure as the variance. D) is the distance from the median to the highest score. Answer: A Rationale: The range is simply the difference between the highest and lowest scores in a data set, making it the distance from the highest to the lowest score. 48) The range is A) very stable. B) stable when used with score data. C) stable when used with ordinal data. D) unstable. Answer: D Rationale: The range is highly influenced by extreme values and is not robust when dealing with outliers or skewed distributions, making it an unstable measure of variability. 49) The simplest measure of variability is the A) range. B) variance. C) standard deviation. D) median. Answer: A Rationale: The range, being the difference between the highest and lowest values in a dataset, is the simplest measure of variability as it involves minimal computation. 50) The distance from the lowest to the highest score is called the A) apogee. B) variance. C) range. D) standard deviation. Answer: C Rationale: The range is defined as the difference between the highest and lowest scores in a dataset, making it the distance from the lowest to the highest score. 51) A major disadvantage of using the range as a measure of variability is that it is A) difficult to compute, because it is based on many scores. B) difficult to interpret. C) often unable to be computed. D) unstable, because it derives from only two scores. Answer: D Rationale: The range is calculated by subtracting the smallest score from the largest score. It is based solely on these two extreme values and doesn't consider the distribution of the other scores in between. Therefore, it can be highly sensitive to outliers and is considered unstable as it may not accurately reflect the variability of the entire dataset. 52) The average squared distance (deviation) of each score from the mean is called the A) standard deviation. B) sum of squares. C) variance. D) range. Answer: C Rationale: The variance is indeed the average squared deviation of each score from the mean. It provides a measure of how much the scores in a dataset vary from the mean. 53) The average deviation is the A) sum of the differences between the scores and mean (maintaining the sign). B) square root of the variance. C) sum of the absolute differences between the scores and mean divided by the number of scores. D) sum of the absolute differences between the scores and mean divided by the number of scores minus 1. Answer: C Rationale: The average deviation is calculated by taking the sum of the absolute differences between each score and the mean, then dividing by the total number of scores. This provides a measure of variability that considers the distance of each score from the mean, regardless of its sign. 54) In computing the variance, if we merely added up all the deviations from the mean without first squaring them, the average deviation from the mean would be A) 1.0, no matter how variable the scores were. B) 1.0, if the scores had a large variance. C) 0.0, if the scores had a small variance. D) 0.0, no matter how variable the scores were. Answer: D Rationale: If we didn't square the deviations from the mean before averaging them, positive and negative deviations would cancel each other out, resulting in an average deviation of 0.0, regardless of the variability of the scores. 55) The distinction between the average deviation and the variance is A) the average deviation is the square root of the variance. B) the average deviation is the average distance each score is from the mean, whereas the variance in the average squared distance each score is from the mean. C) the average deviation is the square of the variance. D) the variance is the square of the standard deviation, which is equal to .707 times the average deviation. Answer: B Rationale: The average deviation is the average absolute distance each score is from the mean, while the variance measures the average squared distance each score is from the mean. Therefore, they represent different aspects of variability within a dataset. 56) The measure of variability that is computed from all of the scores is the A) range. B) stanine. C) variance. D) percentile. Answer: C Rationale: The variance takes into account the variability of all scores in the dataset by computing the average squared distance of each score from the mean. 57) Which of the following is an accurate statement? A) The variance is a better measure of variability than the range. B) The range is a better measure of variability than the variance. C) The range is a better measure of central tendency than the mode. D) The variance is a better measure of central tendency than the range. Answer: A Rationale: The variance provides a more comprehensive measure of variability as it considers all scores in the dataset, whereas the range is limited to only two extreme values. Therefore, the variance is generally considered a better measure of variability than the range. 58) The sum of squared deviations from the mean is called the A) variance. B) standard deviation. C) sum of squares. D) degrees of freedom. Answer: C Rationale: The sum of squares is the sum of the squared differences between each score and the mean, which is used to compute both the variance and the standard deviation. 59) To obtain the variance, the A) sum of squares is divided by the degrees of freedom. B) sum of squares is divided by the number of scores. C) standard deviation is divided by the degrees of freedom. D) standard deviation is divided by the number of scores. Answer: A Rationale: To obtain the variance, the sum of squares is divided by the degrees of freedom, which is typically the total number of scores minus 1. 60) Degrees of freedom are A) always equal to N-1. B) always equal to N. C) the number of scores not free to vary. D) the number of scores free to vary. Answer: D Rationale: Degrees of freedom represent the number of scores in a sample that are free to vary. It is equal to the total number of scores minus 1, allowing for variability while maintaining a known mean. 61) SS is an abbreviation used in formulas to refer to the A) sum of standard deviations. B) sum of squares. C) standard score. D) squared score. Answer: B Rationale: SS stands for the sum of squares, which is a statistical term representing the sum of the squared deviations from the mean. It's commonly used in various statistical computations, particularly in calculating variances and standard deviations. 62) If you were to compute the variance on a set of 12 numbers, how many degrees of freedom would you use? A) 11 B) 12 C) 24 D) 1 Answer: A Rationale: When computing the variance from a sample, the number of degrees of freedom is one less than the number of data points in the sample. Therefore, for a set of 12 numbers, you would use 11 degrees of freedom. 63) The term degrees of freedom refers to A) the range of appropriateness for any given scale or measure. B) the rights of research participants. C) the number of scores that are free to vary. D) what graduate students call their Ph.D. diplomas. Answer: C Rationale: Degrees of freedom refer to the number of independent values or quantities that can be assigned to a statistical distribution. In simpler terms, it represents the number of scores that are free to vary without violating any constraints or conditions imposed on the data. 64) If you could form a set of any 12 numbers, how many degrees of freedom would you have in selecting them? A) 11 B) 12 C) 24 D) 1 Answer: B Rationale: In this scenario, you have complete freedom to choose any 12 numbers, so all 12 numbers are free to vary, meaning there are 12 degrees of freedom. 65) A researcher needs to pick six numbers so that the total adds up to 180. The number of degrees of freedom would be A) 6 B) 7 C) 4 D) 5 Answer: D Rationale: In this case, the total of 180 and the fact that there are six numbers to pick impose constraints, leaving only five numbers free to vary. 66) A researcher needs to pick five people of differing IQ levels. Their IQs must add up to 550, the first must be 95 and the third must be 140. The number of degrees of freedom would be A) 5 B) 4 C) 3 D) 2 Answer: D Rationale: With the first and third IQ levels specified, only three IQ levels are free to vary, leading to two degrees of freedom. 67) The more restrictions imposed on data, A) the more degrees of freedom there are. B) the more degrees of freedom are lost. C) the more inaccurate the measures become. D) Both A and C Answer: B Rationale: Imposing restrictions on data reduces the degrees of freedom, as fewer values are free to vary independently. This reduction can lead to a loss of precision in statistical measures. 68) If we wanted to use a measure of variability that was expressed in the same units as the mean, we would use the A) variance. B) standard deviation. C) average variation. D) median. Answer: B Rationale: The standard deviation is the measure of variability that is expressed in the same units as the mean. It represents the average deviation from the mean, making it suitable for comparison with the mean. 69) Unlike the mean, the variance is expressed in ________ units of the variable. A) original B) squared C) reduced D) divided Answer: B Rationale: The variance is expressed in squared units of the variable because it involves squaring the differences between each value and the mean before averaging them. 70) Which of the following is correct? A) The standard deviation is the square root of the variance. B) The variance is the square root of the standard deviation. C) The standard deviation is the square root of the average deviation. D) The variance is the square root of the average deviation. Answer: A Rationale: The standard deviation is indeed the square root of the variance. This relationship between variance and standard deviation is fundamental in statistics, with the standard deviation providing a measure of the spread of data around the mean. 71) A correlation coefficient describes A) how several variables combine to form a third variable. B) the relationship between two scores. C) the relationship between two variables. D) how two variables combine together to make a third variable. Answer: C Rationale: A correlation coefficient specifically quantifies the strength and direction of the relationship between two variables. It does not involve the formation of a third variable or combinations of multiple variables. 72) Which type of descriptive statistic always involves at least two variables? A) measures of variability B) central tendency measures C) correlation coefficients D) Both A and C Answer: C Rationale: Correlation coefficients involve the relationship between two variables. Measures of variability and central tendency can involve single variables. 73) The association between two variables is best indexed by the A) standard deviation. B) variance. C) sum of squares. D) correlation coefficient. Answer: D Rationale: The correlation coefficient best indexes the association between two variables, indicating the strength and direction of their relationship. 74) The descriptive statistic that involves at least two variables is the A) correlation coefficient. B) mean. C) standard deviation. D) variance. Answer: A Rationale: The correlation coefficient involves the relationship between two variables, unlike mean, standard deviation, or variance, which describe characteristics of a single variable. 75) The Pearson product-moment correlation is used with A) score data. B) ordered data. C) nominal data. D) All of the above Answer: A Rationale: The Pearson product-moment correlation is suitable for analyzing relationships between variables measured on an interval or ratio scale, such as score data. 76) The most widely used correlation index is the A) Spearman rank-order correlation. B) Pearson product-moment correlation. C) point biserial correlation. D) correlated t-test. Answer: B Rationale: The Pearson product-moment correlation is widely used to measure the linear relationship between two continuous variables. 77) The product-moment correlation can range from A) -2.00 to +2.00. B) 0 to +1.00. C) -1.00 to +1.00. D) -1.00 to 0. Answer: C Rationale: The range of the product-moment correlation coefficient is from -1 to +1, where -1 indicates a perfect negative linear relationship, 0 indicates no linear relationship, and +1 indicates a perfect positive linear relationship. 78) The type of correlation best used with score data is the A) point biserial correlation. B) Spearman rank-order correlation. C) correlated t-test. D) Pearson product-moment correlation. Answer: D Rationale: The Pearson product-moment correlation is most appropriate for analyzing relationships between variables measured on a continuous scale, such as score data. 79) If we wished to use statistics to explore the possibility that early language acquisition is related to high Apgar scores at birth, we would most likely consult which descriptive statistic? A) mean B) variance C) a correlation of the two variables D) a correlated t-test Answer: C Rationale: To explore the relationship between early language acquisition and Apgar scores, a correlation coefficient would be most suitable as it measures the strength and direction of the relationship between two variables. 80) If you wanted to quantify the degree of relationship between academic achievement and creativity, which correlation coefficient would you want to use? A) Pearson product-moment correlation B) Spearman rank-order correlation C) Either would be appropriate. D) It would depend on how the variables were measured. Answer: D Rationale: The choice between Pearson product-moment correlation and Spearman rank-order correlation depends on whether the variables are measured on interval/ratio scales (Pearson) or ordinal scales (Spearman). Therefore, the appropriate correlation coefficient depends on the measurement scale of the variables involved. 81) A researcher wants to measure the relationship between age and score on this test. The most appropriate statistic would be the A) Spearman rank-order correlation. B) sign test. C) Pearson product-moment correlation. D) repeated measures ANOVA. Answer: C Rationale: The Pearson product-moment correlation coefficient is used to measure the strength and direction of a linear relationship between two continuous variables. In this case, age and test scores are both continuous variables, making Pearson correlation the most appropriate choice for assessing their relationship. 82) A researcher wants to measure the relationship between age and test taking enjoyment (high, extremely high, insanely high, etc.). The most appropriate statistic would be the A) chi square. B) sign test. C) Pearson product-moment correlation. D) Spearman rank-order correlation. Answer: D Rationale: When one or both variables are ordinal or measured on an ordinal scale, the Spearman rankorder correlation is typically more appropriate. In this scenario, where test taking enjoyment is likely measured ordinally, the Spearman rank-order correlation would be the suitable statistic. 83) If two variables are measured on a nominal scale, what type of correlation coefficient is appropriate? A) a Spearman rank-order correlation B) a Pearson product-moment correlation C) a point-biserial correlation D) None of the above Answer: D Rationale: When both variables are measured on a nominal scale, a correlation coefficient is not appropriate because correlation coefficients are designed for ordinal or continuous variables, not nominal variables. 84) A product-moment correlation should be used only when the relationship between two variables is A) nonlinear. B) ongoing. C) correlated. D) linear. Answer: D Rationale: The Pearson product-moment correlation coefficient is specifically designed to measure linear relationships between two variables. If the relationship is nonlinear, other methods should be considered. 85) The Pearson product-moment correlation is an index of what type of relationship between two variables? A) circular B) parabolic C) curvilinear D) linear Answer: D Rationale: The Pearson product-moment correlation coefficient is used to measure the strength and direction of a linear relationship between two continuous variables. It assesses how well the relationship between the variables can be described by a straight line. 86) A correlation of -1.00 represents A) no relationship between the two variables. B) a perfect negative relationship between the two variables. C) a weak negative relationship between the two variables. D) a very weak positive relationship between the two variables. Answer: B Rationale: A correlation coefficient of -1.00 indicates a perfect negative relationship between the two variables, meaning that as one variable increases, the other variable decreases perfectly in a linear fashion. 87) Which correlation represents the strongest relationship? A) +.37 B) +.68 C) -.02 D) -.73 Answer: D Rationale: The correlation coefficient of -.73 represents the strongest relationship among the options given. The closer the correlation coefficient is to -1 or +1, the stronger the relationship between the variables. In this case, -.73 indicates a strong negative linear relationship between the variables. 88) A perfect positive relationship between two variables is a correlation of ________. A) +1.0 B) -1.0 C) -2.0 D) 0.0 Answer: A Rationale: A correlation coefficient of +1.0 represents a perfect positive relationship between two variables, indicating that as one variable increases, the other variable also increases perfectly in a linear fashion. 89) A perfect negative relationship between two variables is a correlation of ________. A) +1.0 B) -1.0 C) -2.0 D) -0.0 Answer: B Rationale: A correlation coefficient of -1.0 represents a perfect negative relationship between two variables, indicating that as one variable increases, the other variable decreases perfectly in a linear fashion. 90) When two variables have no relationship, a correlation coefficient of ________ is to be expected. A) +1.0 B) -1.0 C) 0.0 D) -0.5 Answer: C Rationale: A correlation coefficient of 0.0 indicates no linear relationship between two variables. It means that changes in one variable are not associated with changes in the other variable. 91) Which of the following shows the strongest relationship: a correlation of 0.75 or -0.75? A) -0.75 B) +0.75 C) They are the same strength. D) They cannot be compared because they are in opposite directions. Answer: C Rationale: The strength of a correlation is determined by its absolute value. Both +0.75 and -0.75 have the same absolute value of 0.75, indicating the same strength of relationship, albeit in opposite directions. 92) If a perfect positive relationship between two variables exists, as one variable A) increases, the other variable will increase by a predictable amount. B) decreases, the other variable will decrease by a predictable amount. C) increases, the other variable will decrease by a predictable amount. D) Both A and B Answer: D Rationale: In a perfect positive relationship, as one variable increases, the other variable also increases predictably (Option A), and as one variable decreases, the other variable decreases predictably (Option B). 93) A correlation coefficient is an index of the degree of A) curvilinear relationship. B) linear relationship. C) scatter plot relationship. D) inter-subtest variability. Answer: B Rationale: The correlation coefficient measures the strength and direction of a linear relationship between two variables. 94) The graphic technique used to represent the relationship between two variables is the A) histogram. B) scatter plot. C) correlation polygon. D) frequency polygon. Answer: B Rationale: A scatter plot is a graph that displays the relationship between two variables by plotting points along two axes. 95) A researcher wants to study the relationship between women's ages and number of births per year for women in the 10 to 55 age range. The relationship would be expected to be A) a linear relationship. B) nonexistent. C) nonlinear. D) a negative correlation. Answer: C Rationale: The relationship between women's ages and number of births per year is likely to be nonlinear because it is not expected to follow a straight line pattern. 96) A researcher found that short people did well in research methods and tall people did badly, with people of average height scoring in the middle. The relationship between height and grade in research methods is a A) positive correlation. B) negative correlation. C) nonlinear relationship. D) curvilinear relationship. Answer: B Rationale: The relationship described indicates that shorter height corresponds to higher grades and taller height corresponds to lower grades, indicating a negative correlation. 97) A correlation of +.89 represents A) a weak positive relationship between the variables. B) a strong positive relationship between the variables. C) a perfect relationship between the variables. D) a weak negative relationship between the variables. Answer: B Rationale: A correlation coefficient close to +1 indicates a strong positive relationship between the variables. 98) A correlation of 0.00 means that A) a mistake has been made in the computation. B) there is a perfect relationship between the two variables. C) there is no relationship between the two variables. D) the data points would fall on a straight line on a scatter plot. Answer: C Rationale: A correlation coefficient of 0 indicates no linear relationship between the variables. 99) The graphic technique used to represent the relationship between two variables is called A) a scatter plot. B) computer simulated correlation. C) marking. D) matrix algebra. Answer: A Rationale: A scatter plot is a graphical representation of the relationship between two variables. 100) In a scatter plot, the dots cluster along a line from the lower left-hand corner to the upper right-hand corner. This shows A) a negative correlation. B) no relationship. C) a positive correlation. D) a nonlinear relationship. Answer: C Rationale: When dots cluster along a line from the lower left-hand corner to the upper right-hand corner, it indicates a positive correlation, where as one variable increases, the other variable tends to increase as well. 101) In a scatter plot, the dots cluster along a line from the top left-hand corner to the lower right-hand corner. This shows A) a negative correlation. B) no relationship. C) a positive correlation. D) a nonlinear relationship. Answer: A Rationale: A negative correlation is indicated by a downward sloping line in a scatter plot, where the dots cluster from the top left to the lower right-hand corner. This suggests that as one variable increases, the other variable decreases. 102) In constructing a scatter plot, standard x- and y-axes are A) not used. B) labeled with the names of the three variables of interest. C) reversed so that the ordinate becomes the abscissa and vice versa. D) labeled with the names of the two variables of interest. Answer: D Rationale: The standard x- and y-axes in a scatter plot are labeled with the names of the two variables of interest, with the independent variable typically on the x-axis and the dependent variable on the y-axis. 103) When using SPSS for Windows to compute a regression line, you should also request a ________ to see how well the data fits a straight line function. A) product-moment correlation B) scatter plot C) MANOVA D) median graph Answer: B Rationale: A scatter plot is essential to visualize the relationship between variables and assess how well the data fits a straight line function, in addition to computing a regression line. 104) Which technique would be most sensitive to (i.e., would help you identify) a nonlinear relationship? A) the Pearson product-moment correlation B) the Spearman rank-order correlation C) a scatter plot D) a t-test Answer: C Rationale: A scatter plot is the most sensitive technique to identify a nonlinear relationship because it visually displays the pattern of the data points, making it easier to detect deviations from linearity. 105) If a perfect relationship were represented on a scatter plot, the dots would A) fall in the upper portion of the plot. B) fall in a circular shape. C) form a straight line. D) be clustered near the x-axis. Answer: C Rationale: A perfect relationship would be represented by a straight line on a scatter plot, indicating a strong linear correlation between the variables. 106) The type of correlation best used with ordered data is the A) Spearman rank-order correlation. B) Pearson product-moment correlation. C) correlated t-test. D) point biserial correlation. Answer: A Rationale: The Spearman rank-order correlation is best used with ordered data because it assesses the strength and direction of association between variables when the data are in ranks. 107) The Spearman rank-order correlation is used with A) score data. B) ordered data. C) nominal data. D) All of the above Answer: B Rationale: The Spearman rank-order correlation is specifically used with ordered data, where the values of variables can be ranked but not necessarily measured on a continuous scale. 108) The appropriate correlation to use when you have nominal data is A) Phi. B) the Spearman rank-order correlation. C) the Pearson product-moment correlation. D) Any of the above would be appropriate. Answer: A Rationale: Phi is the appropriate correlation to use with nominal data because it measures the association between two categorical variables with two levels each. 109) Phi is the appropriate correlation to use with A) nominal data. B) score data. C) ordered data. D) any type of data. Answer: A Rationale: Phi is specifically designed to measure the association between two nominal variables, making it suitable for nominal data analysis. 110) For the data illustrated in the scatter plot below, what is the product-moment correlation? A) 0.00 B) +1.00 C) -1.00 D) You cannot tell from the scatter plot. Answer: A Rationale: The scatter plot indicates no discernible pattern or direction in the relationship between the variables, suggesting a correlation of 0.00, indicating no linear relationship. 111) If we saw on a scatter plot that the dots were arranged in a straight line that fell from the upper left-hand corner (top of the y-axis) to the bottom right-hand corner (end of the x-axis), we could conclude that the correlation was: A) negative. B) positive. C) zero. D) nonlinear. Answer: A Rationale: In a scatter plot where the dots fall from the upper left-hand corner to the bottom right-hand corner, it indicates a negative correlation. As one variable increases, the other variable decreases, resulting in a downward slope. 112) A Phi coefficient of -1.00 would indicate that: A) there is a perfect negative relationship between the variables. B) there is no relationship between the variables. C) one should have used the Spearman rank-order correlation. D) a computational error was made because negative Phi coefficients are impossible. Answer: D Rationale: Phi coefficient ranges from -1 to +1, where -1 indicates a perfect negative relationship, +1 indicates a perfect positive relationship, and 0 indicates no relationship. However, a Phi coefficient cannot be negative, so encountering a value of -1.00 would imply a computational error. 113) We can predict one variable from values of another variable by using information from: A) a sign test. B) a t-test. C) a correlation coefficient. D) an ANOVA. Answer: C Rationale: Correlation coefficient measures the strength and direction of the linear relationship between two variables. By examining the correlation coefficient, we can infer how well one variable predicts another. 114) The prediction of the value of one variable from the value of another is called: A) variation. B) deviation. C) regression. D) standardization. Answer: C Rationale: Regression analysis is used to predict the value of one variable based on the value of another variable. It helps in understanding the relationship between variables and making predictions. 115) When the correlation between two variables is zero, the linear regression line would be: A) horizontal. B) vertical. C) diagonal. D) curved. Answer: A Rationale: When the correlation between two variables is zero, it means there is no linear relationship between them. In such cases, the linear regression line would be horizontal. 116) The internal consistency reliability index is also referred to as the: A) consistency distribution. B) standard deviation. C) standard score. D) coefficient alpha. Answer: D Rationale: Coefficient alpha, also known as Cronbach's alpha, is a measure of internal consistency reliability. It assesses how closely related a set of items are as a group. 117) ________ are often used to quantify test-retest and interrater reliability. A) Non-linear coefficients. B) Z-scores. C) Correlation coefficients. D) Multiple regressions. Answer: C Rationale: Correlation coefficients are commonly used to quantify the reliability of measurements, including test-retest reliability and interrater reliability. 118) ________ is an index of how intercorrelated the items in a measure are. A) Standard deviation. B) Coefficient alpha. C) Standard score. D) Consistency distribution. Answer: B Rationale: Coefficient alpha, also known as Cronbach's alpha, measures the intercorrelation among items in a scale or measure. It assesses how well the items in a measure are related to each other. 119) A standard score is useful in determining: A) how a particular person scored relative to the rest of the people. B) the standard deviation of a population. C) the average score of participants. D) the number of scores in a population. Answer: A Rationale: A standard score (z-score) indicates how many standard deviations a particular score is above or below the mean. It helps in understanding where an individual's score falls relative to the rest of the population. 120) The size of the ________ score indicates how far from the mean a particular person scored. A) nominal. B) standard. C) median. D) optimal. Answer: B Rationale: A standard score (z-score) indicates the distance of a particular score from the mean of the distribution, measured in standard deviation units. A larger standard score indicates a greater deviation from the mean. 121) A person's ________ indicates what percent of the sample scored below that person. A) standard rank B) level score C) percentile rank D) mean rank Answer: C Rationale: Percentile rank is a measure indicating the percentage of scores that fall below a given score in a distribution. It is commonly used in statistics and testing to compare individual scores relative to others in the same group or population. This measure helps in understanding where a person stands in comparison to others in terms of their performance or characteristics. 5.4 Statistical Inference 1) A person's ________ indicates what percent of the sample scored below that person. A) standard rank B) level score C) percentile rank D) mean rank Answer: C Rationale: Percentile rank is a measure used in statistics to indicate the percentage of scores that a given value is higher or lower than. It indicates what percent of the sample scored below a particular individual. This helps to understand where an individual stands relative to the rest of the sample. 2) A researcher is interested in studying decisions of U.S. college seniors regarding graduate school. With that in mind, how would we classify all of the college seniors in the United States? A) as a sample of college seniors B) as a sample of all human beings C) as the population of college seniors D) as the entire population of human beings Answer: C Rationale: In statistics, the term "population" refers to the entire group of individuals or items about which the researcher wants to draw conclusions. In this case, all college seniors in the United States constitute the population of interest for the researcher. 3) In statistics, the larger group of all the people of interest is the A) parameter. B) sample. C) population. D) reference group. Answer: C Rationale: The population in statistics refers to the larger group of individuals or items about which the researcher wants to draw conclusions. It encompasses all the people or elements of interest in a particular study. 4) In statistics, a subset of people drawn from the larger group of all the people of interest is a A) parameter. B) population. C) reference group. D) sample. Answer: D Rationale: A sample in statistics is a subset of individuals or items selected from a larger population. It is used to make inferences or generalizations about the population from which it is drawn. 5) If a researcher is interested in all research methods classes but only studies one class, he is studying a A) population. B) reference group. C) sample. D) parameter. Answer: C Rationale: By studying only one class of research methods, the researcher is examining a subset of the larger group of interest, which constitutes a sample. A sample is a representative subset of a population from which generalizations can be made. 6) A researcher wants to study all 2000 college graduates from the University of Hawaii. She gathers data from all the college graduates at the University of Hawaii in the year 2000. She has studied a A) reference group. B) sample. C) population. D) statistic. Answer: C Rationale: The researcher has studied the entire group of interest, which is the 2000 college graduates from the University of Hawaii in the year 2000. This constitutes the population of interest for the study. 7) The variation among different samples drawn from the same population is termed A) measurement error. B) population variation. C) sampling error. D) sampling variation. Answer: C Rationale: Sampling error refers to the variation observed among different samples drawn from the same population. It arises due to the randomness involved in selecting samples from a population and can affect the generalizability of study findings. 8) In order to interpret information about a population of interest, researchers must employ A) descriptive statistics. B) inferential statistics. C) summary statistics. D) population statistics. Answer: B Rationale: Inferential statistics are used to make inferences or predictions about a population based on data collected from a sample. It helps researchers draw conclusions and make generalizations about populations based on sample data. 9) The first step in interpreting research data is computing A) descriptive statistics. B) inferential statistics. C) a score transformation of the data. D) a linear transformation of the data. Answer: A Rationale: The first step in interpreting research data is to compute descriptive statistics. Descriptive statistics summarize and describe the basic features of the data, providing insights into its central tendency, variability, and distribution. This step is essential for understanding the characteristics of the data before proceeding to inferential analysis. 10) Which of the following allows researchers to draw conclusions about a population from a sample? A) descriptive statistics B) inferential statistics C) population statistics D) sample parameters Answer: B Rationale: Inferential statistics enable researchers to generalize findings from a sample to a larger population. Unlike descriptive statistics, which simply describe the characteristics of the sample, inferential statistics make inferences or predictions about the population based on the sample data. 11) Researchers use samples to A) reflect precisely population characteristics. B) reflect exactly population parameters. C) draw conclusions about the population. D) gather data from the entire group of people being studied. Answer: C Rationale: Researchers use samples to draw conclusions about the population. While samples may not precisely reflect all population characteristics, they provide valuable insights that can be generalized to the larger population. 12) Sampling error represents A) a mistake on the part of the researcher. B) the error among different samples due to errors by the researcher. C) the expected variation among different samples due to chance. D) a lack of validity in the study due to design problems. Answer: C Rationale: Sampling error refers to the expected variation among different samples due to chance factors rather than mistakes by the researcher. It is an inherent part of sampling processes and can affect the generalizability of study findings. 13) The null hypothesis states that A) there is no statistical difference between the population means. B) the population mean is different than the sample mean on a variable of interest. C) there is always a difference between samples because of sampling error. D) any difference between samples will be negligible. Answer: A Rationale: The null hypothesis (H0) assumes no statistical difference between population means or parameters. It is the default assumption in hypothesis testing until evidence suggests otherwise. 14) The statement that there is no statistically significant difference between two population variances exemplifies a(n) A) no-difference hypothesis. B) null hypothesis. C) no-difference theorem. D) alternative hypothesis. Answer: B Rationale: In this context, the null hypothesis (H0) asserts that there is no statistically significant difference between the two population variances. It serves as the baseline assumption until evidence suggests otherwise. 15) Null hypotheses are tested by using A) population statistics. B) population parameters. C) inferential statistics. D) descriptive statistics. Answer: C Rationale: Null hypotheses are tested using inferential statistics, which allow researchers to make inferences about populations based on sample data. Descriptive statistics summarize the characteristics of the sample but do not directly test hypotheses. 16) The decision point for rejecting the null hypothesis is A) the beta level. B) the alpha level. C) typically .005 or .0001. D) the null hypothesis level. Answer: B Rationale: The alpha level, often set at 0.05 or 0.01, represents the probability threshold below which the null hypothesis is rejected. It determines the critical region for hypothesis testing. 17) In a study on the effects of a new type of soft lighting on task efficiency using two groups of office workers as the participants, what is the null hypothesis? A) There will be no task efficiency difference between the groups in the two conditions. B) The effects of the lights on one group will be nullified by the effects on the other group. C) The new type of lighting will lead to greater task efficiency. D) The new type of lighting will lead to lower task efficiency. Answer: A Rationale: The null hypothesis in this scenario states that there will be no significant difference in task efficiency between the groups exposed to different lighting conditions. It serves as the baseline assumption to be tested against the alternative hypothesis. 18) A researcher is interested in studying anxiety levels in psychology majors compared to management majors. He thinks that there is a greater incidence of anxiety in psychology majors. The null hypothesis would be that the incidence of anxiety is A) inversely proportional to the number of students in each major. B) the same for both majors. C) greater among psychology majors. D) greater among management majors. Answer: B Rationale: In this context, the null hypothesis (H0) posits that there is no significant difference in the incidence of anxiety between psychology majors and management majors. This assumption is subject to testing against the alternative hypothesis. 19) A somewhat arbitrary cutoff point employed to decide whether the null hypothesis is false is called the A) alpha level. B) beta level. C) null point. D) point of no return. Answer: A Rationale: The alpha level represents the cutoff point used to determine whether the null hypothesis should be rejected. It is typically set at 0.05 or 0.01 and serves as the threshold for determining statistical significance. 20) Which of the following is correct? A) We cannot say with certainty that the conclusions drawn from a sample are always valid for the population. B) Inferences drawn from a random sample are always valid for the population. C) Inferences drawn from a population are always valid for the sample. D) Inferences drawn from parameters are always valid for the sample. Answer: A Rationale: Option A is correct because conclusions drawn from a sample may not always accurately reflect the characteristics of the entire population due to sampling variability and other factors. It highlights the uncertainty associated with generalizing from a sample to a population. 21) Type I error occurs when a researcher A) rejects the null hypothesis when it should be retained. B) does not reject the null hypothesis when it should be retained. C) makes a large mistake in data collection. D) is confronted with random error in the form of participant characteristics. Answer: A Rationale: Type I error happens when the null hypothesis is incorrectly rejected, indicating a significant effect when there isn't one. This can lead to false positive conclusions. 22) A researcher conducts a study on grades of students who live in apartments compared with those who live in dorms. In actuality, the grades are equivalent in both groups. After statistical analysis the researcher makes the decision to reject the null hypothesis. This decision is A) not possible to evaluate from the information given. B) correct. C) a Type I error. D) a Type II error. Answer: C Rationale: The decision to reject the null hypothesis when it's actually true (i.e., the grades are equivalent) constitutes a Type I error because it leads to a false positive conclusion. 23) A Type II error occurs when a researcher A) devises an incorrect null hypothesis. B) rejects the null hypothesis when it should be retained. C) does not reject the null hypothesis when it should be rejected. D) makes a large mistake in data collection. Answer: C Rationale: Type II error occurs when the null hypothesis is incorrectly retained, failing to detect a true effect that exists in the population. This leads to a false negative conclusion. 24) The alpha level is the expected probability of A) measurement error. B) a Type I error. C) a Type II error. D) sampling error. Answer: B Rationale: The alpha level (α) represents the significance level, which is the probability of committing a Type I error, i.e., rejecting the null hypothesis when it is actually true. 25) What will happen if the researcher increases the level of Type I error without making any other changes? A) The level of Type II error will increase. B) The level of Type II error will decrease. C) Alpha will decrease. D) There will be no change in either alpha or the level of Type II error. Answer: B Rationale: Increasing the level of Type I error (α) without any other changes would lead to a higher likelihood of rejecting the null hypothesis incorrectly, which reduces the chances of committing a Type II error. 26) If two populations are different on their population means and the inferential statistical test leads you to conclude that the populations are equivalent, you have A) made a Type I error. B) made a Type II error. C) drawn a correct inference. D) inflated your alpha. Answer: B Rationale: Concluding that two populations are equivalent when they are actually different constitutes a Type II error because it fails to detect a true difference. 27) Failing to reject the null hypothesis when in fact it is false results in a A) Type I error. B) high alpha level. C) low alpha level. D) Type II error. Answer: D Rationale: Failing to reject the null hypothesis when it's actually false leads to a Type II error because it means not detecting a true effect that exists in the population. 28) In a study on the effects of a new type of soft lighting on task efficiency using two groups of office workers as the participants, a researcher mistakenly concludes that the use of the new lighting does NOT improve task efficiency when, in fact, it does. What type of error has the researcher made? A) a Type I error B) a Type II error C) a beta error D) an alpha error Answer: B Rationale: The researcher's conclusion that there is no improvement in task efficiency when there actually is one constitutes a Type II error because it fails to detect a true effect. 29) In a study on the effects of a new type of soft lighting on task efficiency using two groups of office workers as the participants, a researcher mistakenly concludes that using the new lighting does improve task efficiency when, in fact, it does NOT. What type of error has the researcher committed? A) a Type I error B) a Type III error C) a Type II error D) a measurement error Answer: A Rationale: The researcher's conclusion that there is an improvement in task efficiency when there isn't one constitutes a Type I error because it falsely identifies a significant effect. 30) Rejecting the null hypothesis when in fact it is true results in a A) high alpha level. B) Type I error. C) Type II error. D) indigestion. Answer: B Rationale: A Type I error occurs when the null hypothesis is incorrectly rejected, leading to the conclusion that there is a significant effect or difference when there isn't one. This is commonly associated with a significance level (alpha) set too high, which increases the probability of falsely rejecting the null hypothesis. 31) Setting alpha to a low level to reduce the chance of rejecting the null hypothesis when it should be accepted A) is rarely done in research. B) increases the chance of Type I error. C) increases the chance of Type II error. D) increases the accuracy of decision making with little adverse effects. Answer: C Rationale: Setting alpha (the significance level) to a low level reduces the probability of Type I errors (rejecting the null hypothesis when it's true), but it increases the probability of Type II errors (failing to reject the null hypothesis when it's false). This trade-off is inherent in hypothesis testing. 32) A researcher conducts a study on grades of students who live in apartments compared with those who live in dorms. In actuality, the grades are equivalent in both groups. After statistical analysis, he makes the decision to not reject the null hypothesis. This decision is A) correct. B) a Type I error. C) a Type II error. D) not possible to evaluate from the information given. Answer: A Rationale: Since the researcher correctly decides not to reject the null hypothesis when there is no significant difference between the grades of students living in apartments and those living in dorms, the decision is correct. 33) A researcher does a study on knowledge of animals in cat lovers compared to dog lovers. In actuality, dog lovers have greater knowledge of animals. After statistical analysis, the researcher makes the decision to NOT reject the null hypothesis. This decision is A) correct. B) a Type I error. C) a Type II error. D) not possible to evaluate from the information given. Answer: C Rationale: In this scenario, the researcher fails to reject the null hypothesis (no difference in knowledge between cat lovers and dog lovers) when there actually is a difference (dog lovers have greater knowledge). This is a Type II error, as the researcher misses the true effect. 34) If alpha is set to .05, what will the level of Type II error be? A) .05 B) .95 C) 0 D) cannot say Answer: D Rationale: The level of Type II error depends on factors beyond just the significance level (alpha). It is affected by sample size, effect size, and power of the statistical test. Without additional information, it's not possible to determine the level of Type II error based solely on the significance level. 35) The probability of a Type I error is equal to A) less than 1%. B) the beta level. C) the alpha level. D) the degrees of freedom. Answer: C Rationale: The probability of a Type I error is equal to the significance level (alpha), which is typically set by the researcher. It represents the likelihood of rejecting the null hypothesis when it is actually true. 5.5 Inferential Statistics 1) If you wish to know whether the results of your experiment are likely to have been obtained by chance, you should use A) the mean. B) descriptive statistics. C) inferential statistics. D) random number tables. Answer: C Rationale: Inferential statistics are used to draw conclusions and make inferences about a population based on sample data, including assessing whether the observed differences or effects are likely due to chance. 2) What is the appropriate statistical procedure for evaluating the difference among two or more groups? A) ANOVA B) t-test C) standard deviation D) either A or B Answer: A Rationale: Analysis of Variance (ANOVA) is the appropriate statistical procedure for evaluating the differences among two or more groups. It assesses whether the means of different groups are significantly different from each other. 3) What is the appropriate procedure for evaluating the difference between two groups? A) ANOVA B) t-test C) standard deviation D) either A or B Answer: D Rationale: Both ANOVA and t-test can be used to evaluate the difference between two groups. The choice between them depends on factors such as the number of groups being compared and the assumptions of the data. 4) The analysis of variance (ANOVA) actually compares A) the variance of all the groups. B) the standard error of all the groups. C) the means of all the groups. D) the standard deviation of all the groups. Answer: C Rationale: ANOVA compares the means of different groups to determine whether there are statistically significant differences among them. It does so by partitioning the total variance observed into variance between groups and variance within groups. 5) The term ________ refers to the sensitivity of a statistical procedure to provide a basis for correctly rejecting the null hypothesis. A) strength B) schism C) capacity D) power Answer: D Rationale: Power in statistics refers to the ability of a statistical test to detect an effect, should one exist. It's crucial in determining whether the null hypothesis can be correctly rejected when it's false. Essentially, power measures the likelihood that a statistical test will correctly reject a false null hypothesis. 6) The easiest way to increase power is to A) increase sample size. B) decrease sample size. C) increase the number of statistical tests. D) increase individual differences. Answer: A Rationale: Increasing the sample size directly impacts the power of a statistical test. With a larger sample size, the test becomes more sensitive to detecting true effects, hence increasing its power. 7) Power is the ability to reduce A) statistical significance. B) Type II errors. C) Type I errors. D) practical significance. Answer: B Rationale: Power is the ability of a statistical test to correctly reject the null hypothesis when it is false. In other words, it's the probability of not making a Type II error, which occurs when a true effect exists but is not detected by the test. 8) Which of the following will NOT increase the power of a statistical test? A) better standardization of procedures B) more precise sampling C) decreasing the number of participants D) using more precise measures Answer: C Rationale: Decreasing the number of participants will not increase the power of a statistical test. In fact, reducing the sample size typically decreases power because there's less data available to detect true effects reliably. 9) The size of the sample needed to achieve a specified level of power can be computed on the basis of pilot data with a process called A) distribution analysis. B) sample analysis. C) power analysis. D) depth analysis. Answer: C Rationale: Power analysis involves determining the sample size needed to achieve a certain level of power in detecting a true effect. It's based on factors like effect size, significance level, and desired power level. 10) Which term refers to the sensitivity of a statistical procedure? A) statistical reliability B) validity or statistical validity C) statistical replication D) power or statistical power Answer: D Rationale: Power (or statistical power) refers to the sensitivity of a statistical procedure. It indicates the likelihood of correctly rejecting the null hypothesis when it is false, thus detecting a true effect. 11) Effect size refers to A) the strength of the independent variable manipulation. B) the strength of the dependent variable manipulation. C) the difference between group means expressed in standard deviation units. D) None of the above Answer: C Rationale: Effect size quantifies the magnitude of the difference between two groups or conditions, often expressed in standard deviation units. It provides a standardized measure of the strength of the relationship between variables. 12) As effect size increases, A) power will increase. B) power will decrease. C) power is unchanged. D) the sample size increases. Answer: A Rationale: As effect size increases, the difference between groups becomes more apparent. Consequently, the statistical test becomes more sensitive to detecting this difference, leading to an increase in power. 13) Why is power increased when the effect size increases? A) It is harder to detect large differences than small differences. B) It is easier to detect large differences than small differences. C) Power depends on the sample size, which is a function of effect size. D) Power refers to the level of Type I error, which increases with the effect size. Answer: B Rationale: Larger effect sizes represent more substantial differences between groups or conditions, making them easier to detect with statistical tests. Hence, as effect size increases, the power of a statistical test also increases. 5.6 Ethical Principles 1) What is a researcher's ethical obligation with respect to statistics. A) to always select the appropriate statistical procedures B) to use statistical procedures that accurately reflect the data C) to use the most sophisticated statistical tests available D) All of the above Answer: B Rationale: The ethical obligation of a researcher regarding statistics is to ensure that the statistical procedures used accurately represent the data collected. This helps maintain the integrity of the research findings and ensures transparency in reporting results. Selecting appropriate statistical procedures that accurately reflect the data is essential for drawing valid conclusions from the research. 2) Cherry picking the data refers to A) selecting and reporting only those data that support your hypothesis. B) using only the most reliable data. C) fabricating data to support the hypothesis. D) using the wrong inferential statistic. Answer: A Rationale: Cherry picking the data involves selectively choosing and presenting only the data that align with one's hypothesis while disregarding conflicting data. This bias can lead to a distorted view of the evidence and is considered a form of scientific misconduct. 3) A researcher who cherry picks the data is A) being unethical B) selecting and reporting those data that support the study's hypothesis. C) ignoring data that contradict the study's hypothesis. D) All of the above Answer: D Rationale: Cherry picking data is unethical because it involves deliberately selecting and reporting only supportive data, while ignoring contradictory evidence. Therefore, all the options (A, B, and C) are correct as they describe different aspects of unethical behavior associated with cherry picking. Test Bank for Research Methods: A Process of Inquiry Anthony M. Graziano, Michael L. Raulin 9780205900923, 9780205907694, 9780135705056
Close