Preview (13 of 42 pages)

Preview Extract

Chapter 9
Business Intelligence Systems
True/False Questions
1) Business intelligence (BI) systems have only four of the five components of an
information system: hardware, software, data, and people.
Answer: False
Rationale:
BI systems include all five components of an information system: hardware, software, data,
procedures, and people. The procedures component encompasses the processes and activities
involved in acquiring, analyzing, and disseminating business intelligence within an
organization.
2) The software component of a BI system is called a BI application.
Answer: True
Rationale:
The software component of a BI system consists of specialized applications designed to
extract, transform, analyze, and visualize data to support decision-making processes. These
applications are commonly referred to as BI applications.
3) The three primary activities in the BI process are to acquire data, perform analysis, and
publish results.
Answer: True
Rationale:
The BI process involves acquiring data from various sources, analyzing the data to extract
insights and patterns, and publishing the results in a format that is understandable and
actionable for decision-makers within the organization.
4) Data acquisition is the process of creating business intelligence.
Answer: False
Rationale:

Data acquisition is the process of gathering raw data from internal and external sources, such
as transactional databases, spreadsheets, and external data providers. Business intelligence is
created through the analysis and interpretation of this data to generate insights and support
decision-making.
5) The three fundamental categories of BI analysis are reporting, data mining, Big data and
knowledge management.
Answer: True
Rationale:
The three fundamental categories of BI analysis are reporting, which involves presenting
summarized data in a structured format; data mining, which involves discovering patterns and
relationships in large datasets; and knowledge management, which involves capturing,
organizing, and sharing knowledge within an organization.
6) Push publishing requires the user to request BI results.
Answer: False
Rationale:
Push publishing involves automatically delivering BI results to users without requiring them
to request the information. This approach ensures that relevant insights are proactively
delivered to decision-makers, enhancing their ability to make timely and informed decisions.
7) Report servers are specialized Web servers.
Answer: True
Rationale:
Report servers are specialized servers that are responsible for generating, storing, and
delivering reports to users within an organization. These servers often operate as web servers,
allowing users to access reports through a web-based interface.
8) Operational data is structured for fast and reliable transaction processing.
Answer: True
Rationale:

Operational data is structured and optimized for efficient transaction processing, supporting
the day-to-day operations of an organization. This data is typically stored in relational
databases and is designed to ensure fast and reliable transaction processing.
9) Placing BI applications on operational servers can dramatically increase system
performance.
Answer: False
Rationale:
Placing BI applications on operational servers can potentially degrade system performance
due to the additional processing and resource requirements of BI activities. Separating BI
systems from operational systems helps to ensure that each system can operate optimally
without negatively impacting the performance of the other.
10) Data warehouses do not include data that are purchased from outside sources.
Answer: False
Rationale:
Data warehouses can include data from a variety of sources, including internal operational
systems, external sources, and purchased datasets. Incorporating data from external sources
allows organizations to enrich their analyses with additional insights and perspectives beyond
their internal data sources.
11) Problematic data are termed dirty data.
Answer: True
Rationale:
Dirty data refers to data that is inaccurate, incomplete, inconsistent, or outdated, making it
problematic for use in business intelligence (BI) processes. Cleaning and preprocessing dirty
data is essential to ensure the accuracy and reliability of BI outcomes.
12) A value of 999–999–9999 for a U.S. phone number is an example of dirty data for BI
purpose.
Answer: True
Rationale:

In the context of business intelligence, a value like 999–999–9999 for a U.S. phone number is
an example of dirty data because it does not conform to the standard format of a valid phone
number. Dirty data can hinder the effectiveness of BI analyses and reporting.
13) Granularity is a term that refers to the level of detail represented by the data.
Answer: True
Rationale:
Granularity in data refers to the level of detail or specificity at which data is captured, stored,
and analyzed. High granularity means detailed data with more specific attributes, while low
granularity means summarized or aggregated data with fewer attributes.
14) It is possible to capture the customers' clicking behavior using clickstream data.
Answer: True
Rationale:
Clickstream data refers to the record of a user's activity on the internet, including clicks,
navigation paths, and interactions with web pages or applications. Capturing and analyzing
clickstream data is common in BI to understand customer behavior and preferences.
15) It is better to have data with too coarse a granularity, than data with too fine a granularity.
Answer: False
Rationale:
Neither extreme is ideal. Having data with too coarse a granularity may lead to loss of
important details and insights, while data with too fine a granularity can increase storage and
processing requirements without providing significant additional value. The granularity of
data should be appropriate to the specific BI requirements.
16) Data mart is another term used for a data warehouse.
Answer: False
Rationale:
While both data marts and data warehouses are used for storing and managing data for BI
purposes, they serve different functions. A data warehouse typically integrates data from

various sources across an organization, while a data mart is a subset of a data warehouse that
focuses on specific business functions or departments.
17) A data mart is a data collection that addresses a particular functional area of the business.
Answer: True
Rationale:
A data mart is a subset of a data warehouse that is designed to serve the needs of a specific
business unit, department, or functional area. Data marts are often optimized for the particular
requirements and analysis tasks of the users within that area.
18) A data mart is a lot larger than a data warehouse.
Answer: False
Rationale:
Generally, a data warehouse is larger and more comprehensive than a data mart. A data
warehouse typically serves as a central repository for integrated organizational data, while
data marts are smaller subsets of the data warehouse tailored to specific user needs.
19) A reporting application is a BI application that inputs data from one or more sources and
applies reporting operations to that data to produce business intelligence.
Answer: True
Rationale:
Reporting applications are a type of BI application designed to retrieve, format, and present
data from various sources in the form of reports, dashboards, or visualizations. These
applications help users analyze and understand data to support decision-making processes.
20) The five basic reporting operations that produce business intelligence can all be
accomplished using SQL and basic HTML or a simple report writing tool.
Answer: True
Rationale:
SQL (Structured Query Language) can be used to perform the basic reporting operations of
filtering, grouping, calculating, sorting, and formatting data. Additionally, basic HTML or

report writing tools can be used to present the processed data in a user-friendly format for
analysis and decision-making.
21) RFM analysis considers how recently (R) a customer ordered, how frequently (F) they
ordered, and how much margin (M) the company made on the orders.
Answer: False
Rationale:
RFM analysis indeed considers how recently a customer ordered and how frequently they
ordered, but it focuses on the monetary value or profitability of those orders rather than the
margin the company made on them. The "M" in RFM typically stands for Monetary Value,
representing the amount of money spent by the customer.
22) OLAP provides the ability to sum, count, average, and perform other simple arithmetic
operations on groups of data.
Answer: True
Rationale:
OLAP (Online Analytical Processing) is designed to perform multidimensional analysis of
data. It allows users to perform various operations such as aggregation (sum, count, average),
drill-down, slice-and-dice, and more on data stored in multidimensional databases or data
warehouses.
23) The remarkable characteristic of OLAP reports is that they are dynamic.
Answer: True
Rationale:
One of the remarkable characteristics of OLAP reports is indeed their dynamic nature. OLAP
reports allow users to interactively explore and analyze data from different perspectives,
applying various operations and filters in real-time to gain insights into business
performance.
24) In an OLAP report, a measure is the data item of interest.
Answer: True
Rationale:

In OLAP terminology, a measure is indeed the data item of interest or the numerical value
being analyzed. Measures represent the quantitative data points such as sales revenue, profit
margin, or customer count that are subjected to analysis in OLAP reports.
25) Total sales, average sales, and average cost are examples of dimensions used in an OLAP
report.
Answer: False
Rationale:
In OLAP, dimensions represent the categorical data attributes by which measures are
analyzed. Total sales, average sales, and average cost are examples of measures, not
dimensions. Dimensions typically include attributes such as product, customer, time, and
geography.
26) An OLAP cube and an OLAP report are the same thing.
Answer: True
Rationale:
An OLAP cube and an OLAP report are essentially the same thing in the sense that both are
used for multidimensional analysis of data. An OLAP cube refers to the multidimensional
data structure used to store and organize data for OLAP analysis, while an OLAP report
presents the analyzed data in a user-friendly format.
27) A drawback associated with OLAP reports is their inability to let users drill down into the
data.
Answer: False
Rationale:
One of the strengths of OLAP reports is their ability to allow users to drill down into the data,
enabling them to explore detailed information and analyze data at different levels of
granularity. OLAP reports support interactive navigation and exploration of data, including
drilling down to lower levels of detail.
28) Data mining is the application of statistical techniques to find patterns and relationships
among data for classification and prediction.

Answer: True
Rationale:
Data mining involves the use of various statistical and machine learning techniques to
analyze large datasets and discover patterns, trends, associations, and anomalies within the
data. These patterns are then used for classification, prediction, clustering, and other
analytical purposes.
29) Knowledge discovery in databases (KDD) is used as a synonym for data mining.
Answer: True
Rationale:
Knowledge discovery in databases (KDD) is often used interchangeably with data mining.
KDD refers to the process of extracting useful knowledge or insights from large volumes of
data, which includes data mining as one of its key techniques for discovering patterns and
relationships.
30) With unsupervised data mining, analysts do not create a model or hypothesis before
running the analysis.
Answer: True
Rationale:
In unsupervised data mining, analysts do not have predefined models or hypotheses about the
data patterns they are looking for. Instead, they allow the data mining algorithms to identify
hidden patterns, structures, or clusters in the data without prior guidance. Unsupervised
techniques include clustering, association rule mining, and anomaly detection.
31) Cluster analysis is used to identify groups of entities that have similar characteristics.
Answer: True
Rationale:
Cluster analysis, also known as clustering, is a data mining technique used to identify natural
groupings or clusters within a dataset based on similarities in the data attributes. It aims to
partition data points into clusters so that data points within the same cluster are more similar
to each other than to those in other clusters.

32) In supervised data mining, a model is developed after the analysis.
Answer: False
Rationale:
In supervised data mining, a model is developed before the analysis, not after. Analysts use
labeled training data to build predictive models that can then be applied to new, unlabeled
data for classification or prediction tasks. Supervised techniques include classification and
regression.
33) Regression analysis measures the effect of a set of variables on another variable.
Answer: True
Rationale:
Regression analysis is a statistical technique used to examine the relationship between one
dependent variable and one or more independent variables. It measures how changes in the
independent variables are associated with changes in the dependent variable, thus assessing
the effect of the variables on each other.
34) Neural networks are a popular unsupervised data mining technique used to predict values
and make classifications.
Answer: False
Rationale:
Neural networks are actually a type of supervised data mining technique commonly used for
pattern recognition, classification, and prediction tasks. In supervised learning, neural
networks are trained on labeled data, where the input-output mappings are known, allowing
them to make predictions or classifications based on new input data.
35) In marketing transactions, the fact that customers who buy product X also buy product Y
creates a cross-selling opportunity.
Answer: True
Rationale:
Cross-selling refers to the practice of selling additional products or services to existing
customers based on their past purchasing behavior. When customers frequently purchase both

product X and product Y together, it indicates a correlation between the two products and
presents an opportunity for cross-selling.
36) In market-basket terminology, confidence is the probability that two items will be
purchased together.
Answer: False
Rationale:
In market-basket terminology, confidence measures the probability that a customer who buys
item X will also buy item Y. It represents the likelihood of item Y being purchased given that
item X was purchased. Confidence is calculated as the ratio of the number of transactions
containing both items X and Y to the number of transactions containing item X.
37) In market-basket terminology, a conditional probability estimate is called lift.
Answer: False
Rationale:
In market-basket terminology, lift is a measure of the strength of association between two
items. It indicates how much more likely it is for two items to be purchased together
compared to what would be expected by chance. Lift is calculated as the ratio of the observed
support (frequency of co-occurrence) to the expected support under independence (product of
individual item supports).
38) A decision tree analysis is a supervised data mining technique.
Answer: False
Rationale:
A decision tree analysis is indeed a supervised data mining technique. Decision trees are
constructed by recursively partitioning the data into subsets based on the values of input
attributes, with the goal of creating a tree-like model of decisions. Each internal node
represents a test on an attribute, each branch corresponds to an outcome of the test, and each
leaf node represents a class label or a numerical value.
39) A common business application of decision trees is to classify loans by likelihood of
default.

Answer: True
Rationale:
Decision trees are commonly used in various business applications, including credit risk
assessment. In the context of banking and finance, decision trees can be applied to classify
loans into categories based on the likelihood of default or creditworthiness, using attributes
such as income, credit history, and loan amount as inputs to the decision-making process.
40) MapReduce is a technique for harnessing the power of thousands of computers working
in parallel.
Answer: True
Rationale:
MapReduce is a programming model and associated implementation framework used for
processing and generating large datasets in parallel across a distributed cluster of computers.
It allows for the efficient processing of vast amounts of data by dividing the workload into
smaller tasks (map phase) and then aggregating the results (reduce phase) in a distributed
manner.
41) Hadoop is an open-source program supported by the Apache Foundation that implements
MapReduce on potentially thousands of computers.
Answer: True
Rationale:
Hadoop is an open-source framework developed by the Apache Software Foundation for
distributed storage and processing of large datasets using the MapReduce programming
model. It provides a scalable, fault-tolerant platform for handling big data analytics and is
designed to run on clusters of commodity hardware, potentially spanning thousands of
computers.
42) Knowledge management was done only after the advent of social media.
Answer: False
Rationale:

Knowledge management (KM) predates the advent of social media by several decades. KM
involves the process of capturing, organizing, storing, and sharing knowledge within an
organization to enhance decision-making, improve efficiency, and foster innovation. While
social media platforms have contributed to knowledge sharing, KM practices existed long
before the emergence of social media.
43) KM fosters innovation by encouraging the free flow of ideas.
Answer: True
Rationale:
One of the key objectives of knowledge management (KM) is to foster innovation within
organizations by facilitating the exchange of ideas, expertise, and best practices among
employees. By creating platforms and processes for sharing knowledge and collaborating on
projects, KM helps to break down silos, stimulate creativity, and promote continuous learning
and improvement.
44) Expert systems are rule-based systems that use "If . . . then" rules unlike those created by
decision tree analysis.
Answer: False
Rationale:
Expert systems and decision tree analysis both involve the use of rules for decision-making,
although they differ in their approaches and applications. Expert systems are rule-based AI
systems that emulate the decision-making processes of human experts using a knowledge
base of rules and heuristics. Decision tree analysis, on the other hand, constructs decision
trees based on data patterns to make predictions or classifications. While decision trees do
use "If . . . then" rules, they are derived from data rather than predefined expert knowledge.
45) In expert systems, the programs that process a set of rules are called expert systems
shells.
Answer: True
Rationale:
Expert systems shells are software tools or frameworks used to develop expert systems.
These shells provide the infrastructure and functionality for representing and processing sets

of rules, typically using "If...Then" logic to mimic the decision-making capabilities of human
experts. Expert systems shells allow developers to focus on encoding domain-specific
knowledge rather than building the underlying inference engine.
46) Static reports are BI documents that are fixed at the time of creation and do not change.
Answer: True
Rationale:
Static reports in business intelligence (BI) are documents or outputs that are generated based
on predefined criteria and do not change over time or in response to user interactions. They
present data and information that is fixed at the time of creation, providing a snapshot of a
particular set of metrics or insights.
47) Dynamic reports are BI documents that are updated at the time they are requested.
Answer: True
Rationale:
Dynamic reports in business intelligence (BI) are documents or outputs that are generated in
real-time or updated dynamically in response to user queries or requests. These reports often
retrieve data from underlying databases or data sources at the time they are accessed,
ensuring that the information presented is current and relevant to the user's needs.
48) A sales report that is current as of the time the user accessed it on a Web server is a static
report.
Answer: False
Rationale:
A sales report that is current as of the time the user accessed it on a Web server is a dynamic
report, not a static report. Dynamic reports are updated at the time they are requested,
ensuring that the data presented reflects the most recent information available in the
underlying data sources.
49) A BI server extends alert/RSS functionality to support user subscriptions.
Answer: True
Rationale:

Business intelligence (BI) servers often include functionality to support alerting and
subscription services for users. These features enable users to subscribe to specific reports,
dashboards, or data alerts, receiving notifications or updates when predefined conditions or
events occur. This functionality enhances user engagement and ensures that stakeholders stay
informed about relevant changes in the data or business environment.
50) All management data needed by any of the BI servers is stored in metadata.
Answer: True
Rationale:
Metadata in business intelligence (BI) systems serves as a repository of information about the
structure, content, and context of the data stored in the BI environment. It includes details
such as data definitions, data lineage, data quality metrics, and relationships between different
data elements. BI servers often rely on metadata to manage and organize the underlying data
assets, facilitating efficient querying, reporting, and analysis.
Multiple Choice Questions
1) ________ is defined as information containing patterns, relationships, and trends of
various forms of data.
A) Process mining
B) Business process management
C) Business intelligence
D) Spatial intelligence
Answer: C
Rationale:
Business intelligence (BI) involves analyzing data to uncover insights, patterns, relationships,
and trends that can help organizations make informed decisions. It encompasses techniques
and technologies for collecting, processing, and analyzing data from various sources to
support business decision-making processes.
2) Which of the following is true of source data for a BI system?
A) It refers to the organization's metadata.

B) It refers to data that the organization purchases from data vendors.
C) It refers to the level of detail represented by the data.
D) It refers to the hierarchical arrangement of criteria that predict a classification or a value.
Answer: B
Rationale:
Source data for a BI system typically refers to the raw data that the organization collects from
various internal and external sources. This data may include transactional data, customer
information, sales records, and more. In some cases, organizations may also purchase
additional data from external data vendors to augment their internal data sources.
3) Data ________ is the process of obtaining, cleaning, organizing, relating, and cataloging
source data.
A) entry
B) acquisition
C) mining
D) encryption
Answer: B
Rationale:
Data acquisition is the process of gathering, cleaning, organizing, and cataloging source data
from various internal and external sources. This step is essential for preparing the data for
further analysis and processing within the BI system.
4) Which of the following is a fundamental category of BI analysis?
A) automation
B) catalog
C) report servers
D) data mining
Answer: D

Rationale:
Data mining is a fundamental category of BI analysis that involves using statistical and
machine learning techniques to discover patterns, trends, and insights within large datasets. It
encompasses methods such as clustering, classification, regression, and association rule
mining.
5) Push publishing delivers business intelligence ________.
A) according to a schedule or as a result of an event or particular data condition
B) through reporting, data mining, and knowledge management
C) by obtaining, cleaning, organizing, relating, and cataloging source data
D) in response to requests from users
Answer: A
Rationale:
Push publishing delivers business intelligence according to a predefined schedule or in
response to specific events or conditions without requiring user requests. It proactively
delivers relevant insights, reports, or alerts to users based on predetermined criteria or
triggers.
6) ________ requires the user to request BI results.
A) Push publishing
B) Pull publishing
C) Desktop publishing
D) Accessible publishing
Answer: B
Rationale:
Pull publishing requires users to actively request business intelligence results or reports when
they need them. Unlike push publishing, which proactively delivers information, pull
publishing requires users to initiate the retrieval of BI content based on their specific needs or
queries.

7) Because of the various problems with operational data, large organizations choose to
extract operational data and store them into a(n) ________.
A) OLAP cube
B) neural network
C) data warehouse
D) Web server
Answer: C
Rationale:
Large organizations often choose to extract operational data from transactional systems and
store them in a data warehouse. A data warehouse is a centralized repository that integrates
data from multiple sources and provides a unified view of the organization's data for
reporting and analysis purposes.
8) ________ records the source, format, assumptions and constraints, and other facts about
the data.
A) Clickstream data
B) Dimensional data
C) Outsourced data
D) Metadata
Answer: D
Rationale:
Metadata refers to descriptive information about the data, including its source, format,
structure, and other relevant attributes. It provides context and documentation for the data,
helping users understand its meaning, lineage, and usage within the BI system.
9) Problematic operational data are termed ________.
A) bad data
B) rough data

C) dirty data
D) granular data
Answer: C
Rationale:
Problematic operational data that contain errors, inconsistencies, or inaccuracies are often
referred to as dirty data. Dirty data can arise from various sources, including data entry errors,
system glitches, and incomplete or outdated information.
10) ________ is a term that refers to the level of detail represented by the data.
A) Granularity
B) Intricacy
C) Elaboration
D) Complexity
Answer: A
Rationale:
Granularity in the context of data refers to the level of detail or specificity represented by the
data. It indicates how finely data is categorized, organized, or recorded, with higher
granularity indicating more detailed information and lower granularity indicating broader,
aggregated data. Granularity impacts the precision and specificity of analyses and reporting
conducted using the data.
11) Which of the following statements is true about operational data?
A) It is always better to have data with too coarse a granularity than with too fine a
granularity.
B) If the data granularity is too coarse, the data can be made finer by summing and
combining.
C) Purchased operational data often contains missing elements.
D) Problematic operational data are termed rough data.
Answer: C

Rationale:
Purchased operational data, especially from external sources, often contain missing elements
or incomplete records due to various factors such as data collection processes, data
transmission errors, or data formatting issues. Organizations need to address these missing
elements through data cleansing and preprocessing techniques before using the data for
analysis or decision-making.
12) Due to a phenomenon called the ________, the more attributes there are, the easier it is to
build a model that fits the sample data but that is worthless as a predictor.
A) attribute paradox
B) curse of dimensionality
C) uncertainty principle
D) economies of scale
Answer: B
Rationale:
The "curse of dimensionality" refers to the challenge of working with high-dimensional data,
where the number of attributes or features exceeds the number of samples or observations. In
such cases, models may overfit the training data, resulting in poor generalization to new data.
This phenomenon highlights the importance of feature selection, dimensionality reduction,
and regularization techniques in machine learning and data analysis.
13) A ________ takes data from the data manufacturers, cleans and processes the data, and
then stores it.
A) data mart
B) data mine
C) data warehouse
D) data model
Answer: C
Rationale:

A data warehouse is a centralized repository that integrates data from various sources,
cleanses and processes the data to ensure consistency and quality, and then stores it for
analysis and reporting purposes. Data warehouses are designed to support decision-making
processes by providing a unified view of the organization's data.
14) A data ________ is a data collection, smaller than the data warehouse, that addresses a
particular department or functional area of the business.
A) mart
B) mine
C) cube
D) model
Answer: A
Rationale:
A data mart is a subset of a data warehouse that focuses on specific subject areas,
departments, or functional areas within an organization. Data marts are designed to meet the
unique needs and requirements of specific user groups or business units, providing tailored
access to relevant data for analysis and decision-making.
15) Which of the following statements is true about data marts?
A) A data mart is like a distributor in a supply chain, while a data warehouse can be
compared to a retail store.
B) Data mart users possess the data management expertise that data warehouse employees
have.
C) Data marts address only a particular component or functional area of a business.
D) Data marts are larger than data warehouses.
Answer: C
Rationale:
Data marts are designed to address specific business needs or functional areas within an
organization, focusing on providing relevant data and analytical capabilities to support

decision-making within those areas. Unlike data warehouses, which serve as centralized
repositories for integrated data from multiple sources, data marts are smaller in scope and
tailored to the requirements of particular user groups or departments.
16) Which of the following statements is true about reporting applications?
A) Reporting applications deliver business intelligence to users as a result of an event or
particular data condition.
B) Reporting applications consist of five standard components: hardware, software, data,
procedures, and people.
C) Two important reporting applications are RFM analysis and OLAP.
D) Reporting applications produce business intelligence using highly sophisticated
operations.
Answer: C
Rationale:
Reporting applications are a category of business intelligence (BI) tools that focus on
generating predefined reports and visualizations from data to support decision-making
processes. Two common types of reporting applications are RFM (Recency, Frequency,
Monetary) analysis, which ranks customers based on their purchasing patterns, and OLAP
(Online Analytical Processing), which enables multidimensional analysis of data.
17) Which of the following is a basic operation used by reporting tools to produce
information from data?
A) coalescing
B) transposing
C) dispersing
D) calculating
Answer: D
Rationale:

Calculating is a basic operation used by reporting tools to produce information from data.
Reporting tools perform various calculations such as summing, averaging, counting, and
aggregating data to generate meaningful insights and metrics for decision-making purposes.
18) ________ analysis is a way of analyzing and ranking customers according to their
purchasing patterns.
A) TQM
B) CRM
C) Market-basket
D) RFM
Answer: D
Rationale:
RFM (Recency, Frequency, Monetary) analysis is a method used in marketing and customer
relationship management (CRM) to analyze and rank customers based on their purchasing
behavior. It considers factors such as how recently a customer made a purchase (Recency),
how frequently they make purchases (Frequency), and how much money they spend
(Monetary).
19) RFM analysis is used to analyze and rank customers according to their ________.
A) purchasing patterns
B) propensity to respond to a marketing stimulus
C) socio-economic status
D) motivation and needs
Answer: A
Rationale:
RFM analysis is primarily used to analyze and rank customers based on their purchasing
patterns. By considering factors such as how recently a customer made a purchase, how
frequently they make purchases, and how much money they spend, organizations can
segment their customer base and tailor marketing strategies to different customer segments.

20) U.S. Steel Corp. is a well-known steel manufacturing company. SAMCROW, one of the
customers of U.S. Steel Corp. holds an RFM score of 111. Which of the following
characteristics relates SAMCROW with its RFM score?
A) SAMCROW has ordered recently and orders frequently, but it orders the least expensive
goods.
B) SAMCROW has not ordered in some time, but when it did order in the past it ordered
frequently, and its orders were of the highest monetary value.
C) SAMCROW has not ordered for some time, it did not order frequently, and, when it did
order, it bought the least-expensive items.
D) SAMCROW has ordered recently and orders frequently, and it orders the most expensive
goods.
Answer: D
Rationale:
An RFM score of 111 indicates that SAMCROW has ordered recently (high recency), orders
frequently (high frequency), and orders high-value goods (high monetary value). Therefore,
option D best relates to SAMCROW's RFM score.
21) A sales team should attempt to up-sell more expensive products to a customer who has an
RFM score of ________.
A) 311
B) 555
C) 113
D) 545
Answer: C
Rationale:
In RFM analysis, the RFM score is typically represented as a three-digit number, with each
digit corresponding to recency, frequency, and monetary value, respectively. For up-selling
more expensive products, a customer with a high monetary value is preferable. Among the

given options, a customer with an RFM score of 113 would have the highest monetary value,
making it suitable for up-selling more expensive products.
22) Ajax is one of the customers of a well-known linen manufacturing company. Ajax has not
ordered linen in some time, but when it did order in the past it ordered frequently, and its
orders were of the highest monetary value. Under the given circumstances, Ajax's RFM score
is most likely ________.
A) 155
B) 511
C) 555
D) 151
Answer: B
Rationale:
Ajax's RFM score is derived from its ordering behavior, where it has ordered frequently and
made high-value purchases in the past. Since the highest possible score for each dimension is
5, and Ajax has not ordered recently, its RFM score would likely be 511, indicating high
frequency and monetary value but low recency.
23) How should a sales team respond to a customer who has an RFM score of 545?
A) The sales team should contact this customer immediately.
B) The sales team should let go of this customer; the loss will be minimal.
C) The sales team should attempt to up-sell more expensive goods to this customer.
D) The sales team should spend more time with this customer.
Answer: B
Rationale:
A customer with an RFM score of 545 indicates high monetary value and frequency but low
recency, suggesting that the customer has not ordered recently. In this case, the sales team
may decide to let go of this customer as their potential for immediate revenue generation
might be low.

24) OLAP stands for ________.
A) online analytical processing
B) object-based lead analysis procedure
C) object-oriented analytical protocol
D) organizational lead analysis process
Answer: A
Rationale:
OLAP stands for Online Analytical Processing, which is a technology that enables users to
interactively analyze multidimensional data from multiple perspectives.
25) The viewer of an OLAP report can change its format. Which term implies this capability?
A) processing
B) analytical
C) dimension
D) online
Answer: D
Rationale:
The term "online" implies the capability for users to interactively change the format or view
of an OLAP report according to their analytical needs and preferences.
26) The remarkable characteristic of OLAP reports is that they are ________, as they are
online and the viewer of the report can change their format.
A) extensible
B) informal
C) specific
D) dynamic
Answer: D

Rationale:
OLAP reports are dynamic in nature, allowing users to dynamically manipulate and explore
data from different dimensions and perspectives, making them well-suited for interactive
analysis and decision-making.
27) An OLAP report has measures and dimensions. Which of the following is an example of
a dimension?
A) total sales
B) average sales
C) sales region
D) average cost
Answer: C
Rationale:
In OLAP reports, dimensions represent the characteristics or attributes by which data is
analyzed or categorized. "Sales region" is an example of a dimension, as it categorizes data
based on geographical regions.
28) Which of the following accurately defines a dimension in an OLAP report?
A) It is a characteristic of a measure.
B) It is the item that is processed in the OLAP report.
C) It is the data item of interest.
D) It is referred to as a decision tree.
Answer: A
Rationale:
A dimension in an OLAP report is a characteristic or attribute of a measure that provides
context for analysis. It categorizes data along specific criteria, allowing users to slice and dice
data for deeper insights.
29) Which of the following is an example of a measure in an OLAP report?

A) customer type
B) purchase date
C) sales region
D) average cost
Answer: D
Rationale:
A measure in an OLAP report represents a quantifiable metric or numerical value that is of
interest for analysis. "Average cost" is an example of a measure as it provides information
about the average cost of products sold.
30) An ________ and an OLAP report are the same thing.
A) OLAP measure
B) OLAP cube
C) OLAP dimension
D) OLAP array
Answer: B
Rationale:
An OLAP cube is a multidimensional data structure used for storing and analyzing data in
OLAP reports. It contains measures, dimensions, and hierarchies that enable users to perform
multidimensional analysis and exploration of data.
31) Which of the following observations about RFM and OLAP reports is true?
A) RFM is more generic than OLAP.
B) OLAP reports are more dynamic than RFM reports.
C) RFM reports have measures and dimensions.
D) RFM reports can drill down into the data.
Answer: B

Rationale:
OLAP (Online Analytical Processing) reports are typically more dynamic than RFM
(Recency, Frequency, Monetary) reports. OLAP reports allow users to interactively analyze
multidimensional data, change perspectives, and drill down into details, making them more
dynamic compared to RFM reports, which focus on specific customer behavior metrics
without the same level of interactive functionality.
32) ________ is the application of statistical techniques to find patterns and relationships
among data for classification and prediction.
A) Data optimization
B) Database normalization
C) Data mining
D) Data warehousing
Answer: C
Rationale:
Data mining involves the application of statistical techniques to discover patterns,
relationships, and insights from large datasets. It aims to extract useful information and
knowledge from data for classification, prediction, clustering, and other analytical purposes.
33) Which of the following terms is used as a synonym for data mining?
A) regression analysis
B) data warehousing
C) knowledge discovery in databases
D) parallel processing
Answer: C
Rationale:
Knowledge discovery in databases (KDD) is often used interchangeably with data mining.
Both terms refer to the process of discovering patterns, trends, and valuable insights from
large datasets.

34) Which of the following is true of unsupervised data mining?
A) Analysts do not create a model or hypothesis before running the analysis.
B) Neural networks are a popular unsupervised data mining application.
C) Unsupervised data mining requires tools such as regression analysis.
D) Unsupervised data mining requires analysts to fit data to suggested hypotheses.
Answer: A
Rationale:
Unsupervised data mining involves exploring data without a specific hypothesis or model in
mind. Analysts do not guide the analysis based on predefined outcomes but rather let
algorithms identify patterns and relationships within the data on their own.
35) With ________, statistical techniques can identify groups of entities that have similar
characteristics.
A) regression analysis
B) cluster analysis
C) expert systems
D) neural networks
Answer: B
Rationale:
Cluster analysis is a data mining technique used to identify natural groupings or clusters
within a dataset based on similarities between data points. It helps to uncover patterns and
structures in data without requiring predefined categories or labels.
36) With ________, data miners develop a model prior to the analysis and apply statistical
techniques to data to estimate parameters of the model.
A) cluster analysis
B) unsupervised data mining
C) supervised data mining

D) click streaming
Answer: C
Rationale:
Supervised data mining involves developing a predictive model based on known outcomes or
labeled data. Analysts guide the analysis by providing a training dataset with predefined
target variables, allowing algorithms to learn patterns and relationships for predictive
purposes.
37) Which of the following is an example of a supervised data-mining technique?
A) cluster analysis
B) market-basket analysis
C) regression analysis
D) click streaming
Answer: C
Rationale:
Regression analysis is a supervised data mining technique used to model the relationship
between a dependent variable and one or more independent variables. It predicts the value of
the dependent variable based on the values of the independent variables.
38) Which of the following shows the products that customers tend to buy together?
A) regression analysis
B) market-basket analysis
C) neural networks
D) cluster analysis
Answer: B
Rationale:

Market-basket analysis is a data mining technique used to identify associations and patterns
in customer purchasing behavior. It analyzes transaction data to discover which products are
frequently purchased together, enabling retailers to implement cross-selling strategies.
39) In marketing transactions, the fact that customers who buy product X also buy product Y
creates a ________ opportunity. That is, "If they're buying X, sell them Y," or "If they're
buying Y, sell them X."
A) cross-selling
B) value-added selling
C) break-even
D) portfolio
Answer: A
Rationale:
Cross-selling is the practice of selling additional products or services to customers based on
their existing purchases. When customers buy one product, there is an opportunity to
recommend or upsell related or complementary products, creating a cross-selling opportunity.
40) In market-basket terminology, ________ describes the probability that two items will be
purchased together.
A) support
B) confidence
C) lift
D) dimension
Answer: A
Rationale:
In market-basket analysis, support refers to the frequency with which two items are
purchased together in transactions. It quantifies the probability of co-occurrence and is used
to identify associations between items in a dataset.

41) In market-basket terminology, the ratio of confidence to the base probability of buying an
item is called ________.
A) confidence
B) support
C) granularity
D) lift
Answer: D
Rationale:
Lift is a measure used in market-basket analysis to determine the strength of association
between items in a transaction. It indicates the ratio of the observed frequency of two items
being purchased together to the frequency expected if they were purchased independently.
Higher lift values indicate stronger associations.
42) ________ is a hierarchical arrangement of criteria that predict a classification or a value.
A) A value chain
B) A cluster analysis
C) A decision tree
D) A neural network
Answer: C
Rationale:
A decision tree is a predictive modeling tool that utilizes a hierarchical structure of decision
nodes to classify data or predict outcomes. Each node represents a criterion or attribute, and
branches represent possible values or decisions based on those criteria.
43) ________ is a technique for harnessing the power of thousands of computers working in
parallel.
A) IPP
B) MapReduce

C) Business Process
D) Reposition
Answer: B
Rationale:
MapReduce is a programming model and parallel processing technique used for processing
and generating large datasets across distributed clusters of computers. It enables efficient
processing of massive amounts of data by dividing tasks into smaller sub-tasks that can be
executed in parallel.
44) ________ is the process of creating value from intellectual capital and sharing that
knowledge with employees, managers, suppliers, customers, and others who need it.
A) Intellectual property protection
B) Knowledge management
C) Business process reengineering
D) Repository management
Answer: B
Rationale:
Knowledge management (KM) involves the systematic process of creating, organizing,
sharing, and utilizing knowledge assets within an organization to enhance decision-making,
problem-solving, innovation, and overall performance.
45) KM benefits organizations because it ________.
A) allows distributors to work within the company premises
B) enables suppliers to work according to the organizational policies
C) enables customers to provide bulk feedback
D) enables employees and partners to work smarter
Answer: D
Rationale:

Knowledge management enables organizations to harness and leverage the collective
expertise, insights, and experiences of employees and partners, leading to improved decisionmaking, problem-solving, innovation, and overall productivity.
46) Which of the following is true about knowledge management (KM)?
A) KM restricts the free flow of ideas.
B) KM improves customer service by broadening the response time.
C) KM boosts revenues by getting products and services to market faster.
D) KM increases operational cost.
Answer: C
Rationale:
Knowledge management (KM) can enhance organizational performance by accelerating the
dissemination of knowledge, streamlining processes, fostering collaboration, and promoting
innovation, ultimately leading to faster time-to-market for products and services and
increased revenues.
47) ________ attempt to directly capture employee expertise.
A) Expert networks
B) Expert systems
C) Expert analysis
D) Expert decisions
Answer: B
Rationale:
Expert systems are AI-based information systems designed to emulate the decision-making
abilities of human experts in specific domains. They capture and encode expert knowledge in
the form of rules or heuristics to provide automated problem-solving and decision support.
48) Which of the following observations concerning expert systems is true?
A) The "If...then" rules used in these systems are created by mining data.

B) These systems have lived up to the high expectations set by their name.
C) These systems typically have fewer than a dozen rules.
D) These systems encode human knowledge in the form of "If...Then" rules.
Answer: D
Rationale:
Expert systems encode human expertise and knowledge in the form of "If...then" rules or
logical statements. These rules are derived from the understanding and experience of human
experts in a particular domain and are used to guide the system's decision-making process.
49) ________ are information systems that support the management and delivery of
documents including reports, Web pages, and other expressions of employee knowledge.
A) Knowledge Discovery in Databases (KDD)
B) Online Analytical Processing (OLAP) applications
C) Content Management Systems (CMS)
D) Data Transfer Protocols (DTP)
Answer: C
Rationale:
Content Management Systems (CMS) are software platforms used to create, manage, and
deliver digital content, including documents, reports, web pages, and multimedia assets. They
facilitate the organization, storage, retrieval, and sharing of knowledge resources within an
organization.
50) ________ is the application of social media and related applications for the management
and delivery of organizational knowledge resources.
A) BI server management
B) Hyper-social knowledge management
C) Knowledge management protocol
D) Expert system

Answer: B
Rationale:
Hyper-social knowledge management refers to the integration of social media technologies
and platforms into knowledge management processes. It involves leveraging social
collaboration tools, user-generated content, and online communities to enhance the creation,
sharing, and dissemination of organizational knowledge resources.
51) A sales report that is current, as of the time the user accessed it on a Web server, is an
example of a(n) ________.
A) static report
B) dynamic report
C) expert system
D) hybrid market report
Answer: B
Rationale:
A dynamic report is one that is updated or refreshed automatically to reflect the most current
data at the time of access. In this case, the sales report being current as of the time the user
accessed it on a Web server indicates that it dynamically adjusts its content based on real-time
or near-real-time data.
52) Which of the following statements is true about BI publishing alternatives?
A) Most dynamic reports are published as PDF documents.
B) For Web servers and SharePoint, the push option is mandatory.
C) BI servers extend alert/RSS functionality to support user subscriptions.
D) Publishing static BI content requires more skill, compared to publishing dynamic BI
content.
Answer: C
Rationale:

BI servers often extend their functionality to include features like alerting and RSS (Really
Simple Syndication) feeds, which allow users to subscribe to specific reports or data updates.
This functionality enhances user engagement and enables timely access to relevant
information.
53) The ________ is the most popular BI server today.
A) Oracle BI Server
B) Microsoft Windows Vista
C) Microsoft SQL Server Report manager
D) Pentaho BI Server
Answer: C
Rationale:
Microsoft SQL Server Report Manager is a popular BI server used for creating, managing,
and distributing reports within organizations. It offers robust reporting capabilities and
integration with other Microsoft technologies, making it widely adopted in the BI industry.
54) BI servers use ________ to determine what results to send to which users and on which
schedule.
A) expert systems
B) metadata
C) RSS feeds
D) neural networks
Answer: B
Rationale:
BI servers rely on metadata, which contains information about the structure, content, and
context of the data, to determine how to process and deliver results to users. Metadata helps
BI servers understand data relationships, user access permissions, report formats, and
scheduling requirements, enabling efficient and personalized delivery of business intelligence
insights.

Essay Questions
1) Define business intelligence and BI systems.
Answer: Business intelligence (BI) systems are information systems that process operational
and other data to identify patterns, relationships, and trends that can be used to make
predictions. These patterns, relationships, trends, and predictions are referred to as business
intelligence. As information systems, BI systems have the five standard components:
hardware, software, data, procedures, and people. The software component of a BI system is
called a BI application.
2) Name and describe the three primary activities in the BI process.
Answer: The three primary activities in the BI process are: acquire data, perform analysis,
and publish results. Data acquisition is the process of obtaining, cleaning, organizing,
relating, and cataloging source data. BI analysis is the process of creating business
intelligence. The three fundamental categories of BI analysis are reporting, data mining, and
knowledge management. Publish results is the process of delivering business intelligence to
the knowledge workers who need it. Push publishing delivers business intelligence to users
without any request from the users; the BI results are delivered according to a schedule or as
a result of an event or particular data condition. Pull publishing requires the user to request BI
results. Publishing media include print as well as online content delivered via Web servers,
specialized Web servers known as report servers, and BI results that are sent via automation
to other programs.
3) Describe the functions of data warehouses.
Answer: Larger organizations typically create and staff a group of people who manage and
run a data warehouse, which is a facility for managing an organization's BI data. The
functions of a data warehouse are to:
• Obtain data
• Cleanse data
• Organize and relate data
• Catalog data
4) What are the common problems with using operational data?

Answer: The problems associated with operational data are:
• Problematic (dirty) data
• Missing values
• Inconsistent data
• Nonintegrated data
• Wrong granularity (too fine; not fine enough)
• Too much data (too many attributes; too many data points)
5) Describe the features of a data mart.
Answer: A data mart is a data collection, smaller than the data warehouse, that addresses a
particular component or functional area of the business. If the data warehouse is the
distributor in a supply chain, then a data mart is like a retail store in a supply chain. Users in
the data mart obtain data that pertain to a particular business function from the data
warehouse. Such users do not have the data management expertise that data warehouse
employees have, but they are knowledgeable analysts for a given business function.
6) What is a reporting application? Name five basic reporting operations.
Answer: A reporting application is a BI application that inputs data from one or more sources
and applies reporting operations to that data to produce business intelligence.
Reporting applications produce business intelligence using five basic operations:
• Sorting
• Filtering
• Grouping
• Calculating
• Formatting
7) What is an RFM analysis?
Answer: RFM analysis, a technique readily implemented with basic reporting operations, is
used to analyze and rank customers according to their purchasing patterns. RFM considers

how recently (R) a customer has ordered, how frequently (F) a customer ordered, and how
much money (M) the customer has spent.
8) What is OLAP? What are some of its features?
Answer: Online analytical processing (OLAP) is an important reporting application. It is
more generic than RFM. OLAP provides the ability to sum, count, average, and perform
other simple arithmetic operations on groups of data. The remarkable characteristic of OLAP
reports is that they are dynamic. The viewer of the report can change the report's format,
hence the term online. An OLAP report has measures and dimensions. A measure is the data
item of interest. It is the item that is to be summed or averaged or otherwise processed in the
OLAP report. Total sales, average sales, and average cost are examples of measures. A
dimension is a characteristic of a measure. Purchase date, customer type, customer location,
and sales region are all examples of dimensions. With an OLAP report, it is possible to drill
down into the data. This term means to further divide the data into more detail.
9) Differentiate between unsupervised and supervised data mining.
Answer: Data mining techniques fall into two broad categories: unsupervised and supervised.
With unsupervised data mining, analysts do not create a model or hypothesis before running
the analysis. Instead, they apply the data mining technique to the data and observe the results.
With this method, analysts create hypotheses after the analysis, in order to explain the
patterns found. With supervised data mining, data miners develop a model prior to the
analysis and apply statistical techniques to data to estimate parameters of the model.
10) What is the objective of performing a market-basket analysis?
Answer: A market-basket analysis is an unsupervised data mining technique for determining
sales patterns. Such an analysis shows the products that customers tend to buy together. In
marketing transactions, the fact that customers who buy product X also buy product Y creates
a cross-selling opportunity. That is, "If they're buying X, sell them Y," or "If they're buying Y,
sell them X."
11) What are MapReduce and Hadoop?
Answer: MapReduce is a technique for harnessing the power of thousands of computers
working in parallel. The basic idea is that the BigData collection is broken into pieces, and
hundreds or thousands of independent processors search these pieces for something of

interest. Hadoop is an open-source program supported by the Apache Foundation4 that
implements MapReduce on potentially thousands of computers. Hadoop could drive the
process of finding and counting the Google search terms, but Google uses its own proprietary
version of MapReduce to do so, instead.
12) What is knowledge management (KM)?
Answer: Knowledge management (KM) is the process of creating value from intellectual
capital and sharing that knowledge with employees, managers, suppliers, customers, and
others who need that capital. The goal of knowledge management is to prevent the kinds of
problems just described.
13) What are the primary benefits of knowledge management?
Answer: The primary benefits of KM are:
1. KM fosters innovation by encouraging the free flow of ideas.
2. KM improves customer service by streamlining response time.
3. KM boosts revenues by getting products and services to market faster.
4. KM enhances employee retention rates by recognizing the value of employees' knowledge
and rewarding them for it.
5. KM streamlines operations and reduces costs by eliminating redundant or unnecessary
processes.
14) What are expert systems? What are their primary disadvantages?
Answer: Expert systems are rule-based systems that encode human knowledge in the form of
If/then rules. They are rule-based systems that use "If/then" rules similar to those created by
decision tree analysis. However, decision trees' "If/then" rules are created by mining data.
The "If/then" rules in expert systems are created by interviewing experts in a given business
domain and codifying the rules stated by those experts. They suffer from three major
disadvantages. First, they are difficult and expensive to develop. Second, expert systems are
difficult to maintain. Finally, they have been unable to live up to the high expectations set by
their name.
15) Explain the difference between static and dynamic reports.

Answer: Static reports are BI documents that are fixed at the time of creation and do not
change. A printed sales analysis is an example of a static report. In the BI context, most static
reports are published as PDF documents.
Dynamic reports are BI documents that are updated at the time they are requested. A sales
report that is current as of the time the user accessed it on a Web server is a dynamic report.
In almost all cases, publishing a dynamic report requires the BI application to access a
database or other data source at the time the report is delivered to the user.
16) Describe the management functions of a business intelligence server.
Answer: Business intelligence servers provide two major functions: management and
delivery. The management function maintains metadata about the authorized allocation of BI
results to users. The BI server tracks what results are available, what users are authorized to
view those results, and the schedule upon which the results are provided to the authorized
users. It adjusts allocations as available results change and users come and go. BI servers
vary in complexity and functionality, and their management function varies as well.

Test Bank for Using MIS
David M. Kroenke
9780133029673, 9780135191767, 9780134106786, 9780138132484, 9780136100751, 9780134606996

Document Details

Related Documents

Close

Send listing report

highlight_off

You already reported this listing

The report is private and won't be shared with the owner

rotate_right
Close
rotate_right
Close

Send Message

image
Close

My favorites

image
Close

Application Form

image
Notifications visibility rotate_right Clear all Close close
image
image
arrow_left
arrow_right