Module 1 has covered quite a bit of ground on data analytics in the context of Business Intelligence. Its content covers a few main topics, namely, Data Warehouse Design & Star Schema Design, Performance Management & Balanced Scorecard, Data Quality Analysis, and Dashboard Design and Analysis. This article overviews the four questions.
Data warehouse Design & Star Schema Design
The
module started off by introducing the notion of data warehouses and data marts,
and the data design system known as Dimensional Modeling (Kimball et al. 2016).
The essential protocol for dimensional modeling involves (1) defining the business process, (2) declaring
the grain of the model (i.e., events where data are collected from), (3) identifying
its dimensions (i.e., its context), and (4) identifying its measurable facts.
An important characteristic of dimensional
modeling is its support of database design where database normalization is
largely avoided. As Kimball and Ross (2013) put it, “the price
paid for greater analytic flexibility is often greater complexity. Although IT
professionals may be impressed by elegant flexibility, business users may be
just as easily turned off by complexity.”
I
see a tradeoff which seems to be analog to that of code writing and web content
development. Basically, two paradigms seem to go opposite directions:
- Minimize information redundancy. This approach
aims to protect content integrity, reduce maintenance time, and minimize
storage of information, at the cost of increasing the complexity and the number
of connections between items (whether they are data objects of modules of source
code).
- Allow for redundant information, to make readability
of content more user-friendly, improve access of content by certain bits of
information (e.g. queries or functions in source code), and thin the network of
joins between bits of information. (e.g. this is found in both code development
and database modeling).
In general, a model staying halfway of these theoretical systems may be the preferred practice, to facilitate maintenance and update of content but also enable a logical structure for efficient access and navigation of the data space.
Data Quality Analysis
This
section was the shortest in the module. Its importance to guarantee a reliable
and efficient data warehouse is, however, unquestionable. Most of its content
revolves around data profiling and data governance. In data profiling, the goal is to discover and understand the content, structure
and quantity of data, as part of managing and governing data. Examples of profiling
techniques include visualizing the frequency of values, uncovering associations
between data attributes, and accurately depict the completeness and consistency
of attribute values. In my search for resources on data quality management, I
found the following publications to be a good source for definitions and method:
Performance management & Balanced Scorecards

The
module also focused on particular methodologies for the management of business
performance. Concretely, it considered the Balanced Scorecard (Kaplan and
Norton, 1996; Schneiderman, 2006). The Balanced Scored card includes goals and performance
measures that respond to four perspectives: financial (e.g. sales, revenue), customer
(e.g. customer satisfaction, customer purchasing preferences), internal
business process (e.g., overall cycle time, product quality), and learning
& growth (e.g., life cycle of a product, design of new products, employee
training). The balanced scorecard has evolved into a second generation and a
third generation to improve the selection of more relevant measures (Lawrie and
Cobbold, 2002). I found it interesting to note how the Balanced Scorecard concept,
although popularized by Kaplan and Norton’s publications, dates back to the mid-1980’s
and originated from a professional business environment, just to be discovered,
modeled, and adopted later in academic management thinking (Schneiderman,
2006).
I
seem to see capacity in the Balanced Scorecard to define the dimensions of the
search space in multiobjective optimization problems. Provided, alternative
strategies or several cases can be compared (e.g. different branches, years,
regions, or simulated business models) a global measure of comparative performance
(e.g., a Pareto front) could be
added to the framework to assist managers in assessing and prioritizing objectives
based on a population of cases.
Dashboard Design & Analysis
The
basic notion about a dashboard is that we are using a graphical environment
that helps to visually describe key performance indicators at a glance. Few
(2006) points out that, despite the existence of suitable graphical user
interface technologies in the, dashboards didn't become widespread until the
late 1990s thanks to the popularity of KPIs and the adoption of Robert S.
Kaplan and David P. Norton's Balanced Scorecard (Kaplan and Norton, 1996). We
have introduced the Balanced Scorecard above. The graphical analogy and
complementarity between both tools are evident, but conceptual differences can
be identified: while scoreboards are more of a tactic tool to monitor
performance, a balanced scoreboard is a strategic framework where quality
indicators of performance are prescribed.
Multimedia
(text, image) and multimodal information formats (chart, map, measures) provide a visual
interface with capabilities to interact with dynamic data. Although not
exclusive of dashboards nor caused by them, miniaturization of multimedia devises
(tablets, mobile, laptops) may affect the communicative capacity of dashboard,
as graphical features in the dashboard need to be compresses, simplifies, and minimized
to makes to prioritize information at the same time that saturation of
information in the screen is avoided.
References
Few,
S. (2006). Information Dashboard Design. Reilly. [URL]
Kaplan,
R.S. and Norton, D.P. (1992). “The Balanced Scorecard – Measures That Drive
Performance”. Harvard Business Review. January–February: 71–79.
Kaplan,
R.S. and Norton, D.P. (1996). “Using the Balanced Scorecard as a Strategic Management
System”. Harvard Business Review. January–February: 75–85.
Kimball, R. and Ross, R. (2013). The Data
Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd ed.). Wiley.
Lawrie, G. & Cobbol, I. (2002).
Development
of the 3rd Generation Balanced Scorecard: Evolution of the Balanced
Scorecard into an effective strategic performance management tool. 2GC Working Paper.
Schneiderman,
A.M. (2006). "Analog Devices: 1986-1992: The First Balanced Scorecard". URL: https://www.schneiderman.com/Concepts/The_First_Balanced_Scorecard/BSC_INTRO_AND_CONTENTS.htm
[Accessed: 9/20/2020
Woodall, P., Oberhofer, M., and Borek, A.
(2014). “A Classification of Data Quality Assessment and Improvement Methods”. International
Journal of Information Quality, 3. DOI: https://doi.org/10.1504/IJIQ.2014.068656
I'm a little sad that that the data quality section was not as long as I felt it should have been. Data quality affects companies in the trillions of dollars, and I'm sure as we start to consume more and more data expontentially, this will become a worsening problem. I would check out the source below for a good read on the cost of bad data within the US.
ReplyDeleteA visualization engineer or BI dashboard designer has to go through many cost/benefit grids to understand what information should be shown and what information can be removed from the scope in order for management and leadership to still understand the problem and the solution being presented in the data. Overall, I agree that with the shrinkage of screens, the ability for visualization platforms to be scalable and adaptive to varying screen sizes will be a key driver of adoption and use, especially since a lot of indidividuals are working from home and also working from their phone.
Source: https://hbr.org/2016/09/bad-data-costs-the-u-s-3-trillion-per-year
Hi Fernando, great post! I enjoyed how you framed dashboards and balanced scorecard as complementary yet conceptually different. I also agree with the assertion that the data quality section could play bigger part in the class. I attempted to use the Ataccama online profiling tool on the spreadsheet for the assignment, but it took forever to get through the queue and it didn't finish before the assignment was due.
ReplyDelete