Saturday, March 12, 2016

Chapter 8- Accessing Organizational Information (Data Warehouse)

What is Data Warehouse?
Ø  Defined in many different ways, but not rigorously
-          A decision support database that is maintained separately from the organization’s operational database.
-          A consistent database source that bring together information from multiple sources for decision support queries.
-          Support information processing by providing a solid platform of consolidated, historical data for analysis.
History of Data Warehousing
Ø  In the 1990’s executives became less concerned with the day-to-day business operations and more concerned with overall business functions
Ø  The data warehouse provided the ability to support decision making without disrupting the day-to-day operations, because;
-          Operational information is mainly current – does not include the history for better decision making
-          Issues of quality information
-          Without information history, it is difficult to tell how and why things change over time
Data warehouse fundamentals
Ø  Data warehouse – A logical collection of information – gathered from many different operational databases – that supports business analysis activities and decision-making takes
Ø  The primary purpose of a data warehouse is to combined information   throughout an organization into a single repository for decision-making purposes – data warehouse support only analytical processing
Data warehouse model
Ø  Extraction, transformation and loading (ETL) – A process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse.
Ø  Data warehouse then send subsets of the information to data mart.


Ø  Data mart – contains a subset of data warehouse information.

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeGiXPOieQPzxeNEk0Vy_pc6KdzXHU9Vnwn5tk9ISoc_QMBbpreEUzePYHMb6NHp3_IfSIr9o0NCjMMIbWqULmPOO4PpfcnKXwd0AdUNR4lgWpzpXD2WfVB__8TjEA5rRV9-QCoAaCYgJw/s320/15_Data_Warehouse_Model.png


Multidimensional Analysis and Data Mining
Ø  Relational Database contains information in a series of two-dimensional tables.
Ø  In a data warehouse and data mart, information is multidimensional, it contains layers of columns and rows
-          Dimension – A particular attribute of information



https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbVMnBBDIIfUAxtqf_iXt6cAXj-q-xVyWbPcu9jHJi8PXt6x16T2a5RtIfa4sF18NSZCuK8kKC-Tlv-awqAbacfjbcPuBozF-UzptNtxaQPAgufb0jTjbWV6d4w7vJY1VcDIAYaZOIbUXg/s320/example-cube.png

Ø  Cube – common term for the representation of multidimensional information

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNMMU4dzmTuue_ONVh4xbMm4VQNmqyYoTrxQZ1LmAluIwMoAQh_UUJdT6mA_mEKpb_3WUxOA4lv_a3c87NzJga4MduWnLXUnnlSJOmLN66_PKRJMxizG9GmAsl0SWMO_pbTAFM6a22-m-Q/s320/analysi3.gif

Ø  Once a cube of information is created, users can begin to slice and dice the cube to drill down into the information.
Ø  Users can analyze information in a number of different ways and with number of different dimensions.
Ø  Data Mining – the process of analyzing data to extract information not offered by the raw data alone. Also known as “knowledge discovery” – computer-assisted tools and techniques for sifting through and analyzing vast data stores in order to finds trends, patterns and correlations that can guide decision making and increase understanding
Ø  To perform data mining users need data-mining tools
-          Data-mining tool – uses a variety of techniques to finds patterns and relationships in large volumes of information. Eg: retailers and use knowledge of these patterns to improve the placement of items in the layout of a mail-order catalog page or Web page.
Information Cleansing or Scrubbing
Ø  An organization must maintain high-quality data in the data warehouse
Ø  Information cleansing or scrubbing – A process that weeds out and fixes or discards inconsistent, incorrect or incomplete information
Ø  Occurs during ETL process and second on the information once if is in the data warehouse
Ø  Contract information in an operational system
Ø  Standardizing Customer  name from Operational Systems
Ø  Information cleansing activities
-          Missing Records or Attributes
-          Redundant Records
-          Missing Keys or Other Required Data
-          Erroneous Relationships or References
-          Inaccurate Data

Ø  Accurate and complete information
https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcDM4k4hH7B0AA4BRlWF7ANPWsFv2fsJv8QmzgkX8nCqrYEiz_PFnGQwXvOllc_3nVhCsW4CyYZzE5PNtJYr-a4l8Y12OISmM7Rnc_19uaTmkOst22Yns6B5UDApU5L5dMVfPLKuyFMKc7/s320/bi2.jpg

Business Intelligence
Ø  Business Intelligence – refers to applications and technologies that are used to gather, provides access, analyze data and information to support decision making efforts
Ø  These systems will illustrate business intelligence in the areas of customer profiling, customer support, market research, market segmentation, product profitability, statistical analysis, and inventory and distribution analysis to name a few
Ø  Eg; Excel, Access

No comments:

Post a Comment