CHAPTER 2: DATA WAREHOUSING Objectives: After completing this chapter, you should be able to: 1. Understand the basic definitions and concepts of data warehouses 2. Understand data warehousing architectures 3. Describe the processes used in developing and managing data warehouses 4. Explain data warehousing operations 5. Explain the role of data warehouses in decision support 6. Explain data integration and the extraction, transformation, and load (ETL) processes 7. Describe real-time (active) data warehousing 8. Understand data warehouse administration and security issues CHAPTER OVERVIEW Data warehousing is at the foundation of most BI. This is the data warehousing chapter of the book. Later chapters will use it as they discuss DW …show more content…
A repository of current and historical data of potential interest to managers throughout an organization. Data are usually structured so that they are ready for analytical processing (for OLAP, data mining, reporting, querying, etc.) DW provides a single version of the truth 4. (Note: Watson, 2005, refers to the term data warehousing as a discipline resulting from applications that provide decision and support capabilities) C. Characteristics of data warehousing (Inmon, 2005) to ensure that the DW is tuned almost exclusively for data access: 1. Subject Oriented – data are organized by topics, such as sales, products, customers, etc. Best for providing a more comprehensive view of the organization; not only how a business is operating, but why. Integrated – data from different sources are stored in a consistent format. Also clarity is obtained in unit of measures, naming/labeling of attributes, etc. (The assumption is the data warehouse is totally integrated.) Time Variant – provides data at various points in time (daily, weekly, monthly quarterly, annually – historic and current data so as to analyze trends, deviations, compare and forecast outcomes, etc.). Every data warehouse should have a time variable. (Example: LSU enrollment, retention, graduation data) Nonvolatile – users cannot change the data once entered into the data warehouse. This ensures that the data warehouse is almost exclusively
The purpose of data warehousing is to combine all of a company 's data and allows users to access the data directly, create reports, and obtain responses to
According to Berson and Dubov (2011), there are four typical categories of drivers that explain the need for data management: Business Development, Sales and Marketing; Customer Service; Risk, Privacy, Compliance and Control; and Operational
This data is collected and organized in order to process orders and maintain good customer service. The logical view of data would allow a knowledge worker to arrange and access information based on the needs of the business separating it from the physical view of how information is arranged and stored. The ability to do this allows for an employee to create detailed reports in order to determine information such as customer information and their order numbers and dates. This is imperative for a company like Comcast who has over 27 million customers in order to have a system to keep important data to analyze. Using a data warehouse allows them to gather from several databases and then the company can use the information to determine for example how many units of voice products are sold to create the necessary business intelligence to make future decisions and remain
Google is able to use data warehousing to improve its business. A data warehouse is a logical collection of data that supports business analysis activities and decision making tasks. Google can use a data warehouse to store information just like a database is able to, but in an aggregated form more suited to supporting decision-making tasks.
One of the main functions of any business is to be able to use data to leverage a strategic competitive advantage. The use of relational databases is a necessity for contemporary organizations; however, data warehousing has become a strategic priority due to the enormous amounts of data that must be analyzed along with the varying sources from which data comes. Company gathers data by using Web analytics and operational systems, we must design a solution overview that incorporates data warehousing. The executive team needs to be clear about what data warehousing can provide the company.
An active data warehousing, or ADW, is a data warehouse implementation that supports near-time or near-real-time decision making. It is featured by event-driven actions that are triggered by a continuous stream of queries that are generated by people or applications regarding an organization or company against a broad, deep granular set of enterprise data. Continental uses active data warehousing to keep track of their company’s daily progress and performance. Continental’s management team holds an operations meeting every morning to discuss how their
Extraction, Transformation, and Loading processes are responsible for the operations taking place in the back stage of a data warehouse architecture. In a broader aspect, initially the data is extracted from the source data stores which could be On-Line Transaction Processing or Legacy system, files of any formats, web pages or any other documents like spreadsheets or text documents. In this step, only the data which is different from the previous execution of ETL process (newly inserted, updated) gets extracted from the sources. Next, the extracted data is sent to Data Staging Area where the data is transformed and cleaned. Finally, the data is loaded to the central data warehouse and all its counterparts e.g., data marts and views. (Kabiri & Chiadmi 2013, p.1)
Q3: While this case study supports a specific data warehouse product, please locate another case study from another data warehousing Software Company and explain the data warehouse that was designed in that case study?
What information is accessible? The data warehouse offers possibilities to define what’s offered through metadata, published information, and parameterized analytic applications. Is the data of high value? Data warehouse patrons assume reliability and value. The presentation area’s data must be correctly organized and harmless to consume. In terms of design, the presentation area would be planned for the luxury of its consumers. It must be planned based on the preferences articulated by the data warehouse diners, not the staging supervisors. Service is also serious in the data warehouse. Data must be transported, as ordered, promptly in a technique that is pleasing to the business handler or reporting/delivery application designer. Lastly, cost is a feature for the data
A data warehouse is a large databased organized for reporting. It preserves history, integrates data from multiple sources, and is typically not updated in real time. The key components of data warehousing is the ability to access data of the operational systems, data staging area, data presentation area, and data access tools (HIMSS, 2009). The goal of the data warehouse platform is to improve the decision-making for clinical, financial, and operational purposes.
This paper will present the return on investment (ROI) of data warehousing (DW). The history of data warehousing is based on the definition and timeline. Then, detailed information about return on investment will be discussed. Following, will be information about data warehousing new technology of hardware and software. Data Warehousing is a new term in my department where we use the Network Appliance (NetApps) Netfiler storage devices/units. The information read was very informative and helpful in my understanding data warehousing better. Finally, a conclusion about the return on investment of data warehousing.
The data warehousing system will also allow the company to use a data model and server technology that speeds up querying and reporting. This is because these will not be included in the data processing time thus allowing the company to use a modeling technique that does not slow down or complicate the transaction processing system. The data warehouse will also allow the company to use a bit-mapped indexing system as their server technology in order to speed up query and report processing. Technologies for transaction recovery will also be employed to speed up transaction
All in all, a dataware housng being subject-oriented, cohesive and time variant, it provides electrical centralization of all the data into one database. It provides integrated company wide information and helps in separating operational and informational systems in an organization. Though, it carries some disadvantages like lack of scalability and lack of data integrity, an organized use of data warehousing has got a lot of benefits which helps the organization in proper management of the all kind of
Data warehouse – focuses primarily on storing data used to generate information required to make tactical or strategic decision. (pg. 9)
Data warehouse are multiple databases that work together. In other words, data warehouse integrates data from other databases. This will provide a better understanding to the data. Its primary goal is not to just store data, but to enhance the business, in this case, higher education institute, a means to make decisions that can influence their success. This is accomplished, by the data warehouse providing architecture and tools which organizes and understands the