preview

Information Retrieval And Evaluating Its Usefulness

Better Essays

Study Report on

Information retrieval and evaluating its usefulness

Adarsh Murali Kashyap
800828747

Table of Contents

1. Introduction III
2. ETL process III
3. Creation of a warehouse using SQL statements VII
4. OLAP operations VII
5. Data Mining IX
5.1. Cluster Analysis IX
5.2. Association Rule Mining XII
5.3. Outcome of ETL, OLAP, Mining operations XII
6. Data Analytics and its usefulness for business XII
7. Usage of production logs to test and engineer an application’s performance: XIII
8. References: XIV
9. VB Code XV

1. Introduction

Information is very important to a business organization. Information helps in identifying opportunities, understanding the customers in …show more content…

There are many ways of cleaning data using many tools that help in formatting, removal of unwanted parts of data. Here I will make an effort to demonstrate a method of extracting, cleaning of data files using Visual Basic and evaluating its usefulness. This report comprises certain topics that I have studied during the course of my masters program which includes few concepts of data warehousing, data management and data analytics. These topics cover different ways of data manipulation such as extraction, transformation techniques, loading of data using SQL queries (creation of tables, insertion of values and checking their normal forms), creation of a data warehouse, evaluating its usefulness by measuring several factors, applying data mining techniques to analyze data in a better way that will lead to improved understanding of business and importance of analytics on business data.

2. ETL process
ETL is a process of managing databases by performing the below mentioned steps:

Step 1: Extraction - Extract data from data sources.
Step 2: Transformation
 Data cleaning: remove errors, inconsistencies and redundancies.
 Data transformation: transform data into warehouse format.
 Data reduction: remove useless data, shrink data without loss of information.
Step 3: Loading - Load transformed data into database/warehouse.

I will be considering “Movies.list” file from IMDB

Get Access