Information Retrieval And Evaluating Its Usefulness

2836 Words12 Pages
Study Report on Information retrieval and evaluating its usefulness Adarsh Murali Kashyap 800828747 Table of Contents 1. Introduction III 2. ETL process III 3. Creation of a warehouse using SQL statements VII 4. OLAP operations VII 5. Data Mining IX 5.1. Cluster Analysis IX 5.2. Association Rule Mining XII 5.3. Outcome of ETL, OLAP, Mining operations XII 6. Data Analytics and its usefulness for business XII 7. Usage of production logs to test and engineer an application’s performance: XIII 8. References: XIV 9. VB Code XV 1. Introduction Information is very important to a business organization. Information helps in identifying opportunities, understanding the customers in…show more content…
There are many ways of cleaning data using many tools that help in formatting, removal of unwanted parts of data. Here I will make an effort to demonstrate a method of extracting, cleaning of data files using Visual Basic and evaluating its usefulness. This report comprises certain topics that I have studied during the course of my masters program which includes few concepts of data warehousing, data management and data analytics. These topics cover different ways of data manipulation such as extraction, transformation techniques, loading of data using SQL queries (creation of tables, insertion of values and checking their normal forms), creation of a data warehouse, evaluating its usefulness by measuring several factors, applying data mining techniques to analyze data in a better way that will lead to improved understanding of business and importance of analytics on business data. 2. ETL process ETL is a process of managing databases by performing the below mentioned steps: Step 1: Extraction - Extract data from data sources. Step 2: Transformation  Data cleaning: remove errors, inconsistencies and redundancies.  Data transformation: transform data into warehouse format.  Data reduction: remove useless data, shrink data without loss of information. Step 3: Loading - Load transformed data into database/warehouse. I will be considering “Movies.list” file from IMDB
Open Document