Using Data Storage And Cleansing

1027 Words Nov 21st, 2014 5 Pages
Selecting and creating data set from sources like data warehouse, transactional data or flat tables. This step is considered crucial because it is considered as base for constructing models. The entire study may fail if any of the important attributes are missed. After getting started with the best available data set, the techniques of knowledge discovery and modelling are applied vigorously.
Pre-processing and cleansing: Data is made reliable during this stage. Include mechanisms such as removing outliers, handling missing values.
Data Transformation: Generating better data for which the data mining is prepared and developed. Dimension reduction, extraction, selection, record sampling and so forth are used.
Data mining: Choosing appropriate data mining algorithm is the main task in this step. After getting decided with what kind of data to be used among classification, regression, clustering etc we need to consider two important steps like: Prediction and description.
Prediction is often referred to as supervised data mining and description is referred to as unsupervised data mining.
In data mining, inductive learning techniques are used when constructing a model which ensures that trained data model can be applied for future cases.
Select a data mining algorithm considering all the factors. There might be a necessity to employ these algorithm several times until the satisfying result is obtained.
Evaluation: With respect to the goals, we interpret and evaluate the…
Open Document