CHAPTER 1 INTRODUCTION
________________________________________
1.1 Background
Knowledge Discovery and Data Mining are rapidly evolving areas of research that are at intersection of multiple application areas and approaches. Today no field either it belongs to computer or not, knowledge discovery is required. The loss prediction, cost estimation, identification of market moves are the common application areas where knowledge discovery is essential. Knowledge discovery is not an individual process, instead it is the combination of various session data operations that are applied in a series to extract some valuable information from dataset. Data Mining is the core part of Knowledge Discovery Process. Data Mining is the intelligent
…show more content…
The tools used different data mining technique and algorithm. Example of such data mining tools are SPSS, Clementine, SAS E-Miner, Salford CART etc., (commercial software) and YALE, WEKA etc. (free software).
1.1.2 Main reason for growth of Data Mining Research
The amount of digital data has been exploding during the past decade, while the number of scientists, engineers and analysts available to analyze the data has been static. To bridge this gap requires the solutions of fundamentally new research problems, which can be grouped into the following broad challenges: (a) developing algorithms and system to mine large, massive and high dimensional datasets, (b) developing algorithms and system to mine new types of data, (c) developing algorithms, protocols and other infrastructure to mine distributed data, (d) improving the ease of use of data mining systems, and (e) developing appropriate privacy and security models for data mining. In order to respond to these challenges, we require applied, multidisciplinary and interdisciplinary research in data mining and knowledge discovery (Soman et al., 2008).
1.1.3 Introduction to Data Mining
Data Mining is considered as the effective knowledge process to perform the data value analysis so that effective data discovery over the application data can be performed. These kind of da mining operations are defined a low level
Data Mining. It is the process of discovering interesting knowledge that are gathered and significant structures from large amounts of data stored in data warehouse or other information storage.
Data mining is another concept closely associated with large databases such as clinical data repositories and data warehouses. However data mining like several other IT concepts means different things to different people. Health care application vendors may use the term data mining when referring to the user interface of the data warehouse or data repository. They may refer to the ability to drill down into data as data mining for example. However more precisely used data mining refers to a sophisticated analysis tool that automatically dis covers patterns among data in a data store. Data mining is an advanced form of decision support. Unlike passive query tools the data mining analysis tool does not require the user to pose individual specific questions to the database. Instead this tool is programmed to look for and extract patterns, trends and rules. True data mining is currently used in the business community for market ing and predictive analysis (Stair & Reynolds, 2012). This analytical data mining is however not currently widespread in the health care community.
Data mining software allows users to analyze large databases to solve business decision problems. Data mining is, in some ways, an extension of statistics, with a few
Data mining uses computer-based technology to evaluate data in a database and identify different trends. Effective data mining helps researchers predict economic trends and pinpoint sales prospects. Data mining is stored in data warehouses, which are sophisticated customer databases that allow managers to combine data from several different organization functions.
What is data mining? Data mining is the deriving new information from massive amounts of data in databases (Sauter, 2014, p. 148). Chowdhurry argues that data mining is part of KDD. KDD is knowledge discovery in databases, it is a process that includes data mining. In addition to data mining, KDD includes data preparation, modeling and evaluation of KDD. KDD is at the heart of this research field. This research field is multidisciplinary and includes data visualization, machine learning, database technology, expert systems and statistics. Overall, the use of a case based reasoning and data mining tools within an information system would create a CBR system to solve new problems with adapted solutions and could be used in many industries such as education and healthcare (Chowdhurry,
Data mining is a class of database applications that looks for hidden patterns in a group of data that can be
Data mining is a very important component in today’s big data [22, 23]. Data mining is essential for everyone from large businesses to government organizations. It helps to identify trends, patterns and make predictions by exploring, comparing, researching and analyzing data.
1) Data mining is a way for companies to develop business intelligence from their data to gain a better understanding of their customers and operations and to solve complex organizational problems.
Although data mining is still in its infancy, companies in a wide range of industries –
Today with the ever growing use of computers in the world, information is constantly moving from one place to another. What is this information, who is it about, and who is using it will be discussed in the following paper. The collecting, interpreting, and determination of use of this information has come to be known as data mining. This term known as data mining has been around only for a short time but the actual collection of data has been happening for centuries. The following paragraph will give a brief description of this history of data collection.
Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.
In its infancy, data mining was as limited as the hardware being used. Large amounts of data were difficult to analyze because the hardware simply could not handle it [1]. The term "data mining" first began appearing in the 1980 's largely within the research and computer science communities. In the 1990 's it was considered a subset of a process called Knowledge Discovery in Databases of KKD [1]. KKD analyzes data in the search for patterns that may not normally be recognized with the naked eye. Today however, data mining does not limit itself to databases,
Data mining is when a financial analyst gathers consumer information and looks for patterns that a business can exploit. A simplified data mining example is when a restaurant manager knows the local yearly convention schedule based on experience. The manager can cross-reference that information with historical sales results to predict such things as forecasted profit or labor demand. With this information, the manager can estimate an advertising budget or hire temporary staff to handle anticipated work load. When medium to large-sized businesses use data mining, they uncovering these same information points; however, revenue gains can range from millions to billions of dollars. There are several techniques that firms frequently employ to find gold in information.
Companies and organizations all over the world are blasting on the scene with data mining and data warehousing trying to keep an extreme competitive leg up on the competition. Always trying to improve the competiveness and the improvement of the business process is a key factor in expanding and strategically maintaining a higher standard for the most cost effective means in any business in today’s market. Every day these facilities store large amounts of data to improve increased revenue, reduction of cost, customer behavior patterns, and the predictions of possible future trends; say for seasonal reasons. Data
Data mining allows companies to focus on the more important information in their data warehouses. Data mining can be broken down into two major categories. Automated prediction of trends and behaviors, and automated discovery of previously unknown patterns. In the first category, data mining automates the process of finding predictive information in large databases. Questions that traditionally required exhaustive hands-on analysis can now be quickly answered directly from data. In the second category, data mining tools sweep through databases and identify previously hidden patterns in one step. This category is where the major focus of research has been on.