CHAPTER 2
2. Background and Literature Review
The purpose of this chapter is to show an in depth review of the topics, areas and works related to the research presented here. we conduct a brief but comprehensive in depth review of data mining and association rule mining approaches and techniques, followed by a focus at interestingness & quality, and redundancy issues related to association rule mining. This review sets the basic work for our research and the proposals made here.
2.1 Data mining
Data Mining technique is the result of a long process of studies and research in the area of databases and product development. This evolution began when business data and companies was stored for the first time on computer device, with continuous improvements in access to data and more newly, produced technologies that allow users to navigate during their data in real time. Data mining is a approach that help to mine important data from a large database. It is the technique of classification during huge amounts of data and chosen out relevant information during the use of certain advanced algorithms. Like more data is collected, with the amount of data doubling every one years, data mining is becoming an more and more important tool to convert this data into information. Data mining takes this evolutionary process behind retrospective data access and navigation to prospective and proactive information delivery. Data mining is very useful and ready in applications in the business
Data mining is another concept closely associated with large databases such as clinical data repositories and data warehouses. However data mining like several other IT concepts means different things to different people. Health care application vendors may use the term data mining when referring to the user interface of the data warehouse or data repository. They may refer to the ability to drill down into data as data mining for example. However more precisely used data mining refers to a sophisticated analysis tool that automatically dis covers patterns among data in a data store. Data mining is an advanced form of decision support. Unlike passive query tools the data mining analysis tool does not require the user to pose individual specific questions to the database. Instead this tool is programmed to look for and extract patterns, trends and rules. True data mining is currently used in the business community for market ing and predictive analysis (Stair & Reynolds, 2012). This analytical data mining is however not currently widespread in the health care community.
Data mining software allows users to analyze large databases to solve business decision problems. Data mining is, in some ways, an extension of statistics, with a few
As stated above, data mining is often used to solve business decision problems, “it provides ways to quantitatively measure what business users should already know qualitatively” (Linoff, 2004). A growing number of industries are using data mining to become more competitive in their market by primarily focusing on the customers; increasing their customer relationships and increasing customer acquisition.
With the increased and widespread use of technologies, interest in data mining has increased rapidly. Companies are now utilized data mining techniques to exam their database looking for trends, relationships, and outcomes to enhance their overall operations and discover new patterns that may allow them to better serve their customers. Data mining provides numerous benefits to businesses, government, society as well as individual persons. However, like many technologies, there are negative things that caused by data mining such as invasion of privacy right. This paper tries to explore the advantages as well as the disadvantages of data mining. In addition, the ethical and global issues regarding the use of data mining
Abstract - In the Data mining process, we can identify the patterns in the data that is hard to find using normal analysis. Several Mathematical and statistical algorithms are used in this approach to determine the probability of the event or scenario. The main aim of this process in terms of technical representation is to find the correlation amongst the attributes. There is a huge amount of discovery being carried out in this field creating a huge scope and jobs in this area. Several data mining algorithms are present that could determine different features present in the data that could lead in prediction and future analysis. Main Study report would consist of these algorithms that could help us predict and some sample data that we
Data Mining is defined as extracting information from huge sets of data. In other words, we can say that data mining is the procedure of mining knowledge from data. There is a huge amount of data available in the Information Industry. This data is of no use until it is converted into useful information. It is necessary to analyse this huge amount of data and extract useful information from it. Extraction of information is not the only process we need to perform; data mining also involves other processes such as Data Cleaning, Data Integration, Data Transformation, Data Mining, Pattern Evaluation and Data Presentation [12].
Association rule mining, One of the most important approaches and techniques that studied of data mining. A lot of researchers’ Inefficient the ideas to get the frequent item sets/ pattern sets. frequent patterns and closed frequent patterns are two techniques the major objective it is to reducing the set of extracted patterns to a smaller more interesting subset. Some frequent pattern mining often generates a large number of frequent patterns, which imposes a big challenge on visualize, comprehension and further analysis of the generated patterns. Mine a large amount useless frequent patterns , it is requires more space and more time, and thus leads to high cost as well as ineffectiveness. This require the need for finding small number
In the year of 2001 when the use of data mining in marketing was a relatively new concept Shaw ,Subramaniam, Tan and Welge gave an insight about management of large database using data mining techniques. They brought the concept of identifying useful information from the large customer database by identifying hidden patterns. They integrated data mining and marketing knowledge management to help in managing marketing decisions.
More than one algorithms & sequences were predicted for top-k association administers mining. But, most of them do now not take advantage of the essential definition of an association rule. As an occurrence, KORD discovers approaches with an unmarried thing in the resulting, while the arrangement of principles of You et al. mines connection rules from a move in lieu of an exchange database. To the wonderful of our concentration, least difficult best k rules finds top-k affiliation rules predicated on a similar old meaning of an alliance run (with various things, in an exchange database). the primary pivotal thought process that characterize this calculation is that it characterizes the endeavor of mining the
Association Rule Mining is a part of data mining, which are most important techniques. Data Mining is used to extract the required information from a certain total data. Association Rules are mainly used in several areas such as telecommunication networks, risk management, etc.
Data mining prediction model works on the process of identifying the patterns based on the historical information to predict the new incoming data sets. This prediction modelling is much useful in the case of decision making process in the business models. On the other way, Descriptive model describes the data in an efficient way by means of grouping the data by using clustering; association rules principles of data mining.
Data mining is the process of extracting knowledge from large data sets. It uses artificial intelligence methods to discover the hidden relationships among the huge amount of data that is collected. It has a great potential to improve applications in many fields like Healthcare systems, Customer relationship management, Financial banking, Research analysis, Bio informatics, Marketing analysis, Education, Manufacturing engineering, Criminology and many more. Criminology is the study of crimes and typically a criminologist’s job include analyzing data to determine why the crime was committed and more importantly to predict and prevent criminal behavior in the future. It became an interesting field to apply data mining techniques because of its large datasets and the complexity of relationships between the data. This paper will discuss some of the tools and techniques used in this field to find out important information that will help and support the police forces and reduce social nuisance.
Data mining and knowledge discovery is the name frequently used to refer to a very interdisciplinary field, which consists of using methods of several research areas to extract knowledge from real-world datasets. There is a distinction between the terms data mining and knowledge discovery which seems to have been introduced by [Fayyad et al.1996].the term data mining refers to the core step of a broader process, called knowledge discovery in database. Architecture of data mining structure is defined the following figure.
Data mining is the process of extracting useful knowledge from large databases or data warehouses. It can be also said as a set of mathematical functions and data manipulation techniques to extract useful data from databases. Data mining can also be said as knowledge discovery process in other words. It explores a large collection of data into a meaningful patterns and rules based on the queries provided by users using data mining query language. The meaningful patterns and rules are generated by analysing the database. Data mining makes use several techniques such as clustering, classification, association rule mining and so on to generate the meaningful patterns from the databases. The purpose of this report is to describe how data are prepared for data
data is also Growing. It has resulted large amount of data stock in databases , depot and other repositories . therefore the Data mining comes into model to explore and analyses the databases to extract the interesting and previously obscure patterns and rules well-known as association rule mining