Data mining: Current issues and challenges
Abstract:- Data mining has pulled in a lot of interest in the data business and in the public eye overall lately, because of wide accessibility to massive amount of data and the up and coming need of turning such data into valuable data and information. This data and information can be utilized for applications running from market analysis, fraud detection and different investigations. The aim of this paper is to explore the role of data mining for information extraction, its importance in the current world. prolific writing has been devoted to this research and huge advancements has been made, extending from proficient calculations for continuous thing set and mining in exchange databases to various exploration frontiers, for example, associative classification, correlation mining, pattern based clustering and in addition their wide applications. This paper also presents the scenario about the current issues and challenges data mining is facing.
Key words: - Data mining, data, information extraction. Introduction
Data mining is finding the routines and examples in large databases to guide choices about future exercises. It is normal that data mining tools to get the model with negligible information from the client to identify. Data mining is the utilization of automated data analysis techniques to discover already undetected connections among data things. It regularly determines the
Abstract: Data mining algorithms determine how the cases for a data mining model are analyzed. Data mining model algorithms provide the decision-making capabilities needed to classify, segment, associate and analyze data for the processing of data mining columns that provide predictive, variance, or probability information about the case set. With an enormous amount of data stored in databases and data warehouses, it is increasingly important to develop powerful tools for analysis of such data and mining interesting knowledge from it. Data mining is a process of inferring knowledge from such huge data. Data Mining has three major components Clustering or Classification, Association Rules and Sequence Analysis.
As computer and database technologies develop rapidly, data accumulates in a speed unmatchable by human capacity of data processing[2]. Data mining as a multidisciplinary joint effort from databases, machine learning and statistics, is championing in turning mountains of data into nuggets. Researchers and practitioners realize that in order to use data mining tools effectively, data processing is essential to successful data mining.PrimitiveThese are features which have an influence on the output and their role cannot be assumed by the rest.[1]
Both data mining and data analysis are a subset of Business Intelligence which also includes data management systems, data warehouses and Online analytic processing(OLAP). To manage the mountains of information, the data is put away in a warehouse of information accumulated from different sources, including corporate databases, compressed data from interior frameworks, and information from outer
Data Mining, a sub-branch of computer science, involving statistics, methods and calculations to find patterns in large amount of data sets, and database systems. Generally, data mining is the process to examine data from different aspects and summarizing it into meaningful information. Data mining techniques depict actions and future trends, allowing any individual to make better and knowledge-driven decisions.[1][2]
The data stored in data warehouse will not provide any benefit to the organization until the hidden information is extracted from it. There are various way to extract the data, however data mining is the best method to get the meaningful trends and patterns from the data warehouse. Therefore, data mining can be defined as the method of extracting the valid, comprehensible and previously unknown information from the huge storage to use on the decision making process (P. 1233).
Data mining is the process used to analyse large quantities of data and gather useful information from them. It extracts the hidden information from large heterogeneous databases in many different dimensions and finally summarizes it into categories and relations of data. Clustering and classifications are the two main techniques of data mining followed by association rules, predictions, estimations and regressions. Many fields imply on data mining like games, business, surveillance, science and engineering etc.
Abstract- Data mining is one of the essentially used and interesting research areas. Mining association rule is one of the important research techniques in data mining field. Many algorithms for mining association rules are proposed on the basis of Apriori algorithm and improving the algorithm strategy but most of these algorithms not concentrate on the structure of database. The proposed technique includes transposition of database with further enhancement in this particular transposition technique. This approach will reduce the total scans over the database and then time consumed to generate the association rules will be less.
The term DM was conceptualised as early as 1990s as a means of addressing the problem of analysing the vast repositories of data that are available to mankind, and being added to continuously. DM has been the oldest yet one of the interesting buzzwords. It involves defining associations, or patterns, or frequent item sets, through the analysis of a given data set. Further-more, the discovered knowledge should be valid, novel, useful, and understandable to the user. Many organizations often underutilize their already existing databases not knowing that there is slot of hidden information that requires to be discovered i.e. interesting patterns or knowledge from these databases. DM disciplines revolve around statistics, artificial intelligence, and pattern recognition. There are two main techniques in DM that is reporting and DM techniques. Our study focuses on semi-automatic DM technique for discovering meaningful relationships from a given data set. There is no hypothesis required to mine the data (Jans 09). The technique uses exploratory analysis with no predetermined notions about what will constitute an ―interesting outcome (Kantardzi 02).
Data mining prediction model works on the process of identifying the patterns based on the historical information to predict the new incoming data sets. This prediction modelling is much useful in the case of decision making process in the business models. On the other way, Descriptive model describes the data in an efficient way by means of grouping the data by using clustering; association rules principles of data mining.
Data Mining is defined as extracting information from huge sets of data. In other words, we can say that data mining is the procedure of mining knowledge from data. There is a huge amount of data available in the Information Industry. This data is of no use until it is converted into useful information. It is necessary to analyse this huge amount of data and extract useful information from it. Extraction of information is not the only process we need to perform; data mining also involves other processes such as Data Cleaning, Data Integration, Data Transformation, Data Mining, Pattern Evaluation and Data Presentation [12].
Companies and organizations all over the world are blasting on the scene with data mining and data warehousing trying to keep an extreme competitive leg up on the competition. Always trying to improve the competiveness and the improvement of the business process is a key factor in expanding and strategically maintaining a higher standard for the most cost effective means in any business in today’s market. Every day these facilities store large amounts of data to improve increased revenue, reduction of cost, customer behavior patterns, and the predictions of possible future trends; say for seasonal reasons. Data
data is also Growing. It has resulted large amount of data stock in databases , depot and other repositories . therefore the Data mining comes into model to explore and analyses the databases to extract the interesting and previously obscure patterns and rules well-known as association rule mining
Mining valuable patterns in different data streams have been a significant research area in data mining research during the last decade. There are several proposed techniques for data mining that have been developed for mining patterns from different text documents. But to determine the method in which the patterns are discovered effectively is a popular issue in data mining research including text mining area. Most of the popular methods in text mining make use of term-based methodology which involves problems like synonym and polysemy. Some research on text mining proves that the pattern based or phrase based approach performs better compared to the term-based approach but there is no concrete evidence to prove this point. The
The overall goal of the data mining process is to extract information from data sets and transform it into an understandable structure such as patterns and knowledge for further use [3].
A data stream is a real time, continuous, structured sequence of data items. Mining data stream is the process of extracting knowledge from continuous, rapid data records. Data arrives faster, so it is a very difficult task to mine that data. Stream mining algorithms typically need to be designed so that the algorithm works with one pass of the data. Data streams are a computational challenge to data mining problems because of the additional algorithmic constraints created by the large volume of data. In addition, the problem of temporal locality leads to a number of unique mining challenges in the data stream case. The data mining techniques namely clustering, classification and frequent pattern mining are applied to extract the knowledge