Data Mining

16277 Words Jul 22nd, 2010 66 Pages
Abstract. Data mining is concerned with analysing large volumes of (often unstructured) data to automatically discover interesting regularities or relationships which in turn lead to better understanding of the underlying processes. The field of temporal data mining is concerned with such analysis in the case of ordered data streams with temporal interdependencies. Over the last decade many interesting techniques of temporal data mining were proposed and shown to be useful in many applications. Since temporal data mining brings together techniques from different fields such as statistics, machine learning and databases, the literature is scattered among many different sources. In this article, we present an overview of techniques
…show more content…
Often temporal data mining methods must be capable of analysing data sets that are prohibitively large for conventional time series modelling techniques to handle efficiently. Moreover, the sequences may be nominal-valued or symbolic
(rather than being real or complex-valued), rendering techniques such as autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) modelling inapplicable. Also, unlike in most applications of statistical methods, in data mining we have little or no control over the data gathering process, with data often being collected for some entirely different purpose. For example, customer transaction logs may be maintained from an auditing perspective and data mining would then be called upon to analyse the logs for estimating customer buying patterns.
The second major difference (between temporal data mining and classical time series analysis) lies in the kind of information that we want to estimate or unearth from the data. The scope of temporal data mining extends beyond the standard forecast or control applications of time series analysis. Very often, in data mining applications, one does not even know which variables in the data are expected to exhibit any correlations or causal relationships.
Furthermore, the exact model parameters (e.g. coefficients of an ARMA model or the weights of a neural network) may be of

More about Data Mining

Open Document