Data Mining Research On Text Mining

Better Essays

Mining valuable patterns in different data streams have been a significant research area in data mining research during the last decade. There are several proposed techniques for data mining that have been developed for mining patterns from different text documents. But to determine the method in which the patterns are discovered effectively is a popular issue in data mining research including text mining area. Most of the popular methods in text mining make use of term-based methodology which involves problems like synonym and polysemy. Some research on text mining proves that the pattern based or phrase based approach performs better compared to the term-based approach but there is no concrete evidence to prove this point. The …show more content…

There are several term based approaches have been proposed by information retrieval to solve the issue of finding exact features in the text documents which includes Support Vector Machine (SVM), probabilistic model, probabilistic models and filtering models [4]. The term based models however deals with issues like synonym: Multiple words which has same meaning and polysemy: a word with several meanings. It is sometime uncertain to discover the meaning of what the user require. Text mining technique is a method of discovering previously unknown, complicated and potentially valuable information from the text documents.
Some research uses the pattern mining techniques to overcome these issues related to the phase mining techniques. But there are two significant issue when considering pattern based mining approach over the phrase based approach which are misinterpretation and low frequency [1 – Effective Pattern Discovery]. Misinterpretations in patter mining are the measures which are not appropriate to respond the user needs in the patters discovered. Frequency in pattern mining is to categorize whether the given topic is highly frequent or low frequent pattern. The major issue here is to decide on how to use pattern discovered that could be

Get Access