Applications for Big Data Analysis

1806 Words7 Pages
In the last few years the global marketplace has seen exponential growth in data volume. Every day people create unstructured large datasets of different types such as GPS coordinates, payment transactions, web data, e-mails or smart meter values that are termed as "big data" \cite{nasscom}. The need to derive useful information from such data requires the development of specific tools that are based on techniques as data mining, statistics, artificial intelligence, neural networks and other advanced analytics methods \cite{russom}. The analysis of big data is widely used in insurance, medicine for disease prediction and improved health outcomes, industry for sales prediction and customer relationship optimization and transport \cite{oreilly, kinsey}. There is a wide range of paid or open source tools and techniques for big data analytics: statistical analysis, online analytical processing (OLAP) tools \cite{dwh}, data warehouses (DWH) \cite{dwh}, distributed programming models (e.g., MapReduce \cite{mapreduce}), clouds \cite{cloudcomputing}, complex event processing \cite{cep}, etc. \cite{russom}. The objective of the proposed research is to evaluate different applications for big data analysis using benchmarks for store sales with focus on performance, and to compare their applicability in this context. Two open source applications such as KNIME \cite{knime, rosaria, berthold} and WEKA \cite{weka, hallweka} and two open-source software packages: R language package
Open Document