III. RELATED WORK
Provide an approach for research efforts towards developing highly scalable and autonomic data management systems associated with programming models for processing Big Data. Aspects of such systems should address challenges related to data analysis algorithms, real-time processing and visualisation, context awareness, data management and performance and scalability, correlation and causality and to some extent, distributed storage [1]. Provide an approach for framework for evaluating big data initiatives [2]. Provide an approach for summarize opportunities and challenges with big data. Recent technological advances and novel applications, such as sensors, cyber-physical systems, smart mobile devices ,cloud systems, data analytics, and social networks, are making possible to capture, process, and share huge amounts of data – referred to as big data - and to extract useful knowledge, such as patterns, from this data and predict trends and events. Big data is making possible tasks that before were impossible, like preventing disease spreading and crime, personalizing healthcare, quickly identifying business opportunities, managing emergencies, protecting the homeland, and so on [3]. Provide an approach for sources of structured and unstructured big data. Unstructured data is everywhere. In fact, most individuals and organizations conduct their lives around unstructured data [4]. Successful decision-making will increasingly be driven by analytics-generated
Every day, we produce 2.5 quintillion bytes of data. 90% of all data in the world was produced in the past two years. Data has been around forever; we have always gathered information. Paleolithic cavemen recorded their activities by carving them in stone or notching them in sticks. Egyptians used hieroglyphics to record significant events in history. The Library of Alexandria was home to half-a-million scrolls of the ancient world. Less than hundred years ago, we used punch cards to record and store information. As technology continues to evolve, the amount of data we store continues to grow. We’ve come a long way since stone tablets, scrolls, and punch cards. It’s important to understand the concept of big data and the impact is has created. This paper will define the classifications of data, explain the challenges of big data, and describe how big data analytics is being used in today’s data driven world.
Big Data is an expansive phrase for data sets so called big, large or complex that they are very difficult to process using traditional data processing applications. Challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and information privacy. In common usage, the term big data has largely come to refer simply to the use of predictive analytics. Big data is a set of techniques and technologies that need or require new forms of integration to expose large invisible values from large datasets that are diverse, complex, and of a massive scale. When big data is effectively and efficiently captured, processed, and analyzed, companies
The emergence of big data has provided different avenues for organizations to use data to improve different aspects of their respective operations. Be it customer service, research and development, or market position, Big Data has the potential to be a significant driving force in all these areas. However, there’s still a significant gap between the ability of Big Data to produce insightful analytical information based on real-time data and the ability of organizations to capture and utilize this readily available tool. This is, in part, due to the fact that the systems and processes necessary to fully maximize the usefulness of Big Data is currently lacking in most organizations. This lack of a conducive habitat for Big Data is further magnified in new organizations without any knowledge of Big Data. For organizations that have that have little to no knowledge of Big Data, there must be a thorough assessment of the benefits of big data and how they could improve the organizations overall place in the market. There also needs to be steps taken towards the design of frameworks that will enable the organization to better capture and utilize Big Data.
The amount of data in our world has been rapidly increasing and analyzing these large data sets, or big data, has become crucial for businesses in increasing their success. Many businesses use big data to model their business structures, control processes, and run the business. The availability of this data leads to a more accurate analysis of the target market. More accurate analyses lead to more confident decision making and better decisions means greater operational efficiencies, cost reductions and reduced risk. There are many ways in which big data can be successfully implemented in an organization. Big data allows businesses to segment their target market, creating more precisely tailored products and services. Big data is also used to conduct controlled experiments to make better management decisions. Finally, big data can unlock value by making the captured information transparent and usable at much higher frequency (Manyika, “Big data: The next frontier for innovation, competition, and productivity”).
Therefore, the consecutive sections discussed the definition of big data, tools for analyzing big data, data mining, knowledge discovery, visualization and collaborative
the ‘big Data era’ has arrived — multi-petabyte data warehouses, social media interactions, real-time sensory data feeds, geospatial information and other new data sources are presenting organisations with a range of challenges, but also significant opportunities. IDC believes that as CIOs start to adopt the new class of technologies required to process, discover and analyse these massive data sets that cannot be dealt with using traditional databases
Big Data is becoming more meaningful with the ever more powerful data technologies, which enable us to derive insights from the data and help us make decisions. Big Data also creates new courses and professional fields such as the data science and data scientist, which are aimed at analyzing the ever growing volume of data. Some might think this exaggerated because data analysis, after all, not a new invention. However, we might all agree that the progress of digitization associated with the generation of ever larger amounts of data have totally changed the ways we deal with data.
Big Data is a term for very large amounts of formal and informal information that can be analyzed to find trends and patterns. The information can be about anything, but it needs to be processed in a way that will give it value and relevance. It can come in multiple formats and from different sources such as large databases, electronic records, social media, mobile phones, apps, wearable devices such as pedometers, and others. Different data sets are combined and contrasted in different ways to give perspectives and insights about a topic. It can be used in a seemly endless number of ways and people are discovering new ways to use it all the time, some of them entertaining. The largest areas of use include those relating to consumer behavior and choices, business procedures, healthcare, science and research, and law enforcement. People are also discovering its use in their personal lives as well for things like buying a home, dating, fitness routines and travel.
In this modern age of constant technological advancement and business innovation, a relatively new feature has unlocked more potential for businesses to improve: Big Data. Big Data can be defined in many different ways. It has become important primarily for business decision making.
In 2012, the concept of ‘Big Data’ became widely debated issue as we now live in the information and Internet based era where everyday up to 2.5 Exabyte (=1 billion GB) of data were created, and the number is doubling every 40 months (Brynjolfsson & McAfee, 2012). According to a recent research from IBM (2012), 90 percent of the data in the world has been created in the last two years alone, and Internet activity in each second today will generate more data than all the data combined in the
“Why Big-Data Is a Big Deal”. Big-data is a logo used to describe a massive volume of both structured (is information already managed by the organization in relational databases ) and unstructured data (is information that is unorganized and does not fall into a pre-determined model) that is so large it is difficult to process using traditional database and software techniques. In most companies, the volume of data is too big or it moves too fast or it exceeds current processing capacity. Despite these problems, big-data has the potential to help companies improve operations and make faster, more intelligent decisions.
Big data is an extensive collection of structured and unstructured data. It is a modern day technology which is applied to store, manage and analyze data that are not possible to manage, store and analyze by using the commonly used software or tools. Since all of our daily tasks are overtaken by the modern technologies and all the businesses and organizations are using internet system to operate, the production of data has increased significantly in past
Data storage and management technologies have recently begun to surge in popularity. Businesses want to learn how to implement the best ways to store, maintain, capitalize on the copious amounts of data that their products, consumers, services, etc. generate. With that being said, organizing and measuring data has proven to be quite difficult despite present-day technological innovations. The term “Big data” has emerged and Apache Hadoop, or Hadoop, technology uses a set of algorithms to process large clusters of data (Kelly, 2014).
To address the question of how and what techniques has been used to manages this big amount of data or in the field of Big Data, I review some research papers and review articles in the field of Big Data. This paper provides the synthesis of those papers which I found relevant to this field. This paper will focus on the following things:
Data itself is useless, until it is mined and transformed into a valuable source of knowledge discovery. Due to its conversion into useful information, data mining has become the leading source being used in many fields worldwide. “Data mining is based on complex algorithms that allow for the segmentation of data to identify patterns and trends, detect anomalies, and predict the probability of various situational outcomes.”[1] Many organizations from healthcare to multimedia and more are relaying on data and getting developed through the use of it. Regardless of how, data warehouse changed its rhythm and dimension in terms of measurements such as: variety, volume and velocity. Today, one can see the current trends of data mining in different fields such as social networks, healthcare and businesses. As data mining is giving the opportunity for those fields to get advanced, "Big Data" is also opening up new doors within itself as the new trends emerge.