Data platforms are digital databases that house, manage and organize data from a variety of different sources. Data platforms started to form due to the rise of “Big Data” in the early 2000s. Big Data are large volumes of structured and unstructured data. While paper methods were still acceptable for smaller data sets, large sets were becoming harder to handle. Thus, data platforms were formed to store data digitally on servers, rather than on paper. Data platforms required to have several attributes to start handling Big Data. They need to be able to hold the large volumes of data that was now being generated, handle the speed at which data was being collected, and be able to handle various formats and structures of data. While data …show more content…
The Connecticut Data Collaborative provides this data to its communities by hosting an open data portal. From this portal, one can obtain raw data of specific data sets by downloading available files, or by browsing statistics using their online user interface. They offer a variety of features to assist a user and locating data of interest, such as location based comparisons, and searching for sources of data with respect . The Boston Data Platform instead of relying on just government sources for their data they often get data from academic research projects. A majority of these research projects come from either Harvard or Northeastern University. The data gathered from these projects are put into a third party open source research data repository software known as the Dataverse Project. The Boston Data Platform also has an open data portal for the public to download raw data, but they also include an interactive map that can be used to look at the demographics in the areas of Boston. Figure 2.8: (Left) Boston Are Research Initiative Logo, Retrieved From: https://www.northeastern.edu/csshresearch/bostonarearesearchinitiative/boston-data-portal/ Figure 2.9: (Right) Connecticut Data Collaborative Logo, Retrieved From: http://ctdata.org/ The Venice Open Data Project is a data platform that the WPI Venice Project Center has worked with in the past. Their team's job was to help the Venice Open Data Project create a more available and usable for the Venetian people. They
Data and information management is a huge growth area. But it's not just data management creating new job opportunities, its gathering, analyzing, storing and securing the data as well.
Meanwhile, data sharing is a large part of Boston’s efforts to improve the quality of life for their residents. And the City of Boston eagers to improve BAR from a data collection effort to a citywide performance management system accessible in real-time from any device. Big data provides an unprecedented opportunity to deliver improved services. It is a good chance to analyze large data sets, find the value of data and tell the story of the City of Boston in new way.
Visionary executives are finding opportunities beyond existing and traditional data repositories, such as on-premises CRM and ERP systems. Today 's data can include social media posts, customer journey information, Internet of things (IoT) data, and more.
Today, data is a growing asset that various businesses are having difficulty converting into a powerful strategic tool. Companies need help turning this data into valuable insight, which can diminish risk and enhance returns on investments. Companies are struggling to make sense and obtain value from their big data. Superior and reliable
In 2012, it was estimated, that human beings were generating around 2.5 exabytes of data every day and that number is likely even greater today (McAfee & Brynjolfsson, 2012). Twitter processes on average about 5,700 tweets per second (Twitter Inc, 2013). All of this data is stored in numerous ranging traditional database tables and spreadsheets to SMS text messages, PDF files, HTML web pages and more. While the value in capturing and analyzing this data is clear, the solution is not. Traditional data warehouse technologies were not designed for this volume, velocity and variety of data, which is collectively referred to as big data.
There will be also the possibility of choosing data maps in order to generate the pertinent studies from directed demands. The data must be hosted in the IADR website that will make the dissemination of the project and the data collected.
Definite answers regarding how much usable data is actually being generated by city-sized populations, and best places to get it could be the subject of a future study, but social media platforms may be the source most easily tapped into for this type of data. Employees of local IT departments are in danger of having difficulties storing and processing data from social media platforms to use as a foundation for aligning social network analysis with the systems . There is a growing demand for those having the necessary skillsets to deal with big data. Local governments (especially smaller-sized ones) might have special difficulty acquiring these resources which makes finding a comprehensive solution for combining “big data”, SNA, and GIS even more difficult, as data manipulators are in short supply. A survey supporting a Public CIO Special Report ("The Trouble with Big Data Talent," 2013), finds that more than half of government agencies polled were experiencing data-related hiring difficulties - meaning there is a shortage of individuals who possess an engineering background, can model data mathematically and can contribute to organizations effectively using data insights for decision making.
New data platforms have emerged from these basic needs and one of them is the data management platform (DMP). DMPs have quickly transformed the way companies gather and use information and the industry is full of great products that make data ecosystems more usable.
Hortonworks: Sentiment Graphing and Social Graphs (Marketing), Click Stream Analysis (Internet Marketing), Network Security, IT Compliance (HIPAA, Sarbanes Oxley, etcetc.), Sensor Data (“Internet of Things”), Predictive Analytics and Proactive Maintenance, Location Data Analysis, Text Analysis (Legal Discovery, Insurance Underwriting, and Application Risk Screening), and Data Hub.
High-quality data are the precondition for guaranteeing, using big data and analyzing. Big data has a quality that faces many challenges. The characteristics of big data are the three Vs Variety, Velocity, and Volume, as explained in the what is big data section of the paper Variety of data indicates that big data has a different kind of data types, and with this diverse division puts the data into unstructured data or structured data. These data need a much higher data processing capability. Velocity is the data that is being formed at and unbelieve amount of speed and it must be dealt in an organizational and timely manner. Volume is the tremendous volume
A Big Data system is comprised of a number of functional blocks that provide the system the capability for acquiring data from diverse sources, pre-processing (e.g. cleansing and validating) this data, storing the data, processing and analyzing this stored data, and finally presenting and visualizing the summarized and aggregated results. The rest of this article describes various performance considerations for each of the components shown in Fig 1. Refer: Appendices
Big data is not just about having large amounts of data, but it also refers to the complexity of data sets fetched from multiple sources, where traditional data processing methods cannot be sufficient to process the data, and thus requires advanced computing tools and technologies which can be acquired by using distributed processing and cloud technology. In general, Big data excels in three areas: Volume, Variety and Velocity. Volume and Variety is when you are dealing with a huge quantity of data, in all types of forms and all types of sources. Velocity is about additional capabilities offered by distributed processing, which drastically accelerates the computing speed. But, when it comes to logistics, Big data is more than just 3Vs. It goes beyond and presents real-world use cases, revealing what is happening now and what could possibly happen in the future. Big data is often unstructured, so you definitely need advance mechanism for interpreting the data. The major challenges faced are data analysing, capturing, curating, storing, searching, querying, updating, sharing, transferring, visualising and security. As of 2012, the size of Big data ranges from a few terabytes to dozens of petabytes.
There is increasing literature focusing on the discussion regarding big data. However, there is yet to be a consensus regarding its definition. Keller, Koonin and Shipp’s Big data and city living – what can it do for us? offers some examples and uses of data seen in the box below :
So what exactly is Big Data? This can be a point of contention amongst even those who work with big data all the time so the description that follows is only one of many. Big Data is best characterised by 3Vs: volume, variety and velocity. Volume refers to the extreme amounts of data, variety to the varying types of said data and velocity to the speed at which the data must be processed. Big Data does not actually refer to a specific quantity of data as many might assume by looking at the term but it is often used when speaking about massive volumes of data in the range of Exabyte’s (Rouse, 2014), most of which cannot be integrated easily.
Indeed a smart city is an open city and based on data, so that public bodies need to discover a speculation of openness, making data freely accessible online. Campbell (2011)