NoSQL databases
Study Report
Nikitha Edunuri
800817767
Table of Contents
Introduction III
Abstract IV
Overview of NoSQL V
Data Model VI
Document base model VI
Key-value Stores VII
Graph Stores VII
Wide Column stores VIII
Criteria to be considered IX
Conclusion XI
References XI
INTRODUCTION
I was always inspired by the idea of storing large amounts of structured and unstructured data. Therefore I chose to finish my concentration in Data Management.
I have finished the following courses as my concentration requirement:
• ITCS 6162 Knowledge Discovery in databases
• ITCS 6160 Database Systems
• ITCS 6100 Big data Analytics for Competitive Advantage
The course Big data Analytics for Competitive Advantage has thrown some light about 3V 's Volume, Variety and Velocity at which the data is growing day by day and how this huge data is dealt with. From Knowledge discovery in databases, I learnt how useful data can be extracted from a large pile of unstructured data. In Database systems, I learnt how relational databases ruled for decades.
In this report I am going to discuss about No SQL databases and the criterion that are to be taken into consideration when choosing a database that best supports your product. NoSQL databases enclose a variety of database technologies and was developed in response to rise in volume of data unlike relational databases which were built to handle structured data [1].
ABSTRACT
Until the last decade choosing a
In order to overcome these limitations, a new database model known as Not Only SQL (NoSQL) database emerged with a set of new features. The main objective of NoSQL is not to discard SQL, but to be used as an alternative database data model for new features [1] [2] [3]. NoSQL database increases the performance of relational databases by a set of new characteristics and advantages. In contrast to relational databases, NoSQL databases introduced an additional feature that provides flexible and horizontal scalability and taking advantage of new clusters. The rise of NoSQL provides cost-effective management of data in modern web applications. With its new features, NoSQL can be used with applications that have a large transaction, and require low-latency access to huge datasets, service availability while
Connolly and Carolyn (2004) define a database as a structureordesign that consists of theclient’s data as well as metadata. It is also a persistent, logicallycoherentrepository of inherentlymeaningful data that is relevant to someaspects of therealworld. The database consists of data organized in a systematicway, anditallowseasy retrieval of information, analysis, updating andoutput of data. Thatdata can be in theform of graphics, scripts, reports, text, tables, andsoon. Most of thecomputerapplications are databases at their core. Manycompaniesusuallyhave a lot of data, andsotheyhavebig databases that can handlethatlargeamount of data. It is wherenothe database administratorcomes into playtopensurepropermanagement of the database sothatthe organizational data is safe from anyintrudersor data corruption (Jones, 2014). The database is whatcontrolsthedata of theentireorganization, andany tampering of the databases can culminate to the stoppage of businessoperations.
On Confais et al\cite{Confais} the authors evaluate through performance analysis three “off-the-shelf” object store solutions, namely Rados, Cassandra and InterPlanetary File
Tracking the concept of Big Data management from Relational Databases Management Systems to the current NoSQL database, this paper surveys the Big Data challenges from the perspective of its characteristics Volume, Variety and Velocity, and attempts to study how each of these challenges are addressed by various NoSQL systems. NoSQL is not a single system that can solve every single Big Data problem; it is an eco-system of technologies where different type of NoSQL databases are optimized to address various types of big data challenges by providing schema-less modeling and automatic
The term “No SQL” is considered in a much wider vision which means “Not Only SQL”. This can be elaborated in the sense that the concept of No SQL does not consider the complete elimination of SQL language, rather it focuses on supporting other SQL like queries. The No SQL Database basically follows a model-free approach. The leading advantage of implementing the No SQL database is eliminating all the restrictions of the rigorously followed structured model in the relational database system. In No SQL approach, there are many flexibilities of choosing eligible data structure according to the information or data that has to be handled. Some of the widely followed data models of the No SQL database are key value stores, column family stores, document database, graph database, etc. The fundamental concept behind the development of the key-value store data model is to create a data model that
According to a report from The International Business Machines Corporation, known as IBM, 90% of the data in the world has been generated in the last two years. Frank J. Ohlhorst (2013) explains how the concept of collecting data for use in business is not new, but the scale of data that has been collected recently is so large that it has been termed Big Data (p. 1). Company executives who choose to ignore Big Data are denying their companies an advantage over their competitors. Big Data analysis is fundamental for all fields of work; it provides an insight to large amounts of data that will answer questions and make discoveries to improve efficiency in all areas of the world.
The need to store and evaluate data is a perpetually growing field in the world of information systems. From the days of using flat files to very large database management systems that store petabytes of data in real time, the practice of building information from data continues to evolve. Today, the relational data model is quite ubiquitous and is used in a plethora of information systems ranging from accounting systems, banks, retail business, and scientific usage. It is important to understand the concepts involved in data modeling for a relational database management system in order to build an effective and efficient system.
In Nowadays, there are two major of database management systems which are use to deal with data, the first one called Relational Database Management System (RDBMS) which is the traditional relational databases, it deals with structured data and have been popular since decades since 1970, while the second one called Not only Structure Query Language databases (NoSQL), they are dealing with semi-structured and unstructured data; the NoSQL types are gaining their popularity with the development of the internet and the social media since April 2009. NoSQL are intending to override the cons of RDBMs, such as fixed schemas, JOIN operations and handling the scalability problems. In this paper we will review one of the graph database (Neo4j), which the graph database is part of the emerging technology that is called NoSQL and compared it with one of the traditional relational database (MySQL). MySQL, it is being another name for Relational Databases and it has been used for a long period time until now. However, with the emergence of Big Data there was clearly a need for more flexible databases. Facebook 's Graph Search using Neo4j, a graph database, is an application which clearly displays how relationships need to be modeled in a more efficient and sophisticated manner than using conventional relational models. In this paper, we will make a compare between MySQL and Neo4j based on the features like ACID, replication, availability and the language that is used in both of
Any database can be rum on the amazon platform which is built to be flexible as possible, we are using MYSQL, IBMDB2, Oracle, postgre SQL, and some databases for complete storage to run these databases production. However, there is a considerable measure of work in building and keeping up these databases services must be valid to a team. In late 2009 we build relational database services which aims to streamline in the creation of relational databases can support MYSQL and ORACLE we can spend up any databases and consistencies with nice additional features. Social database administration can have versatile capacity were we can easily increase the amount of data to be stored in data storage, Rapid provisioning, High availability options more than NOSQL, Scalable compute to increase the amount of memory or cpu put your databases as your Query required. There are couple of common patterns to setting up high performance databases, we can Increase throughput by scaling up the physical resources available in the cloud we can add read replicas and Elastic ache. Increase availability by multi availability deployments, Reduce
MongoDB is one of numerous cross-stage archive situated databases. Named a NoSQL database, MongoDB shuns the customary table-based social database structure for JSON-like archives with element constructions (MongoDB calls the organization BSON), making the combination of information in specific sorts of utilizations less demanding and quicker. Discharged under a mix of the GNU Affero General Public License and the Apache License, MongoDB is free and open-source programming.
Describe in a few sentences a business or computational problem you would want to solve with a NoSQL database, and what makes NoSQL a better choice in this case?
Big data is not a hype, but it is the future. The big data industry continues to advance, and big data service providers are making it easier for companies to work with big data in driving their businesses. Progressively, greater volumes and varieties of data will be incorporated with more business processes to support better decision making and greater insight. Moreover,
The Basic idea in Hadoop is all about parallelization. Parallelization can be easily accomplished, if a processing (a work) can be easily split into n units. Hence the core focus of MapReduce programming framework have been to solve the partition problem.
A Data WareHouse is a type of database normally used by large companies to store large amounts of data in and have the data be easily accessible. They are normally set up in one of three set-ups. The basic model that takes data straight from it sources, such as operational systems and flat files. The Staging Mode that has a staging area that takes the data, from the systems and files before moving it to data warehouse. The Final type adds data marts, a small database that takes specific information from the data warehouse, to the previous model between the data warehouse and the end users. Data Warehouses are also really useful because they make it easy to pull data from either queries or data mining. Data warehouses are a useful tool when dealing with large amounts of data.
The modern RDBMS advancements are not capable of supporting unstructured information with ideal space necessity. The plan winds up plainly mind-boggling and is henceforth troublesome for designers. The requirement for unstructured information administration is so annoying with conventional RDBMS arrangements (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). Moreover, RDBMS turns out to be an exorbitant answer for creating light-footed web applications with direct information investigation necessities. NoSQL is developing as a proficient possibility in this situation, which connects the issues related with RDBMS innovation. The market development can credit to creative dispatches of NoSQL arrangements, and collective endeavors by NoSQL sellers and clients. The endeavors of organizations, to enhance their market offerings, are creating the request of NoSQL, as a back-end bolster (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). The emergence of agile software development is creating the demand for NoSQL (Big data in financial services industry: Market trends, challenges, and prospects 2013 - 2018). They offer users much more avenues to accept data in many different forms. NoSQL is adaptable as SQL but offers many more uses that can apply to many organizations.