Modern Database Management
Modern Database Management
13th Edition
ISBN: 9780134773650
Author: Hoffer
Publisher: PEARSON
bartleby

Concept explainers

Question
Book Icon
Chapter 10, Problem 10.1RQ
Program Plan Intro

a. Definition of the term Hadoop.

Expert Solution
Check Mark

Explanation of Solution

Hadoop is a complete package of framework that makes it possible to deal with data using cheap commodity hardware machines. We know that it is costly to store and process the data in a single machine because machine with that kind of computation power and memory is very expensive. What Hadoop does is, it combines the power of many cheap commodity machines as one by storing and processing the data in a distributed fashion over cluster of commodity machines.

Hadoop uses popular MapReduce technique (explained in next section) to achieve this.

Program Plan Intro

(b)

Definition of the term MapReduce.

Expert Solution
Check Mark

Explanation of Solution

Map Reduce is a processing technique used in Hadoop based on Java. It is a combination of two individual processing techniques.

  1. Map: Map technique takes the input data and transform it into another set of data that is tuple(key/value) pair.
  2. Reduce as name suggests reduces or combines the output from map into a smaller set of data(tuples).
Program Plan Intro

(c)

Definition of the term HDFS.

Expert Solution
Check Mark

Explanation of Solution

HDFS (Hadoop File System) is a distributed file system designed to run on commodity hardware. It is highly fault tolerant.

HDFS follows the master-slave architecture. Where Namenode acts as a master and Datanode acts as a slave.

Namenda: - It manages the namespace of file system, client’s access to file and controls the operations like renaming, opening and closing a file.

Datanode: - It acts as the instruction received from Namenode which includes file I/O(read/write), block creation, deletion and replication.

Pig as name suggests who eats anything, it is an abstraction layer on the top of MapReduce technique to analyze Big data using the representation of data flow.

Program Plan Intro

(d)

Definition of the term NoSQL.

Expert Solution
Check Mark

Explanation of Solution

As name suggest NoSQL means non-relational. In a nutshell NoSQL is a database for the kind of data that is not available in the tabular format or those doesn’t have any defined schema. So, NoSQL database along with providing the mechanism to store and retrieve the structured(relational) data, it also provides the same functionalities for semi structured or unstructured data.

Program Plan Intro

(e)

Define the term Pig.

Expert Solution
Check Mark

Explanation of Solution

Pig as name suggests who eats anything, it is an abstraction layer on the top of MapReduce technique to analyze data using the representation of data flow.

Want to see more full solutions like this?

Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!
Students have asked these similar questions
Explain how HDFS and MapReduce are complementary to each other.
CAP theorem for NoSQL databases. What are C and A, and why can't they be fulfilled simultaneously?
Describe the three approaches of CSMA persistence.
Knowledge Booster
Background pattern image
Computer Science
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Text book image
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Text book image
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Text book image
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
Text book image
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Text book image
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Text book image
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education