preview

Investigation Into An Efficient Hybrid Model Of A With Mapreduce + Parallel Platform Data Warehouse Architecture Essay

Best Essays
Investigation into deriving an Efficient Hybrid model of a - MapReduce + Parallel-Platform Data Warehouse Architecture

Shrujan Kotturi skotturi@uncc.edu College of Computing and Informatics
Department of Computer Science

Under the Supervision of
Dr. Yu Wang yu.wang@uncc.edu Professor, Computer Science

Investigation into deriving an Efficient Hybrid model of a - MapReduce + Parallel-Platform Data Warehouse Architecture

Shrujan Kotturi
University of North Carolina at Charlotte
North Carolina, USA
E-mail: skotturi@uncc.edu

Abstract—Parallel databases are the high performance databases in RDBMS world that can used for setting up data intensive enterprise data warehouse but they lack scalability whereas, MapReduce paradigm highly supports scalability, nevertheless cannot perform as good as parallel databases. Deriving an architectural hybrid model of best of both worlds that can support high performance and scalability at the same time.
Keywords—Data Warehouse; Parallel databases; MapReduce; Scalability

I. INTRODUCTION

Parallel-platform data warehouse is the one that built using parallel processing database like Teradata, IBM Netezza etc. that support Massive Parallel Processing (MPP) architecture for data read/write operations, unlike non-parallel processing databases like Oracle, MySQL and SQL server that does sequential row-wise read/write operations without parallelism from DBMS. MapReduce paradigm is popularized by Google, Inc.
Get Access