A Quest on Hadoop
Keywords:
hadoop, bigdata, Map/Reduce, Hadoop Distributed File Systems, HDFS, Job tracker, task trackerAbstract
Everyday quintillion bytes of data are created. About 90% of this data which are posts to social media sites, digital pictures, videos etc. are unstructured. These data is BigData and should be formatted to make it suitable for data mining and its subsequent analysis. Hadoop offers a firm platform in this regard and is designed to handle mixture of complex and structured data so that computationally extensive tasks can be performed. The paper also acquaints a brief idea on how we store and query BigData using Hadoop.
References
Jeffrey Shafer, Scott Rixner, and Alan L. Cox IEEE ISPASS, 2010. The Hadoop Distributed File System: Balancing portability and Performance.
B.Thirumala Rao, Dr. L.S.S Reddy. International Journal of Computer Applications - ISSN 0975-8887, Volume 2, No. 9, November 2011. A Survey on improved Scheduling in Hadoop MapReduce in cloud Environments
Hadoop’s Fair Scheduler - http://hadoop.apache.org/common/docs/ r0.20.2/fair_scheduler.html
Hadoop’s Capacity Scheduler: http://hadoop.apache.org/core/docs/ current/capacity_scheduler.html.
Apache Hadoop. http://hadoop.apache.org.
Hadoop Distributed File System, http://hadoop.apache.org/hdfs Hadoop in Practice – Ebook on Hadoop
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2013 COMPUSOFT: An International Journal of Advanced Computer Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.
©2023. COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY by COMPUSOFT PUBLICATION is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at COMPUSOFT: AN INTERNATIONAL OF ADVANCED COMPUTER TECHNOLOGY. Permissions beyond the scope of this license may be available at Creative Commons Attribution 4.0 International Public License.