CLC number: TP311
On-line Access: 2016-01-05
Received: 2015-01-12
Revision Accepted: 2015-06-11
Crosschecked: 2015-12-25
Cited: 1
Clicked: 9033
Dipayan Dev, Ripon Patgiri. Dr. Hadoop: an infinite scalable metadata management for Hadoop—How the baby elephant becomes immortal[J]. Frontiers of Information Technology & Electronic Engineering, 2016, 17(1): 15-31.
@article{title="Dr. Hadoop: an infinite scalable metadata management for Hadoop—How the baby elephant becomes immortal",
author="Dipayan Dev, Ripon Patgiri",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="17",
number="1",
pages="15-31",
year="2016",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1500015"
}
%0 Journal Article
%T Dr. Hadoop: an infinite scalable metadata management for Hadoop—How the baby elephant becomes immortal
%A Dipayan Dev
%A Ripon Patgiri
%J Frontiers of Information Technology & Electronic Engineering
%V 17
%N 1
%P 15-31
%@ 2095-9184
%D 2016
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1500015
TY - JOUR
T1 - Dr. Hadoop: an infinite scalable metadata management for Hadoop—How the baby elephant becomes immortal
A1 - Dipayan Dev
A1 - Ripon Patgiri
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 17
IS - 1
SP - 15
EP - 31
%@ 2095-9184
Y1 - 2016
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1500015
Abstract: In this Exa byte scale era, data increases at an exponential rate. This is in turn generating a massive amount of metadata in the file system. hadoop is the most widely used framework to deal with big data. Due to this growth of huge amount of metadata, however, the efficiency of hadoop is questioned numerous times by many researchers. Therefore, it is essential to create an efficient and scalable metadata management for hadoop. Hash-based mapping and subtree partitioning are suitable in distributed metadata management schemes. Subtree partitioning does not uniformly distribute workload among the metadata servers, and metadata needs to be migrated to keep the load roughly balanced. Hash-based mapping suffers from a constraint on the locality of metadata, though it uniformly distributes the load among nameNodes, which are the metadata servers of hadoop. In this paper, we present a circular metadata management mechanism named dynamic circular metadata splitting (DCMS). DCMS preserves metadata locality using consistent hashing and locality-preserving hashing, keeps replicated metadata for excellent reliability, and dynamically distributes metadata among the nameNodes to keep load balancing. nameNode is a centralized heart of the hadoop. Keeping the directory tree of all files, failure of which causes the single point of failure (SPOF). DCMS removes hadoop’s SPOF and provides an efficient and scalable metadata management. The new framework is named ‘Dr. hadoop’ after the name of the authors.
This paper aims to address the bottlenecks (limited scalability, single point of failure) of the centralized metadata management in the HDFS of Hadoop framework by proposing distributed peer-to-peer metadata cluster management. The metadata is distributed across multiple metadata servers (namenodes) through the consistent hashing technique to achieve load balancing, and the locality-preserving hashing to achieve data-locality. The proposed work is called Dr. Hadoop, which is compared with the vanilla Hadoop using two data sets from Yahoo and Microsoft, and Dr. Hadoop shows considerable performance improvement. This work is definitely meaningful, as the centralized metadata management of HDFS has been an inherent scalability and reliability problem. This work tries to address the problems, and will be beneficial to the data processing community in the Internet domains in the era of big data.
[1]Aguilera, M.K., Chen, W., Toueg, S., 1997. Heartbeat: a timeoutfree failure detector for quiescent reliable communication. Proc. 11th Int. Workshop on Distributed Algorithms, p.126-140.
[2]Apache Software Foundation, 2012. Hot Standby for NameNode. Available from http://issues.apache.org/jira/browse/HDFS-976.
[3]Beaver, D., Kumar, S., Li, H.C., et al., 2010. Finding a needle in haystack: Facebookąŕs photo storage. OSDI, p.47-60.
[4]Biplob, D., Sengupta, S., Li, J., 2010. FlashStore: high throughput persistent key-value store. Proc. VLDB Endowment, p.1414-1425.
[5]Bisciglia, C., 2009. Hadoop HA Configuration. Available from http://www.cloudera.com/blog/2009/07/22/hadoop-haconfiguration/.
[6]Braam, R. Z. PJ, 2007. Lustre: a Scalable, High Performance File System. Cluster File Systems, Inc.
[7]Brandt, S.A., Miller, E.L, Long, D.D.E., et al., 2003. Efficient metadata management in large distributed storage systems. IEEE Symp. on Mass Storage Systems, p.290-298.
[8]Cao, Y., Chen, C., Guo, F., et al., 2011. Es2: a cloud data storage system for supporting both OLTP and OLAP. Proc. IEEE ICDE, p.291-302.
[9]Corbett, P.F., Feitelson, D.G., 1996. The Vesta parallel file system. ACM Trans. Comput. Syst., 14(3):225-264.
[10]DeCandia, G., Hastorun, D., Jampani, M., et al., 2007. Dynamo: Amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev., 41(6):205-220.
[11]Dev, D., Patgiri, R., 2014. Performance evaluation of HDFS in big data management. Int. Conf. on High Performance Computing and Applications, p.1-7.
[12]Dev, D., Patgiri, R., 2015. HAR+: archive and metadata distribution! Why not both? ICCCI, in press.
[13]Escriva, R., Wong, B., Sirer, E.G., 2012. HyperDex: a distributed, searchable key-value store. ACM SIGCOMM Comput. Commun. Rev., 42(4):25-36.
[14]Fred, H., McNab, R., 1998. SimJava: a discrete event simulation library for Java. Simul. Ser., 30:51-56.
[15]Ghemawat, S., Gobioff, H., Leung, S.T., 2003. The Google file system. Proc. 19th ACM Symp. on Operating Systems Principles, p.29-43.
[16]Haddad, I.F., 2000. Pvfs: a parallel virtual file system for Linux clusters. Linux J., p.5.
[17]Wiki, 2012. NameNode Failover, on Wiki Apache Hadoop. Available from http://wiki.apache.org/hadoop/NameNodeFailover.
[18]HDFS, 2010. Hadoop AvatarNode High Availability. Available from http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html.
[19]Karger, D., Lehman, E., Leighton, F., et al., 1997. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. Proc. 29th Annual ACM Symp. on Theory of Computing, p.654-663.
[20]Kavalanekar, S., Worthington, B.L., Zhang, Q., et al., 2008. Characterization of storage workload traces from production Windows Servers. Proc. IEEE IISWC, p.119-128.
[21]Lewin, D., 1998. Consistent hashing and random trees: algorithms for caching in distributed networks. Master Thesis, Department of EECS, MIT.
[22]Lim, H., Fan, B., Andersen, D.G., et al., 2011. SILT: a memory-efficient, high-performance key-value store. Proc. 23rd ACM Symp. on Operating Systems Principles.
[23]McKusick, M.K., Quinlan, S., 2009. GFS: evolution on fast-forward. ACM Queue, 7(7):10-20.
[24]Miller, E.L., Katz, R.H., 1997. Rama: an easy-to-use, high-performance parallel file system. Parall. Comput., 23(4-5):419-446.
[25]Miller, E.L., Greenan, K., Leung, A., et al., 2008. Reliable and efficient metadata storage and indexing using nvram. Available from dcslab.hanyang.ac.kr/nvramos08/EthanMiller.pdf.
[26]Nagle, D., Serenyi, D., Matthews, A., 2004. The Panasas activescale storage cluster-delivering scalable high bandwidth storage. Proc. ACM/IEEE SC, p.1-10.
[27]Okorafor, E., Patrick, M.K., 2012. Availability of Jobtracker machine in hadoop/mapreduce zookeeper coordinated clusters. Adv. Comput., 3(3):19-30.
[28]Ousterhout, J.K., Costa, H.D., Harrison, D., et al., 1985. A trace-driven analysis of the Unix 4.2 BSD file system. SOSP, p.15-24.
[29]Raicu, I., Foster, I.T., Beckman, P., 2011. Making a case for distributed file systems at exascale. Proc. 3rd Int. Workshop on Large-Scale System and Application Performance, p.11-18.
[30]Rodeh, O., Teperman, A., 2003. ZFS—a scalable distributed file system using object disks. IEEE Symp. on Mass Storage Systems, p.207-218.
[31]Satyanarayanan, M., Kistler, J.J., Kumar, P., et al., 1990. Coda: a highly available file system for a distributed workstation environment. IEEE Trans. Comput., 39(4):447-459.
[32]Shvachko, K., Kuang, H.R., Radia, S., et al., 2010. The Hadoop Distributed File System. IEEE 26th Symp. on Mass Storage Systems and Technologies, p.1-10.
[33]Torodanhan, 2009. Best Practice: DB2 High Availability Disaster Recovery. Available from http://www.ibm.com/developerworks/wikis/display/data/Best+Practice+-+DB2+High+Availability+Disaster+Recovery.
[34]U.S. Department of Commerce/NIST, 1995. FIPS 180-1. Secure Hash Standard. National Technical Information Service, Springfield, VA.
[35]Wang, F., Qiu, J., Yang, J., et al., 2009. Hadoop high availability through metadata replication. Proc. 1st Int. Workshop on Cloud Data Management, p.37-44.
[36]Weil, S.A., Pollack, K.T., Brandt, S.A., et al., 2004. Dynamic metadata management for petabyte-scale file systems. SC, p.4.
[37]Weil, S.A., Brandt, S.A., Miller, E.L., et al., 2006. CEPH: a scalable, high-performance distributed file system. OSDI, p.307-320.
[38]White, T., 2009. Hadoop: the Definitive Guide. O’Reilly Media, Inc.
[39]White, B.S., Walker, M., Humphrey, M., et al., 2001. Legionfs: a secure and scalable file system supporting cross-domain highperformance applications. Proc. ACM/IEEE Conf. on Supercomputing, p.59.
[40]Yadava, H., 2007. The Berkeley DB Book. Apress.
[41]Zhu, Y., Jiang, H., Wang, J., et al., 2008. Hba: Distributed metadata management for large cluster-based storage systems. IEEE Trans. Parall. Distrib. Syst., 19(6):750-763.
Open peer comments: Debate/Discuss/Question/Opinion
<1>