ARTICLE
TITLE

Performing MapReduce on Data Centers with Hierarchical Structures

SUMMARY

Data centers are created as distributed information systems for massive data storage and processing. The structure of a data center determines the way that its inner servers, links and switches are interconnected. Several hierarchical structures have been proposed to improve the topological performance of data centers. By using recursively defined topologies, these novel structures can well support general applications and services with high scalability and reliability. However, these structures ignore the details of some specific applications running on data centers, such as MapReduce, a well-known distributed data processing application. The communication and control mechanisms for performing MapReduce on the traditional structure cannot be employed on the hierarchical structures. In this paper, we propose a methodology for performing MapReduce on data centers with hierarchical structures. Our methodology is based on the distributed hash table (DHT), an efficient data retrieval approach on distributed systems. We utilize the advantages of DHT, including decentralization, fault tolerance and scalability, to address the main problems that face hierarchical data centers in supporting MapReduce. Comprehensive evaluation demonstrates the feasibility and excellent performance of our methodology.

 Articles related

Leixiao Li,Jing Gao,Ren Mu    

In order to solve the problem of unbalanced load of data les in large-scale data all-to-all comparison under distributed system environment, the differences of les themselves arefully considered. This paper aims to fully utilize the advantages of distrib... see more


Hedy Izmaya,Dety Purnamasari    

Technology and Data Center (Pustekdata) LAPAN in 2016 has distributed remote sensing satellite data of 86,176 scenes to 323 government institutions. In 2016, the data service has also achieved the quality management standard ISO 9001: 2015 and the value ... see more


Azhar F. Hassan    

In this paper, two algorithms were introduced to describe two algorithms to describe and compare the applying of the proposed technique in the two types of the distributed database system. The First Proposed Algorithm is Homogeneous Distributed Clusterin... see more


Carmen Brando,Francesca Frontini,Jean-Gabriel Ganascia    

This paper proposes a graph-based Named Entity Linking (NEL) algorithm named REDEN for the disambiguation of authors’ names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and ev... see more


Muhammad Hidayat Darmawan,Isnawaty Isnawaty,Subardin Subardin    

Current technological developments are very influential on the process of data distribution that is demanded fast because every second of information can change. One of the methods used in data exchange is to use data replication in a distributed databas... see more

Revista: semanTIK