题名

Improving the Scalability of Online Social Networks with Hypergraph-Based Data Placement

DOI

10.6138/JIT.2016.17.6.20160115b

作者

Jingya Zhou;Jianxi Fan

关键词

Online social networks ; Data placement ; Hypergraph partitioning ; Tree-based data center networks

期刊名称

網際網路技術學刊

卷期/出版年月

17卷6期(2016 / 11 / 01)

页次

1173 - 1185

内容语文

英文

中文摘要

Online social networks (OSNs) have become one of today's most popular internet services, and are growing at a phenomenal rate. With the huge amount of users, OSNs have to face the scalability problem of how to place users' data to thousands of distributed servers within a data center. Key-value stores use consistent hashing to fix the problem, and have been turned into a defacto standard. Nevertheless, random placement manner of hashing cannot preserve social locality, which leads to high intra-data center traffic as well as unpredictable response time. To preserve social locality or interaction locality, many existing works model the data placement problem as a graph partitioning problem. Although the partitioning problem is well-studied in these works, the social graph or interaction graph is formed based on ordinary pairwise graph that cannot fully reflect multi-participant interactions occurred in OSNs. Moreover, in a specific network topology of data center, servers communicate with one another upon different paths with varied distances, which is not considered in previous works. In this paper, we focus on the data placement with the aim of reducing intra-data center traffic as well as preserving load balance. We formulate the problem as a hypergraph partitioning problem together with a partition-to-server assignment problem. Specifically, we propose a hypergraph-based data placement (HDP) scheme that using round-robin hypergraph partitioning to maximally preserve both interaction locality and distance locality. Through extensive experiments with a large scale Facebook trace, we evaluate that HDP significantly reduces intra-data center traffic without deteriorating load balancing across servers.

主题分类 基礎與應用科學 > 資訊科學