题名

A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments

DOI

10.6138/JIT.2016.17.6.20150603c

作者

Kawuu W. Lin;Sheng-Hao Chung;Chun-Yuan Hsiao;Chun-Cheng Lin;Pei-Ling Chen

关键词

Data mining ; Frequent pattern mining ; Clustering ; Distributed computing

期刊名称

網際網路技術學刊

卷期/出版年月

17卷6期(2016 / 11 / 01)

页次

1259 - 1268

内容语文

英文

中文摘要

In distributed computing environments, frequent pattern mining by a multi-computing node can greatly improve mining efficiency. However, the drawback of memory limitations may cause interruption in the kernel and computing nodes when recursively building a frequent-pattern (FP) tree or an FP-growth algorithm. In this paper, we propose disk-based FP-tree generation and node-based clustering mechanisms to solve the insufficient memory problem. Results from empirical evaluations show that the proposed method delivers excellent scalability.

主题分类 基礎與應用科學 > 資訊科學