题名 |
A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments |
DOI |
10.6138/JIT.2016.17.6.20150603c |
作者 |
Kawuu W. Lin;Sheng-Hao Chung;Chun-Yuan Hsiao;Chun-Cheng Lin;Pei-Ling Chen |
关键词 |
Data mining ; Frequent pattern mining ; Clustering ; Distributed computing |
期刊名称 |
網際網路技術學刊 |
卷期/出版年月 |
17卷6期(2016 / 11 / 01) |
页次 |
1259 - 1268 |
内容语文 |
英文 |
中文摘要 |
In distributed computing environments, frequent pattern mining by a multi-computing node can greatly improve mining efficiency. However, the drawback of memory limitations may cause interruption in the kernel and computing nodes when recursively building a frequent-pattern (FP) tree or an FP-growth algorithm. In this paper, we propose disk-based FP-tree generation and node-based clustering mechanisms to solve the insufficient memory problem. Results from empirical evaluations show that the proposed method delivers excellent scalability. |
主题分类 |
基礎與應用科學 >
資訊科學 |