题名

自動鏈結分析演算法在社會網絡之開發與應用

并列篇名

The Development and Application of an Automatic Link Analysis Algorithm for Social Networks

DOI

10.6382/JIM.200807.0157

作者

江憲坤(Heien-Kun Chiang);陳鴻文(Hown-Wen Chen);楊境榮(Jing-Rong Yang)

关键词

資料探勘 ; 鏈結分析 ; 弱鏈結 ; 關聯法則 ; Data mining ; link analysis ; weak link ; association rules

期刊名称

資訊管理學報

卷期/出版年月

15卷3期(2008 / 07 / 01)

页次

157 - 180

内容语文

繁體中文

中文摘要

在關聯分析或序列樣式分析之資料探勘研究中,即使採用了多重門檻值的設定來過濾大資料集合,仍會找到過多無用且信度過低的關聯法則,或可能遺漏了頻率較低但實質上卻具有高度價值的資料項目。此外,除了少數特定問題外,以往鏈結分析之研究,都需要仰賴專家來目測已轉換為視覺化的資料,來進行主觀評估,以發現資料之規則性。此種衡量和評估的方式,對於複雜之網絡,往往費事耗時且成效不彰。而一些社會網絡之研究也指出,頻率低的弱鏈結扮演著聯繫不同群體之重要角色。因此,本研究透過基本的圖學理論,提出一個不需要依賴門檻值設定,就能找出存在於網絡中的弱鏈結及關鍵弱鏈結路徑之自動鏈結演算法。本研究再利用真實的安隆企業電子郵件資料,配合NetDraw視覺化的網絡分析工具,以實驗來檢驗本自動化鏈結分析演算法之可行性及正確性。

英文摘要

Even with the settings of multiple thresholds when screening large data sets, using link analysis or sequential patterns analysis, many data mining studies obtain lots of not-very-useful low-confidence association rules or miss the low-frequency but actually highly valuable data items. In addition, except for some specific problems, previous link analysis researches mostly rely on experts´subjective visual investigations of analyzed data which are transformed into visual form in order to find the data's regularities. This kind of assessment and evaluation is usually time-consuming and inefficient for complicated networks. Prior studies of social networks have revealed that low-frequency (weak) links play important roles in connecting different cliques in a social network. Therefore, utilizing the topology in graph theory this study proposes an automatic link analysis algorithm without depending on the thresholds to discover the weak links and the key weak link paths in a network. To check and see the feasibility and accuracy of the proposed algorithm, empirical studies on the well-known Enron e-mail data sets using the NetDraw network visualization tool are conducted, and the results are found to be positive.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 管理學
参考文献
  1. UCINet
  2. Visual_complexity
  3. Enron Dataset
  4. Enron Email Dataset
  5. Adriaans, P.,Zantinge, D.(1999).Data Mining.New York:Addison-Wesley.
  6. Agrawal, R.,Imielinski, T.,Swami, A.(1993).Mining Association Rules between Sets of Items in Large Database.Proceedings of the ACM SIGMOD Conference on Management of Data
  7. UC Berkeley Enron Email Analysis Project
  8. Berry, M. J. A.,Linoff, G.(1997).Data Mining Techniques: For Marketing Sale and Customer Support.California:John Wiley & Sons.
  9. Berry, M. W.,Browne, M.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  10. Biconnected Component Partition
  11. Borodin, A.,Roberts, G. O.,Rosenthal, J. S.,Tsaparas, P.(2005).Link Analysis Ranking: Algorithms, Theory and Experiments.ACM Transactions on Internet Technology,5(1),231-297.
  12. Brin, S.,Page, L.(1998).The Anatomy of Large-Scale Hypertextual Web Search Engine.Proceedings of the 7th International World Wide Web Conference,Brisbane, Australia:
  13. Chapanond, A.,Krishnamoorthy, M. S.,Bülent, Y.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  14. Chung, F. L.,Lui, C. L.(2000).Workshop Notes of KDD`2000 Workshop on Post-Processing in Machine Learning and Data Mining.Boston, MA, USA:
  15. Cormen, T. H.,Leiserson, C. E.,Rivest, R. L.,Stein, C.(2001).Introduction to Algorithms.
  16. Diesner, J.,Carley, K. M.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  17. Diestel, R.(2000).Graph Theory.Springer.
  18. Duan, Y.,Wang, J.,Kam, M.,Canny, J.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  19. Garton, L.,Haythornthwaite, C.,Wellman, B.(1997).Studying Online Social Networks.Journal of Computer-Medicated Communication,3(1),124-132.
  20. Granovetter, M. S.(1973).The Strength of Weak Ties.American Journal of Sociology,78,1360-1380.
  21. Henzinger, M. R.(2001).Hyperlink Analysis for the Web.IEEE Internet Computing,5(1),45-50.
  22. Keila, P. S.,Skillicorn, D. B.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  23. Knoke, D.,Kuklinski, J. H.(1982).Network Analysis.California:Sage.
  24. Lauw, H. W.,Lim, E. P.,Tan, T. T.,Pang, H. H.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  25. Liu, B.,Hsu, W.,Ma, Y.(1999).Mining Association Rules with Multiple Minimum Supports.Proceedings of the 1999 International Conference on Knowledge Discovery and Data Mining,San Diego, CA, USA:
  26. McCallum, A.,Corrada-Emmanuel, A.,Wang, X.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  27. Milgram, S.(1967).The Small World Problem.Psychology Today,2,60-67.
  28. Priebe, C.(2005).Proceedings of Workshop on Link Analysis, Counterterrorism and Security.California, USA:Newport Beach.
  29. Shetty, K.,Adibi, J.(2005).Proceedings of 3rd International Workshop on Link Discovery.New York:ACM Press.
  30. Valiente, G.(2002).Algorithms on Trees and Graphs.Springer.
  31. Wasserman, S.,Faust, K.(1997).Social Network Analysis: Methods and Application.New York:Cambridge University Press.
  32. Watts, D. J.,Strogatz, S. H.(1998).Collective Dynamics of Small-World.Networks,393,440-442.
  33. Weiss, M. A.(1993).Data Structures and Algorithm Analysis in C.Boston:Addison-Wesley.
  34. Westphal, C.,Blaxton, T.(1998).Data Mining Solutions.New York:John Wiley & Sons.
  35. Xu, J. J.,Chen, H. C..Fighting Organized Crimes: Using Shortest-Path Algorithms to Identify Associations in Criminal Networks.Decision Support Systems,38,473-487.
  36. Yun, H.,Hwang, D.,Ha, B.,Ryu, K. H.(2003).Mining Association Rules on Significant Rare Data Using Relative Support.The Journal of Systems and Software,67(3),181-191.
  37. 吳寶秀(1990)。碩士論文(碩士論文)。東海大學社會學研究所。
  38. 胡守仁譯(2002)。最具開創性的小世界理論。台北:天下文化。
  39. 陳會安(2002)。資料結構理論與實務。台北:學貫行銷。
  40. 蕭新煌、龔宜君(1998)。東南亞台商與華人之商業網絡關係。華商經貿,381,19-38。
被引用次数
  1. 張鈞甯、吳怡瑾(2011)。維基百科瀏覽輔助介面─整合連結探勘與語意關聯分析。圖書資訊學研究,5(2),101-142。