


Research on an Ontology-based Web-Mining Technique for Supply-Chain Competitive Analysis




李俊宏(Chung-Hong Lee);張興亞(Hsing-Ya Chang)


文件探勘 ; 本體論 ; 崗絡探勘 ; 供應鏈管理 ; 競爭分析 ; Text mining ; Ontology ; Web mining ; supply chain management ; competitive analysis




9卷3期(2007 / 09 / 01)


435 - 460




本研究的目的是開發一個以ontology為主的產業供應鏈知識探勘平台,並提出一種利用文件探勘(text mining)與XML文件技術整合的方法來提供企業在供應鏈競爭分析上的應用;為測試所開發的文件探勘演算法,本研究在文件探勘的對象是與產業活動相關的新聞語料庫為主,並以RosettaNet作為系統的重要內容來源,以進行系統模型架構的實現及個案研究。根據這個探勘演算法,應用網路探勘(Web Mining)的技術從大量半結構性(semi-structured)以及非結構性(unstructured)的網頁文件中萃取出文件內容中相關的概念與知識,以產生用來描述文件的資訊(i.e. metadata)與階層式的知識分類目錄架構,再將原有網頁轉化為XML文件存於這個分類演算法的On-tology架構為基礎的XML文件庫中。本研究將文件探勘的技術應用於分類目錄的自動建立與維護,並開發一個能達到自動知識分類目的之XML文件資料庫系統。本研究主要採用的研究是以Web content mining的方法為主,亦即以文件探勘(Text mining)技術針對存在於WWW網頁中的文件资訊內容加以分析處理,並運用類神經網路機器學習的技術來實現。


In this research we propose a novel approach to develop a platform for discovering supply-chain competitive analysis on an ontology-based web-mining technique. Also, by integrating a text mining approach with a XML document technique, in the developed platform we provide a way to allow businesses tackle difficulties in knowledge management for the supply-chain related information. To testify the developed web-mining algorithm, in this research a corpus associated with industrial information collected from specific news web sites (e.g. CNA News), with the RosettaNet standard framework, is employed as the major information source for conducting system implementation and case study. By applying the developed web-mining algorithm, in this work we attempt to extract concepts and knowledge from a huge semi-structured and unstructured HTML-document collections. The extracted concepts and knowledge can then be used to produce metadata and ontology to describe the contents in the original web documents. As such, the original web documents can be transformed into XML documents and stored in the XML document database based upon the ontology based ”knowledge template”. The research applies a text-mining approach to automating the construction and maintenance of a concept-hierarchy, in order to establish a XML document database based on the extracted metadata and ontology. The approach for knowledge extraction in this research is mainly using a Web-content mining method. That is, the existing WWW pages can be analyzed to generate a set of metadata to describe their content and produce an ontology for the XML document database through a text-mining technique, incorporated with a neural-net machine learning method for implementation.

主题分类 人文學 > 人文學綜合
基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 社會科學綜合
  1. Benjamins, V. R.(1999).(KA)2: building ontologies for the internet: a mid-term report.International Journal of Human-Computer Studies,51(3),687-721.
  2. Braga, R. M. M.,C. M. L. Werner,M. Mattoso(2000).Proceedings of the 11th International Workshop on Database and Expert Systems Applications.London. UK:
  3. Chakrabarti, S.,B. E. Dom,S. R. Kumar,P. Raghavan,S. Rajagopalan,A. Tomkins,D. Gibson,J. Kleinberg(1999).Mining the Web`s link structure.IEEE Computer,32(8),60-67.
  4. Chandrasekaran, B.,J. R. Josephson,V. R. Benjamins(1999).What are ontologies, and why do we need them?.IEEE Intelligent Systems,14(1),20-26.
  5. Clark, D.(1999).Mad cows, metathesauri, and meaning.IEEE Intelligent Systems,14(1),75-77.
  6. Cohen, W.W.,H. Hirsh(1998).Joins that Generalize: Text Classification Using WHIRL.Proceeding of the Fourth International Conference on Knowledge Discovery and Data Aiming (KDD98)
  7. Cooley, R.,B. Mobasher,J. Srivastava(1997).Web mining: information and pattern discovery on the World Wide Web.Proceedings of liar Moth IEEE International Conference on Tools with Artificial Intelligence,USA:
  8. Crampee, M.,S. Ranwez(2000).Proceedings of the eleventh ACM on Hypertext and hypermedia.San Antonio, TX USA:
  9. Czejdo, B.,J. Dinsmore, C. H.,Hwang. R. Miller,M. Rusinkiewicz(2000).Automatic generation of ontology based annotations in XML and their use in retrieval systems.Proceedings of the First International Conference on Web Information Systems Engineering,Hong Kong, China:
  10. Dagan, I.,R. Feldman,H. Hirsh(1996).Keyword-Based Browsing and Analysis of Large Document Sets.Proceedings of the Symposium on Document Analysis and Information Retrieval (SDAIR96),Las Vegas, USA:
  11. Deerwester, S.,S. Dumais,G. Furnas,K. Landauer(1990).Indexing by Latent Semantic Analysis.Journal of the American Society for Information Science,40(6),391-407.
  12. Erdmann, M.,R. Studer(2001).How to structure and access XML documents with ontologies.Data and Knowledge Engineering,36(3),317-335.
  13. Feldman, R.,A. Amir,Y. Aumann,A. Zilberstein,H. Hirsh(1997).Incremental Algorithms for Association Generation.Proceedings of the 1st Pacific Asia Conference on Knowledge Discovery and Data Mining,Singapore:
  14. Feldman, R.,H. Hirsh(1996).Mining Associations in Text in the Presence of Background Knowledge.Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD96),Portland, Oregon, USA:
  15. Feldman, R.,H. Hirsh(1997).Exploiting Background Information in Knowledge Discovery from Text.Intelligent Information Systems,9(1),83-97.
  16. Feldman, R.,H. Hirsh,Michaiski, R.S. (edited),I. Bratko,M. Kubat(1997).Machine Learning and Data Mining. Methods and Applications.John Wiley and Sons.
  17. Feldman, R.,I. Dagan(1995).KDT-Knowledge Discovery in Texts.Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD95),Montreal, Canada:
  18. Feldman, R.,W. Klosgen,Y. Ben-Yehuda,G. Kedar,V. Reznikov(1997).Pattern Based Browsing in Document Collections.Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD `97),Trondheim, Norway:
  19. Fikes, R.,A. Farquhar(1999).Distributed repositories of highly expressive reusable ontologies.IEEE Intelligent Systems,14(2),73-79.
  20. Guarino, N.,C. Masolo,G. Vetere(1995).OntoSeek: content-based access to the Web.Proceedings of IEEE international Conference on Systems, Man and Cybernetics, 1995
  21. Guarino, N.,C. Masolo,G. Vetere(1995).OntoSeek: content-based access to the Web.Intelligent Systems for the 21st Century,14(3),70-80.
  22. Text Data Mining: issues, Techniques. and the Relation to Information Access
  23. Honkela, T.,S. Kaski,K. Lagus,T. Kohonen(1996).Technical Report A32, Helsinki University of TechnologyTechnical Report A32, Helsinki University of Technology,Espoo. Finland:Laboratory of Computer and Information Science.
  24. Kaski, S.,K. Lagus,T. Honkela,T. Kohonen(1998).Statistical aspects of the WEB-SOM system in organizing document collections.Computing Science and Statistics,29,281-290.
  25. Kaski, S.,T. Honkela,K. Lagus,T. Kohonen(1998).WEBSOM-Self-Organizing Maps of Document Collections.Neurocomputing,21(1-3),101-117.
  26. Kohonen, T.(1998).Self-Organization of Very Large Document Collections: State of the Art.Proceedings of ICANN98, the 8th International Conference on Artificial Neural Networks
  27. Kohonen, T.(1982).Self-Organizing Formation of Topologically Correct Feature Maps.Biological Cybernetics,43(1),59-69.
  28. Kohonen, T.(1995).Self-Organizing Maps.Berlin:Springer Verlag.
  29. Kosala, R.,H. Blockeel(2000).Web Mining Research: A Survey.SIGKDD Explorations,2(1),1-15.
  30. Lagus, K.,T. Honkela,S. Kaski,T. Kohonen(1996).Self-organizing maps of document collections: A new approach to interactive exploration.Proceedings of the Second International Conference on Knowledge Discovery and Data Mining,Menlo Park, CA:
  31. Lee, C. H.,H. C. Yang(2001).Developing an Adaptive Search Engine for E-Commerce Using a Web Mining Approach.International Conference on Information Technology: Coding and Computing (YTCC 2001): Special Session on Web and Hypermedia Systems,Las Vegas, Nevada, USA:
  32. Lee, C. H.,H. C. Yang(2000).Towards Multilingual Information Discovery through a SOM based Text Mining Approach.Proceedings of International Workshop on Text and Web Mining, The Sixth Pacific Rim International Conference on Artificial Intelligence (PRICA12000),Melbourne, Australia:
  33. Lee, C. H.,H. C. Yang(1999).A Web Text Mining Approach Based on Self-Organizing Map.ACM CIKM`99 2nd Workshop on Web Information and Data Management (WIDM`99).
  34. Lee, C. H.,H. C. Yang(2003).A Multilingual Text Mining Approach Based on Self-Organizing Maps.Applied Intelligence: Special issue on Text and Web Mining,18(3),295-310.
  35. Lee, C. H.,H. C. Yang(1999).Proceedings of the Fourth International Workshop on Information Retrieval with Asian Languages.Taipei, Taiwan:Academia Sinica.
  36. Loh, S.,L. K. Wives,J. Palazzo(2000).Concept-Based Knowledge Discovery in Texts Extracted from the Web.SIGKDD Explorations,2(1),29-39.
  37. Lopez, M. F.,A. Gomez-Perez,J. P. Sierra,A. P. Sierra(1999).Building a chemical ontology using Methontology and the Ontology Design Environment.IEEE Intelligent Systems,14(1),37-46.
  38. Ritter. H.,T. Kohonen(1989).Self-Organizing Semantic Maps.Biological Cybernetics,61(4),241-254.
  39. Salton, G.,M. J. McGill(1983).Introduction to Modern Information Retrieval.New York:McGraw-Hill.
  40. Salton. G.,A. Wong,C. S. Yang(1975).A Vector Space Model for Automatic Indexing.Communications of the ACM,18(11),613-620.
  41. Visser, P.,T. Bench-Capon(1996).Proceedings of the Seventh International Workshop on Database and Expert Systems Applications.Zurich, Switzerland:
  42. Wang. J.,Y. Huang,G. Wu,F. Zhang(1999).Web Mining: knowledge discovery on the web.Proceedings of 1999 IEEE International Conference on Systems, Man, and Cybernetics (IEEE SMC `99),Tokyo, Japan:
  43. Weinstein, P. C.(1998).Ontology-based metadata: transforming the MARC legacy.Proceedings of the third ACM Conference on Digital Libraries,Pittsburgh, PA USA:
  44. Yang. H. C.,C. H. Lee(2000).Automatic Category Structure Generation and Categorization of Chinese Text Documents.Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2000),Lyon, France:
  45. Yang. H. C.,C. H. Lee(2000).Automatic Category Generation for Text Documents by Self-organizing Maps.Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000),Como, Italy:
  46. Yang. H. C.,C. H. Lee(2001).Automatic Hypertext Construction through a Text Mining Approach by Self-organizing Maps.The Fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-01),Hong Kong:
  47. 李俊宏、吳志鴻(2000)。文件探勘技術應用於BZB電子商務入口網站開發之研究。第六屆資訊管理研究暨實務研討會論文集(CSIM 2000)
  48. 李俊宏、吳志鴻(2001)。一個Ontology-based的Web-mining技術應用於供應鏈知識管理之研究。第十二屆國際資訊管理學術研討會論文集
  1. 黃仁鵬、張貞瑩(2014)。運用詞彙權重技術於自動文件摘要之研究。資訊管理學報,21(4),391-416。