题名

程式抄襲源頭偵測之研究

并列篇名

The Research of Source Code Detecting for Plagiarism Program

作者

蔡明志(MING-JYH TSAI);陳思豪(SZU-HAO CHEN);曾宣瑜(HSUAN-YU TSENG);胡俊之(JYUN-JY HU)

关键词

抄襲群組 ; 群組與源頭分析 ; plagiarism group ; group and source analysis

期刊名称

輔仁管理評論

卷期/出版年月

26卷3期(2019 / 09 / 01)

页次

27 - 50

内容语文

繁體中文

中文摘要

過去研究多著重於程式的抄襲比對,僅有少數的研究對於抄襲源頭與抄襲群組進行尋找,但這些方法均不是針對學生作業抄襲的領域而設計。本研究使用程式抄襲與複製偵測的文獻為基礎,將相似的作業結合為群組,並以程式“重要片段”的概念;利用重要片段的參考性、重要片段的傳遞性以及重要片段位於群內相似性與群間差異性,計算抄襲群組中的源頭可能性;最後再透過權重訓練模式訓練抄襲群組的權重,提升真正源頭被偵測的可能性。實驗結果顯示:(1)抄襲分數計算從一至五個群組的樣本,均可具有良好的源頭偵測率。(2)使用權重訓練模式能有效提升真實源頭的權重分數,並且降低非源頭的誤判率。(3)重要片段的三階段分數計算能有效形成組內分數差異,使得真實源頭更容易被偵測。當抄襲群組與真實源頭被分析出來後,授課老師即可進一步的藉由抄襲群組比對學生間的同儕群組競合關係,以評估同學之間是否有抄襲或被抄襲的動機。

英文摘要

In the past, studies focused on the plagiarism of programs. Only a few studies looked for plagiarism sources and plagiarism groups, but these methods were not designed for the field of student plagiarism. This study is based on the papers of plagiarism and copy detection, combining similar assignments into groups, and using the concept of "important fragments" of programs; using the reference of important fragments, the transitivity of important fragments, and the internal similarity and inter-group difference of important fragments located in the group, to calculate the source possibility in the plagiarism group; finally, the weight training mode is used to train the weight of the plagiarized group, and the possibility that the true source is detected is improved. The experimental results show that: (1) plagiarism scores can be sampled from one to five groups, all with good source detection rate. (2) The use of weight training mode can effectively improve the weight score of the real source and reduce the false positive rate of non-source. (3) The three-stage score calculation of important segments can effectively form the difference in scores within the group, making the real source more easily detected. After the plagiarism group and the real source are analyzed, the instructor can further evaluate the ambiguity of plagiarism or plagiarism between the students by plagiarizing the group to match the competing group relationship between the students.

主题分类 社會科學 > 管理學
参考文献
  1. 黃政傑,張嘉育(2010)。讓學生成功學習:適性課程與教學之理念與策略。課程與教學季刊,3(13),1-22。
    連結:
  2. Baker, B. S.(1999).Parameterized diff.Proceedings of the 10th ACM-SIAM Symposium on Discrete Algorithms (SODA’99),USA:
  3. Baxter, I. D.,Yahin, A.,Moura, L.,Sant’Anna, M.,Bier, L.(1998).Clone Detection Using Abstract Syntax Trees.14th IEEE International Conference on Software Maintenance (ICSM'98)
  4. Belkhouche, B.,Nix, A.,Hassell, J.(2004).Plagiarism detection in software designs.Proceedings of the 42nd annual Southeast regional conference
  5. Chanchal, K. R.,James. R. C.,Rainer, K.(2009).Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach.Science of Computer Programming,74(7),470-495.
  6. Ducasse. S.,Nierstrasz, O.,Rieger, M.(2006).On the Effectiveness of CloneDetection by String Matching.International Journal on Software Maintenance and Evolution: Research and Practice,18(1),37-58.
  7. Greenan, K., Method-Level Code Clone Detection on Transformed Abstract Syntax Trees using Sequence Matching Algorithms, Student Report, University of California - Santa Cruz, Winter 2005.
  8. Hirschberg, D. S.(1975).A linear space algorithm for computing maximal common subsequences.Communications of the ACM,18(6),341-343.
  9. Howard, J. J.(1993).Identifying Redundancy in Source Code Using Fingerprints.Proceeding of the 1993 Conference of the Centre for Advanced Studies Conference (CASCON’93)
  10. Jadalla, A.,Elnagar, A.(2008).Pde4java: plagiarism detection engine for java source code: a clustering approach.International Journal of Business Intelligence and Data Mining,3(2),121-135.
  11. Ji, J. H.,Woo, G.,Park, S. H.,Cho, H. G.(2007).Evolution analysis of homogenous source code and its application to plagiarism detection.Frontiers in the Convergence of Bioscience and Information Technologies
  12. Joy, M.,Luck. M.(1999).Plagiarism in programming assignments.IEEE Transactions on Education,42(2),129-133.
  13. Kamiya, T.,Kusumoto, S.,Inoue, K.(2002).CCfinder: a multilinguistic token-based code clone detection system for large scale source code.IEEE Transactions on Software,28(7),654-670.
  14. Lweicki, R.J.,Bunker, B.B.(1996).Developing andMaintaining Trustin Work Relationships.Trust in Organization: Frontiers of Theory and Research
  15. Michael, J. W.(1996).YAP3: improved detection of similarities in computer program and other texts.Proceedings of the twenty-seventh SIGCSE technical symposium on Computer science education
  16. Moussiades, L.,Vakali, A.(2005).PDetect: a clustering approach for detecting plagiarism in source code datasets.The Computer Journal,48(6),651-661.
  17. Mozgovoy, M.(2006).Desktop Tools for Offline Plagiarism Detection in Computer Programs.Informatics in Education,5,97-112.
  18. Mozgovoy, M.,Karakovskiy, S.,Klyuev, V.(2007).Fast and reliable plagiarism detection system.Frontiers In Education Conference - Global Engineering: Knowledge Without Borders, Opportunities Without Passports
  19. Parker, A.,Hamblen, J. O.(1989).Computer algorithms for plagiarism detection.IEEE Transactions on Education,32(2),94-99.
  20. Prechelt, G.,Malpohl, M.(2002).Finding Plagiarisms among a Set of Programs with JPlag.Journal of Universal Computer Science,8(11),1016-1038.
  21. Rainer, K.,Raimar, F.,Pierre, F.(2006).Clone Detection Using Abstract Syntax Suffix Trees.Proceedings of the 13th Working Conference on Reverse Engineering (WCRE’06)
  22. Roy, C. K.,Cordy, J. R.(2007).Technical ReportTechnical Report,Kingston:Queen’s University.
  23. Schleimer, S.,Wikerson, D. S.,Aiken, A.(2003).Winnowing: local algorithms for document fingerprinting.Proceedings of the 2003 ACM SIGMOD international conference on Management of data, June 2003
  24. Tairas, R.,Gray, J.(2006).Phoenix-Based Clone Detection Using Suffix Trees.Proceedings of the 44th annual Southeast regional conference (ACM-SE’06)
  25. Ukkonen, E.(1995).On-line construction of suffix trees.Algorithmica,14(3),249-260.
  26. Whale, G.(1990).Identification of program similarity in large populations.The Computer Journal,33(2),140-146.
  27. Wise, M. J., String similarity via greedy string tiling and running Karp-Rabin matching. Retrieved October 20, 2010, from the World Wide Web: http://www.pam1.bcs.uwa.edu.au/~michaelw/ftp/doc/RKR_GST.ps.
  28. Yang, W.(1991).Identifying syntactic differences between two programs.Software: Practice and Experience,21(7),739-755.
  29. 張火燦,劉淑寧(2002)。從社會網絡理論探討員工知識分享。人力資源管理學報,2(2),101-113。
  30. 游景翔(2007)。國立台灣科技大學資訊工程系。