题名

以人工智慧讀取親權酌定裁判文本:自然語言與文字探勘之實踐

并列篇名

Applying Natural Language Processing and Text Mining to Classifying Child Custody Cases and Predicting Outcomes

DOI

10.6199/NTULJ.202003_49(1).0004

作者

黃詩淳(Sieh-Chuen Huang);邵軒磊(Hsuan-Lei Shao)

关键词

法律資料分析 ; 親權酌定 ; 類神經網路 ; 文字探勘 ; 人機協作 ; legal analytics ; custody case ; artificial neural network ; text mining ; human-machine collaboration

期刊名称

臺大法學論叢

卷期/出版年月

49卷1期(2020 / 03 / 01)

页次

195 - 224

内容语文

繁體中文

中文摘要

近期人工智慧的研究,使電腦(機器)能夠部分模擬人類思考過程、輔助人類決策,並應用至法學領域。過去著重在解析法官思考過程以及預測裁判結果,讓律師與當事人參考;而為了獲得高度正確的成果,必須倚賴法學專家先閱讀裁判或法律條文,抽取出關鍵因素,將之編碼之後,再由機器建立模型並預測。本研究嘗試不同的方法,直接將自然語言的文本(即法院裁判原文)輸入機器,觀察機器能否成功解析法官的語意,判斷裁判之結果。具體言之,本文以三年期間地方法院第一審結果為「單獨親權」之裁判文書之部分段落作為樣本,使機器讀入這些文本後,以自動斷詞等文字探勘技術,製作出詞彙矩陣,再使用機器學習領域中的類神經網路方法,訓練機器「理解」法官的語氣與裁判方向(親權歸屬父親或母親)。接著以此為基礎,要求機器讀入其他未知裁判,並判斷結果。其準確率約77.25%,F1分數0.8674,如此證實了機器可以某種程度「讀懂」裁判文本並做出分類。由於機器的運算速度遠大於人類,此成果將能更快速地讓人們找到所需的裁判(例如:母親取得單獨親權的裁判),而減少人工檢索、閱讀、挑選。「人機協作」的結果,將能增進人類決策的效率與正確性,也是法律資料分析學的近期目標。

英文摘要

Recently there have been many studies of artificial intelligence that enable computers (machines) to simulate human thinking processes, assist human decision-making, and apply them to the field of law. However, most of previous studies have focused on analyzing the judges' thinking process and predicting the case outcomes, so that lawyers and the parties can refer to them. In order to obtain highly accurate results on analyzing cases or statutes, these studies must rely on legal experts to extract or retrieve key legal factors and code them manually. Based on the human coded data, the machine will build models and predict outcomes. On the other hand, this study attempts to adopt different methods. Instead of using manually coded data, we directly input legal texts which are in the form of unstructured natural language data (i.e., the original texts of court cases) into the machine, and observe whether the machine can successfully "understand" the judges' semantics and classify the cases. We collected 448 cases regarding child custody from 2012 through 2014. These parents were both Taiwanese and willing to acquire the custody, where the Taiwanese district court granted one parent sole custody. The machine used word segmentation techniques to build the Document Term Matrixm. Next, we built the artificial neural network (ANN) model to classify the cases into two groups: father-sole-custody and mother-sole-custody. The model has a 77.25% overall accuracy and 0.8674 average F1 score on the testing data set. This confirms that the machine can "read" the legal texts to some extent and classify it. Since the speed of the machine is much faster than that of humans, this result, if being used in the legal data search system, will allow people to find the information (for example, to find the cases where the mother receives sole custody) more efficiently without the bother to rely on manual searching, reading, and selection of the most relevant cases. This research will also contribute to "human-machine collaboration" to support human decision-making, which is exactly the goal of legal data analytics in recent years.

主题分类 社會科學 > 法律學
参考文献
  1. 林琬真,郭宗廷,張桐嘉,顏厥安,陳昭如,林守德(2012)。利用機器學習於中文法律文件之標記、案件分類及量刑預測。中文計算語言學期刊,17(4),49-67。
    連結:
  2. 邵軒磊,吳國清(2019)。法律資料分析與文字探勘:跨境毒品流動要素與結構研究。問題與研究,58(2),91-114。
    連結:
  3. 陳世榮(2015)。社會科學研究中的文字探勘應用:以文意為基礎的文件分類及其問題。人文及社會科學集刊,27(4),683-718。
    連結:
  4. 陳譽文(2017)。人工智慧規範性議題綜觀。科技法律透析,29(4),43-51。
    連結:
  5. 黃詩淳,邵軒磊(2018)。酌定子女親權之重要因素:以決策樹方法分析相關裁判。臺大法學論叢,47(1),299-344。
    連結:
  6. 黃詩淳,邵軒磊(2019)。人工智慧與法律資料分析之方法與應用:以單獨親權酌定裁判的預測模型為例。臺大法學論叢,48(4),2023-2073。
    連結:
  7. Aggarwal C.(Ed.),Zhai C.(Ed.)(2012).Mining Text Data.Boston, MA:Springer US.
  8. Ashley, K. D.(2017).Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age.Cambridge, England:Cambridge University Press.
  9. Blake, C.(2011).Text Mining.Annual Review of Information Science and Technology,45(1),121-155.
  10. Burscher, B.,Vliegenthart, R.,de Vreese., C. H.(2015).Using Supervised Machine Learning to Code Policy Issues: Can Classifiers Generalize Across Contexts?.The ANNALS of the American Academy of Political and Social Science,659(1),122-131.
  11. Byrd, O. (2017, June 12). Legal Analytics vs. Legal Research: What’s the Difference? [Online forum comment]. Retrieved from https://www.lawtech nologytoday.org/2017/06/legal-analytics-vs-legal-research/
  12. Chen, C.-J.(2016).The Chorus of Formal Equality: Feminist Custody Law Reform and Fathers’ Rights Advocacy in Taiwan.Canadian Journal of Women and the Law,28(1),116-151.
  13. Feldman, R.,Sanger, J.(2006).The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data.Cambridge, England:Cambridge University Press.
  14. Hill, J. (2018, August 15). How Legal Analytics is Changing the Legal Landscape [Online forum comment]. Retrieved from https://www.legaltech nology.com/latest-news/how-legal-analytics-is-changing-the-legal-landscape/
  15. Hoekstra, R.(2010).The Knowledge Reengineering Bottleneck.Semantic Web Journal,1 1(2),111-115.
  16. Hopkins, D. J.,King, G.(2010).A Method of Automated Nonparametric Content Analysis for Social Science.American Journal of Political Science,54(1),229-247.
  17. Katz, D. M.(2013).Quantitative Legal Prediction - or - How I Learned to Stop Worrying and Start Preparing for the Data-Driven Future of the Legal Services Industry.Emory Law Journal,62,909-966.
  18. Kelly III, J. E.,Hamm, S.(2013).Smart Machines: IBM's Watson and the Era of Cognitive Computing.New York, NY:Columbia University Press.
  19. Lothar Philipps,陳顯武(譯)(1988)。專家系統:法學方法論上之挑戰。政大法學評論,37,175-181。
  20. Miner, G.,Elder, J.,Fast, A.,Hill, T.,Nisbet, R.,Delen, D.(2012).Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications.Netherlands: Elsevier Science
  21. Raghupathi, V.,Zhou, Y.,Raghupathi, W.(2018).Legal Decision Support: Exploring Big Data Analytics Approach to Modeling Pharma Patent Validity Cases.IEEE Access,6,41518-41528.
  22. Sobowale, J.(2016).How artificial intelligence is transforming the legal profession.ABA Journal,2016(8)
  23. Sullivan, D.(2001).Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing, and Sales.New York, NY:John Wiley & Sons.
  24. 王鵬翔,張永健(2015)。經驗面向的規範意義:論實證研究在法學中的角色。中研院法學期刊,17,205-294。
  25. 宋皇志(2017)。方興未艾之區塊鏈專利。月旦法學雜誌,266,52-68。
  26. 林勤富,劉漢威(2018)。人工智慧法律議題初探。月旦法學雜誌,274,195-215。
  27. 科技部(2017),〈行政院院會議案:我國 AI 的科研戰略〉,載於:https://www.ey.gov.tw/Page/448DE008087A1971/a76aec69-0950-44c5-bcb8-8e106cd735c5。
  28. 張冠群(2017)。自金融監理原則與金融消費者保護觀點論金融科技監理沙盒制度:兼評行政院版「金融科技創新實驗條例草案」。月旦法學雜誌,266,5-34。
  29. 郭雨嵐,汪家倩,侯春岑(2017)。法律科技與人工智慧時代,科技法律人才的養成與挑戰。萬國法律,214,51-59。
  30. 曾元顯(2002)。文件主題自動分類成效因素探討。中國圖書館學會會報,68,62-83。
  31. 黃詩淳,邵軒磊(2017)。運用機器學習預測法院裁判:法資訊學之實踐。月旦法學雜誌,270,86-96。
  32. 臧正運(2017)。區塊鏈運用對金融監理之啟示與挑戰。月旦法學雜誌,267,136-152。
  33. 劉宏恩(2011)。「子女最佳利益原則」在臺灣法院離婚後子女監護案件中之實踐:法律與社會研究(Law and Society Research)之觀點。軍法專刊,57(1),84-106。
  34. 劉靜怡(2018)。「人工智慧相關法律議題工作坊」簡介。人文與社會科學簡訊,19(2),53-60。
  35. 鄭諺霓(2015)。臺北,國立臺灣大學法律學院法律學研究所。
  36. 蘇凱平(2018)。法律數據分析與刑事證據:概念的建立、學習與應用。第二屆兩岸刑事訴訟法學術研討會:「現代風險社會下之刑事訴訟法學的對應與展望,臺北:
被引用次数
  1. 顧以謙,許福元,張道行,林俐如,李思賢,宋曜廷,吳瑜(2021)。應用AI人工智慧自動判讀起訴書類先導研究-以施用毒品罪為例。刑事政策與犯罪防治研究專刊,30,93-140。
  2. 黃詩淳,邵軒磊(2020)。新住民相關親權酌定裁判書的文字探勘:對「平等」問題的法實證研究嘗試。臺大法學論叢,49(S),1267-1308。
  3. 釋淨如,蔡金田,沈秋宏(2022)。國民小學實施九年一貫課程期間教育領導趨勢之探究-應用文字探勘技術分析。學校行政,138,74-104。
  4. (2020)。初探車禍判決中法院認定之過失比例之因素。月旦法學雜誌,305,206-221。
  5. (2021)。人工智慧與酒駕刑度估計──深度學習卷積神經網路量刑模型之實踐。月旦法學雜誌,312,105-116。
  6. (2022)。智慧法院之發展與界限(上)──演算法、科技治理與司法韌性。月旦法學雜誌,323,72-98。
  7. (2023)。我國法實證研究社群的發展現況—知識結構、引用網絡與質性分析。中研院法學期刊,33,1-80。