题名 |
應用中文句法權重於潛在語意分析技術於中文智能教學系統之對話計分研究 |
并列篇名 |
Scoring Dialogue in Chinese Intelligent Tutoring System Based on Weighted Latent Semantic Analysis |
DOI |
10.6342/NTU.2011.02712 |
作者 |
陳家毅 |
关键词 |
對話計分 ; 潛在語意分析 ; 加權向量空間模型 ; 中文句法結構樹 ; Scoring dialogue ; latent semantic analysis ; weighting ; syntactic information |
期刊名称 |
國立臺灣大學工程科學及海洋工程學系學位論文 |
卷期/出版年月 |
2011年 |
学位类别 |
碩士 |
导师 |
郭振華 |
内容语文 |
英文 |
中文摘要 |
本論文遵循通用的智慧教學系統架構建立一個系統平台以研究自然語言對話為基礎的電腦輔助教學。潛在語意分析(LSA)技術是一種用概念比對方式來模擬人類在高等認知的知識表現統計模型。此技術雖在許多領域有顯著的應用發展,然而缺乏字詞順序以及句法結構的資訊使得LSA在對話式教學情境中常錯估使用者的語句與期望答案的相似度。本研究從中研院發表的中文句法結構樹資料庫(Sinica Treebank)提供句法訊息,並找到一個合適的權重函數,調整使用者語句中的各個詞在向量空間的值,改善LSA之向量空間模型所缺乏的字詞順序以及句法結構資訊的缺點。本論文實驗並探討了三種權重函數的形式,分析二十位系統受試者共兩百六十四組問答紀錄,最後統計結果顯示所測試之三種權重函數中,以 2的n次方擁有較好的相對準確度和精確度,其中n為中心語在中文句法結構樹上的高度。 |
英文摘要 |
In this study we followed the generic framework of intelligent tutoring systems (ITS) and constructed an ITS platform to investigate computer-assisted instruction with natural language dialogue. Unlike the literal-matching, Latent Sematic Analys (LSA) is utilized primarily to model higher level cognition as an approach of concept-matching. LSA as a statistical model of human language knowledge representation has been highly successful in many different areas. However, the neglect of word order and syntactic information becomes the primary limitation of LSA. The notion of adding syntactic information to improve the limitation of LSA is proposed in this work. The idea is that the weight of each element of vectors was adjusted according to its syntactic structure in a sentence. The Sinica Treebank is adopted as foundation of weight determination. Three kinds of weighting function were proposed and positive results had been tested. The weighting function 2 powered by n provided highest relative precision and accuracy among those weighting functions. In addition to LSA, the results revealed that weighting function is also effective for traditional vector space models. |
主题分类 |
基礎與應用科學 >
海洋科學 工學院 > 工程科學及海洋工程學系 工程學 > 工程學總論 |
被引用次数 |