题名

利用政府開放資料探討影響台北市房價之主要房屋特性及周邊設施影響因子

并列篇名

EXPLORING THE EFFECT OF ENVIRONMENTAL FACTORS AND RESIDENTIAL CHARACTERISTICS ON HOUSING PRICES IN TAIPEI CITY USING OPEN DATA

DOI

10.6338/JDA.201910_14(5).0001

作者

劉富容(Fu-Jung Liu);游璿達(Hsuan-Ta Yu);黃孝雲(Hsiao-Yun Huang);劉正夫(Jeng-Fu Liu)

关键词

實價登錄 ; 公開資料 ; 房價預估 ; 隨機森林模型 ; 環境因子 ; Actual Selling Price Registration ; housing price valuation ; random forest ; environmental factors

期刊名称

Journal of Data Analysis

卷期/出版年月

14卷5期(2019 / 10 / 01)

页次

1 - 26

内容语文

繁體中文

中文摘要

近年來,由於房價所得比居高不下,房價相關之問題為一般民眾及政府所極度重視之議題。在政府推行不動產交易實價登錄制度並陸續公佈開放資料之情況下,房地交易相關資料被大量揭露,對於了解影響房價之主要房屋特性及周邊設施影響因子這個重要議題,提供了前所未有的大量資訊。然而,如何在這種以分散在各處之資料庫為來源,具有資料量大、資料型態複雜、涵蓋情況多元、及極端值隨處等特性下,探討此議題,為隨之而來的重要挑戰。本研究以在許多應用有絶佳表現的隨機森林法,來做為探討影響房價因子之模型。經實證分析發現,隨機森林模型在房價預估方面,相較於比較之方法,於所有採用之指標上,都有最佳的表現;說明了隨機森林在此類房價分析之的優越性。而利用隨機森林模型之解釋能力,本研究最終由廣泛收集的45個自變數中,挑出了最重要的15個影響房的因子,並說明了這些因子與房價之關係,而且本研究首先發現實價登錄的「備註」欄位,是僅次於行政區外,第二重要的變數。

英文摘要

In recent years, due to the high house price-to-income ratio, housing price related problems are important issues for both government and civilians. A series of relative measures were adopted by Taiwan government, including the actual registration policy of real estate transactions. Together with the open data trend, this is the first time in Taiwan such huge amount of real estate related information is publicly available. This provides a good opportunity for exploring the effect of environmental factors and residential characteristics on housing prices. However, the properties of data from various databases are very complicated. This raises a big challenge for researches in the phase of data analysis. In this paper, a state-of-art model, random forest, was employed for exploring the main factors of housing price. According to our experiment results, random forest is the best model for predicting the housing price. This revealed that random forest is an ideal model for analyzing housing price. In addition, according to the associated variable importance measure, 15 important factors were identified among the broadly collected 45 predictor variables. The relationships of housing price and theses 15 factors were also addressed. Moreover, this paper is the first research reveals that the importance of remark field in the actual price registration data base is surpassed only by the region field.

主题分类 基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 管理學
参考文献
  1. Chen, S. H.,Kuo, T. W.,Tsao, C. Y.(2007).Regression Trees for Housing Price Models: An Empirical Study on Taiwan.Journal of Housing Studies,16(2),1-20.
    連結:
  2. Lin, C. C.,Ma, Y. C.(2007).An Application of Mass Appraisal and the Hedonic Equation in the Real Estate Market in Taiwan.Journal of Housing Studies,16(2),1-22.
    連結:
  3. Tsai, R. H.,Kao, M. C.,Chang, C. O.(1998).Neural Network Technique for Residential Property Appraisal in Taipei.Journal of Housing Studies,8,1-20.
    連結:
  4. 江穎慧(2009)。不動產自動估價與估價師個別估價之比較-以比較法之案例選取、權重調整與估值三階段差異分析。住宅學報,18(1),39-62。
    連結:
  5. 林祖嘉,馬毓駿(2007)。特徵方程式大量估價法在台灣不動產市場之應用。住宅學報,16(2),1-22。
    連結:
  6. 林素菁(2004)。台北市國中小明星學區邊際願意支付之估計。住宅學報,13(1),15-34。
    連結:
  7. 洪得洋,林祖嘉(1999)。台北市捷運系統與道路寬度對房屋價格影響之研究。住宅學報,8,47-67。
    連結:
  8. 陳樹衡,郭子文,棗厥庸(2007)。以決策樹之迴歸樹建構住宅價格模型—台灣地區之實證分析。住宅學報,16(1),1-20。
    連結:
  9. 楊宗憲,蘇倖慧(2011)。迎毗設施與鄰避設施對住宅價格影響之研究。住宅學報,20(2),61-80。
    連結:
  10. 蔡瑞煌,高志明,張金鶚(1998)。類神經網路應用於房地產估價之研究。住宅學報,8,1-20。
    連結:
  11. 賴碧瑩(2007)。應用類神經網路於電腦輔助大量估價之研究。住宅學報,16(2),43-65。
    連結:
  12. Acciani, C.,Fucilli, V.,Sardaro, R.(2011).Data Mining in Real Estate Appraisal: A Model Tree and Multivariate Adaptive Regression Spline Approach.Pubblicazioni Ce. SET,58,27-45.
  13. Antipov, E. A.,Pokryshevskaya, E. B.(2012).Mass Appraisal of Residential Apartments: An Application of Random Forest for Valuation and a CART-Based Approach for Model Diagnostics.Expert Systems with Applications: An International Journal,39(2),1772-1778.
  14. Berry, M. J.,Linoff, G.(1997).Data Mining Techniques: For Marketing, Sales, and Customer Support.New York:John Wiley & Sons, Inc..
  15. Biau, G.(2012).Analysis of a Random Forests Model.Journal of Machine Learning Research,13,1063-1095.
  16. Blum, A.(1992).Neural Networks in C++: An Object-Oriented Framework for Building Connectionist Systems.New York:John Wiley & Sons, Inc..
  17. Breiman, L.(2001).Random Forests.Machine Learning,45,5-32.
  18. Breiman, L.(1996).Bagging Predictors.Machine Learning,26(2),123-140.
  19. Breiman, L.,Friedman, J. H.,Olshen, R. A.,Stone, C. J.(1984).Classification and Regression Trees.Wadsworth.
  20. Calhoun, C. A.(2001).Property Valuation Methods and Data in The United States.Housing Finance International,16,12-23.
  21. Criminisi, A.,Shotton, J.(2013).Decision Forests for Computer Vision and Medical Image Analysis.London:Springer.
  22. Criminisi, A.,Shotton, J.,Konukoglu, E.(2011).Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning.Foundations and Trends in Computer Graphics and Vision,7(2-3),81-227.
  23. Fan, G.,Ong, Z. S. E.,Koh, H. C.(2006).Determinants of House Price: A Decision Tree Approach.Urban Studies,43(12),2301-2315.
  24. Fernandez-Delgado, M.,Cernadas, E.,Barro, S.,Amorim, D.(2014).Do We Need Hundreds of Classifiers to Solve Real World Classification Problems.Journal of Machine Learning Research,15(1),3133-3181.
  25. Freund, Y.,Schapire, R.(1996).Experiments with a New Boosting Algorithm.Machine Learning: Proceedings for the Thirteenth International Conference,San Francisco:
  26. Genuer, R.,Poggi, J. M.,Tuleau, C.(2008).,未出版
  27. Goodman, A. C.(1978).Hedonic Prices, Price Indices and Housing Markets.Journal of Urban Economics,5,471-484.
  28. Goodman, A. C.,Thibodeau, T. G.(1995).Age-Related Heteroskedasticity in Hedonic House Price Equations.Journal of Housing Research,6,25-42.
  29. Grandvalet, Y.(2004).Bagging Equalizes Influence.Machine Learning,55(3),251-270.
  30. Hastie, T.,Tibshirani, R.,Friedman, J.(2013).The Elements of Statistical Learning.New York:Springer.
  31. Ho, T. K.(1998).The Random Subspace Method for Constructing Decision Forests.Pattern Analysis and Machine Intelligence, IEEE Transactions,20(8),832-844.
  32. Hornik, K.,Stichcombe, M.,White, H(1989).Multilayer Feedforward Networks are Universal Approximators.Neural Networks,2(5),359-366.
  33. Kass, G. V.(1980).An Exploratory Technique for Investigating Large Quantities of Categorical Data.Applied Statistics,29(2),119-127.
  34. Lai, P. Y.(2011).Applying the Artificial Neural Network in Computer-assisted Mass Appraisal.Journal of Housing Studies,16(2),43-65.
  35. Liaw, A.,Wiener, M.(2002).Classification and Regression by Randomforest.R News,2(3),18-22.
  36. Limsombunchai, V.,Gan, C.,Lee, M.(2004).House Price Prediction: Hedonic Price Model vs. Artificial Neural Network.American Journal of Applied Sciencies,1(3),193-201.
  37. McCulloch W. S. & W. Pitts (1943), “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bulletin of Mathematical Biology. 5(4):115-133.
  38. Mohd Radzi, M. S.,Muthuveerappan, C.,Kamarudin, N.,Mohammad, I. S.(2012).Forecasting House Price Index Using Artificial Neural Network.International Journal of Real Estate Studies.,7(1)
  39. Morgan, J. N.,Sonquist, J. A.(1963).Problems in the Analysis of Survey Data, and a Proposal.Journal of the American Statistical Association,58,415-434.
  40. Nguyen, N.,Cripps, A.(2001).Predicting Housing Value: A Comparison of Multiple Regression Analysis and Artificial Neural Network.The Journal of Real Estate Research,22(3),313-336.
  41. Pitts W. & W. S. McCulloch (1947), “How We Know Universals the Perception of Auditory and Visual Forms,” Bulletin of Mathematical Biology. 9(3):127-147.
  42. Prasad, A.,Iverson, L.,Liaw, A.(2006).Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction.Ecosystems.,9(2),181-199.
  43. Quinlan, R.(1993).C4.5: Programs for Machine Learning.Morgan Kaufmann.
  44. Rumelhart, D.,Hinton, G.,Williams, R.(1986).Learning Representations by Back-Propagating Errors.Nature,323,533-536.
  45. Sheppard, S.(1999).Hedonic Analysis of Housing Markets.Handbook of regional and urban economics,3,1595-1635.
  46. Shotton, J.,Fitzgibbon, A.,Cook, M.,Sharp, T.,Finocchio, M.,Moore, R.,Kipman, A.,Blake, A.(2011).Real-Time Human Pose Recognition in Parts from Single Depth Images.Computer Vision and Pattern Recognition (CVPR)
  47. Svetnik, V.,Liaw, A.,Tong, C.,Culberson, J.,Sheridan, R.,Feuston, B.(2003).Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling.Journal of Chemical Information and Computer Sciences,43(6),1947-1958.
  48. Tay, D. P.,Ho, D. k.(1991).Artificial Intelligence and the Mass Appraisal of Residential Apartments.Journal of Property Valuation & Investment,10,525-540.
  49. Therneau, T. M.,Atkinson, B.,Ripley, B.(2015).Therneau, T. M., B. Atkinson & B. Ripley (2015), Rpart: Recursive Partitioning and Regression Trees. R package version 4.1-10..
  50. Venables, W. N.,Ripley, B. D.(2002).Modern Applied Statistics with S.New York:Springer.
  51. 內政部地政司(2015) ,實價登錄不動產交易資訊看得清,(http://www.moi.gov.tw/chi/chi_ipmoi_note/ipmoi_note_detail.aspx?type=2014B&sn=149)。
  52. 台北市政府地政局(2012),公告訊息。(https://land.gov.taipei/News_Content.aspx?n=3A9182DCC176FC7C&sms=78D644F2755ACCAA&s=0912C4B77B3BCF63)