Estimation of missing prices in real-estate market agent-based simulations with machine learning and dimensionality reduction methods
- 40 Downloads
The opacity of real-estate market involves some challenges in their agent-based simulation. While some real-estate Web sites provide the prices of a great amount of houses publicly, the prices of the rest are not available. The estimation of these prices is necessary for simulating their evolution from a complete initial set of houses. Additionally, this estimation could also be useful for other purposes such as appraising houses, letting buyers know which are the best offered prices (i.e., the lowest ones compared to the appraisals) and recommending the buyers to set an initial price. This work proposes combining dimensionality reduction methods with machine learning techniques to obtain the estimated prices. In particular, this work analyzes the use of nonnegative factorization, recursive feature elimination and feature selection with a variance threshold, as dimensionality reduction methods. It compares the application of linear regression, support vector regression, the k-nearest neighbors and a multilayer perceptron neural network, as machine learning techniques. This work has applied a tenfold cross-validation for comparing the estimations and errors and assessing the improvement over a basic estimator commonly used in the beginning of simulations. The developed software and the used dataset are freely available from a data research repository for the sake of reproducibility and the support to other researchers.
KeywordsAgent-based simulation Machine learning Real-estate market Simulation setup
This work has been supported by the program “Estancias de movilidad en el extranjero José Castillejo para jóvenes doctores” funded by the Spanish Ministry of Education, Culture and Sport with reference CAS17/00005. This work also acknowledges the research project “Diseño de actividades de aprendizaje colaborativas con Big Data” with reference PIIDUZ_16_120 funded by University of Zaragoza. We acknowledge the research project “Construcción de un framework para agilizar el desarrollo de aplicaciones móviles en el ámbito de la salud” funded by University of Zaragoza and Foundation Ibercaja with grant reference JIUZ-2017-TEC-03. We also acknowledge support from “Universidad de Zaragoza,” “Fundación Bancaria Ibercaja” and “Fundación CAI” in the “Programa Ibercaja-CAI de Estancias de Investigación” with reference IT1/18. This work was partially supported by the Spanish Research grant MTM2015-65433-P (MINECO/FEDER), Gobierno de Aragón and Fondo Social Europeo. Furthermore, we acknowledge the “Fondo Social Europeo” and the “Departamento de Tecnología y Universidad del Gobierno de Aragón” for their joint support with grant number Ref-T81.
Compliance with ethical standards
Conflict of interest
The authors declare that there is not any conflict of interest about this work.
- 2.Bárcena Ruiz MJ, Menéndez P, Palacios MB, Tusell Palmer FJ (2011) Measuring the effect of the real estate bubble: a house price index for Bilbao. Biltoki 5463. http://hdl.handle.net/10810/5463. Last accessed 19 July 2017
- 8.Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Last accessed 19 July 2017
- 10.Chasco Yrigoyen C, Le Gallo J (2012) Hierarchy and spatial autocorrelation effects in hedonic models. Econ Bull 32(2):1474–1480Google Scholar
- 12.Chiarazzo V, Caggiani L, Marinelli M, Ottomanelli M (2014) A neural network based model for real estate price estimation considering environmental quality of property location. Transp Res Procedia 3:810–817. https://doi.org/10.1016/j.trpro.2014.10.067, http://www.sciencedirect.com/science/article/pii/S2352146514002300, 17th Meeting of the EURO working group on transportation, EWGT2014, 2–4 July 2014, Sevilla, Spain
- 16.Davidsson P (2002) Agent based social simulation: a computer science view. J Artif Soc Soc Simul 5(1):1–7Google Scholar
- 17.Dismuke C, Lindrooth R (2006) Ordinary least squares. In: Chumney E, Simpson NK (eds) Methods and designs for outcomes research. American Society of Health-System Pharmacists, Bethesda, pp 93–104Google Scholar
- 20.Galey M (2005) System and method of online real estate listing and advertisement. US Patent App. 10/896,331Google Scholar
- 21.Garca N, Gmez M, Alfaro E (2008) Ann+gis: an automated system for property valuation. Neurocomputing 71(4):733–742. https://doi.org/10.1016/j.neucom.2007.07.031, http://www.sciencedirect.com/science/article/pii/S0925231207003505, Neural Networks: algorithms and applications 50 years of artificial intelligence: a neuronal approach
- 26.García-Magariño I, Medrano C, Delgado J (2017) Python code for the estimation of missing prices in real-estate market with a dataset of house prices from Teruel city. Mendeley Data, v2 https://doi.org/10.17632/mxpgf54czz.2
- 32.Jayaram D, Manrai AK, Manrai LA (2015) Effective use of marketing technology in Eastern Europe: web analytics, social media, customer analytics, digital campaigns and mobile applications. J Econ Finance Adm Sci 20(39):118–132Google Scholar
- 36.Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562Google Scholar
- 37.Li ZX (2006) Using fuzzy neural network in real estate prices prediction. In: 2007 Chinese control conference, pp 399–402. https://doi.org/10.1109/CHICC.2006.4347291
- 40.Nguyen N, Cripps A (2001) Predicting housing value: a comparison of multiple regression analysis and artificial neural networks. J Real Estate Res 22(3):313–336Google Scholar