This paper reviews real estate price estimation in France, a market that has received little attention. We compare seven popular machine learning techniques by proposing a different approach that quantifies the relevance of location features in real estate price estimation with high and fine levels of granularity. We take advantage of a newly available open dataset provided by the French government that contains 5 years of historical data of real estate transactions. At a high level of granularity, we obtain important differences regarding the models’ prediction powers between cities with medium and high standards of living (precision differences beyond 70% in some cases). At a low level of granularity, we use geocoding to add precise geographical location features to the machine learning algorithm inputs. We obtain important improvements regarding the models’ forecasting powers relative to models trained without these features (improvements beyond 50% for some forecasting error measures). Our results also reveal that neural networks and random forest techniques particularly outperform other methods when geocoding features are not accounted for, while random forest, adaboost and gradient boosting perform well when geocoding features are considered. For identifying opportunities in the real estate market through real estate price prediction, our results can be of particular interest. They can also serve as a basis for price assessment in revenue management for durable and non-replenishable products such as real estate.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
The link to the dataset “Demands of land values” is:https://www.data.gouv.fr/fr/datasets/5c4ae55a634f4117716d5656/.
Abidoye, R. B., Chan, A. P., Abidoye, F. A., & Oshodi, O. S. (2019). Predicting property price index using artificial intelligence techniques. International Journal of Housing Markets and Analysis, 12, 1072.
Akyildirim, E., Goncu, A., & Sensoy, A. (2020). Prediction of cryptocurrency returns using machine learning. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03575-y.
Anselin, L. (2013). Spatial Econometrics: Methods and Models. Berlin: Springer.
Anthony, M., & Bartlett, P. L. (2009). Neural Network Learning: Theoretical Foundations. Cambridge: Cambridge University Press.
Basu, S., & Thibodeau, T. G. (1998). Analysis of spatial autocorrelation in house prices. The Journal of Real Estate Finance and Economics, 17(1), 61–85.
Bekoulis, G., Deleu, J., Demeester, T., & Develder, C. (2018). An attentive neural architecture for joint segmentation and parsing and its application to real estate ads. Expert Systems with Applications, 102, 100–112.
Berk, E., Gürler, Ü., & Yıldırım, G. (2009). On pricing of perishable assets with menu costs. International Journal of Production Economics, 121(2), 678–699.
Baldominos, A., Blanco, I., Moreno, A. J., Iturrarte, R., Bernárdez, Ó., & Afonso, C. (2018). Identifying real estate opportunities using machine learning. Applied Sciences, 8, 2321.
Bidanset, P.E., et al. (2017). “Further evaluating the impact of kernel and bandwidth specifications of geographically weighted regression on the equity and uniformity of mass appraisal models.” In Advances in Automated Valuation Modeling, Springer, 191–99.
Bitter, C., Mulligan, G. F., & Dall’erba, S. . (2007). Incorporating spatial variation in housing attribute prices: a comparison of geographically weighted regression and the spatial expansion method. Journal of Geographical Systems, 9(1), 7–27.
Bogataj, D., McDonnell, D. R., & Bogataj, M. (2016). Management, financing and taxation of housing stock in the shrinking cities of aging societies. International journal of production economics, 181, 2–13.
Borde, S., Rane, A., Shende, G., & Shetty, S. (2017). Real estate investment advising using machine learning. International Research Journal of Engineering and Technology (IRJET), 4(3), 1821–1825.
Borst, R. A., & McCluskey, W. J. (2008). Using geographically weighted regression to detect housing submarkets: Modeling large-scale spatial variations in value. Journal of Property Tax Assessment & Administration, 5(1), 21–54.
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Conference on Learning Theory (pp: 144–152).
Botchkarev, A. (2019). A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisciplinary Journal of Information, Knowledge & Management, 14, 45.
Bourassa, S. C., Cantoni, E., & Hoesli, M. (2007). Spatial dependence, housing submarkets, and house price prediction. The Journal of Real Estate Finance and Economics, 35(2), 143–160.
Bourassa, S. C., Hamelink, F., Hoesli, M., & MacGregor, B. D. (1999). Defining housing submarkets. Journal of Housing Economics, 8(2), 160–183.
Bourassa, S. C., Hoesli, M., & Vincent, S. P. (2003). Do Housing Submarkets Really Matter? Journal of Housing Economics, 12(1), 12–28.
Bourassa, S., Eva, C., & Hoesli, M. (2010). Predicting House Prices with Spatial Dependence: A Comparison of Alternative Methods. Journal of Real Estate Research, 32(2), 139–159.
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.
Case, B., John, C., Robin, D., & Rodriguez, M. (2004). Modeling spatial and temporal house price patterns: A comparison of four models. The Journal of Real Estate Finance and Economics, 29(2), 167–191.
Čeh, M., Kilibarda, M., Lisec, A., & Bajat, B. (2018). Estimating the performance of random forest versus multiple regression for predicting prices of the apartments. ISPRS International Journal of Geo-Information, 7(5), 168.
Chen, B., Bai, R., Li, J., Liu, Y., Xue, N., & Ren, J. (2020). A multiobjective single bus corridor scheduling using machine learning-based predictive models. International Journal of Production Research. https://doi.org/10.1080/00207543.2020.1766716.
Choi, T. M., Wallace, S. W., & Wang, Y. (2018). Big data analytics in operations management. Production and Operations Management, 27, 1868–1883.
Clapp, J. M. (2003). A semiparametric method for valuing residential locations: application to automated valuation. The Journal of Real Estate Finance and Economics, 27(3), 303–320.
Cohen, M. C. (2018). Big data and service operations. Production and Operations Management, 27(9), 1709–1723.
Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297.
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. Information Theory, 13(1), 21–27.
Cui, R., Gallino, S., Moreno, A., & Zhang, D. J. (2018). The operational value of social media information. Production and Operations Management, 27(10), 1749–1769.
D’Amato, V., Di Lorenzo, E., Haberman, S. et al. 2019. “Pension Schemes versus Real Estate.” Annals of Operations Research: 1–13.
d’Amato, M., & Kauko, T. (2017). Advances in Automated Valuation Modeling. Berlin: Springer.
Dana, J. D., Jr. (2008). New directions in revenue management research. Production and Operations Management, 17(4), 399–401.
Devroye, L., Györfi, L., & Lugosi, G. (1996).A Probabilistic Theory of Pattern Recognition, Springer, Berlin
Din, A., Hoesli, M., & Bender, A. (2001). Environmental variables and real estate prices. Urban Studies, 38(11), 1989–2000.
Doumpos, M., Papastamos, D., Andritsos, D., & Zopounidis, C. (2020). Developing automated valuation models for estimating property values: a comparison of global and locally weighted approaches. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03556-1.
Garcia, J. C. E., & Alfandari, L. (2018). Robust location of new housing developments using a choice model. Annals of Operations Research, 271(2), 527–550.
Fik, T. J., Ling, D. C., & Mulligan, G. F. (2003). Modeling spatial variation in housing prices: a variable interaction approach. Real Estate Economics, 31(4), 623–646.
Freund, Y., & Schapire, R. E. (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In European Conference on Computational Learning Theory (pp 23–37).
Geraghty, M. K., & Johnson, E. (1997). Revenue management saves national car rental. Interfaces, 27(1), 107–127.
Gomes, L. F. A. M. (2009). An application of the TODIM method to the multicriteria rental evaluation of residential properties. European Journal of Operational Research, 193(1), 204–211.
Gomes, L. F. A. M., & Rangel, L. A. D. (2009). Determining the utility functions of criteria used in the evaluation of real estate. International Journal of Production Economics, 117(2), 420–426.
Goodman, A. C., & Thibodeau, T. G. (1998). Housing market segmentation. Journal of Housing Economics, 7(2), 121–143.
Goodman, A. C., & Thibodeau, T. G. (2003). Housing market segmentation and hedonic prediction accuracy. Journal of Housing Economics, 12(3), 181–201.
Goodman, A. C., & Thibodeau, T. G. (2007). The spatial proximity of metropolitan area housing submarkets. Real Estate Economics, 35(2), 209–232.
Gröbel, S., & Thomschke, L. (2018). Hedonic pricing and the spatial structure of housing data–an application to Berlin. Journal of Property Research, 35(3), 185–208.
Gu, S., Kelly, B., & Xiu, D. (2020). Empirical asset pricing via machine learning. The Review of Financial Studies, 33(5), 2223–2273.
Harewood, S. I. (2006). Managing a Hotel’s perishable inventory using bid prices. International Journal of Operations & Production Management. https://doi.org/10.1108/01443570610691094.
Helbich, M., & Griffith, D. A. (2016). Spatially varying coefficient models in real estate: eigenvector spatial filtering and alternative approaches. Computers, Environment and Urban Systems, 57, 1–11.
Hu, L., et al. (2019). Monitoring housing rental prices based on social media: An integrated approach of machine-learning algorithms and hedonic modeling to inform equitable housing policies. Land Use Policy, 82, 657–673.
Huang, Y. (2019). Predicting home value in California, United States via machine learning modeling. Statistics, Optimization & Information Computing, 7(1), 66–74.
Isakson, H. R. (1988). Valuation analysis of commercial real estate using the nearest neighbors appraisal technique. Growth and Change, 19(2), 11–24.
Johnson, M. P. (2003). Single-period location models for subsidized housing: Tenant-based subsidies. Annals of Operations Research, 123, 105–124.
Koetter, M., & Poghosyan, T. (2010). Real estate prices and bank stability. Journal of Banking & Finance, 34(6), 1129–1138.
Kok, N., Koponen, E. L., & Martínez-Barbosa, C. A. (2017). Big data in real estate? The Journal of Portfolio Management, 43(6), 202–211.
Kontrimas, V., & Verikas, A. (2011). The mass appraisal of the real estate by computational intelligence. Applied Soft Computing, 11(1), 443–448.
Kuşan, H., Aytekin, O., & Özdemir, İ. (2010). The use of fuzzy logic in predicting house selling price. Expert systems with Applications, 37(3), 1808–1813.
Kusiak, A. (2020). Convolutional and generative adversarial neural networks in manufacturing. International Journal of Production Research, 58(5), 1594–1604.
Lam, K. C., Yu, C. Y., & Lam, C. K. (2009). Support vector machine and entropy based decision support system for property valuation. Journal of Property Research, 26(3), 213–233.
Li, J., & Tang, O. (2012). Capacity and pricing policies with consumer overflow behavior. International Journal of Production Economics, 140(2), 825–832.
Lockwood, T., & Rossini, P. (2011). Efficacy in modelling location within the mass appraisal process. Pacific Rim Property Research Journal, 17(3), 418–442.
Lolli, F., Balugani, E., Ishizaka, A., Gamberini, R., Rimini, B., & Regattieri, A. (2019). Machine learning for multi-criteria inventory classification applied to intermittent demand. Production Planning and Control, 30(1), 76–89.
Mayer, M., Bourassa, S. C., Hoesli, M., & Scognamiglio, D. (2018) Estimation and updating methods for hedonic valuation. Swiss Finance Institute Research Paper (18–76).
McCluskey, W. J., et al. (2013). Prediction accuracy in mass appraisal: A comparison of modern approaches. Journal of Property Research, 30(4), 239–265.
McCluskey, W. J., & Borst, R. A. (2011). Detecting and validating residential housing submarkets. International Journal of Housing Markets and Analysis, 4, 290.
McCluskey, W. J., Daud, D. Z., & Kamarudin, N. (2014). Boosted regression trees: An application for the mass appraisal of residential property in Malaysia. Journal of Financial Management of Property and Construction. https://doi.org/10.1108/JFMPC-06-2013-0022.
McNeill, G., & Hale, S. A. (2017). Generating tile maps (pp. 435–445). Wiley Online Library: In Computer Graphics Forum.
Morano, P., Tajani, F., & Locurcio, M. (2018). Multicriteria analysis and genetic algorithms for mass appraisals in the Italian property market. International Journal of Housing Markets and Analysis. https://doi.org/10.1108/IJHMA-04-2017-0034.
Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87–106.
Shigaki, I., & Narazaki, H. (1999). A machine-learning approach for a sintering process using a neural network. Production Planning and Control, 10(8), 727–734.
Narula, S. C., Wellington, J. F., & Lewis, S. A. (2012). Valuating residential real estate using parametric programming. European Journal of Operational Research, 217(1), 120–128.
Orford, S. (2017). Valuing the built environment: GIS and house price analysis. London: Routledge.
Padhi, S. S., Theogrosse-Ruyken, P., & Das, D. (2015). Strategic revenue management under uncertainty: A case study on real estate projects in India. Journal of Multi-Criteria Decision Analysis, 22(3–4), 213–229.
Pagourtzi, E., Assimakopoulos, V., Hatzichristos, T., & French, N. (2003) Real estate appraisal: A review of valuation methods. Journal of Property Investment & Finance.
Pedersen, A. M. B., Weissensteiner, A., & Poulsen, R. (2013). Financial planning for young households. Annals of Operations Research, 205, 55–73.
Lins, M. P. E., de Lyra Novaes, L. F., & Legey, L. F. L. (2005). Real estate appraisal : A double perspective data envelopment analysis approach. Annals of Operations Research, 138, 79–96.
Pérez-Rave, J. I., Correa-Morales, J. C., & González-Echavarría, F. (2019). A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes. Journal of Property Research, 36(1), 59–96.
Di Pietro, G., & Rinnone, F. (2017). Online geocoding services: A benchmarking analysis to some European cities. In 2017 Baltic Geodetic Congress (BGC Geomatics), IEEE, 273–81.
Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: from theory to algorithms. Cambridge: Cambridge University Press.
Shin, C. K., & Park, S. C. (2000). A machine learning approach to yield management in semiconductor manufacturing. International Journal of Production Research, 38(17), 4261–4271.
Shmueli, G., & Yahav, I. (2018). The forest or the trees? Tackling Simpson’s paradox with classification trees. Production and Operations Management, 27(4), 696–716.
Singh, S. K. (2017). Evaluating two freely available geocoding tools for geographical inconsistencies and geocoding errors. Open Geospatial Data, Software and Standards, 2(1), 11.
Stigler, S. M. (1981). Gauss and the invention of least squares. Annals of Statistics, 9(3), 465–474.
Thériault, M., Des Rosiers, F., Villeneuve, P., & Kestens, Y. (2003). Modelling interactions of location with specific value of housing attributes. Property Management. https://doi.org/10.1108/02637470310464472.
Valier, A. (2020). Who performs better? AVMs vs Hedonic Models”. Journal of Property Investment & Finance, 38, 213.
Viriato, J. C. (2019). AI and machine learning in real estate investment. The Journal of Portfolio Management, 45(7), 43–54.
Wang, D., & Li, V. J. (2019). Mass appraisal models of real estate in the 21st century: A systematic literature review. Sustainability, 11(24), 7006.
Wen, X., Xu, C., & Hu, Q. (2016). Dynamic capacity management with uncertain demand and dynamic price. International Journal of Production Economics, 175, 121–131.
Wu, R. C. (1997). Neural network models: Foundations and applications to an audit decision problem. Annals of Operations Research, 75, 291–301.
Xu, T. (2008). Heterogeneity in housing attribute prices. International Journal of Housing Markets and Analysis, 1, 166.
Yacim, J. A., & Boshoff, D. G. B. (2018). Impact of artificial neural networks training algorithms on accurate prediction of property values. Journal of Real Estate Research, 40(3), 375–418.
Yu, D., & Wu, C. (2006). Incorporating remote sensing information in modeling house values. Photogrammetric Engineering & Remote Sensing, 72(2), 129–138.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Tchuente, D., Nyawa, S. Real estate price estimation in French cities using geocoding and machine learning. Ann Oper Res (2021). https://doi.org/10.1007/s10479-021-03932-5
- Real estate market
- Automated valuation models
- French cities
- Machine learning
- Artificial intelligence