Supervised item response models for informative prediction

Idé, Tsuyoshi; Dhurandhar, Amit

doi:10.1007/s10115-016-0976-2

Supervised item response models for informative prediction

Regular Paper
Published: 02 August 2016

Volume 51, pages 235–257, (2017)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Tsuyoshi Idé¹ &
Amit Dhurandhar¹

506 Accesses
9 Citations
3 Altmetric
Explore all metrics

Abstract

Supporting human decision-making is a major goal of data mining. The more decision-making is critical, the more interpretability is required in the predictive model. This paper proposes a new framework to build a fully interpretable predictive model for questionnaire data, while maintaining a reasonable prediction accuracy with regard to the final outcome. Such a model has applications in project risk assessment, in healthcare, in social studies, and, presumably, in any real-world application that relies on questionnaire data for informative and accurate prediction. Our framework is inspired by models in item response theory (IRT), which were originally developed in psychometrics with applications to standardized academic tests. We extend these models, which are essentially unsupervised, to the supervised setting. For model estimation, we introduce a new iterative algorithm by combining Gauss–Hermite quadrature with an expectation–maximization algorithm. The learned probabilistic model is linked to the metric learning framework for informative and accurate prediction. The model is validated by three real-world data sets: Two are from information technology project failure prediction and the other is an international social survey about people’s happiness. To the best of our knowledge, this is the first work that leverages the IRT framework to provide informative and accurate prediction on ordinal questionnaire data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Article Open access 30 January 2023

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 01 March 2024

References

Baker FB, Kim SH (2004) Item Response Theory: Parameter Estimation Techniques, 2nd edn. CRC Press, Boca Raton
Bellet A, Habrard A, Sebban M (2013) A survey on metric learning for feature vectors and structured data. arXiv:1306.6709
Bishop CM (2006) Pattern Recognition and Machine Learning. Springer-Verlag New York
Borji A, Itti L (2013) Bayesian optimization explains human active search. In: Burges C, Bottou L, Welling M, Ghahramani Z, Weinberger K (eds) Advances in Neural Information Processing Systems 26, pp 55–63
Chapelle O, Chang Y, Liu T (eds) (2011) Proceedings of the Yahoo! Learning to Rank Challenge, held at ICML 2010, Haifa, Israel, June 25, 2010, JMLR Proceedings, vol 14
Cuturi M, Avis D (2014) Ground metric learning. Journal of Machine Learning Research 15(1):533–564
MathSciNet MATH Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33(1):1–22
Article Google Scholar
Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighbourhood component analysis. Advances in Neural Information Processing Systems 17:513–520
Google Scholar
Guillaumin M, Verbeek J, Schmid C (2009) Is that you? metric learning approaches for face identification. In: Computer Vision, 2009 IEEE 12th International Conference, pp 498–505
Hastie T, Tibshirani R, Friedman J (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York
Hildebrand HB (1974) Introduction to Numerical Analysis, 2nd edn. Dover
Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainathand T, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29(6):82–97
Article Google Scholar
Idé T, Dhurandhar A (2015) Informative prediction based on ordinal questionnaire data. In: Proceedings of 2015 IEEE International Conference on Data Mining (ICDM 15), pp 191–200
Idé T, Güven S, Jan EE, Makogon S, Venegas A (2015) Latent trait analysis for risk management of complex information technology projects. In: Proceedings of the 14th IFIP/IEEE International Symposium on Integrated Network Management, IM 15, pp 305–312
Koren Y, Sill J (2011) Ordrec: An ordinal model for predicting personalized item rating distributions. In: Proceedings of the Fifth ACM Conference on Recommender Systems, RecSys ’11, pp 117–124
Koren Y, Sill J (2013) Collaborative filtering on ordinal user feedback. In: IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp 3022–3026
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Article Google Scholar
Kostinger M, Hirzer M, Wohlhart P, Roth P, Bischof H (2012) Large scale metric learning from equivalence constraints. In: Proc. 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2288–2295
Lan AS, Waters AE, Studer C, Baraniuk RG (2014) Sparse factor analysis for learning and content analytics. Journal of Machine Learning Research 15(1):1959–2008
MathSciNet MATH Google Scholar
Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp 609–616
McCullagh P (1980) Regression models for ordinal data. Journal of the Royal Statistical Society Series B (Methodological) 42(2):109–142
MathSciNet MATH Google Scholar
Murray W, Wright MH (1995) Line search procedures for the logarithmic barrier function. SIAM Journal on Optimization 4(2):229–246
Article MathSciNet MATH Google Scholar
Osogami T, Otsuka M (2014) Restricted boltzmann machines modeling human choice. Advances in Neural Information Processing Systems 27:73–81
Google Scholar
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1–2):1–135
Article Google Scholar
SAT (2015) Wikipedia. http://en.wikipedia.org/wiki/SAT
Stevens SS (1946) On the theory of scales of measurement. Science 103(2684):677–680
Article MATH Google Scholar
Sun BY, Li J, Wu D, Zhang XM, Li WB (2010) Kernel discriminant learning for ordinal regression. IEEE Transactions on Knowledge and Data Engineering 22(6):906–910
Article Google Scholar
Terada Y, Luxburg UV (2014) Local ordinal embedding. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), JMLR Workshop and Conference Proceedings, pp 847–855
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
MATH Google Scholar
Wilson M (2004) Constructing Measures. Psychology Press
World Values Survey Association (2015) World Values Survey. www.worldvaluessurvey.org, Wave 6, 2010–2014, Official Aggregate v.20150418
Xing EP, Jordan MI, Russell S, Ng AY (2002) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 505–512

Download references

Author information

Authors and Affiliations

IBM Research, T. J. Watson Research Center, 1101 Kitchawan Road, Yorktown Heights, NY, 10592, USA
Tsuyoshi Idé & Amit Dhurandhar

Authors

Tsuyoshi Idé
View author publications
You can also search for this author in PubMed Google Scholar
Amit Dhurandhar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tsuyoshi Idé.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Idé, T., Dhurandhar, A. Supervised item response models for informative prediction. Knowl Inf Syst 51, 235–257 (2017). https://doi.org/10.1007/s10115-016-0976-2

Download citation

Received: 10 January 2016
Revised: 28 April 2016
Accepted: 23 July 2016
Published: 02 August 2016
Issue Date: April 2017
DOI: https://doi.org/10.1007/s10115-016-0976-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supervised item response models for informative prediction

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Supervised item response models for informative prediction

Abstract

Access this article

Similar content being viewed by others

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation