Abstract
This study presents a methodology to determine risk scores of individuals, for a given financial risk preference survey. To this end, we use a regression-based iterative algorithm to determine the weights for survey questions in the scoring process. Next, we generate classification models to classify individuals into risk-averse and risk-seeking categories, using a subset of survey questions. We illustrate the methodology through a sample survey with 656 respondents. We find that the demographic (indirect) questions can be almost as successful as risk-related (direct) questions in predicting risk preference classes of respondents. Using a decision-tree based classification model, we discuss how one can generate actionable business rules based on the findings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
See, for example http://www.paragonwealth.com/risk_tolerance.php.
- 2.
Also referred to as classification trees.
- 3.
Science high school: specially designated high schools that heavily implement a math- and science-oriented curriculum.
References
Ahn, H., Kim, K., Han, I.: Hybrid genetic algorithms and case-based reasoning systems for customer classification. Expert Syst. 23(3), 127–144 (2006)
Ashby, W.R.: Principles of the self-organizing system. In: Principles of Self-organization, pp. 255–278 (1962)
Aven, T., Renn, O.: Risk Management and Governance: Concepts, Guidelines and Applications. Springer, Berlin (2010)
Barberis, N., Thaler, R.H.: A survey of behavioral finance. In: Constantinides, G.M., Harris, M., Stulz, R.M. (eds.) Handbook of the Economics of Finance, vol. 1, Part 1, pp. 1053–1128. Amsterdam, Elsevier (2003)
Cao, L.: Behavior informatics and analytics: Let behavior talk. In: ICDMW’08. IEEE International Conference on Data Mining Workshops, pp. 87–96 (2008)
Cao, L.: In-depth behavior understanding and use: the behavior informatics approach. Inf. Sci. 180, 3067–3085 (2010)
Chen, F.L., Li, F.C.: Combination of feature selection approaches with SVM in credit scoring. Expert Syst. Appl. 37(7), 4902–4909 (2010)
Chien, C.F., Chen, L.F.: Data mining to improve personnel selection and enhance human capital: A case study in high-technology industry. Expert Syst. Appl. 34(1), 280–290 (2008)
Clarke, B., Fokoué, E., Zhang, H.H.: Principles and Theory for Data Mining and Machine Learning. Springer, Berlin (2009)
Daniel, C., Wood, F.S.: Fitting Functions to Data. Wiley, New York (1980)
Ertek, G., Kaya, M., Kefeli, C., Onur, C., Uzer, K.: Supplementary document for “Scoring and Predicting Risk Preferences”. Available online under http://people.sabanciuniv.edu/ertekg/papers/supp/03.pdf (2011)
Galindo, J., Tamayo, P.: Credit risk assessment using statistical and machine learning: Basic methodology and risk modeling applications. Comput. Econ. 15(1), 107–143 (2000)
Giarratano, J.C., Riley, G.: Expert Systems: Principles and Programming. Brooks/Cole, Pacific Grove (1989)
Grable, J.E.: Financial risk tolerance and additional factors that affect risk taking in everyday. J. Bus. Psychol. 14(4), 625–630 (2000)
Grable, J.E., Lytton, R.H.: Investor risk tolerance: Testing the efficacy of demographics as differentiating and classifying factors. Financ. Couns. Plan. 9(1), 61–74 (1998)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hallahan, T.A., Faff, R.W., Mckenzie, M.D.: An empirical investigation of personal financial risk tolerance. Financial Services Review 13(1), 57–78 (2004)
Harrison, G.W., Lau, M.I., Rutstrom, E.E.: Estimating risk attitudes in Denmark: A field experiment. Scand. J. Econ. 109(2), 341–368 (2007)
Hsieh, N.C.: An integrated data mining and behavioral scoring model for analyzing bank customers. Expert Syst. Appl. 27(4), 623–633 (2004)
Huang, C.L., Chen, M.C., Wang, C.J.: Credit scoring with a data mining approach based on support vector machines. Expert Syst. Appl. 33(4), 847–856 (2007)
Kim, J.K., Song, H.S., Kim, T.S., Kim, H.K.: Detecting the change of customer behavior based on decision tree analysis. Expert Syst. 22(4), 193–205 (2005)
Kim, W., Choi, B.J., Hong, E.K., Kim, S.K., Lee, D.: A taxonomy of dirty data. Data Min. Knowl. Discov. 7(1), 81–99 (2003)
Koh, H.C., Wei, C.T., Chwee, P.G.: A two-step method to construct credit scoring models with data mining techniques. Int. J. Bus. Inf. 1(1), 96–118 (2006)
Kuykendall, L.: The data-mining toolbox. Credit Card Manag. 12(7) (1999)
Lessmann, S., Voß, S.: Supervised classification for decision support in customer relationship management. In: Bortfeldt, A. (ed.) Intelligent Decision Support, p. 231 (2008)
MathWorks: Matlab, http://www.mathworks.com (2011)
Palma, A., Picard, N.: Evaluation of MiFID questionnaires in France. Technical report, AMF (2010)
Patt, A., Peterson, N., Carter, M., Velez, M., Hess, U., Suarez, P.: Making index insurance attractive to farmers. Mitig. Adapt. Strategies Glob. Chang. 14(8), 737–753 (2009)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality (complete samples). Biometrika 52(3/4), 591–611 (1965)
Sreekantha, D.K., Kulkarni, R.V.: Expert system design for credit risk evaluation using neuro-fuzzy logic. Expert Syst., (2010). doi:10.1111/j.1468-0394.2010.00562.x
Sung, J., Hanna, S.: Factors related to risk tolerance. Financ. Couns. Plan. 7, 11–20 (1996)
The R Foundation for Statistical Computing: R Project, http://www.r-project.org (2011)
Thomas, L.C.: A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. Int. J. Forecast. 16(2), 149–172 (2000)
Tsai, C.F., Wu, J.W.: Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst. Appl. 34(4), 2639–2649 (2008)
University of Ljubljana, Bioinformatics Laboratory: Orange, http://orange.biolab.si/ (2011)
Wagner, W.P., Najdawi, M.K., Chung, Q.B.: Selection of knowledge acquisition techniques based upon the problem domain characteristics of production and operations management expert systems. Expert Syst. 18(2), 76–87 (2001)
Wang, X.T., Kruger, D.J., Wilke, A.: Life history variables and risk-taking propensity. Evol. Hum. Behav. 30(2), 77–84 (2009)
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)
Zakrzewska, D., Murlewski, J.: Clustering algorithms for bank customer segmentation (2005)
Acknowledgements
The authors thank Sabancı University (SU) alumni Levent Bora, Kıvanc Kılınç, Onur Özcan, Feyyaz Etiz for their work on earlier phases of the study, and students Serpil Çetin and Nazlı Ceylan Ersöz for collecting the data for the case study. The authors also thank SU students Gizem Gürdeniz, Havva Gözde Ekşiog̃lu and Dicle Ceylan for their assistance. This chapter is dedicated to the memory of Mr. Turgut Uzer, a leading industrial engineer in Turkey, who passed away in February 2011. Mr. Turgut Uzer inspired the authors greatly with his vision, unmatched know-how, and dedication to the advancement of decision sciences.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A: Selected Survey Questions
Following are selected direct (risk-related) questions from the survey of the case study, which constitute the corresponding direct (risk-related) attributes.
Q34
Over the long term, typically, investments which are more volatile (i.e., that tend to fluctuate more in value) have greater potential for return (Stocks, for example, have high volatility; whereas government bonds have low volatility). Given this trade-off, what would be the level of volatility you would prefer for your investment?
-
a
Less than 3%
-
b
3% to 5%
-
c
5% to 7%
-
d
7% to 13%
-
e
More than 13%
Q32
What is your most important investment priority?
-
a
I aim to protect my capital; I cannot stand losing money.
-
b
I am OK with small growth; I cannot take much risk.
-
c
I aim for an investment that delivers the market return rate.
-
d
I want higher than market return; I am OK with volatility.
-
e
Return is the most important for me. I am ready to take high risk for high return.
Q23
Compared to others, how do you rate your willingness to take risk?
-
a
Very low
-
b
Low
-
c
Average
-
d
High
-
e
Very high
Q42
What is your most preferred investment strategy?
-
a
I want my investments to be secure. I also need my investments to provide me with modest income now, or to fund a large expense within the next few years.
-
b
I want my investments to grow and I am less concerned about income. I am comfortable with moderate market fluctuations.
-
c
I am more interested in having my investments grow over the long-term. I am comfortable with short-term return volatility.
-
d
I want long-term aggressive growth and I am willing to accept significant short-term market fluctuations.
Appendix B: ScoringAlgorithm
Following is the mathematical presentation of the developed scoring algorithm:
Sets
- \({\mathcal{I}}\)::
-
set of respondents (observations, rows) in the sample; i=1,…,I
- \({\mathcal{J}}\)::
-
set of attributes (questions, columns); j=1,…,J
- \({\mathcal{V}}\)::
-
set of ordinal values for each attribute; v=1,…,V. For the presented case study, \({\mathcal{V}}= (a,b,c,d,e )\), where a≤b≤c≤d≤e
Inputs
- \(\mathbf{O}={[o_{ij}]}_{I\times J}\)::
-
matrix of ordinal values of all attributes for all respondents
- m j ::
-
number of possible ordinal values for attribute j; m j ≤5 in this study
Internal Variables
- \(\mathbf{A}={[a_{ij}]}_{I\times J}\)::
-
matrix of numerical (nominal) values of all attributes for all respondents
- y i ::
-
temporary adjusted risk score for respondent i, to be used in regression
Parameters
- E::
-
threshold on absolute percentage error (falling below this value will terminate the algorithm)
- α::
-
threshold for type-1 error (probability of rejecting a hypothesis when the hypothesis is in fact true)
- M::
-
a very large number
- \(\mathbf{B}\)::
-
transformation matrix for converting the ordinal input value matrix \({\mathbf{O}}\) into the numerical (nominal) value matrix \({\mathbf{A}}\)
Outputs
- z j ::
-
whether attribute j is to be included in computing the risk score; z j ∈{0,1}
- w j ::
-
weight for attribute j; w j ≥0
- β 0j ::
-
intercept value for attribute j
- β 1j ::
-
slope value for attribute j
- Γ j ::
-
sign multiplier for attribute j; Γ j ∈{−1,1}
- x i ::
-
risk score for respondent i
Functions
\(f (v,n ):({\mathcal{V}}, \{2,\ldots,V \})\to [0,3]\): mapping function for an attribute with n possible values, that transforms the ordinal value v collected for that attribute to a nominal value b v,n−1.
where, for V=5,
\(\mathit{regression}(\mathbf{y},\mathbf{a}')\)
solve regression model \(\mathbf{y}={\beta}_{0}+{\beta }_{1}\mathbf{a}'+\varepsilon\) for vectors \(\mathbf{y}\) and \(\mathbf{a}'\)
return (p, β 0,β 1), where p is the p-value for the regression model
\(\mathit{preprocess}()\)
// transform ordinal attribute values to nominal values
\(a_{ij}={\mathbf{\ }}f (o_{ij},m_{j} ); \forall(i,j)\in\ {\mathcal{I}}\times{\mathcal{J}}\)
Iteration-Related Notation
- k::
-
iteration count
- N::
-
number of attributes included in risk score computations at a given iteration
- W::
-
sum of weights for attributes
- ε k ::
-
absolute error at a given iteration k
- e k ::
-
absolute percentage error at a given iteration k
- \(\overline{e}_{k}\)::
-
average absolute percentage error at a given iteration k
\(\mathit{ScoringAlgorithm}({\mathbf{O}},\ m_{j} )\)
BEGIN
// perform pre-processing to transform ordinal data to nominal data
\(\mathit{preprocess}()\)
// initialization:
// initially, all attributes are included in scoring,
// with unit weight of 1 and sign multiplier of 1.
// all of the regression intercepts are 0.
z j =1, w j =1, Γ j =1, β 0j =0; \(\forall j\in{\mathcal{J}}\)
N=∑ j z j
// begin with iteration count of 1
k=1
\(\mathit{Begin\_Iteration}\)
// standardize the weights, so that their sum W will equal to N
W=∑ j w j z j
w j ←(Nw j )/W; \(\forall j\in{\mathcal{J}}\)
// compute the average of the intercepts
\({\overline{\beta}}_{0\cdot}={(\sum_{j}{{\beta}_{0j}z_{j}})}/{N}\)
// compute/update the risk scores at iteration k,
// which is composed of the average intercept value
// and the sum of weighted values for attributes
\(x_{ik}=\overline{\beta}_{0\cdot}+\sum_{j}{\varGamma }_{j}\) w j a ij ; \(\forall i\in{\mathcal{I}}\)
// compute total absolute error
ε k =∑ i |x ik −x i,k−1|
// correction for the initial error values
if k=1 then
ε 0=ε 1
// termination condition
if ε k =0 then
go to \(\mathit{Iterations\_Completed}\)
// compute absolute percentage error,
// and then its average over the last two iterations
\({\overline{x}}_{\cdot k}={\sum_{i}{x_{ik}}}/{I}\)
\(e_{k}={100{\varepsilon}_{k}}/{{\overline{x}}_{\cdot k}}\)
\({\overline{e}}_{k}=(e_{k}+e_{k-1})/2\)
// if the stopping criterion is satisfied, terminate the algorithm
if \({\overline{e}}_{k}<E\) then
go to \(\mathit{Iterations\_Completed}\)
// otherwise, continue with the regression modeling for each attribute j,
// and then go to next iteration
\(\forall j\in{\mathcal{J}}\)
// if the attribute is included in the risk score calculation
if z j =1 then
// first remove the attribute value from the incumbent score
// to eliminate its effect
\(y_{i}=x_{ik}-a_{ij};\ \forall i\in{\mathcal{I}}\)
// then define the vectors for the regression model of that attribute
\(\mathbf{y}= (y_{i} )\); \(\mathbf{a}'=(\varGamma _{j} a_{\cdot j})\)
\((p,\ {\beta}_{0},{\beta}_{1} )=\mathit {regression}(\mathbf{y},\mathbf{a}')\)
// if the regression yields a high p value
// that is greater than the type-1 error,
// this means that attribute j does not contribute significantly
// to the risk scores
if p>α then
// and the attribute should not be included in risk calculations
z j =0
else
// else it will be included (will just keep its default value)
z j =1
// and weight for the attribute will be the slope value
// obtained from the regression
w j =β 1
// the sign of the slope is important;
// if it is negative, this should be noted
if β 1<0 then
// record the sign change in the sign multiplier
Γ j =−1
else
Γ j =1
// advance the iteration count and begin the next iteration
k++
go to \(\mathit{Begin\_Iteration}\)
\(\mathit{Iterations\_Completed}\)
x i =x ik
return x i , z j ,w j ,Γ j , β 0j
END
Rights and permissions
Copyright information
© 2012 Springer-Verlag London
About this chapter
Cite this chapter
Ertek, G., Kaya, M., Kefeli, C., Onur, Ö., Uzer, K. (2012). Scoring and Predicting Risk Preferences. In: Cao, L., Yu, P. (eds) Behavior Computing. Springer, London. https://doi.org/10.1007/978-1-4471-2969-1_9
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2969-1_9
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2968-4
Online ISBN: 978-1-4471-2969-1
eBook Packages: Computer ScienceComputer Science (R0)