Abstract
Turnover of employee considers as one of the major issue that every company faces. Especially, if the employee has advance skills at his/her working field, then the company faces great loss during that period. To find out the most dominant reasons of employee attrition, we approach by determining features and using machine learning algorithms where features have been processed and reduced beforehand. We have proposed a new model where particular attributes of employee turnover have been selected and adjusted accordingly. In first phase of our reduction method, Sequential Backward Selection Algorithm (SBS) has been used to reduce the features from a higher number to a relatively smaller significant number. After that Chi2 and Random Forest importance algorithm have been used together for the second phase of reduction to determine the common important features by both of the algorithms which can be considered as the foremost features that lead to employee turnover. Our two steps feature selection technique confirms that there are mainly three features that are responsible for employee’s departure. Later, these selected minimal features have been tested with state of the art algorithms of machine learning, such as Decision Tree, Random Forest, Support Vector Machine, Multi-layer Perceptron (MLP), K-Nearest Neighbor (kNN) and Gaussian Naïve Bayes. Lastly, the test result has been visualized by 3D representation to learn the features that are precisely involved for the employee’s turnover.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sikaroudi, E., Mohammad, A., Ghousi, R., Sikaroudi, A.: A data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing). J. Ind. Syst. Eng. 8(4), 106–121 (2015)
Gao, Y.: Using decision tree to analyze the turnover of employees (dissertation) (2017). http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-325113
Ajit, P.: Prediction of employee turnover in organizations using machine learning algorithms. Algorithms 4(5), C5 (2016)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)
Heiat, A.: Predicting employee attrition through data mining (2016). http://wdsinet.org/Annual_Meetings/2016_Proceedings/papers/Paper228.pdf. Accessed 1 Oct 2017
Kuldeep, L.: Human Resources Analytics (2016, Fall). https://www.kaggle.com/ludobenistant/hr-analytics/data. Accessed 01 Oct 2017
Pandas.factorize. (n.d.). https://pandas.pydata.org/pandas-docs/stable/generated/pandas.factorize.html. Accessed 01 Oct 2017
Sklearn.preprocessing.MinMaxScaler. (n.d.).from http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html#sklearn.preprocessing.MinMaxScaler. Accessed 01 Oct 2017
Sklearn.preprocessing.StandardScaler. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html. Accessed 01 Oct 2017
Sklearn.preprocessing.RobustScaler. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html. Accessed 01 Oct 2017
Sklearn.model_selection.train_test_split. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html. Accessed 29 Nov 2017
Raschka, S.M.:. Python Machine Learning -. S.l.: Packt Publishing Limited (2017)
Random forest feature importance. (n.d.). http://blog.datadive.net/selecting-good-features-part-iii-random-forests/. Accessed 01 Oct 2017
Sklearn.feature_selection.chi2. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.chi2.html. Accessed 01 Oct 2017
Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388–391. IEEE, November 1995
Sklearn.tree.DecisionTreeClassifier. (n.d.). http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html. Accessed 01 Oct 2017
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
RBF SVM parameters. (n.d.). http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#sphx-glr-auto-examples-svm-plot-rbf-parameters-py. Accessed 01 Oct 2017
Weston, J.: Support vector machine (and statistical learning theory) tutorial. NEC Labs Am. 4 (1998)
Vapnik, V. (2013). The nature of statistical learning theory. Springer Science & Business Media
How the Naive Bayes Classifier works in Machine Learning, 19 February 2017. http://dataaspirant.com/2017/02/06/naive-bayes-classifier-machine-learning/. Accessed 01 Oct 2017
Srivastava, T., Blog, G., Rizvi, M. S., Jain, K., Jain, S.:Introduction to KNN, K-Nearest Neighbors: Simplified, 16 April 2015. https://www.analyticsvidhya.com/blog/2014/10/introduction-k-neighbours-algorithm-clustering/. Accessed 01 Oct 2017
Teknomo, K. (n.d.). How K-Nearest Neighbor (KNN) Algorithm works? http://people.revoledu.com/kardi/tutorial/KNN/HowTo_KNN.html. Accessed 01 Oct 2017
A Detailed Introduction to K-Nearest Neighbor (KNN) Algorithm. 17 May 2010. https://saravananthirumuruganathan.wordpress.com/2010/05/17/a-detailed-introduction-to-k-nearest-neighbor-knn-algorithm/. Accessed 01 Oct 2017
Receiver operating characteristic, 21 November 2017. https://en.wikipedia.org/wiki/Receiver_operating_characteristic. Accessed 29 Nov 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Alam, M.M., Mohiuddin, K., Islam, M.K., Hassan, M., Hoque, M.AU., Allayear, S.M. (2019). A Machine Learning Approach to Analyze and Reduce Features to a Significant Number for Employee’s Turn Over Prediction Model. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 857. Springer, Cham. https://doi.org/10.1007/978-3-030-01177-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-01177-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01176-5
Online ISBN: 978-3-030-01177-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)