Abstract
Unlike traditional datasets with a few explanatory variables, analysis of datasets with high number of explanatory variables requires different approaches. Determining effective explanatory variables, specifically in a complex and large-scale data provides an excellent opportunity to increase efficiency and reduce costs. In a large-scale data with many variables, a variable selection technique could be used to specify a subset of explanatory variables that are significantly more valuable to analyze specially in the survival data analysis. A heuristic variable selection method through ranking classification to analyze large-scale survival data which reduces redundant information and facilitates practical decision-making by evaluating variable efficiency (the correlation of variable and survival time) is presented. A numerical simulation experiment is developed to investigate the performance and validation of the proposed method.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
IBM: Information Integration and Governance (2011), http://www.ibm.com
McAfee, A., Brynjolfsson, E.: Big data: the management revolution. Harvard Business Review 90, 60–66 (2012)
IBM: What is Big Data? Bringing Big Data to the Enterprise (2013), http://www.ibm.com
Hilbert, M., Lopez, P.: The World’s Technological Capacity to Store, Communicate, and Compute Information. Science 332(6025), 60–65 (2011)
Gartner (2011), http://www.gartner.com/newsroom/id/1731916
Hellerstein, J.: Parallel Programming in the Age of Big Data (2008), https://gigaom.com/2008/11/09/mapreduce-leads-the-way-for-parallel-programming
Segaran, T., Hammerbacher, J.: Beautiful Data: The Stories Behind Elegant Data Solutions. O’Reilly Media, Inc. (2009)
Feldman, D., Schmidt, M., Sohler, C.: Turning big data into tiny data: Constant-size coresets for k-means, pca and projective clustering. In: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1434–1453. SIAM (2013)
Manyika, J., et al.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute (2011)
Moran, J.: Is Big Data a Big Problem for Manufacturers? (2013), http://www.sikich.com/blog/post/Is-Big-Data-a-Big-Problem-for-Manufacturers#.VPswcU_F_BM
Brown, B., Chui, M., Manyika, J.: Are you ready for the era of ‘big data’. McKinsey Quarterly 4, 24–35 (2011)
Russom, P.: Big Data Analytics. TDWI Best Practices Report, Fourth Quarter (2011)
Sadeghzadeh, K., Salehi, M.B.: Mathematical Analysis of Fuel Cell Strategic Technologies Development Solutions in the Automotive Industry by the TOPSIS Multi-Criteria Decision Making Method. International Journal of Hydrogen Energy 36(20), 13272–13280 (2010)
Chai, J., Liu, J.N., Ngai, E.W.: Application of decision-making techniques in supplier selection: A systematic review of literature. Expert Systems with Applications 40(10), 3872–3885 (2013)
Yao, F.: Functional Principal Component Analysis for Longitudinal and Survival Data. Statistica Sinica 17(3), 965 (2007)
Cox, D.R.: Regression Models and Life-Tables. Journal of the Royal Statistical Society 34(2), 187–220 (1972)
Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data, vol. 360. John Wiley & Sons (2011)
Buckley, J., James, I.: Linear Regression with Censored Data. Biometrika 66(3), 429–436 (1979)
Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random Survival Forests. The Annals of Applied Statistics, 841–860 (2008)
Ma, S., Kosorok, M.R., Fine, J.P.: Additive Risk Models for Survival Data with High-Dimensional Covariates. Biometrics 62(1), 202–210 (2006)
Huang, J., Ma, S., Xie, H.: Regularized Estimation in the Accelerated Failure Time Model with High-Dimensional Covariates. Biometrics 62(3), 813–820 (2006)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press (1984)
Lee, E.T., Wang, J.: Statistical Methods for Survival Data Analysis, vol. 476. John Wiley & Sons (2003)
Holford, T.R.: Multivariate Methods in Epidemiology. Oxford University Press (2002)
Mendes, A.C., Fard, N.: Accelerated Failure Time Models Comparison to the Proportional Hazard Model for Time-Dependent Covariates with Recurring Events. International Journal of Reliability, Quality and Safety Engineering 21(2) (2014)
Zeng, D., Lin, D.Y.: Efficient Estimation for the Accelerated Failure Time Model. Journal of the American Statistical Association 102(480), 1387–1396 (2007)
Sadeghzadeh, K., Fard, N.: Nonparametric Data Reduction Approach for Large-Scale Survival Data Analysis. IEEE (2015)
Sadeghzadeh, K., Fard, N.: Multidisciplinary Decision-Making Approach to High-Dimensional Event History Analysis through Variable Reduction. European Journal of Economics and Management 1(2), 76–89 (2014)
Stute, W., Wang, J.L.: The Strong Law under Random Censorship. The Annals of Statistics, 1591–1607 (1993)
Feo, T.A., Resende, M.G.: Greedy Randomized Adaptive Search Procedures. Journal of Global Optimization 6(2), 109–133 (1995)
Hart, J.P., Shogan, A.W.: Semi-Greedy Heuristics: An Empirical Study. Operations Research Letters 6(3), 107–114 (1987)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Fard, N., Sadeghzadeh, K. (2015). Heuristic Ranking Classification Method for Complex Large-Scale Survival Data. In: Le Thi, H., Pham Dinh, T., Nguyen, N. (eds) Modelling, Computation and Optimization in Information Systems and Management Sciences. Advances in Intelligent Systems and Computing, vol 360. Springer, Cham. https://doi.org/10.1007/978-3-319-18167-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-18167-7_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18166-0
Online ISBN: 978-3-319-18167-7
eBook Packages: EngineeringEngineering (R0)