Abstract
As a distributed computing framework, MapReduce partially overcomes centralized system’s limitations about computation and storage. However, for matrix computation, there is a paradox between distributed data storage and intensive-coupled computing. To solve this problem, new approaches for matrix transposition and multiplication with MapReduce were brought forward. By applying a new model based on parallel matrix computing methods, the bottleneck of computing for logistic regression algorithm was overcome successfully. Experimental results proved that the new computing model can achieve nearly linear speedup.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cheng, T.C., Sang, K.K., Lin, Y.A., Yu, Y.Y., Bradski, G., Andrew, Y.N., Olukotun, K.: Map-Reduce for Machine Learning on Multicore. In: Neural Information Processing Systems Conference, pp. 281–288 (2006)
Chao, J., Vecchiola, C., Buyya, R.: MRPGA: An Extension of MapReduce for Parallelizing Genetic Algorithms. In: IEEE Fourth International Conference on eScience, pp. 214–221. IEEE Press, New York (2008)
McNabb, A.W., Monson, C.K.,, Seppi, K.D.: Parallel PSO using MapReduce. In: IEEE Congress on Evolutionary Computation, pp. 7–14. IEEE Press, New York (2007)
Singh, S., Kubica, J., Larsen, S., Sorokina, D.: Parallel Large Scale Feature Selection for Logistic Regression. In: 9th SIAM International Conference on Data Mining, pp. 1165–1176. SIAM Press, Philadelphia (2009)
Elsayed, T., Lin, J., Douglas, W.O.: Pairwise Document Similarity in Large Collections with MapReduce. In: 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 155–162. ACM Press, New York (2009)
Matsunaga, A., Tsugawa, M., Fortes, J.: CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications. In: IEEE Fourth International Conference on eScience, pp. 222–229. IEEE Press, New York (2008)
Vrba, Z., Halvorsen, P., Griwodz, C., Beskow, P.: Kahn Process Networks are a Flexible Alternative to MapReduce. In: 11th IEEE International Conference on High Performance Computing and Communications, pp. 154–162. IEEE Press, New York (2009)
Pregibon, D.: Logistic Regression Diagnostics. The Annals of Statistics 9, 705–724 (1981); IMS Production, Philadelphia
Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning- Data Mining, Inference and Prediction. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, Z., Liu, M. (2011). Logistic Regression Parameter Estimation Based on Parallel Matrix Computation. In: Zhou, Q. (eds) Theoretical and Mathematical Foundations of Computer Science. ICTMF 2011. Communications in Computer and Information Science, vol 164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24999-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-24999-0_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24998-3
Online ISBN: 978-3-642-24999-0
eBook Packages: Computer ScienceComputer Science (R0)