Effort Estimation Based on Collaborative Filtering

  • Naoki Ohsugi
  • Masateru Tsunoda
  • Akito Monden
  • Ken-ichi Matsumoto
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3009)


Effort estimation methods are one of the important tools for project managers in controlling human resources of ongoing or future software projects. The estimations require historical project data including process and product metrics that characterize past projects. Practically, in using the estimation methods, it is a problem that the historical project data frequently contain substantial missing values. In this paper, we propose an effort estimation method based on Collaborative Filtering for solving the problem. Collaborative Filtering has been developed in information retrieval researchers, as one of the estimation techniques using defective data, i.e. data having substantial missing values. The proposed method first evaluates similarity between a target (ongoing) project and each past project, using vector based similarity computation equation. Then it predicts the effort of the target project with the weighted sum of the efforts of past similar projects. We conducted an experimental case study to evaluate the estimation performance of the proposed method. The proposed method showed better performance than the conventional regression method when the data had substantial missing values.


Neighborhood Size Collaborative Filter Mean Absolute Error Effort Estimation Listwise Deletion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Albrecht, A., Gaffney, J.: Software Function, Source Lines of Code, and Development Effort Prediction. IEEE Trans. on Software Eng. 9(6), 83–92 (1979)Google Scholar
  2. 2.
    Boehm, B.W.: Software Engineering Economics. IEEE Trans. on Software Eng. 10(1), 4–21 (1984)CrossRefGoogle Scholar
  3. 3.
    Breese, J.S., Heckerman, D., Kadie, C.: Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In: Proc. of the 14th Conf. on Uncertainty in Artificial Intelligence, pp. 43–52 (1998)Google Scholar
  4. 4.
    Briand, L., Basili, V., Thomas, W.: A Pattern Recognition Approach for Software Engineering Data Analysis. IEEE Trans. on Software Eng. 18(11), 931–942 (1992)CrossRefGoogle Scholar
  5. 5.
    Briand, L., El Eman, K., Wieczorek, I.: Explaining the Cost of European Space and Military Projects. Proc. Int’l Conf. Software Eng. 1(1), 61–88 (1996)Google Scholar
  6. 6.
    Conte, S.D., Dunsmore, H.E., Shen, V.Y.: Software Engineering Metrics and Models. The Benjamin/Cummings Publishing Company, Inc., Menlo Park (1986)Google Scholar
  7. 7.
    Finnie, G., Wittig, G.: A Comparison of Software Effort Estimation Techniques: Using Function Points with Neural Networks, Case-Based Reasoning and Regression Models. Journal of Systems and Software 39, 281–289 (1997)CrossRefGoogle Scholar
  8. 8.
    Goldberg, D., Nichols, D., Oki, B.M., Terry, D.: Using Collaborative Filtering to Weave an Information Tapestry. Comm. of the ACM 35(12), 61–70 (1992)CrossRefGoogle Scholar
  9. 9.
    Gray, A., MacDonnell, D.: A Comparison of Techniques for Developing Predictive Models of Software Metrics. Information and Software Technology 3, 425–437 (1997)CrossRefGoogle Scholar
  10. 10.
    Little, R., Rubin, D.: Statistical Analysis with Missing Data. John Wiley & Sons, Inc., Chichester (1987)zbMATHGoogle Scholar
  11. 11.
    Khoshgoftaar, T.M., Munson, J.C., Bhattacharya, B.B., Richardson, G.D.: Predictive Modeling Techniques of Software Quality from Software Measures. IEEE Trans. on Software Eng. 18(1), 979–987 (1992)CrossRefGoogle Scholar
  12. 12.
    Kromrey, J., Hines, C.: Nonrandomly Missing Data in Multiple Regression: An Empirical Comparison of Common Missing-Data Treatments. Educational and Psychological Measurement 54(3), 573–593 (1994)CrossRefGoogle Scholar
  13. 13.
    Rahhal, S., Madhavji, N.: An Effort Estimation Model for Implementing ISO 9001. In: Proc. of the 2nd IEEE Int’l Software Eng. Standards Symp., pp.278–286 (1995)Google Scholar
  14. 14.
    Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In: Proc. ACM Conf. on Computer Supported Cooperative Work (CSCW 1994), Chapel Hill, North Carolina, United States, pp. 175–186 (1994)Google Scholar
  15. 15.
    Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)zbMATHGoogle Scholar
  16. 16.
    Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.: Item-Based Collaborative Filtering Recommendation Algorithms. In: Proc. 10th International World Wide Web Conference (WWW10), Hong Kong, pp. 285–295 (2001)Google Scholar
  17. 17.
    Shepperd, M., Schofield, C.: Estimating Software Project Effort Using Analogies. IEEE Trans. on Software Eng. 23(12), 76–743 (1997)Google Scholar
  18. 18.
    Srinivasan, K., Fisher, D.: Machine Learning Approaches to Estimating Software Development Effort. IEEE Trans. on Software Eng. 21(2), 126–137 (1995)CrossRefGoogle Scholar
  19. 19.
    Strike, K., El Eman, K., Madhavji, N.: Software Cost Estimation with Incomplete Data. IEEE Trans. on Software Eng. 27(10), 890–908 (2001)CrossRefGoogle Scholar
  20. 20.
    Walston, C., Felix, C.: A Method of Programming Measurement and Estimation. IBM Systems Journal 1, 54–73 (1977)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Naoki Ohsugi
    • 1
  • Masateru Tsunoda
    • 1
  • Akito Monden
    • 1
  • Ken-ichi Matsumoto
    • 1
  1. 1.Graduate School of Information ScienceNara Institute of Science and TechnologyKansai Science CityJapan

Personalised recommendations