Skip to main content

Auto-detection of Safety Issues in Baby Products

  • Conference paper
  • First Online:
Recent Trends and Future Technology in Applied Intelligence (IEA/AIE 2018)

Abstract

Every year, thousands of people receive consumer product related injuries. Research indicates that online customer reviews can be processed to autonomously identify product safety issues. Early identification of safety issues can lead to earlier recalls, and thus fewer injuries and deaths. A dataset of product reviews from Amazon.com was compiled, along with SaferProducts.gov complaints and recall descriptions from the Consumer Product Safety Commission (CPSC) and European Commission Rapid Alert system. A system was built to clean the collected text and to extract relevant features. Dimensionality reduction was performed by computing feature relevance through a Random Forest and discarding features with low information gain. Various classifiers were analyzed, including Logistic Regression, SVMs, Naïve-Bayes, Random Forests, and an Ensemble classifier. Experimentation with various features and classifier combinations resulted in a logistic regression model with 66% precision in the top 50 reviews surfaced. This classifier outperforms all benchmarks set by related literature and consumer product safety professionals.

The original version of this chapter was revised: Table 4 and Fig. 2 were corrected. The erratum to this chapter is available at https://doi.org/10.1007/978-3-319-92058-0_87

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 29 August 2018

    An erratum has been published.

References

  1. United States Consumer Product Safety Commission 2015 Annual Report to the President and Congress (2015). https://www.cpsc.gov/s3fs-public/FY15AnnualReport.pdf. Accessed 14 Mar 2017

  2. About CPSC. https://www.cpsc.gov/About-CPSC/Contact-Information. Accessed 16 Mar 2017

  3. CPSC Director Interview. In collab. with Dennis Blasius, 9 January (2017)

    Google Scholar 

  4. 15 September 2016. https://www.cpsc.gov/Recalls/2016/samsung-recalls-galaxy-note7-smartphones. Accessed 14 Mar 2017

  5. The world’s first fried Note 7. 24 August 2016. http://tieba.baidu.com/p/4747843017. Accessed 14 Mar 2017

  6. Samsung is finally working with the U.S. government on a formal recall of the Galaxy Note 7, September 9 2016. https://www.recode.net/2016/9/9/12866952/samsung-cpsc-galaxy-note-7. Accessed 14 Apr 2017

  7. CPSC Hotline. https://www.cpsc.gov/s3fs-public/178.pdf. Accessed 14 Mar 2017

  8. CPSC User Test #1. In collab. with Michelle Mach and Renee Morelli-Linen, 14 February (2017)

    Google Scholar 

  9. CPSC User Test #2. In collab. with Michelle Mach and co, 9 March (2017)

    Google Scholar 

  10. Preliminary Classification. Mining Online Data for Early Identification of Unsafe Food Products. https://uwescience.github.io/DSSG2016-UnsafeFoods/preliminary-classification-models/

  11. Zhang, X., Niu, S., Zhang, D., Wang, G.A., Fan, W.: Predicting vehicle recalls with user-generated contents: a text mining approach. In: Chau, M., Wang, G.A., Chen, H. (eds.) PAISI 2015. LNCS, vol. 9074, pp. 41–50. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18455-5_3

    Chapter  Google Scholar 

  12. Winkler, M., et al.: Toy safety surveillance from online reviews. Decis. Support Syst. 90, 23–32 (2016)

    Article  Google Scholar 

  13. Abrahams, A.S., et al.: Vehicle defect discovery from social media. Decis. Support Syst. 54(1), 87–97 (2012)

    Article  Google Scholar 

  14. Amazon product data. http://jmcauley.ucsd.edu/data/amazon/

  15. Step 6: Best Practices. CPSC.gov. https://www.cpsc.gov/Business-Manufacturing/Business-Education/Safety-Academy/Step-6

  16. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, vol. 10, Stroudsburg, PA, USA, p. 79 (2002)

    Google Scholar 

  17. Lau, J.H., Baldwin, T.: An empirical evaluation of doc2vec with practical insights into document embedding generation. In: arXiv:1607.05368 [cs], July 2016. Accessed 16 Mar 2017

  18. Khorsheed, M.S., Al-Thubaity, A.O.: Comparative evaluation of text classification techniques using a large diverse Arabic dataset. In: Lang. Resour. Eval. 47(2), 513–538 (2013). Accessed 16 Mar 2017

    Google Scholar 

  19. Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. In: Neural Process. Lett. 9(3), 293–300 (1999). ISSN: 1370–4621. http://journals.scholarsportal.info/detailsundefined. Accessed 16 Mar 2017

  20. Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. Adv. Neural Inf. Process. Syst. 2, 841–848 (2002)

    Google Scholar 

  21. Liaw, A., Wiener, M.: Classification and regression by random forest. R News 2(3), 18–22 (2002)

    Google Scholar 

  22. Ruta, D., Gabrys, B.: Classifier selection for majority voting. Inf. Fusion 6(1), 63–81 (2005)

    Article  Google Scholar 

  23. Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth-Heinemann, Newton (1979). ISBN 978-0-408-70929-3

    Google Scholar 

  24. Law, D., Gruss, R., Abrahams, A.S.: Automated defect discovery for dishwasher appliances from online consumer reviews. In: Exp. Syst. Appl. 67, 84–94 (2017)

    Google Scholar 

  25. Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: The 50th Annual Meeting of the Association for Computational Linguistics: Short Papers, vol. 2. Stroudsburg, PA, USA, pp. 90–94 (2012)

    Google Scholar 

  26. Richard Landis, J., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics, 159–174 (1977)

    Google Scholar 

  27. Christine User Test #1. In collab. with Christine Simpson, 8 February 2017

    Google Scholar 

  28. Qu, L., Ifrim, G., Weikum, G.: The bag-of-opinions method for review rating prediction from sparse text patterns. In: Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, pp. 913–921 (2010)

    Google Scholar 

  29. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 160–167. ACM, New York (2008). ISBN 978-1- 60558-205-4

    Google Scholar 

  30. Health Canada French Language Recall Alerts. Health Canada. http://www.canadiensensante.gc.ca/recall-alert-rappel-avis/index-fra.php. Accessed 28 Oct 2017

Download references

Acknowledgments

The authors would like to thank Dr. Olga Vechtomova (University of Waterloo, Canada) for her guidance. The authors appreciate the insights on product safety agencies provided by Christine Simpson (former Health Canada Product Safety Officer), Dennis Blasius (CPSC, Director of Field Investigations Division), Michelle Mach and Renee Morelli-Linen (both CPSC Internet Investigative Analysts). The authors would also like to thank Dr. Alan Abrahams for his advice and for providing his smoke word list. Finally, the authors would like to thank Dr. Julian McCauley for providing his corpus of Amazon reviews.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Graham Bleaney .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bleaney, G., Kuzyk, M., Man, J., Mayanloo, H., Tizhoosh, H.R. (2018). Auto-detection of Safety Issues in Baby Products. In: Mouhoub, M., Sadaoui, S., Ait Mohamed, O., Ali, M. (eds) Recent Trends and Future Technology in Applied Intelligence. IEA/AIE 2018. Lecture Notes in Computer Science(), vol 10868. Springer, Cham. https://doi.org/10.1007/978-3-319-92058-0_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-92058-0_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92057-3

  • Online ISBN: 978-3-319-92058-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics