Skip to main content

Task-related Item-Name Discovery Using Text and Image Data from the Internet

  • Conference paper
  • First Online:
  • 877 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 751))

Abstract

There is a huge number of data on the Internet that can be used for the development of machine learning in a robot or an AI agent. Utilizing this unorganized data, however, usually requires pre-collected database, which is time-consuming and expensive to make. This paper proposes a framework for collecting names of items required for performing a task, using text and image data available on the Internet without relying on any dictionary or pre-made database. We demonstrate a method to use text data acquired from Google Search to estimate term frequency-inverse document frequency (TF-IDF) value for task-word-relation verification, then identify words that are likely to be an item-name using image classification. We show the comparison results of measuring words’ item-name likelihood using various image classification settings. Finally, we have demonstrated that our framework can discover more than 45% of the desired item-names on three example tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bird, S.: Nltk: the natural language toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72. Association for Computational Linguistics (2006)

    Google Scholar 

  2. Bird, S., Klein, E., Loper, E.: Natural language processing with Python: analyzing text with the natural language toolkit, O’Reilly Media, Inc. (2009)

    Google Scholar 

  3. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: BMVC (2014)

    Google Scholar 

  4. Chen, J., Cui, Y., Ye, G., Liu, D., Chang, S.F.: Event-driven semantic concept discovery by exploiting weakly tagged internet images. In: Proceedings of International Conference on Multimedia Retrieval, ICMR ’14, pp. 1:1–1:8. ACM, New York, NY, USA (2014). http://doi.acm.org/10.1145/2578726.2578729

  5. Chen, X., Gupta, A.: Webly supervised learning of convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1431–1439 (2015)

    Google Scholar 

  6. Divvala, S.K., Farhadi, A., Guestrin, C.: Learning everything about anything: Webly-supervised visual concept learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2014)

    Google Scholar 

  7. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  8. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from internet image searches. Proc. IEEE 98(8), 1453–1466 (2010)

    Article  Google Scholar 

  9. Girshick, R.: Fast R-CNN. In: The IEEE International Conference on Computer Vision (ICCV) (Dec 2015)

    Google Scholar 

  10. Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-SVMS for object detection and beyond. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 89–96. IEEE (2011)

    Google Scholar 

  11. Michel, J.B., Shen, Y.K., Aiden, A.P., Veres, A., Gray, M.K., Pickett, J.P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., et al.: Quantitative analysis of culture using millions of digitized books. Science 331(6014), 176–182 (2011)

    Article  Google Scholar 

  12. Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  13. Riboni, D., Murtas, M.: Web mining and computer vision: new partners for object-based activity recognition. In: 2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), pp. 158–163. IEEE (2017)

    Google Scholar 

  14. Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Computer Vision-ECCV 2010, pp. 776–789 (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Putti Thaipumi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Thaipumi, P., Hasegawa, O. (2019). Task-related Item-Name Discovery Using Text and Image Data from the Internet. In: Kim, JH., et al. Robot Intelligence Technology and Applications 5. RiTA 2017. Advances in Intelligent Systems and Computing, vol 751. Springer, Cham. https://doi.org/10.1007/978-3-319-78452-6_6

Download citation

Publish with us

Policies and ethics