An Automatic Extraction of Academia-Industry Collaborative Research and Development Documents on the Web

Kurakawa, Kei; Sun, Yuan; Yamashita, Nagayoshi; Baba, Yasumasa

doi:10.1007/978-3-319-01264-3_18

An Automatic Extraction of Academia-Industry Collaborative Research and Development Documents on the Web

Kei Kurakawa²²,
Yuan Sun²²,
Nagayoshi Yamashita²³ &
…
Yasumasa Baba²⁴

Conference paper
First Online: 01 January 2013

888 Accesses

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Abstract

This research focuses on an automatic extraction method of Japanese documents describing University-Industry (U-I) relations from the Web. The method proposed here consists of a preprocessing step for Japanese texts and a classification step with a SVM. The feature selection process is especially tuned up for U-I relations documents. A U-I document extraction experiment has been conducted and the features found to be relevant for this task are discussed.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Aizawa A (2003) An information-theoretic perspective of tf-idf measures. Inf Process Manag 39(1):45–65. doi:10.1016/S0306-4573(02)00021-3
Article MathSciNet MATH Google Scholar
Kudo T, Matsumoto Y (2002) Japanese dependency analysis using cascaded chunking. In: Roth D, van den Bosch A (eds) CoNLL-2002, Taipei, pp 63–69
Google Scholar
Kudo T, Yamamoto K, Matsumoto Y (2004) Applying conditional random fields to Japanese morphological analysis. In: 2004 conference on empirical methods in natural language processing (EMNLP-2004), Barcelona, pp 230–237. http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html
Leydesdorff L, Meyer M (2003) The triple helix of university-industry-government relations. Scientometrics 58(2):191–203
Article Google Scholar
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Book MATH Google Scholar
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: Fisher DH (ed) Proceedings of ICML-97, 14th international conference on machine learning, Nashville. Morgan Kaufmann Publishers, San Francisco, pp 412–420
Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Informatics, Tokyo, 101-8430, Japan
Kei Kurakawa & Yuan Sun
Japan Society for the Promotion of Science, Tokyo, 102-8472, Japan
Nagayoshi Yamashita
The Institute of Statistical Mathematics, Tokyo, 190-8562, Japan
Yasumasa Baba

Authors

Kei Kurakawa
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Nagayoshi Yamashita
View author publications
You can also search for this author in PubMed Google Scholar
Yasumasa Baba
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kei Kurakawa .

Editor information

Editors and Affiliations

Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Wolfgang Gaul
Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Andreas Geyer-Schulz
The Institute of Statistical Mathematics, Tokyo, Japan
Yasumasa Baba
Graduate School of Management and Information Systems, Tama University, Tokyo, Japan
Akinori Okada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kurakawa, K., Sun, Y., Yamashita, N., Baba, Y. (2014). An Automatic Extraction of Academia-Industry Collaborative Research and Development Documents on the Web. In: Gaul, W., Geyer-Schulz, A., Baba, Y., Okada, A. (eds) German-Japanese Interchange of Data Analysis Results. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01264-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-01264-3_18
Published: 09 October 2013
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01263-6
Online ISBN: 978-3-319-01264-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics