Simple Document-by-Document Search Tool “Fuwatto Search” Using Web API
In this paper, we propose a new search method Fuwatto Search that allows users to retrieve documents in a document-by-document manner via a Web API. We present an implementation of the proposed method (i.e., Fuwatto CiNii Search), which targets the CiNii Article database, one of the largest academic article databases in Japan. The experimental evaluation of Fuwatto CiNii Search with newspaper articles demonstrates the retrieval effectiveness of 0.25 for precision at 10 and 0.17 for mean average precision.
Keywordsdocument retrieval Web API effectiveness CiNii Articles
Unable to display preview. Download preview PDF.
- 1.Nakatani, S.: Body text extraction of web pages (in Japanese), http://labs.cybozu.co.jp/blog/nakatani/2007/09/web_1.html (updated September 12, 2007, accessed June 15, 2014)
- 2.National Institute of Informatics: CiNii Articles, http://ci.nii.ac.jp/en (accessed June 15, 2014)
- 3.National Institute of Informatics: Metadata and API: CiNii Articles OpenSearch for Articles, http://ci.nii.ac.jp/info/en/api/a_opensearch.html (accessed June 15, 2014)
- 4.Kudo, T.: MeCab: Yet another part-of-speech and morphological analyzer, https://code.google.com/p/mecab/ (accessed June 15, 2014)
- 5.Library of Congress: InQuery stopword list for THOMAS, http://thomas.loc.gov/home/stopwords.html (accessed February 10, 2010)