On Domain Independence of Author Identification

Shirai, Masato; Miura, Takao

doi:10.1007/978-3-642-23878-9_2

On Domain Independence of Author Identification

Masato Shirai¹⁹ &
Takao Miura¹⁹

Conference paper

1774 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6936))

Abstract

Latent Dirichlet Allocation (LDA) is a probabilistic framework by which we may assume each word carries probability distribution to each topic and a topic carries a distribution to each document. By putting all the documents together into one collection by each author, it is possible to identify authors. Here we show that author identification is fully reliable within a framework of LDA independent of documents domains by learning incomplete and massive documents.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Griffiths, T.L., Steyvers, M.: Finding Scientific Topics. Proc. National Academy of Sciences 101 (2004)
Google Scholar
Hofmann, T.: Probabilistic Latent Semantic Indexing. In: SIGIR (1999)
Google Scholar
Holmes, D., Forsyth, R.: The Federalist revised: New directions in authorship attribution. Literary and Linguistic Computing 10-2, 111–127 (1995)
Article Google Scholar
Nakayama, M., Miura, T.: Identifying Topics by using Word Distribution. In: Proc. PACRIM (2007)
Google Scholar
Rosen-Zvi, M., Griffiths, Steyvers, M., Smyth, T.: The author-topic model for authors and documents. In: UAI 2004 Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept.of Elect.& Elect. Engr., HOSEI University, 3-7-2 KajinoCho, Koganei, Tokyo, 184–8584, Japan
Masato Shirai & Takao Miura

Authors

Masato Shirai
View author publications
You can also search for this author in PubMed Google Scholar
Takao Miura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Electronic Engineering, University of Manchester, Sackville Street Building, M60 1QD, Manchester, UK
Hujun Yin
School of Computing Sciences, University of East Anglia, NR4 7TJ, Norwich, UK
Wenjia Wang
University of East Anglia, NR4 7TJ, Norwich, UK
Victor Rayward-Smith

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shirai, M., Miura, T. (2011). On Domain Independence of Author Identification. In: Yin, H., Wang, W., Rayward-Smith, V. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2011. IDEAL 2011. Lecture Notes in Computer Science, vol 6936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23878-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-23878-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23877-2
Online ISBN: 978-3-642-23878-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics