Parametric Representation of Paragraphs and Their Classification
Automatic paragraph classification is an important task in the field of information retrieval and digital publication. The work presents a novel approach to represent a paragraph of a document using a set of parameters extracted from it and a methodology has been proposed based on multi layer perceptron in designing an automatic paragraph classifier. The proposed framework has been tested on large industrial data and found improved performance compare to conventional rule based approach.
KeywordsText Classification Information Retrieval Machine Intelligence Multi Layer Perceptron
Unable to display preview. Download preview PDF.
- 1.Crossley, S.A., Dempsey, K., McNamara, D.S.: Classifying paragraph types using linguistic features: Is paragraph positioning important? Journal of Writing Research 3(2), 119–143 (2011)Google Scholar
- 2.Sporleder, C.: Automatic paragraph identification: A study across languages and domains. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 72–79 (2004)Google Scholar
- 3.Filippova, K., Strube, M.: Using linguistically motivated features for paragraph boundary identification. In: EMNLP, pp. 267–274 (2006)Google Scholar
- 4.Taboada, M., Brooke, J., Stede, M.: Genre-based paragraph classification for sentiment analysis. In: Proceedings of SIGDIAL 2009, pp. 62–70 (2009)Google Scholar
- 6.Collobert, R., Wetson, J.: Fast semantic extraction using a novel neural network architecture. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 560–567 (2007)Google Scholar
- 8.Pomerleau, D.A.: Neural network simulation at warp speed: how we got 17 million connections per second. IEEE 2, 143–150 (1988)Google Scholar