Nowadays, research on stylistic features (SF) mainly focuses on two aspects: lexical elements and syntactic structures. The lexical elements act as the content of a sentence and the syntactic structures constitute the framework of a sentence. How to combine both aspects and exploit their common advantages is a challenging issue. In this paper, we propose a Principal Stylistic Features Analysis method (PSFA) to combine these two parts, and then mine the relations between features. From a statistical analysis point of view, many interesting linguistic phenomena can be found. Through the PSFA method, we finally extract some representative features which cover different aspects of styles. To verify the performance of these selected features, classification experiments are conducted. The results show that the elements selected by the PSFA method provide a significantly higher classification accuracy than other advanced methods.


Style Lexical and syntactic features Feature dimension reduction 


  1. 1.
    Ahmad, M., Nadeem, M.T., Khan, T., Ahmad, S.: Stylistic analysis of the ‘muslim family laws ordinance 1961’. J. Study Engl. Linguist. 3(1), 28–37 (2015)CrossRefGoogle Scholar
  2. 2.
    Ashraf, S., Iqbal, H.R., Nawab, R.M.A.: Cross-genre author profile prediction using stylometry-based approach. In: CLEF (Working Notes), pp. 992–999 (2016)Google Scholar
  3. 3.
    Bird, H., Franklin, S., Howard, D.: Age of acquisition and imageability ratings for a large set of words, including verbs and function words. Behav. Res. Methods Instrum. Comput. 33(1), 73–79 (2001)CrossRefGoogle Scholar
  4. 4.
    Booten, K., Hearst, M.A.: Patterns of wisdom: discourse-level style in multi-sentence quotations. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1139–1144 (2016)Google Scholar
  5. 5.
    Chen, J., Huang, H., Tian, S., Qu, Y.: Feature selection for text classification with Naïve Bayes. Expert Syst. Appl. 36(3), 5432–5435 (2009)CrossRefGoogle Scholar
  6. 6.
    Griffiths, T.L., Steyvers, M., Blei, D.M., Tenenbaum, J.B.: Integrating topics and syntax. In: Advances in Neural Information Processing Systems, pp. 537–544 (2005)Google Scholar
  7. 7.
    Kumar, S., Kernighan, B.: Cloud-based plagiarism detection system performing predicting based on classified feature vectors. US Patent 9,514,417 (2016)Google Scholar
  8. 8.
    Lahiri, S., Vydiswaran, V.V., Mihalcea, R.: Identifying usage expression sentences in consumer product reviews. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (vol. 1: Long Papers), pp. 394–403 (2017)Google Scholar
  9. 9.
    Liu, Q.: Research on stylistic features of the English international business contract. DEStech Trans. Soc. Sci. Educ. Hum. Sci. (MSIE) (2017)Google Scholar
  10. 10.
    Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning-based document modeling for personality detection from text. IEEE Intell. Syst. 32(2), 74–79 (2017)CrossRefGoogle Scholar
  11. 11.
    Mishne, G., et al.: Experiments with mood classification in blog posts. In: Proceedings of ACM SIGIR 2005 Workshop on Stylistic Analysis of Text for Information Access, vol. 19, pp. 321–327 (2005)Google Scholar
  12. 12.
    Niu, X., Carpuat, M.: Discovering stylistic variations in distributional vector space models via lexical paraphrases. In: Proceedings of the Workshop on Stylistic Variation, pp. 20–27 (2017)Google Scholar
  13. 13.
    Pavlick, E., Rastogi, P., Ganitkevitch, J., Van Durme, B., Callison-Burch, C.: PPDB 2.0: better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (vol. 2: Short Papers), pp. 425–430 (2015)Google Scholar
  14. 14.
    Pervaz, I., Ameer, I., Sittar, A., Nawab, R.M.A.: Identification of author personality traits using stylistic features: notebook for PAN at CLEF 2015. In: CLEF (Working Notes) (2015)Google Scholar
  15. 15.
    Ruano San Segundo, P.: A corpus-stylistic approach to dickens’ use of speech verbs: beyond mere reporting. Lang. Lit. 25(2), 113–129 (2016)CrossRefGoogle Scholar
  16. 16.
    Santosh, D.T., Babu, K.S., Prasad, S., Vivekananda, A.: Opinion mining of online product reviews from traditional LDA topic clusters using feature ontology tree and sentiwordnet. IJEME 6, 1–11 (2016)CrossRefGoogle Scholar
  17. 17.
    Saparova, M.: The problem of stylistic classification of colloquial vocabulary. 5(1), 80–82 (2016)Google Scholar
  18. 18.
    Schler, J., Koppel, M., Argamon, S., Pennebaker, J.W.: Effects of age and gender on blogging. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, vol. 6, pp. 199–205 (2006)Google Scholar
  19. 19.
    Szymanski, T., Lynch, G.: UCD: diachronic text classification with character, word, and syntactic n-grams. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), United States (2015)Google Scholar
  20. 20.
    Wang, L.: News authorship identification with deep learning (2017)Google Scholar
  21. 21.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Tsinghua UniversityBeijingChina

Personalised recommendations