Skip to main content

Scale Development Using Twitter Data: Applying Contemporary Natural Language Processing Methods in IS Research

  • Chapter
  • First Online:
Analytics and Data Science

Part of the book series: Annals of Information Systems ((AOIS))

Abstract

The availability of big data sources and developments in computational linguistics present an opportunity for IS researchers to pursue new areas of inquiry and to tackle existing challenges with new methods. In this paper, a novel way of developing measurement scales using big data (i.e., tweets) and associated methods (i.e., natural language processing) is proposed and tested. The development of a new scale, the technology hassles and delights scale (THDS), is used to demonstrate how a syntax aware filtering process can identify relevant information from a large corpus of tweets to improve the content validity of a scale. In comparing themes generated from analyzing 146 million tweets, with themes generated from semi-structured interviews, a reasonable overlap is observed. Further, the potential for identifying even more relevant themes from within subsets of the tweet dataset is uncovered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Agarwal R, Dhar V (2014) Editorial—big data, data science, and analytics: the opportunity and challenge for is research. Inf Syst Res 25(3):443–448

    Article  Google Scholar 

  • Anderson JC, Gerbing DW (1991) Predicting the performance of measures in a confirmatory factor analysis with a pretest assessment of their substantive validities. J Appl Psychol 76(5):732–740

    Article  Google Scholar 

  • Anderson LW, Bourke SF (2000) Assessing affective characteristics in the schools. Routledge, New York

    Google Scholar 

  • Asur S, Huberman B et al (2010) Predicting the future with social media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), vol 1. IEEE, pp 492–499

    Google Scholar 

  • Beauchamp N (2013) Predicting and interpolating state-level polling using Twitter textual data. In: New directions in analyzing text as data workshop

    Google Scholar 

  • Beaudry A, Pinsonneault A (2010) The other side of acceptance: studying the direct and indirect effects of emotions on information technology use. MIS Q 34(4):689–6A3

    Google Scholar 

  • Benbasat I, Barki H (2007) Quo vadis, TAM? J Assoc Inf Syst 8(4):211–218

    Google Scholar 

  • Bhattacherjee A (2001) Understanding information systems continuance: an expectation-confirmation model. MIS Q 25(3):351–370

    Article  Google Scholar 

  • Bhattacherjee A, Limayem M, Cheung CMK (2012) User switching of information technology: a theoretical synthesis and empirical test. Inf Manag 49(7):327–333

    Article  Google Scholar 

  • Boudreau M-C, Gefen D, Straub DW (2001) Validation in information systems research: a state-of-the-art assessment. MIS Q 25(1):1–16

    Article  Google Scholar 

  • Brill E (2000) Part-of-speech tagging. In: Handbook of natural language processing. CRC Press, Boca Raton, pp 403–414

    Google Scholar 

  • Buhrmester M, Kwang T, Gosling SD (2011) Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data? Perspect Psychol Sci 6(1):3–5

    Article  Google Scholar 

  • Burton-Jones A, Straub DW (2006) Reconceptualizing system usage: an approach and empirical test. Inf Syst Res 17(3):228–246

    Article  Google Scholar 

  • Churchill GA Jr (1979) A paradigm for developing better measures of marketing constructs. J Mark Res:64–73

    Google Scholar 

  • Clark LA, Watson D (1995) Constructing validity: basic issues in objective scale development. Psychol Assess 7(3):309

    Article  Google Scholar 

  • De Choudhury M, Counts S, Horvitz E (2013) Predicting postpartum changes in emotion and behavior via social media. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 3267–3276

    Chapter  Google Scholar 

  • Gayo-Avello D (2013) A meta-analysis of state-of-the-art electoral prediction from Twitter data. Soc Sci Comput Rev

    Google Scholar 

  • Ghiselli EE, Campbell JP, Zedeck S (1981) Measurement theory for the behavioral sciences: origin & evolution. WH Freeman & Company

    Google Scholar 

  • Goodhue DL (2007) Comment on Benbasat and Barki’s ‘Quo vadis TAM’ article. J Assoc Inf Syst 8(4):15

    Google Scholar 

  • Hinkin TR (1995) A review of scale development practices in the study of organizations. J Manag 21(5):967–988

    Google Scholar 

  • Hinkin TR (1998) A brief tutorial on the development of measures for use in survey questionnaires. Organ Res Methods 1(1):104–121

    Article  Google Scholar 

  • Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266

    Article  Google Scholar 

  • Hudiburg RA (1989) Psychology of computer use: Xvii the computer technology hassles scale: revision, reliability, and some correlates. Psychol Rep 65(3f):1387–1394

    Article  Google Scholar 

  • Hudiburg RA (1992) Factor analysis of the computer technology hassles scale. Psychol Rep 71(3):739–744

    Article  Google Scholar 

  • Kaplan AM, Haenlein M (2011) The early bird catches the news: nine things you should know about micro-blogging. Bus Horiz 54(2):105–113

    Article  Google Scholar 

  • Lampos V, Cristianini N (2010) Tracking the flu pandemic by monitoring the social web. In: 2010 2nd International workshop on cognitive information processing (CIP). IEEE, pp 411–416

    Google Scholar 

  • Lampos V, Cristianini N (2012) Nowcasting events from the social web with statistical learning. ACM Trans Intell Syst Technol 3(4):72

    Article  Google Scholar 

  • Lazer D, Pentland A, Adamic L, Aral S, Barabási A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M, Jebara T, King G, Macy M, Roy D, Alstyne MV (2009) Computational social science. Science 323(5915):721–723

    Article  Google Scholar 

  • Llorente A, Garcia-Herranz M, Cebrian M, Moro E (2015) Social media fingerprints of unemployment. PLoS One 10(5):e0128692

    Article  Google Scholar 

  • Loevinger J (1957) Objective tests as instruments of psychological theory: monograph supplement 9. Psychol Rep 3(3):635–694

    Article  Google Scholar 

  • Loiacono E, Djamasbi S (2010) Moods and their relevance to systems usage models within organizations: an extended framework. AIS Trans Hum-Comput Interaction 2(2):55–72

    Google Scholar 

  • MacKenzie SB, Podsakoff PM, Podsakoff NP (2011) Construct measurement and validation procedures in MIS and behavioral research: integrating new and existing techniques. MIS Q 35(2):293–334

    Google Scholar 

  • Manning CD (2011) Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: Computational linguistics and intelligent text processing. Springer, pp 171–189

    Google Scholar 

  • Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge

    Google Scholar 

  • McCoach DB, Gable RK, Madura JP (2013) Instrument development in the affective domain. Springer

    Google Scholar 

  • Nunnally J (1978) Psychometric methods. McGraw-Hill, New York, p 2013

    Google Scholar 

  • O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From Tweets to Polls: linking text sentiment to public opinion time series. ICWSM 11(122–129):1–2

    Google Scholar 

  • Owoputi O, O’Connor B, Dyer C, Gimpel K, Schneider N, Smith NA (2013) Improved part-of-speech tagging for online conversational text with word clusters. In: HLT-NAACL, pp 380–390

    Google Scholar 

  • Pennacchiotti M, Popescu A-M (2011) Democrats, republicans and starbucks afficionados: user classification in twitter. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 430–438

    Google Scholar 

  • Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: Proceedings of the 2nd international workshop on search and mining user-generated contents. ACM, New York, pp 37–44

    Chapter  Google Scholar 

  • Ratcliff R, McKoon G (1988) A retrieval theory of priming in memory. Psychol Rev 95(3):385

    Article  Google Scholar 

  • Rossiter JR (2002) The C-OAR-SE procedure for scale development in marketing. Int J Res Mark 19(4):305–335

    Article  Google Scholar 

  • Russell JA (2003) Core affect and the psychological construction of emotion. Psychol Rev 110(1):145

    Article  Google Scholar 

  • Smith NA (2011) Linguistic structure prediction. Synth Lect Hum Lang Technol 4(2):1–274

    Article  Google Scholar 

  • Steelman ZR, Hammer BI, Limayem M (2014) Data collection in the digital age: innovative alternatives to student samples. MIS Q 38(2):355–378

    Article  Google Scholar 

  • Straub D, Boudreau M-C, Gefen D (2004) Validation guidelines for IS positivist research. Commun Assoc Inf Syst 13(1):63

    Google Scholar 

  • Takhteyev Y, Gruzd A, Wellman B (2012) Geography of Twitter networks. Soc Networks 34(1):73–81

    Article  Google Scholar 

  • Zhang P (2013) The affective response model: a theoretical framework of affective concepts and their relationships in the ICT context. MIS Q 37(1):247–274

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank anonymous reviewers from the SIG DSA 2015 Business Analytics Congress for comments that helped streamline and enhance this version of the paper. In addition, the authors would like to thank Brendan O’Connor, Computer Science Department, UMass Amherst, and co-creator of TweetNLP, for his guided support during the development of an earlier manuscript of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Traci J. Hess .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Agogo, D., Hess, T.J. (2018). Scale Development Using Twitter Data: Applying Contemporary Natural Language Processing Methods in IS Research. In: Deokar, A., Gupta, A., Iyer, L., Jones, M. (eds) Analytics and Data Science. Annals of Information Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-58097-5_12

Download citation

Publish with us

Policies and ethics