Abstract
The availability of big data sources and developments in computational linguistics present an opportunity for IS researchers to pursue new areas of inquiry and to tackle existing challenges with new methods. In this paper, a novel way of developing measurement scales using big data (i.e., tweets) and associated methods (i.e., natural language processing) is proposed and tested. The development of a new scale, the technology hassles and delights scale (THDS), is used to demonstrate how a syntax aware filtering process can identify relevant information from a large corpus of tweets to improve the content validity of a scale. In comparing themes generated from analyzing 146 million tweets, with themes generated from semi-structured interviews, a reasonable overlap is observed. Further, the potential for identifying even more relevant themes from within subsets of the tweet dataset is uncovered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal R, Dhar V (2014) Editorial—big data, data science, and analytics: the opportunity and challenge for is research. Inf Syst Res 25(3):443–448
Anderson JC, Gerbing DW (1991) Predicting the performance of measures in a confirmatory factor analysis with a pretest assessment of their substantive validities. J Appl Psychol 76(5):732–740
Anderson LW, Bourke SF (2000) Assessing affective characteristics in the schools. Routledge, New York
Asur S, Huberman B et al (2010) Predicting the future with social media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), vol 1. IEEE, pp 492–499
Beauchamp N (2013) Predicting and interpolating state-level polling using Twitter textual data. In: New directions in analyzing text as data workshop
Beaudry A, Pinsonneault A (2010) The other side of acceptance: studying the direct and indirect effects of emotions on information technology use. MIS Q 34(4):689–6A3
Benbasat I, Barki H (2007) Quo vadis, TAM? J Assoc Inf Syst 8(4):211–218
Bhattacherjee A (2001) Understanding information systems continuance: an expectation-confirmation model. MIS Q 25(3):351–370
Bhattacherjee A, Limayem M, Cheung CMK (2012) User switching of information technology: a theoretical synthesis and empirical test. Inf Manag 49(7):327–333
Boudreau M-C, Gefen D, Straub DW (2001) Validation in information systems research: a state-of-the-art assessment. MIS Q 25(1):1–16
Brill E (2000) Part-of-speech tagging. In: Handbook of natural language processing. CRC Press, Boca Raton, pp 403–414
Buhrmester M, Kwang T, Gosling SD (2011) Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data? Perspect Psychol Sci 6(1):3–5
Burton-Jones A, Straub DW (2006) Reconceptualizing system usage: an approach and empirical test. Inf Syst Res 17(3):228–246
Churchill GA Jr (1979) A paradigm for developing better measures of marketing constructs. J Mark Res:64–73
Clark LA, Watson D (1995) Constructing validity: basic issues in objective scale development. Psychol Assess 7(3):309
De Choudhury M, Counts S, Horvitz E (2013) Predicting postpartum changes in emotion and behavior via social media. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 3267–3276
Gayo-Avello D (2013) A meta-analysis of state-of-the-art electoral prediction from Twitter data. Soc Sci Comput Rev
Ghiselli EE, Campbell JP, Zedeck S (1981) Measurement theory for the behavioral sciences: origin & evolution. WH Freeman & Company
Goodhue DL (2007) Comment on Benbasat and Barki’s ‘Quo vadis TAM’ article. J Assoc Inf Syst 8(4):15
Hinkin TR (1995) A review of scale development practices in the study of organizations. J Manag 21(5):967–988
Hinkin TR (1998) A brief tutorial on the development of measures for use in survey questionnaires. Organ Res Methods 1(1):104–121
Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266
Hudiburg RA (1989) Psychology of computer use: Xvii the computer technology hassles scale: revision, reliability, and some correlates. Psychol Rep 65(3f):1387–1394
Hudiburg RA (1992) Factor analysis of the computer technology hassles scale. Psychol Rep 71(3):739–744
Kaplan AM, Haenlein M (2011) The early bird catches the news: nine things you should know about micro-blogging. Bus Horiz 54(2):105–113
Lampos V, Cristianini N (2010) Tracking the flu pandemic by monitoring the social web. In: 2010 2nd International workshop on cognitive information processing (CIP). IEEE, pp 411–416
Lampos V, Cristianini N (2012) Nowcasting events from the social web with statistical learning. ACM Trans Intell Syst Technol 3(4):72
Lazer D, Pentland A, Adamic L, Aral S, Barabási A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M, Jebara T, King G, Macy M, Roy D, Alstyne MV (2009) Computational social science. Science 323(5915):721–723
Llorente A, Garcia-Herranz M, Cebrian M, Moro E (2015) Social media fingerprints of unemployment. PLoS One 10(5):e0128692
Loevinger J (1957) Objective tests as instruments of psychological theory: monograph supplement 9. Psychol Rep 3(3):635–694
Loiacono E, Djamasbi S (2010) Moods and their relevance to systems usage models within organizations: an extended framework. AIS Trans Hum-Comput Interaction 2(2):55–72
MacKenzie SB, Podsakoff PM, Podsakoff NP (2011) Construct measurement and validation procedures in MIS and behavioral research: integrating new and existing techniques. MIS Q 35(2):293–334
Manning CD (2011) Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: Computational linguistics and intelligent text processing. Springer, pp 171–189
Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge
McCoach DB, Gable RK, Madura JP (2013) Instrument development in the affective domain. Springer
Nunnally J (1978) Psychometric methods. McGraw-Hill, New York, p 2013
O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From Tweets to Polls: linking text sentiment to public opinion time series. ICWSM 11(122–129):1–2
Owoputi O, O’Connor B, Dyer C, Gimpel K, Schneider N, Smith NA (2013) Improved part-of-speech tagging for online conversational text with word clusters. In: HLT-NAACL, pp 380–390
Pennacchiotti M, Popescu A-M (2011) Democrats, republicans and starbucks afficionados: user classification in twitter. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 430–438
Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: Proceedings of the 2nd international workshop on search and mining user-generated contents. ACM, New York, pp 37–44
Ratcliff R, McKoon G (1988) A retrieval theory of priming in memory. Psychol Rev 95(3):385
Rossiter JR (2002) The C-OAR-SE procedure for scale development in marketing. Int J Res Mark 19(4):305–335
Russell JA (2003) Core affect and the psychological construction of emotion. Psychol Rev 110(1):145
Smith NA (2011) Linguistic structure prediction. Synth Lect Hum Lang Technol 4(2):1–274
Steelman ZR, Hammer BI, Limayem M (2014) Data collection in the digital age: innovative alternatives to student samples. MIS Q 38(2):355–378
Straub D, Boudreau M-C, Gefen D (2004) Validation guidelines for IS positivist research. Commun Assoc Inf Syst 13(1):63
Takhteyev Y, Gruzd A, Wellman B (2012) Geography of Twitter networks. Soc Networks 34(1):73–81
Zhang P (2013) The affective response model: a theoretical framework of affective concepts and their relationships in the ICT context. MIS Q 37(1):247–274
Acknowledgements
The authors would like to thank anonymous reviewers from the SIG DSA 2015 Business Analytics Congress for comments that helped streamline and enhance this version of the paper. In addition, the authors would like to thank Brendan O’Connor, Computer Science Department, UMass Amherst, and co-creator of TweetNLP, for his guided support during the development of an earlier manuscript of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Agogo, D., Hess, T.J. (2018). Scale Development Using Twitter Data: Applying Contemporary Natural Language Processing Methods in IS Research. In: Deokar, A., Gupta, A., Iyer, L., Jones, M. (eds) Analytics and Data Science. Annals of Information Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-58097-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-58097-5_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58096-8
Online ISBN: 978-3-319-58097-5
eBook Packages: Business and ManagementBusiness and Management (R0)