Scale Development Using Twitter Data: Applying Contemporary Natural Language Processing Methods in IS Research

Agogo, David; Hess, Traci J.

doi:10.1007/978-3-319-58097-5_12

David Agogo⁷ &
Traci J. Hess⁷

Part of the book series: Annals of Information Systems ((AOIS))

2963 Accesses
2 Citations

Abstract

The availability of big data sources and developments in computational linguistics present an opportunity for IS researchers to pursue new areas of inquiry and to tackle existing challenges with new methods. In this paper, a novel way of developing measurement scales using big data (i.e., tweets) and associated methods (i.e., natural language processing) is proposed and tested. The development of a new scale, the technology hassles and delights scale (THDS), is used to demonstrate how a syntax aware filtering process can identify relevant information from a large corpus of tweets to improve the content validity of a scale. In comparing themes generated from analyzing 146 million tweets, with themes generated from semi-structured interviews, a reasonable overlap is observed. Further, the potential for identifying even more relevant themes from within subsets of the tweet dataset is uncovered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal R, Dhar V (2014) Editorial—big data, data science, and analytics: the opportunity and challenge for is research. Inf Syst Res 25(3):443–448
Article Google Scholar
Anderson JC, Gerbing DW (1991) Predicting the performance of measures in a confirmatory factor analysis with a pretest assessment of their substantive validities. J Appl Psychol 76(5):732–740
Article Google Scholar
Anderson LW, Bourke SF (2000) Assessing affective characteristics in the schools. Routledge, New York
Google Scholar
Asur S, Huberman B et al (2010) Predicting the future with social media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), vol 1. IEEE, pp 492–499
Google Scholar
Beauchamp N (2013) Predicting and interpolating state-level polling using Twitter textual data. In: New directions in analyzing text as data workshop
Google Scholar
Beaudry A, Pinsonneault A (2010) The other side of acceptance: studying the direct and indirect effects of emotions on information technology use. MIS Q 34(4):689–6A3
Google Scholar
Benbasat I, Barki H (2007) Quo vadis, TAM? J Assoc Inf Syst 8(4):211–218
Google Scholar
Bhattacherjee A (2001) Understanding information systems continuance: an expectation-confirmation model. MIS Q 25(3):351–370
Article Google Scholar
Bhattacherjee A, Limayem M, Cheung CMK (2012) User switching of information technology: a theoretical synthesis and empirical test. Inf Manag 49(7):327–333
Article Google Scholar
Boudreau M-C, Gefen D, Straub DW (2001) Validation in information systems research: a state-of-the-art assessment. MIS Q 25(1):1–16
Article Google Scholar
Brill E (2000) Part-of-speech tagging. In: Handbook of natural language processing. CRC Press, Boca Raton, pp 403–414
Google Scholar
Buhrmester M, Kwang T, Gosling SD (2011) Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data? Perspect Psychol Sci 6(1):3–5
Article Google Scholar
Burton-Jones A, Straub DW (2006) Reconceptualizing system usage: an approach and empirical test. Inf Syst Res 17(3):228–246
Article Google Scholar
Churchill GA Jr (1979) A paradigm for developing better measures of marketing constructs. J Mark Res:64–73
Google Scholar
Clark LA, Watson D (1995) Constructing validity: basic issues in objective scale development. Psychol Assess 7(3):309
Article Google Scholar
De Choudhury M, Counts S, Horvitz E (2013) Predicting postpartum changes in emotion and behavior via social media. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 3267–3276
Chapter Google Scholar
Gayo-Avello D (2013) A meta-analysis of state-of-the-art electoral prediction from Twitter data. Soc Sci Comput Rev
Google Scholar
Ghiselli EE, Campbell JP, Zedeck S (1981) Measurement theory for the behavioral sciences: origin & evolution. WH Freeman & Company
Google Scholar
Goodhue DL (2007) Comment on Benbasat and Barki’s ‘Quo vadis TAM’ article. J Assoc Inf Syst 8(4):15
Google Scholar
Hinkin TR (1995) A review of scale development practices in the study of organizations. J Manag 21(5):967–988
Google Scholar
Hinkin TR (1998) A brief tutorial on the development of measures for use in survey questionnaires. Organ Res Methods 1(1):104–121
Article Google Scholar
Hirschberg J, Manning CD (2015) Advances in natural language processing. Science 349(6245):261–266
Article Google Scholar
Hudiburg RA (1989) Psychology of computer use: Xvii the computer technology hassles scale: revision, reliability, and some correlates. Psychol Rep 65(3f):1387–1394
Article Google Scholar
Hudiburg RA (1992) Factor analysis of the computer technology hassles scale. Psychol Rep 71(3):739–744
Article Google Scholar
Kaplan AM, Haenlein M (2011) The early bird catches the news: nine things you should know about micro-blogging. Bus Horiz 54(2):105–113
Article Google Scholar
Lampos V, Cristianini N (2010) Tracking the flu pandemic by monitoring the social web. In: 2010 2nd International workshop on cognitive information processing (CIP). IEEE, pp 411–416
Google Scholar
Lampos V, Cristianini N (2012) Nowcasting events from the social web with statistical learning. ACM Trans Intell Syst Technol 3(4):72
Article Google Scholar
Lazer D, Pentland A, Adamic L, Aral S, Barabási A-L, Brewer D, Christakis N, Contractor N, Fowler J, Gutmann M, Jebara T, King G, Macy M, Roy D, Alstyne MV (2009) Computational social science. Science 323(5915):721–723
Article Google Scholar
Llorente A, Garcia-Herranz M, Cebrian M, Moro E (2015) Social media fingerprints of unemployment. PLoS One 10(5):e0128692
Article Google Scholar
Loevinger J (1957) Objective tests as instruments of psychological theory: monograph supplement 9. Psychol Rep 3(3):635–694
Article Google Scholar
Loiacono E, Djamasbi S (2010) Moods and their relevance to systems usage models within organizations: an extended framework. AIS Trans Hum-Comput Interaction 2(2):55–72
Google Scholar
MacKenzie SB, Podsakoff PM, Podsakoff NP (2011) Construct measurement and validation procedures in MIS and behavioral research: integrating new and existing techniques. MIS Q 35(2):293–334
Google Scholar
Manning CD (2011) Part-of-speech tagging from 97% to 100%: is it time for some linguistics? In: Computational linguistics and intelligent text processing. Springer, pp 171–189
Google Scholar
Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge
Google Scholar
McCoach DB, Gable RK, Madura JP (2013) Instrument development in the affective domain. Springer
Google Scholar
Nunnally J (1978) Psychometric methods. McGraw-Hill, New York, p 2013
Google Scholar
O’Connor B, Balasubramanyan R, Routledge BR, Smith NA (2010) From Tweets to Polls: linking text sentiment to public opinion time series. ICWSM 11(122–129):1–2
Google Scholar
Owoputi O, O’Connor B, Dyer C, Gimpel K, Schneider N, Smith NA (2013) Improved part-of-speech tagging for online conversational text with word clusters. In: HLT-NAACL, pp 380–390
Google Scholar
Pennacchiotti M, Popescu A-M (2011) Democrats, republicans and starbucks afficionados: user classification in twitter. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 430–438
Google Scholar
Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in twitter. In: Proceedings of the 2nd international workshop on search and mining user-generated contents. ACM, New York, pp 37–44
Chapter Google Scholar
Ratcliff R, McKoon G (1988) A retrieval theory of priming in memory. Psychol Rev 95(3):385
Article Google Scholar
Rossiter JR (2002) The C-OAR-SE procedure for scale development in marketing. Int J Res Mark 19(4):305–335
Article Google Scholar
Russell JA (2003) Core affect and the psychological construction of emotion. Psychol Rev 110(1):145
Article Google Scholar
Smith NA (2011) Linguistic structure prediction. Synth Lect Hum Lang Technol 4(2):1–274
Article Google Scholar
Steelman ZR, Hammer BI, Limayem M (2014) Data collection in the digital age: innovative alternatives to student samples. MIS Q 38(2):355–378
Article Google Scholar
Straub D, Boudreau M-C, Gefen D (2004) Validation guidelines for IS positivist research. Commun Assoc Inf Syst 13(1):63
Google Scholar
Takhteyev Y, Gruzd A, Wellman B (2012) Geography of Twitter networks. Soc Networks 34(1):73–81
Article Google Scholar
Zhang P (2013) The affective response model: a theoretical framework of affective concepts and their relationships in the ICT context. MIS Q 37(1):247–274
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank anonymous reviewers from the SIG DSA 2015 Business Analytics Congress for comments that helped streamline and enhance this version of the paper. In addition, the authors would like to thank Brendan O’Connor, Computer Science Department, UMass Amherst, and co-creator of TweetNLP, for his guided support during the development of an earlier manuscript of this paper.

Author information

Authors and Affiliations

Operations and Information Management Department, Isenberg School of Management, University of Massachusetts Amherst, 121 President’s Drive, Amherst, MA, 01003, USA
David Agogo & Traci J. Hess

Authors

David Agogo
View author publications
You can also search for this author in PubMed Google Scholar
Traci J. Hess
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Traci J. Hess .

Editor information

Editors and Affiliations

Robert J. Manning School of Business, University of Massachusetts Lowell, Lowell, Massachusetts, USA
Amit V. Deokar
Raymond J. Harbert College of Business, Auburn University, Auburn, Alabama, USA
Ashish Gupta
Walker College of Business, Appalachian State University, Boone, North Carolina, USA
Lakshmi S. Iyer
College of Business, University of North Texas, Denton, Texas, USA
Mary C. Jones

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Agogo, D., Hess, T.J. (2018). Scale Development Using Twitter Data: Applying Contemporary Natural Language Processing Methods in IS Research. In: Deokar, A., Gupta, A., Iyer, L., Jones, M. (eds) Analytics and Data Science. Annals of Information Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-58097-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-58097-5_12
Published: 07 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58096-8
Online ISBN: 978-3-319-58097-5
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics