Abstract
The increasing popularity of social media has encouraged health consumers to share, explore, and validate health and wellness information on social networks, which provide a rich repository of Patient-Generated Wellness Data (PGWD). While data-driven healthcare has attracted a lot of attention from academia and industry for improving care delivery through personalized healthcare, limited research has been done on harvesting and utilizing PGWD available on social networks. This chapter focuses on wellness profiling of users where we demonstrate algorithms to effectively harvest social media to extract wellness information of individuals as well as construct the latent profile of users. In particular, we study the wellness profile of users in diabetes, with extension to obesity and depression.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
This is a real tweet from the dataset.
- 5.
Note that the numbers in Table 8.2 do not add up to 11; 217 since our dataset is a multi-label dataset meaning that some messages discuss about more than one PWE.
- 6.
In this text, we use wellness feature (e.g., blood glucose, hypertension) and wellness events (onset of asthma attack, hyperglycemia) interchangeably.
- 7.
We followed a bootstrapping approach similar to [68] to ensure the coverage and diversity of used patterns, where all extracted patterns are manually verified to ensure accuracy.
- 8.
In our dataset, there are three non-common diabetes types: gestational diabetes, diabetes LADA (Type 1.5), and diabetes insipidus.
- 9.
Due to user privacy concerns, some words/sentences may be different from original version.
- 10.
We did not consider the first week of May and the last week of October because the data was partially crawled.
- 11.
- 12.
References
UN General Assembly. Political declaration of the high-level meeting of the general assembly on the prevention and control of noncommunicable diseases. New York: UN; 2011.
Abbar S, Mejova Y, Weber I. You tweet what you eat: studying food consumption through twitter. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems: ACM; 2015. p. 3197–206.
Akbari M, Chua T-S. Leveraging behavioral factorization and prior knowledge for community discovery and profiling. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining: ACM; 2017. p. 71–9.
Akbari M, Hu X, Nie L, Chua T-S.. From tweets to wellness: wellness event detection from twitter streams. In: AAAI; 2016.
Akbari M, Hu X, Nie L, Chua T-S. Towards organizing health knowledge on community-based health services. EURASIP J Bioinforma Syst Biol. 2016;2016(1):18.
Akbari M, Hu X, Wang F, Chua T-S. Wellness representation of users in social media: towards joint modelling of heterogeneity and temporality. IEEE Trans Knowl Data Eng. 2017;
Akbari M, Nie L, Chua T-S. amm: towards adaptive ranking of multi-modal documents. International J Multimed Inf Retr. 2015;4(4):233–45.
Akbari M, Relia K, Anas E, Rumi C. From the user to the medium: neural profilin across web communities. In: ICWSM; 2018.
Aronson AR. Effective mapping of biomedical text to the umls metathesaurus: the metamap program. AMIA Symp. 2001:17–21.
A. D. Association, et al. Standards of medical care in diabetes. Diabetes Care. 2014;37(Suppl 1):S14–80.
Attai DJ, Cowher MS, Al-Hamadani M, Schoger JM, Staley AC, Landercasper J. Twitter social media is an effective tool for breast cancer patient education and support: patient-reported outcomes by survey. J Med Internet Res. 2015;17(7):e188.
Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.
Carlson A, Gaffney S, Vasile F. Learning a named entity tagger from gazetteers with the partial perceptron. In: AAAI Spring Symposium: Learning by Reading and Learning to Read; 2009. p. 7–13.
Che Z, Purushotham S, Khemani R, Liu Y. Distilling knowledge from deep networks with applications to healthcare domain. arXiv preprint arXiv:1512.03542. 2015.
Chen X, Pan W, Kwok JT, Carbonell JG. Accelerated gradient method for multi-task sparse learning problem. In: ICDM ‘09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining; IEEE Computer Society, Washington, DC; 2009.
Chen Y, Zhao J, Hu X, Zhang X, Li Z, Chua T-S. From interest to function: Location estimation in social media. In: AAAI; 2013.
Cohen S, Wills TA. Stress, social support, and the buffering hypothesis. Psychol Bull. 1985;98(2):310.
De Choudhury M. You’re happy, i’m happy: diffusion of mood expression on twitter. In: Proceedings of HCI Korea: Hanbit Media; 2014. p. 169–79.
De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. In: ICWSM; 2013.
De Choudhury M, Morris MR, White RW. Seeking and sharing health information online: comparing search engines and social media. In: SIGCHI; 2014.
Farseev A, Chua T-S. Tweet can be fit: Integrating data from wearable sensors and multiple social networks for wellness profile learning. ACM Trans Inf Syst. 2017;35(4):42.
Field AE, Coakley EH, Must A, Spadano JL, Laird N, Dietz WH, Rimm E, Colditz GA. Impact of overweight on the risk of developing common chronic diseases during a 10-year period. Arch Intern Med. 2001;161(13):1581–6.
Gupta N, Singh S. Collectively embedding multi-relational data for predicting user preferences. arXiv preprint arXiv:1504.06165. 2015.
Gupta S, Manning CD. SPIED: Stanford pattern-based information extraction and diagnostics. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces. Baltimore, MD: ACL; 2014. p. 38–44.
He X, Cai D, Niyogi P. Laplacian score for feature selection. In: NIPS; 2005.
Hu FB. Globalization of diabetes: the role of diet, lifestyle, and genes. Diabetes Care. 2011;34(6):1249–57.
Hu X, Liu H. Text analytics in social media. In: Mining text data. Springer; 2012.
Hu X, Tang J, Liu H. Online social spammer detection. In: AAAI; 2014.
Huang T, Elghafari A, Relia K, Chunara R. High-resolution temporal representations of alcohol and tobacco behaviors from social media data. In: Proceedings of the ACM on human-computer interaction, 1(CSCW); 2017.
Jalali A, Sanghavi S, Ruan C, Ravikumar PK. A dirty model for multi-task learning. In: NIPS; 2010.
Jalali L, Jain R. Bringing deep causality to multimedia data streams. In Proceedings of the 23rd ACM international conference on Multimedia. ACM; 2015. p. 221–230.
Jin X, Zhuang F, Pan SJ, Du C, Luo P, He Q. Heterogeneous multi-task semantic feature learning for classification. In: CIKM; 2015.
Kim S, Xing EP. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet. 2009;5(8):e1000587.
Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer. 2009;(8):30–7.
Kuh D, Shlomo YB. A life course approach to chronic disease epidemiology. Number 2. Oxford: Oxford University Press; 2004.
Kumar A, Daume III H. Learning task grouping and overlap in multi-task learning. ICML; 2012.
Lee DD, Seung HS. Algorithms for non-negative matrix factorization. In: NIPS; 2001.
Lee K, Agrawal A, Choudhary A. Real-time disease surveillance using twitter data: demonstration on flu and cancer. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining: ACM; 2013. p. 1474–7.
Li J, Ritter A, Cardie C, Hovy E. Major life event extraction from twitter based on congratulations/condolences speech acts. In: EMNLP; 2014.
Li Z, Yang Y, Liu J, Zhou X, Lu H. Unsupervised feature selection using nonnegative spectral analysis. In: AAAI; 2012.
Lim SS, Vos T, Flaxman AD, et al. A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, , 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2013;380(9859):2224–60.
Lin H, Jia J, Guo Q, Xue Y, Li Q, Huang J, Cai L, Feng L. User-level psychological stress detection from social media using deep neural network. In: ACM MM; 2014.
Lin H, Jia J, Qiu J, Zhang Y, Shen G, Xie L, Tang J, Feng L, Chua T-S. Detecting stress based on social interactions in social networks. IEEE Trans Knowl Data Eng. 2017;29(9):1820–33.
Liu C, Wang F, Hu J, Xiong H. Temporal phenotyping from longitudinal electronic health records: a graph based framework. In: KDD; 2015.
Liu J, Weitzman ER, Chunara R. Assessing behavioral stages from social media data. In: CSCW: proceedings of the Conference on Computer-Supported Cooperative Work. Conference on Computer-Supported Cooperative Work; 2017.
Mintz M, Bills S, Snow R, Jurafsky D. Distant supervision for relation extraction without labeled data. In: ACL; 2009.
Nesterov Y. Introductory lectures on convex optimization: a basic course. New York: Springer; 2004.
Nie F, Huang H, Cai X, Ding CH. Efficient and robust feature selection via joint l2/1-norms minimization. In: NIPS; 2010.
Nie L, Akbari M, Li T, Chua T-S. A joint local-global approach for medical terminology assignment. In: SIGIR; 2014.
Nori N, Kashima H, Yamashita K, Ikai H, Imanaka Y. Simultaneous modeling of multiple diseases for mortality prediction in acute hospital care. In: KDD; 2015.
Obozinski G, Taskar B, Jordan MI. Joint covariate selection and joint subspace selection for multiple classification problems. Stat Comput. 2010;20(2):231–52.
W. H. Organization et al. Global database on body mass index. Global Database on Body Mass Index; 2011.
Pan Y, Yao T, Mei T, Li H, Ngo C-W, Rui Y. Click-through-based cross-view learning for image search. In: SIGIR; 2014.
Park K, Weber I, Cha M, Lee C. Persistent sharing of fitness app status on twitter. In: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing: ACM; 2016. p. 184–94.
Pastors JG, Warshaw H, Daly A, Franz M, Kulkarni K. The evidence for the effectiveness of medical nutrition therapy in diabetes management. Diabetes Care. 2002;25(3):608–13.
Pennebaker JW, Francis ME, Booth RJ. Linguistic inquiry and word count: Liwc 2001. Mahwah, NJ: Lawrence Erlbaum Associates; 2001.
Powers DM. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J Mach Learn Technol. 2011;2(1):37–63.
Relia K, Akbari M, Duncan D, Chunara R. Socio-spatial self-organizing maps: using social media to assess relevant geographies for exposure to social processes. Proceedings of the ACM on Human-Computer Interaction—CSCW; 2018.
Ritter A, Etzioni O, Clark S, et al. Open domain event extraction from twitter. In: KDD; 2012.
Robinson PN. Deep phenotyping for precision medicine. Hum Mutat. 2012;33(5):777–80.
Ruvolo P, Eaton E. Online multi-task learning via sparse dictionary optimization. In: AAAI; 2014.
Shelley KJ. Developing the American time use survey activity classification system. Monthly Lab Rev. 2005;128:3.
Shen G, Jia J, Nie L, Feng F, Zhang C, Hu T, Chua T-S, Zhu W. Depression detection via harvesting social media: A multimodal dictionary learning solution. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17); 2017. p. 3838–44.
Sohn S, Clark C, Halgrim SR, Murphy SP, Chute CG, Liu H. MedXN: an open source medication extraction and normalization tool for clinical text. J Am Med Inform Assoc. 2014;21(5):858–65.
Song X, Nie L, Zhang L, Akbari M, Chua T-S. Multiple social network learning and its application in volunteerism tendency prediction. In: SIGIR; 2015.
Sun Z, Wang F, Hu J. Linkage: an approach for comprehensive risk prediction for care management. In: KDD; 2015.
Teodoro R, Naaman M. Fitter with twitter: Understanding personal health and fitness activity in social media. In: ICWSM; 2013.
Thelen M, Riloff E. A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In: EMNLP; 2002.
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc. 1996;58(1):267–88.
Wang C, Raina R, Fong D, Zhou D, Han J, Badros G. Learning relevance from heterogeneous social network and its application in online targeting. In: SIGIR; 2011.
Wang F, Lee N, Hu J, Sun J, Ebadollahi S, Laine AF. A framework for mining signatures from event sequences and its applications in healthcare data. IEEE Trans Pattern Anal Mach Intell. 2013;35(2):272–85.
Wing C, Yang H. FitYou: integrating health profiles to real-time contextual suggestion. In: SIGIR; 2014.
Xu L, Huang A, Chen J, Chen E. Exploiting task-feature co-clusters in multi-task learning. In: AAAI; 2015.
Xu T, Sun J, Bi J. Longitudinal lasso: Jointly learning features and temporal contingency for outcome prediction. In: CIKM; 2015.
Zhao Z, Liu H. Spectral feature selection for supervised and unsupervised learning. In: ICML; 2007.
Zhou D, Chen L, He Y. An unsupervised framework of exploring events on twitter: Filtering, extraction and categorization. In: AAAI; 2015.
Zhou J, Wang F, Hu J, Ye J. From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records. In: KDD; 2014.
Zhou L, Melton GB, Parsons S, Hripcsak G. A temporal constraint structure for extracting temporal information from clinical narrative. J Biomed Inform. 2006;39(4):424–39.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Akbari, M., Hu, X., Chua, TS. (2019). Learning Wellness Profiles of Users on Social Networks: The Case of Diabetes. In: Bian, J., Guo, Y., He, Z., Hu, X. (eds) Social Web and Health Research. Springer, Cham. https://doi.org/10.1007/978-3-030-14714-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-14714-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14713-6
Online ISBN: 978-3-030-14714-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)