Journal of General Internal Medicine

, Volume 33, Issue 4, pp 533–538 | Cite as

Comparing Amazon’s Mechanical Turk Platform to Conventional Data Collection Methods in the Health and Medical Research Literature

  • Karoline Mortensen
  • Taylor L. Hughes
Review Paper



The goal of this article is to conduct an assessment of the peer-reviewed primary literature with study objectives to analyze’s Mechanical Turk (MTurk) as a research tool in a health services research and medical context.


Searches of Google Scholar and PubMed databases were conducted in February 2017. We screened article titles and abstracts to identify relevant articles that compare data from MTurk samples in a health and medical context to another sample, expert opinion, or other gold standard. Full-text manuscript reviews were conducted for the 35 articles that met the study criteria.


The vast majority of the studies supported the use of MTurk for a variety of academic purposes.


The literature overwhelmingly concludes that MTurk is an efficient, reliable, cost-effective tool for generating sample responses that are largely comparable to those collected via more conventional means. Caveats include survey responses may not be generalizable to the US population.


Amazon Mechanical Turk MTurk Alternate data sources Health and medical research 


Compliance with Ethical Standards

Conflict of Interest

The authors declare no conflicts of interest.


  1. 1.
    Redmiles EM, Kross S, Pradhan A, Mazurek ML. How well do my results generalize? Comparing security and privacy survey results from MTurk and web panels to the US; 2017. Technical Report of the Computer Science Department at the University of Maryland.
  2. 2.
    Paolacci G, Chandler J, Ipeirotis P. Running experiments on Amazon Mechanical Turk. Judgment and decision making. 2010;5(5):411–419. Scholar
  3. 3.
    Chandler J, Shapiro DN. Conducting clinical research using crowdsourced convenience samples. Annual review of clinical psychology. 2016;12:53–81. Scholar
  4. 4.
    Pittman M, Sheehan K. Amazon’s Mechanical Turk a digital sweatshop? Transparency and accountability in crowdsourced online research. Journal of media ethics. 2016;31(4):260–262. Scholar
  5. 5.
    Hitlin P. Research in the crowdsourcing Age, a case study.; 2016.
  6. 6.
    Stewart N, Harris AJL, Bartels DM, Newell BR, Paolacci G, Chandler J. The average laboratory samples a population of 7,300 Amazon Mechanical Turk workers. Judgment and decision making. 2015;10(5):479–491. Scholar
  7. 7.
    Behrend TS, Sharek DJ, Meade AW, Wiebe EN. The viability of crowdsourcing for survey research. Behavioral research methods. 2011;43(3):800–813. Scholar
  8. 8.
    Berinsky AJ, Huber GA, Lenz GS. Evaluating online labor markets for experimental tesearch:’s Mechanical Turk. Political analysis. 2012;20(3):351–368. Scholar
  9. 9.
    Buhrmester M, Kwang T, Gosling SD. Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives in psychological science. 2011;6(1):3–5. Scholar
  10. 10.
    Woods AT, Velasco C, Levitan CA, Wan X, Spence C. Conducting perception research over the internet: a tutorial review. PeerJ. 2015;3:e1058. Scholar
  11. 11.
    Sheehan KB. Crowdsourcing research: Data collection with Amazon’s Mechanical Turk. Commun Monogr. 2017;0(0):1–17.
  12. 12.
    Shapiro DN, Chandler J, Mueller PA. Using Mechanical Turk to study clinical populations. Clinical psychological science. 2013;1(2):213–220. Scholar
  13. 13.
    Casler K, Bickel L, Hackett E. Separate but equal? A comparison of participants and data gathered via Amazon’s MTurk, social media, and face-to-face behavioral testing. Computers in human behavior. 2013;29(6):2156–2160. Scholar
  14. 14.
    Horton JJ, Rand DG, Zeckhauser RJ. The online laboratory: Conducting experiments in a real labor market. Experimental economics. 2011;14(3):399–425. Scholar
  15. 15.
    Mason W, Suri S. Conducting behavioral research on Amazon’s Mechanical Turk. Behavioral research methods. 2012;44(1):1–23. Scholar
  16. 16.
    Ranard BL, Ha YP, Meisel ZF, et al. Crowdsourcing—harnessing the masses to advance health and medicine, a systematic review. Journal of general internal medicine. 2014;29(1):187–203. Scholar
  17. 17.
    Constitution of the World Health Organization. 1946.
  18. 18.
    Aghdasi N, Bly R, White LW, Hannaford B, Moe K, Lendvay TS. Crowd-sourced assessment of surgical skills in cricothyrotomy procedure. Journal of surgical research. 2015;196(2):302–306. Scholar
  19. 19.
    Arch JJ, Carr AL. Using Mechanical Turk for research on cancer survivors. Psychooncology. 2016;
  20. 20.
    Arditte KA, Cek D, Shaw AM, Timpano KR. The importance of assessing clinical phenomena in Mechanical Turk research. Psychological assessment. 2016;28(6):684–691. Scholar
  21. 21.
    Bardos J, Friedenthal J, Spiegelman J, Williams Z. Cloud based surveys to assess patient perceptions of health care: 1000 respondents in 3 days for US $300. JMIR research protocols. 2016;5(3):e166. Scholar
  22. 22.
    Boynton MH, Richman LS. An online daily diary study of alcohol use using Amazon’s Mechanical Turk. Drug and alcohol review. 2014;33(4):456–461. Scholar
  23. 23.
    Brady CJ, Villanti AC, Pearson JL, Kirchner TR, Gupta OP, Shah CP. Rapid grading of fundus photographs for diabetic retinopathy using crowdsourcing. Journal of medical internet research. 2014;16(10):e233. Scholar
  24. 24.
    Briones EM, Benham G. An examination of the equivalency of self-report measures obtained from crowdsourced versus undergraduate student samples. Behavioral research methods. 2016.
  25. 25.
    Brown AW, Allison DB. Using crowdsourcing to evaluate published scientific literature: Methods and example. PLoS One. 2014;9(7):e100647. Scholar
  26. 26.
    Chen C, White L, Kowalewski T, et al. Crowd-Sourced Assessment of Technical Skills: A novel method to evaluate surgical performance. Journal of surgical research. 2014;187(1):65–71. Scholar
  27. 27.
    Deal SB, Lendvay TS, Haque MI, et al. Crowd-sourced assessment of technical skills: an opportunity for improvement in the assessment of laparoscopic surgical skills. American journal of surgery. 2016;211(2):398–404. Scholar
  28. 28.
    Gardner RM, Brown DL, Boice R. Using Amazon’s Mechanical Turk website to measure accuracy of body size estimation and body dissatisfaction. Body image. 2012;9(4):532–534. Scholar
  29. 29.
    Good BM, Nanis M, Wu C, Su AI. Microtask crowdsourcing for disease mention annotation in PubMed abstracts. Pacific symposium on biocomputing. 2015:282–293.
  30. 30.
    Harber P, Leroy G. Assessing work–asthma interaction with Amazon Mechanical Turk. Journal of occupational medicine. 2015;57(4):381–385. Scholar
  31. 31.
    Harris JK, Mart A, Moreland-Russell S, Caburnay CA. Diabetes topics associated with engagement on Twitter. Preventing chronic disease. 2015;12:E62. Scholar
  32. 32.
    Hipp JA, Manteiga A, Burgess A, Stylianou A, Pless R. Webcams, crowdsourcing, and enhanced crosswalks: Developing a novel method to analyze active transportation. Frontiers in public health. 2016;4:1–9. Scholar
  33. 33.
    Holst D, Kowalewski TM, White LW, et al. Crowd-Sourced Assessment of Technical Skills (C-SATS): Differentiating animate surgical skill through the wisdom of crowds. Journal of endourology. 2015;29(10):1183–8. Scholar
  34. 34.
    Khare R, Burger JD, Aberdeen JS, et al. Scaling drug indication curation through crowdsourcing. Database. 2015;2015:bav016. Scholar
  35. 35.
    Kim HS, Hodgins DC. Reliability and validity of data obtained from alcohol, cannabis, and gambling populations on Amazon’s Mechanical Turk. Psychology of addictive behaviors. 2017;31(1):85–94. Scholar
  36. 36.
    Kuang J, Argo L, Stoddard G, Bray BE, Zeng-Treitler Q. Assessing pictograph recognition: A comparison of crowdsourcing and traditional survey approaches. Journal of medical internet research. 2015;17(12):e281. Scholar
  37. 37.
    Lee AY, Lee CS, Keane PA, Tufail A. Use of Mechanical Turk as a MapReduce framework for macular OCT segmentation. Journal of ophthalmology. 2016.
  38. 38.
    Lloyd JC, Yen T, Pietrobon R, et al. Estimating utility values for vesicoureteral reflux in the general public using an online tool. Journal of pediatric urology. 2014;10(6):1026–1031. Scholar
  39. 39.
    MacLean DL, Heer J. Identifying medical terms in patient-authored text: a crowdsourcing-based approach. Journal of the american medical informatics association. 2013;20(6):1120–1127. Scholar
  40. 40.
    Mitry D, Peto T, Hayat S, et al. Crowdsourcing as a screening tool to detect clinical features of glaucomatous optic neuropathy from digital photography. PLoS One. 2015;10(2):1–8. Scholar
  41. 41.
    Mitry D, Zutis K, Dhillon B, et al. The accuracy and reliability of crowdsource annotations of digital retinal images. Translational vision science & technology. 2016;5(5):6. Scholar
  42. 42.
    Mortensen JM, Musen MA, Noy NF. Crowdsourcing the verification of relationships in biomedical ontologies. AMIA Annual symposium proceedings. 2013;2013:1020–1029.PubMedPubMedCentralGoogle Scholar
  43. 43.
    Powers MK, Boonjindasup A, Pinsky M, et al. Crowdsourcing assessment of surgeon dissection of renal artery and vein during robotic partial nephrectomy: A novel approach for quantitative assessment of surgical performance. Journal of endourology. 2016;30(4):447–452. Scholar
  44. 44.
    Santiago-Rivas M, Schnur JB, Jandorf L. Sun protection belief clusters: Analysis of Amazon Mechanical Turk data. Journal of cancer education. 2016;31(4):673–678.
  45. 45.
    Schleider JL, Weisz JR. Using Mechanical Turk to study family processes and youth mental health: A test of feasibility. Journal of child and family studies. 2015;24(11):3235–3246. Scholar
  46. 46.
    Shao W, Guan W, Clark MA, et al. Variations in recruitment yield, costs, speed, and participant diversity across internet platforms in a global study examining the efficacy of an HIV/AIDS and HIV testing animated and live-action video. Digital culture & education. 2015;7(1):40–86.Google Scholar
  47. 47.
    Turner AM, Kirchhoff K, Capurro D. Using crowdsourcing technology for testing multilingual public health promotion materials. Journal of medical internet research. 2012;14(3):e79. Scholar
  48. 48.
    White LW, Kowalewski TM, Dockter RL, Comstock B, Hannaford B, Lendvay TS. Crowd-Sourced Assessment of Technical Skill: A valid method for discriminating basic robotic surgery skills. Journal of endourology. 2015;29(11):1295–1301. Scholar
  49. 49.
    Wu C, Scott Hultman C, Diegidio P, et al. What do our patients truly want? Conjoint analysis of an aesthetic plastic surgery practice using internet crowdsourcing. Aesthet Surg J. 2017;37(1):105–118.
  50. 50.
    Wymbs BT, Dawson AE. Screening Amazon’s Mechanical Turk for adults with ADHD. J Atten Disord. 2015:1–10.
  51. 51.
    Yu B, Willis M, Sun P, Wang J. Crowdsourcing participatory evaluation of medical pictograms using Amazon Mechanical Turk. Journal of medical internet research. 2013;15(6):e108. Scholar

Copyright information

© Society of General Internal Medicine 2017

Authors and Affiliations

  1. 1.Department of Health Sector Management and Policy School of Business Administration University of MiamiCoral GablesUSA
  2. 2.Duke UniversityDurhamUSA

Personalised recommendations