Accurate and reliable population health data are critical to public health and enable evidence-based planning, policy-making, and program evaluation. Public health agencies rely on local data to identify and monitor the burden of disease in a population over time and to answer important health policy questions. The New York City Health and Nutrition Examination Survey (NYC HANES), a local, representative heath examination survey, was conducted twice in NYC, once in 2004 and again in 2013–14 [1, 2]. Data from NYC HANES have served as a strong complement to existing population health surveillance data, especially in the context of New York City’s urban environment. In this special issue, we describe a range of findings from NYC HANES 2013–14 that collectively illustrate the unique contribution of examination surveys to population health surveillance. Yet, few public health agencies implement examination surveys, despite their important contributions, in part because such surveys can be expensive and difficult to conduct. We explore the value and challenges of surveys like NYC HANES, and we also describe emerging population health surveillance approaches that might provide complementary information to improve public health.

NYC HANES was modeled on the National Health and Nutrition Examination Survey (NHANES), using a similar sampling design, instruments, protocols, and testing laboratories. Results from NYC HANES 2004 showed for the first time how many New Yorkers suffered from diagnosed and undiagnosed chronic conditions like diabetes, high cholesterol, and depression, and whether their conditions were well controlled. NYC HANES 2004 findings [3] helped identify and support public health policies to improve New Yorkers’ health, such as laws to restrict the use of artificial trans-fat in restaurants and to reduce exposure to secondhand smoke, the creation of a diabetes A1c Registry to improve diabetes diagnosis and control, and an educational campaign to inform the public of high levels of mercury in certain fish. Ten years later, in 2013–14, the NYC Health Department and researchers at the CUNY School of Public Health (now at NYU School of Medicine) conducted a second NYC HANES to collect information about the health of New Yorkers, to assess health changes since 2004, and to support evaluation of health policies implemented over the past decade.

Unique strengths of a study like NYC HANES include the ability to quantify and characterize the burden of undiagnosed disease by combining self-reported survey data with objective measurements from laboratory testing or physical examination. Also, by remaining consistent with the national HANES design, comparisons can be made between NYC, a diverse, urban area, and the country to view local health changes in the context of national trends. For example, a 2016 study compared secondhand smoke exposure in NYC to the US overall using objectively measured blood cotinine levels and found more non-smokers in NYC were exposed to secondhand smoke than were exposed nationally (37 vs. 24%) and that non-smoking residents in NYC experienced a substantial decrease in secondhand smoke exposure since 2004 (from 57 to 37%) [4]. Possible explanations for greater exposure in NYC included more multi-unit housing, greater population density, and pedestrian exposure.

Many of the articles in this issue demonstrate these strengths, including a brief report that describes increases in obesity and racial inequities from 2004 to 2013–14 using findings from measured height and weight, as well as a brief report on local changes in diagnosed and undiagnosed diabetes and treatment and control using results from laboratory testing. The paper on depression highlights that surveys like NYC HANES can capture mental health conditions and associated health behaviors, data not typically captured well in non-survey sources. Several papers in this issue compare NYC results to the US overall, highlighting issues unique to urban areas and providing context for changes resulting from municipal policy and programmatic initiatives. For example, one paper in this issue demonstrates that decreases in blood mercury levels in NYC adults from 2004 to 2013–14 were greater than the national decrease, most likely reflecting the impact of intense local educational campaigns about high mercury concentration in certain types of fish and leading to decreased consumption of those species.

NYC HANES and other local surveys provide important local health information to inform municipal policies and program planning. In particular, health behaviors, health functioning, and small geographic area estimates resonate deeply with local policy-makers. Yet conducting an extensive in-person examination study such as NYC HANES is challenging. This is likely why only a handful of jurisdictions have been able to conduct local health examination surveys; these include NYC, Chicago, [5] and Wisconsin [6]. Population-based health examination surveys are not only costly and logistically difficult, but residents in NYC and elsewhere increasingly express distrust when asked to participate. Declining response rates to both in-person and telephone surveys require researchers to spend more energy and resources to reach and recruit cases [7, 8]. Thus, an important question is whether such studies will continue to be a worthwhile investment, especially in the context of emerging surveillance sources. New technologies and data systems have prompted public health practitioners to examine novel ways of collecting and analyzing health data for surveillance [9], while maintaining well-established surveillance standards [10].

Electronic health records (EHRs) are an example of an emerging source of population health information. The widespread use of EHRs is encouraging and suggests that data captured in EHRs may eventually be a representative of communities. As of 2015, 87% of office-based physicians were using an EHR [11]. One goal in conducting NYC HANES 2013–14 was to assess the potential for using EHR data for population health surveillance by validating a new EHR-based system, the NYC Macroscope, with NYC HANES findings. NYC HANES 2013–14 estimates were compared with EHR-derived estimates during the same period with mixed results. Among those who had seen a provider for primary care in the past year, diabetes, hypertension, smoking, and obesity prevalence estimates were similar to NYC HANES, but estimates for depression and influenza vaccination were considerably lower than survey estimates [12, 13]. Other jurisdictions have used EHR data to look at a range of conditions, including chlamydia, Lou Gehrig’s disease, asthma, and kidney disease [14, 15]. Massachusetts has validated estimates for diabetes, asthma, smoking, hypertension, and obesity from the EHR-based MDPHnet with the national Behavioral Risk Factor Surveillance Survey (BRFSS) [14]. In Colorado, the Colorado Health Observation Regional Data Service (CHORDS) is being used for public health monitoring and evaluation [16]. More work is needed to understand how EHR-based health estimates can be used and to ensure that they are reliable and accurate.

While EHRs offer innovative, timely and potentially cost-effective approaches to population health surveillance, to date, information from these sources is typically available only for select groups and cannot be generalized to entire populations. There are a few exceptions, such as Kaiser Permanente and the Department of Veteran’s Affairs, which have EHR networks that have broad, representative coverage of well-defined populations. With even more widespread use of EHRs and improved documentation and standardization promoted through meaningful use [17], future population coverage and data quality may improve sufficiently to provide reliable and representative health estimates for other populations and for many more conditions. Yet EHRs are unlikely to replace survey research, as EHRs exclude those not in care, and seldom contain information on health behaviors, such as alcohol use, exercise, and diet.

Health apps, mobile health technologies (mHealth), and social media are also opening new possibilities for health research and disease monitoring and management. Mobile health applications include smoking cessation, mental health counseling, chronic disease monitoring, and reminders to take prescription medication [18,19,20]. Technology is also driving innovations in how data can be analyzed. Private companies have been using artificial intelligence for health data analysis and predictive modeling [21, 22]. Public-private partnerships can encourage the use of advanced technology and have resulted in innovative approaches to capturing public health data. For example, the Centers for Disease Control and Prevention has recently partnered with Watson Health to explore ways to better manage data and respond to emergencies [23], and the NYC Health Department worked with Columbia University in using social media restaurant reviews to detect food poisoning outbreaks [24]. As such partnerships evolve, generating representative data that can inform public health agencies and policy-makers will be critical.

Innovative technologies can also be employed to strengthen how existing data are used for population health surveillance. One example of this is linking multiple data sources such as EHR, survey, and administrative data (e.g., housing, hospital, birth, and death records), enabling researchers to answer broader questions about causation, progression of risk factors and diseases, and the effect of environmental factors on health. New York City [25], San Diego [26], and other cities have compiled neighborhood-level data on crime, education, employment, housing, hospitalization, parks, transportation, and measures of social cohesion to provide a larger picture of the health of communities and the individuals living within those communities. Combining neighborhood and health survey data allows us to ask broader social questions, such as how do crime and access to parks impact exercise and obesity, and what housing characteristics are associated with increased secondhand smoke exposure and asthma.

In summary, population health surveys like NYC HANES are an important component of public health surveillance because they are representative, and they capture health measures that cannot be easily self-reported, such as undiagnosed disease and exposure to environmental toxins. While such surveys can be expensive and labor-intensive as currently designed, methodologic advances in sampling and more cost-effective approaches to collect and analyze biospecimens may improve feasibility over time. In addition, with more reliable and complete complementary data sources, it may be beneficial to conduct smaller-scale examination surveys that focus on objective measures for targeted conditions. Emerging approaches such as EHR-based surveillance and better data linkage technology also allow researchers to more easily supplement traditional survey data. Strong population health surveillance systems rely on triangulation of multiple data sources. With time, more jurisdictions can potentially obtain valuable information by combining representative surveys with some of these newer approaches. For the foreseeable future, population-based surveys like NYC HANES can continue to play an important role in population health surveillance.