Using GIS technology to identify areas of tuberculosis transmission and incidence
- 20k Downloads
Currently in the U.S. it is recommended that tuberculosis screening and treatment programs be targeted at high-risk populations. While a strategy of targeted testing and treatment of persons most likely to develop tuberculosis is attractive, it is uncertain how best to accomplish this goal. In this study we seek to identify geographical areas where on-going tuberculosis transmission is occurring by linking Geographic Information Systems (GIS) technology with molecular surveillance.
This cross-sectional analysis was performed on data collected on persons newly diagnosed with culture positive tuberculosis at the Tarrant County Health Department (TCHD) between January 1, 1993 and December 31, 2000. Clinical isolates were molecularly characterized using IS6110-based RFLP analysis and spoligotyping methods to identify patients infected with the same strain. Residential addresses at the time of diagnosis of tuberculosis were geocoded and mapped according to strain characterization. Generalized estimating equations (GEE) analysis models were used to identify risk factors involved in clustering.
Evaluation of the spatial distribution of cases within zip-code boundaries identified distinct areas of geographical distribution of same strain disease. We identified these geographical areas as having increased likelihood of on-going transmission. Based on this evidence we plan to perform geographically based screening and treatment programs.
Using GIS analysis combined with molecular epidemiological surveillance may be an effective method for identifying instances of local transmission. These methods can be used to enhance targeted screening and control efforts, with the goal of interruption of disease transmission and ultimately incidence reduction.
KeywordsTuberculosis Geographic Information System Inverse Distance Weighting Homeless Shelter Tuberculosis Transmission
List of abbreviations
Generalized Estimating Equations
Geographic Information Systems
Human Immuno-deficiency Virus
Inverse Distance Weighting
North Central Texas Council of Governments
Restriction Fragment Length Polymorphism
Tarrant County Health Department
Texas Department of Health
The application of molecular analysis to identify specific Mycobacterium tuberculosis strains (TB), in combination with traditional surveillance, has yielded insights into tuberculosis transmission . These insights together with a downward trend in tuberculosis in the United States have resulted in the Center for Disease Control and Prevention re-evaluating the TB elimination strategy, and recommending that testing be targeted at specific high risk populations [2, 3]. The Institute of Medicine (IOM) also recommended the development of more effective methods for identifying persons with recently acquired infections as an important component of new strategies to limit the spread of tuberculosis . While a strategy of targeted testing and treatment of persons most likely to develop tuberculosis is attractive, it is uncertain how best to accomplish this goal.
Persons with molecularly clustered tuberculosis isolates are assumed to be in the same chain of recent tuberculosis transmission [5, 6]. Limited studies have been conducted to evaluate whether these clusters occur in predefined geographical areas [7, 8, 9, 10, 11]. If so, then geographically based screening and treatment could be an effective method for TB control programs to identify high risk populations. In this study we seek to determine if we can identify geographical areas with on-going tuberculosis transmission by linking Geographic Information Systems (GIS) technology with ongoing molecular surveillance.
This cross-sectional analysis was performed on data collected on all persons newly diagnosed with culture positive tuberculosis at the Tarrant County Health Department (TCHD) between January 1, 1993 and December 31, 2000. The TCHD serves the western portion of the Fort Worth-Dallas metropolitan area and includes a population of approximately 1.5 million . The Fort Worth-Dallas metropolitan area is the ninth largest in the U.S. . This study is part of the recent collaborative project sponsored by the Center for Disease Control and Prevention National Tuberculosis Genotyping and Surveillance Network for studying the molecular epidemiology of tuberculosis . All data and materials; including isolates, isolate genotypes, demographic factors, and addresses; were collected prospectively. Moreover, one of the stated objectives of the National Tuberculosis Genotyping and Surveillance Network was to characterize places involved in potential TB transmission .
All positive isolates obtained from persons residing in Tarrant County were sent to the Texas Department of Health (TDH) for DNA fingerprinting. Only persons whose M. tuberculosis strains were typed by the Texas Department of Health Mycobacteriology Laboratory were analyzed. Clinical isolate IS6110-based RFLP analysis and spoligotyping analyses were utilized to identify patients infected with the same strain using published methods [16, 17]. RFLP analysis using IS6110 RFLP is a powerful tool for discerning one strain of M. tuberculosis from another when there are greater than 6 copies of IS6110 however, a secondary typing method is needed to help differentiate strains with 6 or fewer IS6110 copies . For this project, isolates were considered to be clonally related (i.e., genotypically clustered) if they had identical IS6110 patterns containing seven or more bands, or they had identical IS6110 patterns containing six or fewer bands and identical spoligotypes. A geographic cluster was defined as two or more patients with molecularly related TB strains living in Tarrant County, TX. The proportion of cases due to ongoing transmission was estimated allowing one source case per cluster (i.e. n-1 method) .
Any patient who did not have both spoligotyping and RFLP analysis of IS6110 performed on their M. tuberculosis isolate, and/or did not live within Tarrant County at the time of collection was excluded from the geographical analysis. Each eligible patient participated in a standard interview as part of their routine initial medical evaluation. Interview data collected included current and past employment, housing, alcohol and illicit drug use, incarceration history, sexual orientation, and psychiatric history. Persons paid daily for work were considered sporadically employed; others were employed or unemployed. Homelessness was defined as being without a permanent address for more than 3 days since 1991. If persons had a history of homelessness, paid rent by the day, or lived with a non-spousal roommate without paying rent, they were considered unstably housed. Alcoholism was defined by admission of daily consumption of three or more ounces of an alcoholic beverage; a history of alcohol-related conditions including cirrhosis, hepatitis, alcohol withdrawal seizures; or incarceration for alcohol use. Illicit drug use was defined by admission of use or documentation of being under the influence of an illicit drug. Persons were classified as having a history of incarceration if they had spent more than 24 hours in any criminal justice facility since 1991. Patients born in the U.S. or one of its territories were considered American-born; all others were considered foreign-born. All patients received HIV testing and counseling as part of standard clinical practice at the time of diagnosis. HIV status was determined from these tests.
Residential address at the time of diagnosis of tuberculosis, including zip code, were geocoded using ArcView, 4.0, Geographic Information System Software, (ESRI, Redlands, CA). After geocoding, automatically and interactively, 94% of the cases were correctly matched. The numbers of cases were then aggregated by zip code and, for each zip code, an average of the total population reported for the US Census 2000 and US Census 1990 was used to calculate incidence. Population information was retrieved from the US Census Bureau  and the North Central Texas Council of Governments (NCTOG) . The US Census Bureau website provides census data aggregated to certain boundaries (e.g.) block groups, blocks, census tracts, zip codes, counties, and states. The NCTOG is a collection of local governments in the Dallas Fort Worth area, provides demographic and GIS data for the region. The demographic data provided has been directly extracted from the US Census. Zip-code level boundaries were established for incidence comparison purpose using zip code tabulation areas (ZTCAs) .
The three-dimensional analysis was performed using Inverse Distance Weighting (IDW) . Interpolation is the estimation of values for points in an area not actually sampled. There are many different types of interpolation, with IDW being the simplest interpolation method. A neighborhood about the interpolated point is identified and a weighted average is taken of the observation values within this neighborhood. The weights are a decreasing function of distance. The simplest weighting function is inverse power: w(d)= 1/dp with p > 0. For p = 1, the interpolated function is "cone-like" in the vicinity of the data points . The resulting "cone" shows the clustering of data around the center point of a geographical area.
Statistical analysis was performed utilizing SAS V.8 statistical software (SAS Institute, Cary, NC). Patients with genotypically clustered and unique strains (non-clustered) were compared regarding each categorical risk variable by using odds ratio as a measure of association. Risk categories included patient demographics, and tuberculosis risk factors such as homelessness, HIV-infection, incarceration, and foreign birth. Because members of the clustered cases are assumed to be related, generalized estimating equations (GEE) analysis  was performed to determine factors associated with infection of genotypically and geographically clustered strains of M. tuberculosis, and to derive maximum likelihood odds ratio and 95% confidence intervals for all variables. Age was the only continuous variable. Age statistics were analyzed by comparing the means between groups. A 95% confidence interval for the mean age difference was calculated by using the normal approximation, and an independent sample two-tailed student's t-test was used to assess the statistical significance of the mean age difference. The institutional review board of the University of North Texas Health Science Center at Fort Worth approved this investigation.
Selected factors associated with genotypic clustering *Within Clustering, CI = confidence interval; OR = Odds Ratio
Living in Zip Code 1
Living in Zip Code 2
Living in Zip Code 3
Three hundred and twenty-one (65.7%) patients were born in the United States. Of those, 235 (73.2%) had clinical isolates that matched the isolate from at least one other person living in Tarrant County. One hundred and sixty-seven patients were born outside of the United States. Of those, 57 (34.1%) clinical isolates that matched the isolate from at least one other person living in Tarrant County. U.S. born individuals were significantly more likely to be genotypically clustered than foreign-born counterparts [OR = 5.3, 95% CI 3.5, 7.9]. The birth country of foreign-born patients varied. Of those born outside of the U.S, 77 (46.1%) were born in Latin America, 47 (28.1%) in Southeast Asia, 14 (8.4%) in Sub-Saharan Africa, 12 (7.2%) in Pacific Asia, 11 (6.6%) in South Asia, and 6 (3.6%) in Europe.
GIS analysis demonstrated that the areas with the highest incidence also have the highest proportion of persons with genotypically clustered isolates. A strong preponderance of clustering occurred in the urban center of Tarrant County. The highest proportion of persons with molecular clustered TB isolates (80.4% clustered) occurred in the same zip code with the highest incidence. Similarly, zip code 2 on the southeast border of zip code 1, recorded the second highest proportion of persons with molecular clustered TB isolates with 76.6% of all reported cases clustered. Cases reported in zip code 1 were more than six times as likely [OR = 6.2, 95% CI = 2.4, 16.1] than any other zip code to have isolates that match at least one other person living in Tarrant County.
In zip code 3, we observed a morbidity that was more than triple the county average (22.3 cases per 100,000). Unlike other high morbidity areas, zip code 3 had a strong preponderance of unique strain distribution. In this zip code, 17 out of 26 (65.4%) patients had isolates that did not match any other patient in Tarrant County. Cases reported in this zip code were 70% less likely [OR = 0.3, 95% CI = 0.1, 0.7] to have a clustered strain, suggesting that the high rates of tuberculosis did not result from local on-going transmission.
The number of cases in the United States is at its lowest point in history, with 15,075 cases reported in 2002 . The role of treatment of LTBI in tuberculosis elimination is of increasing importance. The IOM recommended developing improved methods for identifying persons with recently acquired infections as an important component of strategic tuberculosis elimination in the United States . This study uncovered geographical links to on-going tuberculosis transmission enhancing traditional public health surveillance. We found that by combining molecular strain characterization with GIS analysis that risk of on-going transmission was geographically focal (p = 0.003) with significant clustering of cases occurring in 3 of 59 zip codes. This demonstrated that the current methods of surveillance of contacts of persons with tuberculosis were not completely effective in interrupting disease transmission in these zip code areas.
These finding are similar to those reported in Los Angeles where locations, specifically homeless shelters were identified as important sites of tuberculosis transmission . Similarly, in Houston, locations, specifically bars, were as important as persons in uncovering epidemiological and genotypical links in outbreak investigations . The authors of both of these studies suggested measures to reduce tuberculosis transmission should be based on locations as well as personal contacts [33, 34].
We identified that 55% of our patients were clustered and 47% attributable to ongoing community transmission. This differs from a study conducted in a high incidence area of South Africa, where 72% of cases were clustered and 58% attributable to ongoing community transmission . Our lower percentage of clustering and attributable on-going transmission may be related to a much lower reported overall morbidity or the effects of differences in programmatic interventions, such as contact investigation, targeted screening efforts or DOT completion rates.
Although the majority of the tuberculosis morbidity within the developed world is strongly influenced by imported tuberculosis from high prevalence countries [35, 36], the rates at which these individuals transmit disease to the general population remain low. We found that foreign-born cases were significantly more likely to have a unique strain [OR = 6.4, 95% CI = 4.1, 9.8] indicating that immigrants were less likely to be source of ongoing transmission of TB in Tarrant County. In a San Francisco based study, investigators identified only two instances of a foreign-born individual transmitting the disease to the native population . Similarly, only 1.8% of transmission from infectious Somali immigrants was to the native population in the Netherlands over the period from 1992 to 1999 . Historically, tracking these populations of foreign born to assess transmission has been difficult. GIS provides another approach for evaluating this issue. As this study illustrates, identifying geographical areas of increased incidence with a high percentage of unique strains may improve local surveillance methods to locate hard to reach foreign-born populations before transmission occurs.
There are some limitations to this research approach. This is based on secondary data, which includes variables collected from a cross-sectional period of time. Although each case is an incident case at the time of diagnosis, under this cross-sectional design, exposure and disease outcomes are assessed simultaneously. In addition patients with tuberculosis may have moved shortly before their diagnosis. However, this should not cause systematic error (bias) or result in an association of clustering with specific locations, because these events would be expected to produce a random misclassification. Also persons exposed within certain zip codes may go on to reside elsewhere and later develop the disease, and result in an underestimate of the morbidity and that may be reflected in calculating associations. Finally, genotyping results were not available for a proportion of TB cases in this study. Some unique isolates might have clustered if some of the missing isolates had been available or if other cases with the same strain moved or are located outside the study area . We therefore believe that estimates of the degree of clustering and the size of clusters are conservative.
When using this approach TB control programs must select the appropriate geographical boundary to examine transmission in their area. For example, using zip codes may be too large a boundary in very populated metropolitan areas. Census block groups may provide greater resolution in determining localized transmission.
Nor are the molecular techniques used without limitation. Patients are clustered according to their isolates having the same genotype. While IS6110 RFLP is recognized as the most discriminatory method for genotyping M. tuberculosis isolates, the discriminatory ability of the technique decreases when there are fewer than 6 IS6110 insertions in the genome. In this case, spoligotyping was used for further strain discrimination. However, it is still possible that some isolates classified as being the same strain based on identical genotypes may represent distantly related, but distinct, strains. Moreover, demonstration that particular patients have the same strain supports, but does not irrefutably prove, direct transmission between these patients as opposed to another source of infection. Conversely, strains continue to evolve, and the resulting genotypic differences over time can result in assigning isolates from cases of direct transmission to distinct strain lineages. Given that a small minority of the isolates had fewer than 6 IS6110 bands (18.2%) or differed by the presence or absence of one band in an otherwise conserved pattern (3.7%), we believe that estimates of the degree of clustering and the size of clusters are conservative.
Using GIS analysis combined with molecular epidemiological surveillance can be an effective method for identifying tuberculosis transmission not identified during standard contact tracing methods. The application of these methods can be utilized in countries where contact tracing is routinely performed. These methods can enhance targeted screening and control efforts, with the goal of interruption of disease transmission and ultimately incidence reduction. This study demonstrates that using existing health data, GIS can identify previously undetected TB transmission. These results were used to design new targeted screening efforts . Studies of these efforts are ongoing to demonstrate if identifying focal areas for targeted screening has utility in reducing TB transmission.
This work was supported in part by the Centers for Disease Control and Prevention, National Tuberculosis Genotyping and Surveillance Network Cooperative Agreement U52/CCU600497-18, and Tuberculosis Epidemiologic Studies Consortium 200-2001-00084.
We are grateful to Drs. Wendy Cronin and Marco Marruffo for their thoughtful review of the manuscript, and to Curtis Denton and Deanna Sanchez for their assistance in creating maps.
- 2.From the Centers for Disease Control and Prevention: Tuberculosis morbidity among U.S.-born and Foreign-born populations – United States 2000. MMWR Morb Mortal Wkly Rep. 2002, 51 (5): 101-4.Google Scholar
- 3.Centers for Disease Control and Prevention: Advisory Council for the Elimination of Tuberculosis (ACET): Tuberculosis elimination revisited: obstacles, opportunities, and a renewed commitment. MMWR Recomm Rep. 1999, 48 (RR-9): 1-13.Google Scholar
- 4.Institute of Medicine: Ending Neglect: the elimination of tuberculosis in the United States. 2000, Washington, D.C.: National Academy PressGoogle Scholar
- 5.Small PM, Hopewell PC, Singh SP, Paz A, Parsonnet J, Ruston DC, Schecter GF, Daley CL, Schoolnik GK: The epidemiology of tuberculosis in San Francisco: a population-based study using conventional and molecular methods. N Engl J Med. 1994, 330: 1703-9. 10.1056/NEJM199406163302402.PubMedCrossRefGoogle Scholar
- 10.Bishai WR, Graham NM, Harrington S, Pope DS, Hooper N, Astemborski J, Sheely L, Vlahov D, Glass GE, Chaisson RE: Molecular and geographic patterns of tuberculosis transmission after 15 years of directly observed therapy. JAMA. 1998, 280: 1679-84. 10.1001/jama.280.19.1679.PubMedCrossRefGoogle Scholar
- 11.Verver S, Warren RM, Munch Z, Vynnycky E, van Helden PD, Richardson M, van der Spuy GD, Enarson DA, Borgdorff MW, Behr MA, Beyers N: Transmission of tuberculosis in a high incidence urban community in South Africa. Int J Epidemiol. 2004, 33 (2): 351-7. 10.1093/ije/dyh021.PubMedCrossRefGoogle Scholar
- 12.U.S. Census Bureau 2000: Metropolitan area population estimates. [http://www.census.gov/population/estimates/metrocity/ma99.04.txt]
- 13.U.S. Census Bureau 2000: Metropolitan area population size and percent change. [http://www.census.gov/population/estimates/metro-city/ma99.02.txt]
- 16.Kamerbeek JLS, Kolk A, van Agterveld M, van Soolingen D, Kuijper S, Bunschoten A, Molhuizen H, Shaw R, Goyal M, van Embden JD: Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J Clin Microbiol. 1997, 35: 907-914.PubMedPubMedCentralGoogle Scholar
- 17.Kremer K, van Soolingen D, Frothingham R, Haas WH, Hermans PW, Martin C, Palittapongarnpim P, Plikaytis BB, Riley LW, Yakrus MA, Musser JM, van Embden JD: Comparison of methods based on different molecular epidemiological markers for typing of Mycobacterium tuberculosis complex strains: interlaboratory study of discriminatory power and reproducibility. J Clin Microbiol. 1999, 37: 2607-2618.PubMedPubMedCentralGoogle Scholar
- 19.U.S. Census Bureau. [http://www.census.gov]
- 20.North Central Texas Council of Governments. [http://www.nctcog.org]
- 21.US Census Bureau. Population data by zip code tabulation areas (ZCTAs). [http://www.census.gov/geo/ZCTA/zctafaq.html]
- 22.Fisher NI, Lewis T, Embleton BJ: Statistical Analysis of Spherical Data. 1999, Cambridge, U.K.: Cambridge University Press, 329-Google Scholar
- 24.Centers for Disease Control and Prevention: Reported Tuberculosis in the United States, 2002. 2003, Atlanta, GA CDCGoogle Scholar
- 29.Yaganehdoost A, Graviss EA, Ross MW, Adams GJ, Ramaswamy S, Wanger A, Frothingham R, Soini H, Musser JM: Complex transmission dynamics of clonally related virulent Mycobacterium tuberculosis associated with barhopping by predominantly human immunodeficiency virus-positive gay men. J Infect Dis. 1999, 180 (4): 1245-51. 10.1086/314991.PubMedCrossRefGoogle Scholar
- 37.Chin DP, DeRiemer K, Small PM, de Leon AP, Steinhart R, Schecter GF, Daley CL, Moss AR, Paz EA, Jasmer RM, Agasino CB, Hopewell PC: Differences in contributing factors to tuberculosis incidence in U.S. -born and foreign-born persons. Am J Respir Crit Care Med. 1998, 158 (6): 1797-803.PubMedCrossRefGoogle Scholar
- 38.Lillebaek T, Andersen AB, Bauer J, Dirksen A, Glismann S, de Haas P, Kok-Jensen A: Risk of M. tuberculosis transmission in a low-incidence country due to immigration from high-incidence areas. J Clin Microbiol. 2001, 39 (3): 855-861. 10.1128/JCM.39.3.855-861.2001.PubMedPubMedCentralCrossRefGoogle Scholar
- 40.Burgess G, Moonan PK, Weis SE: National Assocation of County and City Health Officals: Model Practice Database. [http://archive.naccho.org/modelPractices/Result.asp?PracticeID=108]
This article is published under license to BioMed Central Ltd. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.