Introduction

In Japan, incidence of gastric cancer is expected to follow the current downward trend as the younger generation has lower incidence of Helicobacter pylori infection [1]. In this study, therefore, we aimed to estimate how long gastric cancer screening is deemed necessary in the future from epidemiologic and statistical perspectives. Of note, for clarification purposes, population-based screening was selected as a screening mode to be analyzed in this study.

In Japan, based on the “Japanese guidelines for gastric cancer screening 2014 edition” edited by the National Cancer Center [2], the Ministry of Health, Labor, and Welfare recommends radiographic screening and endoscopy as population-based screening [3]. Especially, endoscopy screening was recommended very recently since 2016. In principle, population-based screening should be introduced and conducted after comparing and weighing the benefits regarding mortality reduction and harms concerning screening [4, 5]. Even though there are many disagreements over whether performing screenings falling short of such standard is justifiable, few may take a critical attitude toward conducting screenings if they meet this standard. The challenge here is how to compare the benefits, i.e., size of mortality reduction, to the potential harms of screening. The most common harms associated with screening include false-negative test results, false-positive test results, overdiagnosis, as well as adverse reactions to screening and diagnostic examination procedures. It is not easy to compare these issues with the size of mortality reduction effect because they have fundamentally different natures. In the Japanese guidelines for cancer screening 2014 edition, for comparison between benefits and harms of screening, Number Needed to Screen (NNS), representing the size of mortality reduction effect, is used as a benefit indicator, while recall rate is employed as a risk indicator, which is the same as the Japanese guidelines for breast cancer screening [6]. NNS is an estimated number of people required to participate in a screening program to prevent one death over a defined time interval, and thus the smaller NNS implies larger benefits. On the other hand, recall rate is the number of people required to undergo diagnostic examination procedures to prevent one death over a defined time interval, referred as number needed to recall (NNR) in this article, and the larger NNR implies larger harms, i.e., causing inconvenience to more people. In the above-mentioned Guidelines, the thresholds of 1000 and 100 are set as tentative criteria for NNS and NNR, respectively. To judge the length to continue gastric cancer screening, these criteria were used in the present study due to the following facts: these numbers have been employed in the Guidelines in widespread use; using them can allow qualitative analyses; and there are no alternative proven criteria available. In short, we calculate NNS and NNR, compare them to their corresponding threshold of 1000 and 100, and use the comparison results as a part of a basis for deciding whether it is justifiable to continue or discontinue the gastric cancer screening programs.

To maximize the effect of population-based screening, higher participation rate is necessary. Nevertheless, participation rate is as low as 40% in Japan [7] and the government set the goal as 50% in the Third term Basic Plan to Promote Cancer Control Programs in Japan [8]. Since the number of life saved (NLS) varies according to the participation rate, NLS of participation rate 50% and 100% compared to that of NLS of present rate (40%) are also used as a benefit indicator in this study.

Methods

NNS, NNR, and NLS are estimated by sex and age group. Estimations of NNS, NNR, and NLS require data on gastric cancer mortality, screening effect on mortality reduction, and recall rate. The projections of future gastric cancer deaths by sex and age group in Japan are available from the National Cancer Center [9]. While people are divided into the 7 age groups as follows: 0–14, 15–44, 45–54, 55–64, 65–74, older than or equal to 75 years of age, and all ages, we selected age groups at the time of screening as follows: 45–54, 55–64, 65–74, and older than or equal to 75 years of age in our study. In addition to the number of deaths, estimations of mortality rates require estimates of future population, which should be calculated using the same method and numbers used for calculation of the number of deaths, and thus, we used the method described in the reference [10]. However, since there is no publicly disclosed prediction for the future Japanese population in the period of 2015 and beyond, a ratio of Japanese population to the total population in Japan by sex and 5-year age groups were calculated, which in turn was multiplied by the total population estimates (estimated median numbers of births and deaths) for the year of 2020, 2025, 2030, and 2035, to obtain estimates of future Japanese population by sex and 5-year age groups. These data on the Japanese total population are published by The National Institute of Population and Social Security Research [11]. The projections of the gastric cancer mortality rates are estimated for 2020, 2025, 2030, and 2035 using future number of deaths estimates of 2020–2024, 2025–2029, 2030–2034, and 2035–2039, respectively. Mortality trends are shown using observed value until 2015 [12] and estimates for 2020–2035.

To estimate NNS, the above-mentioned Guidelines used relative risks (RR) of gastric cancer mortality reduction for effectiveness of radiography test and endoscopy test from several studies [13,14,15]. In this study, several relative risk values associated with screening are used for estimation of future NNSs and NNRs in different scenarios. For reference, Table 1 lists the relative risk values used in the Guidelines. These relative risk values ranged from 0.1 to 1.07, which included those either too large or too small to exert any effects, and thus 5 values (0.5, 0.6, 0.7, 0.8, and 0.9) were selected to be used in the scenarios in this study. Recently Korean study reported that the effectiveness of endoscopy screening is RR of 0.53 (95% CI 0.51–0.56), which is not contradict from our scenarios [16].

Table 1 Relative risk used to estimate number needed to screen in the Japanese guidelines for gastric cancer

Recall rates cited in the above-mentioned Guidelines are radiography test data derived from the annual report 2011 of The Japanese Society of Gastrointestinal Cancer Screening [17], and endoscopy data collected in Niigata City reported in 2012 [18] (Table 2). The ranges of recall rates for radiography test and endoscopy were reported as 4.1–12.2% and 2.9–11.6%, respectively. In this study, we used relative risks of 5% and 10% as scenarios.

Table 2 Recall rate used to estimate number needed to recall in the Japanese guidelines for gastric cancer

For estimating NLS, hypothetical number of gastric cancer deaths without screening, D0s, is estimated as follows:

$${\hat {D}_0}=\frac{{{D_{{\text{obs}}}}}}{{1 - {P_{{\text{obs}}}}\left( {1 - {\text{RR}}} \right)}},$$

where Dobs is observed number of deaths and Pobs is observed participation rate of screening. NLSt is estimated as a function of target participation rate Pt:

$$N\hat {L}{S_t}={D_0}\left( {1 - {P_t}\left( {1 - {\text{RR}}} \right)} \right).$$

The observed participation rate is set as 40% and target participation rates are set as 50% and 100%. For the future predication, Pobs is assumed as the same as the present participation rate, i.e., 40%.

Results

Figures 1 and 2 show past transition and future projections of gastric cancer mortalities by age groups. Downward trends are obvious for both men and women in every age group equal to and older than 45 years old.

Fig. 1
figure 1

Observed and projected trends of age-specific gastric cancer mortality in Japan for male

Fig. 2
figure 2

Observed and projected trends of age-specific gastric cancer mortality in Japan for female

Tables 3 and 4 show estimates of NNS and NNR. It might be obvious, but higher relative risks (small effect) and/or lower mortality rates make NNS higher. The results indicated that the benefits of the screening exceeded harms more prominently in men than women, older than younger age groups, and now than future. The criteria of both NNS and NNR would be fulfilled, that is, the both benefits and harms are considered within acceptable limits to justify the screening, for the following age groups (year-old): when relative risk (RR) of screening is set as 0.5, men ≥ 55 and women ≥ 65; when RR is set as 0.6, men ≥ 55 and women ≥ 65; when RR is set as 0.7, men ≥ 65 and women ≥ 75; and when RR is set as 0.8, men ≥ 65 and women ≥ 75; when RR is set as 0.9, men ≥ 75 only.

Table 3 Number needed to screen, number needed to recall, and number of life saved by gastric cancer screening based on future prediction of gastric cancer mortality
Table 4 Number needed to screen, number needed to recall, and number of life saved by gastric cancer screening based on future prediction of gastric cancer mortality

NLS, which is a function of RR, mortality, and participation rate, is substantial for age 65 or older when participation rate is 50% as a national goal while it is not so large for either two combination of female, RR ≥ 0.8, and age 54 or younger.

Discussion

In this study, target population and length appropriate to continue gastric cancer screening were investigated based on the future projection of gastric cancer mortality, from the standpoint of balancing the benefits and harms of the screening. As a result, until 2035, screening programs with higher mortality reduction effects (relative risk 0.5 and 0.6) are shown to be beneficial for men ≥ age 55 and women ≥ age 65. It is expected that, under conditions and scenarios selected in this study, both men and women in the 45–54 age group did not meet the criteria for benefits and harms even in 2010 and 2015.

This study can provide evidence for the decision based on benefits and harms by numerical criteria using NNS, NNR, and NSL. In this way, balancing estimates of benefits and harms is a standard method to evaluate whether to introduce and continue population-based screening [5, 19, 20]. While more comprehensive balance sheets have been proposed [21, 22], typical indicators are those for concerning mortality reduction for benefit and false-positive, overdiagnosis, and adverse reactions to screening and diagnostic examination procedures for harm [19, 20, 23]. The NNS and NNR used in this study are transformed indictors of mortality reduction and false-positive for intuitive interpretation. Overdiagnosis indicators cannot be examined due to lack of reports about overdiagnosis for gastric cancer screening [2]. Because of the difficulty of comparing severity of adverse reactions with screening benefit in numerical way, NNS and NNR were used to balance benefits and harms in this study. As for the threshold, no consensus was obtained due to the uncertainty and variability in the evidence used to make these estimates [20] or a matter of individual judgement [19]. In this study, we used threshold of 1000 for NNS and 100 for NNR based on the Japanese guidelines for cancer screening 2014 edition [2]. These threshold has some sense in Japan because the recommendation of the guideline and following government decision was made based on this value. Even in case of not using such threshold, combination of NNS and NNR for various scenarios in Tables 3 and 4 will help to evaluate whether to continue gastric cancer screening.

There are several limitations in this study. NNSs, NNRs, and NLS addressed in this study are limited to those estimated using the data obtained for both male and female in the age groups of 45–54, 55–64, 65–74, and equal to and older than 75 years, projected for 2020, 2025, 2030, and 2035, due to limited availability of the relevant data. The accurate data of the effect size of screening on mortality, recall rate, and participation rate are not available in Japan, while the detailed and accurate data on mortality rates and their projections were available. Unfortunately, however, although stomach cancer screening has been recommended for age 40 or older until 2015 and is recommended for age 50 or older since 2016, the projections are only available for age groups of 45–54, 55–64, 65–74, and equal to and older than 75 years old. Although NNSs, NNRs, and NLSs outside of these scenarios cannot be estimated due to data availability, they can be speculated by intrapolation of the values of mortality rate, relative risk, and recall rate within the scenarios. Owing to the simple relationships among these values, the results can be speculated that gastric cancer screening is not recommended for men and women with age 50 based on the threshold of NNS < 1000 and NNR > 100 for all the scenarios (Tables 3, 4). As a matter of course, in real situations, other benefits and harms of the screening should be considered such as less invasive treatment due to early detection as benefits and adverse reactions of the screening and diagnostic examinations as harms.

Considering the criteria of benefits and harms as NNS < 1000 and NNR > 100, respectively, these estimates may imply that, compared to sex, age and screening effect, the trend toward mortality reduction may have less impact on NNS and NNR, at least until 2035. Recall rates are closely related to prevalence, sensitivity, specificity, and screening effect, and therefore, it is important to manage the accuracy level of screening to maintain the recall rates in reasonable range. Furthermore, NLS heavily depends on participation rate of screening, it is most important to increase participation rate as high as possible.