Introduction

The introduction of the “3Rs” principle, reduction, refinement and replacement, by Russell and Burch (1959), was instrumental in the development of alternative methods to animal experimentation. Since then, alternative methods have been developed and validated. One of the goals is to replace animal testing for toxicological hazard assessment and ultimate for risk assessment. The so-called Tox21 strategy is shifting the toxicological assessments away from traditional animal studies to target-specific, mechanism-based, biological observations largely obtained using in vitro assays (Tice et al. 2013). The overall toxicity of a compound in an in vivo organism is unlikely to be accurately reflected in a single stand-alone replacement model; rather, a battery of tests is required, such as those in the recently regulatory adopted strategy for skin sensitization (Bauch et al. 2011). Knowledge of physiological and toxicological pathways has allowed the development of adverse outcome pathways (AOPs), which describe the biological key events leading to an adverse outcome in vivo (Vinken 2013).

One adverse outcome of concern is interference of chemicals with sex hormone synthesis, regulation, and function, potentially disturbing reproduction and fetal development (WHO 2012). In 2012, the OECD has issued a guidance document on evaluating chemicals for endocrine disruption (OECD 2012). Since 1998, in the United States, the Environmental Protection Agency (EPA) requires a battery of in vitro and in vivo tests [developed from the Endocrine Disruptor Screening Program (EDSP)] for potential endocrine-disrupting chemicals (EDCs) (EPA 1998). The in vitro screening tests recommended by the EPA include an estrogen [yeast estrogen screening (YES)] or androgen [yeast androgen screening (YAS)] transcriptional activation assay (EPA 2009a; Kolle et al. 2010; OECD 2009a) and the steroidogenesis assay in the human derived cell line H295R (Kolle et al. 2010; OECD 2011). The first assays evaluate the effect of the compound on human steroid hormonal receptors, while the latter assesses any interference of a compound in the steroidogenesis pathway by measuring the steroid hormonal concentrations. Based on the known similarity of the steroid receptors in rats and humans (Chang et al. 1988; Sun et al. 2014), as well as on the common biochemical pathway of steroidogenesis in mammals, these in vitro data should also be predictive for the rat (Ankley and Gray 2013; Sun et al. 2008, 2014). The in vivo studies include one-generation [OECD test guideline (TG) 415 (OECD 1983)] and two-generation [OECD TG 416 (OECD 2001)] studies; the Hershberger [OECD 441 (OECD 2009b)]; and the uterotrophic [OECD 440, OECD 2007)] assays and also the in vivo pubertal assay. In addition, the in vitro assays provide information on the possible mechanisms of action of the endocrine activity observed in the in vivo tests (OECD 2009a). In this study, literature data for the YES/YAS and steroidogenesis in vitro assays from Kolle et al. (2012) were used to detect potential endocrine disruption.

While in vitro testing can provide an efficient way to identify potential hazards of chemicals, nominal in vitro assay concentrations may misrepresent potential in vivo effects (Wetmore et al. 2012) and do not provide dose–response data which can be used for a risk assessment. Therefore, an in vitro-to-in vivo extrapolation (IVIVE) that translates in vitro concentration–effect curves into in vivo dose–response curves, the so-called “reverse dosimetry approach”, is needed (Wetmore et al. 2012; Louisse et al. 2017; Paini et al. 2017). To investigate whether and how in vitro toxicity data can be used and extrapolated using reverse dosimetry to in vivo toxicity, we used data from the endocrine disruption assays in an in vitro–in silico-based concept. Accordingly, the lowest concentration that caused an effect in the toxicity assays in the absence of cytotoxicity or cross reactivity was defined as the in vitro point of departure (PoD). With the assumption of comparable concentration–response ratios of the addressed endocrine effects in the applied in vitro systems and the in vivo environment, this “lowest observed effect concentration (LOEC)” was then extrapolated to an in vivo oral dose using an eight-compartment PBTK model for the rat (including the potential target organs, namely, the adrenals and ovaries/testes). Thus, it is possible to use in vitro LOEC values to determine in vivo PoDs, which are required for risk assessment. This strategy has been applied to a number of toxicological endpoints (Punt et al. 2011) including developmental toxicity (Li et al. 2017; Louisse et al. 2010, 2015; Strikwold et al. 2013, 2017; Verwej et al. 2006); genotoxicity (Paini et al. 2010), acute (and repeated dose) toxicity and hepatotoxicity (Gubbels-van Hal et al. 2005); nephrotoxicity (Abdullah et al. 2016), neurotoxicity (DeJongh et al. 1999a, b; Forsby and Blaauboer 2007); and, more recently, endocrine disruption, which focused on (anti)estrogenicity (Zhang et al. 2018).

PBTK models can be used to acquire more kinetic information across species and for IVIVE (Paini et al. 2017). However, they are often focused on the modelling of a single chemical and adapted as specifically as possible for that compound. Therefore, we evaluated and applied a simple, transparent, and non-commercial PBTK model that could also be easily adapted to other toxicological endpoints. A set of 10 compounds were used in the evaluation, which were selected from a published dataset on YES/YAS and steroidogenesis studies (Kolle et al. 2010, 2012). In addition to in vitro hepatic metabolism data, in vivo rat endocrine disruption toxicity data were available for all compounds to compare in vitro–in silico-derived LOEALs with experimental data in rats.

Materials and methods

Test compounds

Test compounds were selected from a panel of compounds previously tested in steroidogenesis and YES/YAS in vitro assays to detect the potential for endocrine disruption (Kolle et al. 2012). From all compounds tested by Kolle et al. (2012), 10 compounds (Table 1) were selected based on the (1) internal in vitro database, (2) availability of input parameters for PBTK modeling, and (3) on available in vivo data (lowest observed effect levels, LOELs) in the rat for the evaluation of the in vitro–in silico-based dose–response description to facilitate risk assessments.

Table 1 Physicochemical parameters and chemical structures of evaluated compounds

In vitro: LOEC

In the YES/YAS assays, each compound was analyzed for estrogen or androgen receptor-dependent reporter enzyme activity (agonistic and/or antagonistic). In the steroidogenesis assay, the effects of each compound on hormone synthesis of H295R human adrenocortical carcinoma cells were determined by measuring estradiol and testosterone levels (Kolle et al. 2012). In a conservative approach, the LOEC was set as the lowest concentration from the available in vitro data, in which an effect was observed without cytotoxicity or cross reactivity. If the lowest effect concentration in vitro was similar for different test principles as for BPA or GEN, the most sensitive lowest effect level was taken from available in vivo tests following a worst-case assumption. If the highest concentration tested was without effect in the absence of cytotoxicity, the LOEC was defined as above the highest concentration tested.

PBTK modeling

An eight-compartment model set up for male and female rats was applied to describe kinetics and distribution of the test compounds. The developed model included the target tissues/organs for endocrine disruption evaluated in vitro: adrenals and ovaries/testes. The principle of the model is shown in Fig. 1. Differential equations were used to describe a time-dependent mass balance for each defined compartment of the organism. The equations of the model are shown in Table 2.

Fig. 1
figure 1

Principle of the applied eight-compartmental physiologically based toxicokinetic (PBTK) model

Table 2 Equations of the eight compartments used in the physiologically based toxicokinetic (PBTK) model

In the model, the chemicals enter the bloodstream as a first-order process directly via the liver after uptake in the gastrointestinal tract. The distribution was based on diffusion. Hepatic metabolic clearance was integrated into the applied PBTK model and reflected the overall clearance of the test compound from the organism. The unbound fraction of the test compound was taken into account for the quantification of metabolic clearance. For parent compound, biliary and renal clearance are not considered in the applied PBTK model, based on the assumption that clearance via these pathways is mainly based on excretion of metabolites and is therewith addressed by hepatic metabolism as its prerequisite principle. The model is focused on the kinetics of the test compound, which reflects a postulated, parent compound-linked effect. Since in the applied in vitro systems, the identity of the metabolites may be unknown and the amounts negligible (OECD 2011; Routledge and Sumpter 1996), this approach is assessed to be appropriate within the applied concept of in vitro-to-in vivo extrapolation.

For the calculations, the oral dose was set to 50 mg/kg bw. This was an arbitrarily set dose to calculate corresponding Cmax values in plasma. Since all input parameters in the PBTK model are linear with dose, for each compound, a calculated constant Cmax/dose ratio can be used for reverse dosimetry by back-calculating the oral doses from the lowest effect levels in vitro (set to Cmax values) by the rule of proportion. The differential equations were solved using the software Berkeley Madonna™, version 8.3.18 (developed by Macey et al. 2009). All the compounds were analyzed in one task using the built ‘batch-run’. Microsoft Excel 2013 (Microsoft®) was used to import and analyze the data.

The target output parameters were the maximal concentrations (Cmax) in plasma (mean value for males and females) and in the target tissues (ovaries/testes and adrenal glands). To evaluate the PBTK predictions, the results obtained for Cmax in plasma were compared with measured Cmax from rats, extracted from the literature. As described above, in the applied PBTK model, the Cmax/dose ratio is constant and Cmax values for a given dose, from rat studies, can be calculated in a linear approach from the results obtained at the modeled dose of 50 mg/kg bw by the rule of proportion.

Physiological input parameters

The physiological parameters, including body weight, organ volumes, cardiac output, and blood flows, were taken from the literature (Brown et al. 1997; Davies and Morris 1993) and from in house data for male and female Wistar rats (Crl:Han, Charles River, Sulzfeld, Germany). Details are listed in Table 3.

Table 3 Physiological parameters used in the applied PBTK model

Physicochemical input parameters

The physicochemical input parameters of the test compounds consist of the octanol/water partition coefficient log Kow and molecular weight (Table 1). Log Pow data were predicted using the ALOPGS 2.1 software. The tissue partition coefficients, used to describe and model the distribution between blood and defined tissues, were calculated based on the physicochemical parameters by the following equations as given by DeJongh et al. (1997):

$$P=\frac{{(0.081*{\text{Ko}}{{\text{w}}^{0.44}}+0.919)}}{{(0.004*{\text{Ko}}{{\text{w}}^{0.44}}+0.996)}} - 0.19$$
(1)
$$P=\frac{{(0.8*{\text{Ko}}{{\text{w}}^{0.7}}+0.2)}}{{(0.004*{\text{Ko}}{{\text{w}}^{0.7}}+0.996)}} - 0.02$$
(2)
$$P=\frac{{(0.056*{\text{Ko}}{{\text{w}}^{0.29}}+0.944)}}{{(0.004*{\text{Ko}}{{\text{w}}^{0.29}}+0.996)}} - 0.55.$$
(3)

Equation (1) was used for liver, kidneys, adrenals, ovaries/testes, and richly perfused tissues, Eq. (2) for fat and Eq. (3) for poorly perfused tissue.

Kinetic and metabolic input parameters

Intestinal absorption was predicted using a QSAR model based on Caco-2 data described by the following equation given by Hou et al. (2004):

$${\text{Log}}\;{\text{Papp}}= - 4.28 - 0.011*{\text{PSA}},$$
(4)

where Papp is the apparent permeability coefficient and PSA is the polar surface area. PSA values were obtained in ChemAxon public database (2016). The results of this calculation in cm/s were thereafter converted to Papp in (cm/s)10−6.

Metabolic clearance was based on hepatic clearance reported in the literature or determined in S9 subcellular fractions of livers from Wistar rats at Cyprotex, Alderley Park, UK (see Table 4). Hence, the intrinsic clearance (CLint) in this model was based on hepatocytes, microsomal, or S9 subcellular hepatic fractions. Since some of the chemicals were expected to be directly conjugated, the microsomal and liver S9 incubations were used that contained cofactors for glucuronidation (UDPGA) and sulfation (PAPS), as well as NADPH for oxidation reactions. CLint data were normalized for hepatocytes as clearance per 106 cells, and for S9 and microsomal fraction as clearance per mg of protein. To estimate the in vivo hepatic clearance, the CLint values were scaled up using the factors of 135*106 cells/g liver for hepatocytes (Houston 1994); 91.3 and 50 mg protein/g liver for liver S9-fraction and microsomes, respectively (BASF internal data).

Table 4 Kinetic and metabolic input parameters for PBTK modeling

The fraction unbound to protein in the plasma (fup) was experimentally determined by performing rapid equilibrium dialysis (RED). Briefly, each test compound was incubated with rat plasma to a final concentration of 5 µM (1% DMSO) in a volume of 300 µL into a donor well of a RED plate in duplicates. 500 µL of DPBS was added to the receiver well of the plate. For the dialysis, this plate was sealed and incubated under shaking (300 rpm) at 37 °C with 5% CO2 for 6 h (Thermo Scientific 2012). A sample of 200 µL of each well was collected and frozen at − 40 °C until analysis. Warfarin (WAR) was used as a reference compound, which is known for its high protein binding (Waters et al. 2008; van Liempd et al. 2011; Zhang et al. 2012). The samples were analyzed using HPLC-MS/MS at Pharmacelsus, Saarbrücken, Germany. Applied methods are described in supplementary material 1.

The fup data were used to calculate the hepatic clearance in the model using the following equation described by Houston (1994):

$${\text{C}}{{\text{L}}_{\text{H}}}=\frac{{{Q_{\text{L}}}*{\text{f}}{{\text{u}}_{\text{P}}}*{\text{CL}}}}{{{Q_{\text{L}}}+{\text{f}}{{\text{u}}_{\text{P}}}*{\text{CL}}}};\quad \frac{{{\text{dA}}{{\text{M}}_{{\text{int}}}}}}{{{\text{d}}t}}={\text{C}}{{\text{L}}_{\text{H}}}*{\text{CVL,}}$$
(5)

where CLH is the hepatic clearance, CL is the intrinsic clearance scaled up, QL is the liver blood flow, AMint is the arterial blood metabolic intrinsic rate, and CVL is the concentration in the venous blood leaving the liver.

In vivo LOEL dose

The LOEL doses were extracted from literature data for the respective endpoints of endocrine disruption evaluated in the in vitro tests. The lowest dose levels were extracted based on (1) data availability and confirmed adequate studies which followed recommended guidelines; (2) oral route of administration, by gavage rather than feeding; and (3) the appropriate assay for determined in vitro LOEC.

The uterotrophic [OECD guideline 440 (OECD 2007)] or Hershberger [OECD guideline 441 (OECD 2009b)] tests were used to assess interference with estrogen or androgen receptors, respectively. The in vivo pubertal assay OECD guidance document 150 (OECD 2018a), EPA guidelines (EPA 2009b, c), or one- and two-generation studies OECD guidelines 416 and 443 (OECD 2001, 2018b) were chosen to evaluate the interference with steroid hormone synthesis, for which the main endpoints are hormone levels, vaginal opening (for females) and preputial separation (for males) observed in juvenile animals or in the offspring exposed in utero (Hayes et al. 2010). When the LOEC corresponded to a concentration from more than one assessed in vitro system, the LOEL was taken from any of the available in vivo tests described above. This approach is based on the assumption that LOECs in vitro take into consideration data assessments derived from the defined respective endpoints, such as steroid–receptor interaction and/or interaction with steroidogenesis.

Reverse dosimetry: IVIVE

Results from PBTK modelling and LOECs from in vitro experiments were used for IVIVE. In a linear manner, dose levels were calculated for respective LOECs based on the estimated plasma Cmax versus dose plot results from PBTK. The calculations are summarized in Fig. 2. For this analysis, the non-protein bound, free fraction of a test compound at the time point of the Cmax in plasma was considered as correlated tor the endocrine induced effects. In addition, extrapolated dose levels in plasma were compared with in vivo LOELs to evaluate the accuracy of the applied PBTK model.

Fig. 2
figure 2

In vitro-to-in vivo extrapolation (IVIVE) based on the in vitro LOEC concentration and Cmax in plasma

Sensitivity analysis

The sensitivity analysis of the model was performed for all compounds based on the description of Evans and Andersen (2000). In this approach, the sensitivity coefficient (SC) is defined by the initial maximum concentration (C) after prediction for plasma or tissue, the initial parameter of the model (P), the maximum concentration after increasing the parameter value by 5% (C′), and the changed parameter (P′) as shown below:

$${\text{SC}}=\frac{{{C^\prime } - C}}{{{P^\prime } - P}}*~\frac{P}{C}.$$
(6)

The resulting sensitivity coefficients were analyzed using Microsoft Excel 2013 (Microsoft®). The input parameter was considered to significantly affect the model output when the SC absolute value was higher than 0.5 (Rietjens et al. 2011). The output parameter assessed was the Cmax in plasma.

Results

Determination of in vitro LOEC and in vivo LOEL

For the selected compounds, the results of each in vitro assay including the thereof derived LOECs are shown in Table 5. APAP and CAF did not show any effects in these assays, and consequently, their LOECs for IVIVE were set to the highest concentrations tested without cytotoxicity (> 100 µM for both compounds). The LOECs were taken from YES/YAS for four compounds (EE, FLU, MTT, and TRE) and from steroidogenesis assay for two compounds (FEN and KET). The lowest concentrations in which an effect was observed were equal among the assay systems for two compounds (BPA and GEN).

Table 5 Determination of in vitro lowest observed effect concentration (LOEC) of tested compounds and comparison between in vitro–in vivo extrapolated (IVIVE) lowest effect levels based on PBTK modeling and in vivo-derived lowest observed effect levels (LOEL)

In vivo LOELs for the defined endpoints with the respective assay are shown in Table 4. For APAP and CAF, no endocrine effects were described in literature and it was not possible to attribute any LOELs for the chosen endpoints. The LOELs were taken from uterotrophic/Hershberger assays for six compounds (BPA, EE, FLU, GEN, MTT, and TRE). Data from pubertal assay were taken for two compounds (FEN and KET).

PBPK model

Input parameters

Data for permeability and clearance for each compound are summarized in Table 5, together with their respective sources. Experimental data for fup, which was used for PBTK modelling, and the corresponding available literature data are also presented in Table 5. For BPA, FEN, GEN, KET, and TRE, literature values are slightly higher than experimentally determined fup. The highest difference was noted for KET, which is known for its high binding capacity. However, the experimentally determined fup for the reference compound WAR (0.4 ± 0.2%, mean ± standard deviation between 3 experiments) corroborates the literature data (< 1%, Waters et al. 2008; van Liempd et al. 2011; Zhang et al. 2012) and shows the reproducibility of the performed experiments.

Output parameters: predicted plasma C max

The predicted plasma Cmax using PBTK modelling were compared to in vivo data for six compounds for which data were available (Table 6). The PBTK model predicted the plasma Cmax for 67% of the test compounds (4/6—APAP, CAF, EE, and KET) in the same order of magnitude as that in vivo. BPA and GEN modeled concentrations were one order of magnitude higher than literature in vivo values. Further details are summarized in Table 6.

Table 6 Estimated maximum plasma concentrations (Cmax) compared to in vivo-based Cmax values

In vitro-to-in vivo extrapolation (IVIVE)

For 8 compounds, the IVIVE approach predicted doses in plasma that were correlated to the in vivo LOEL from literature (Table 4). For the evaluation, compounds with values that were within tenfold of the in vivo LOEL were considered to be correctly predicted. The correctly predicted compounds were BPA, FEN, GEN, and KET. Two compounds were over-predicted (MTT and TRE) and two compounds were under-predicted (EE and FLU). There was no trend between compounds under or over-predicted in terms of physicochemical properties or clearance pathways. Using this decision criterion, 50% of the test compounds (4/8) were correctly predicted. Assuming no literature evidence in vivo for the assessed endpoints for correct negative compounds in the in vitro assays (APAP and CAF), the current IVIVE model correctly predicted 60% (6/10) using this in vitro in silico-based risk assessment approach.

Sensitivity analysis

A sensitivity analysis of the model was performed for all compounds (supplementary data, Table S1). Sensitivity analysis demonstrated that, generally, the intestinal permeability and the lipophilicity input parameters highly influenced the output parameter, Cmax in plasma. SC absolute values were higher than 0.5 for Log Kow for BPA, EE, FEN, FLU, GEN, KET, MTT, and TRE for intestinal absorption (quantified via Papp) for APAP, BPA, EE, FEN, GEN, KET, MTT, and TRE.

Discussion

In the current case study, we followed basic principles of the concept of animal-free risk assessment and used, as starting point, available in vitro data for endpoints of endocrine disruption. Although the species of interest for animal- free risk assessment is human, especially when taking tests in human in vitro systems into consideration, we applied the principle of in vitro-to-in vivo extrapolation for rats. We did this to be able to assess obtained in vitro–in silico-based results versus available literature data from corresponding standardized in vivo tests in the rat. For this purpose, we applied PBTK modeling for an in vitro–in silico-based risk assessment for a set of 10 compounds. A first assumption was that the concentration–response ratios of the addressed endocrine effects in the applied in vitro systems were comparable to the in vivo environment. This assumption is a general issue in in vitro toxicology and drug discovery (Smith et al. 2010; Lu et al. 2011) and is a prerequisite for IVIVE, since the concentration–effect ratio is given for the test substance concentration in the buffer/medium and the readout for in vitro testing and is related directly to plasma concentration–effect ratios in vivo. As shown for endocrine effects of 17β-estradiol (E2) and BPA based on YES assays, results of these in vitro tests yielded better in vitro in vivo correlation than the MCF-7/BOS proliferation assay or the U2OS ER-CALUX assay (Zhang et al. 2018). The LOEC in vitro was extrapolated to an oral dose by reverse dosimetry applying a PBTK model for the rat. This correlates to a predicted LOEL in vivo, which was compared with the measured LOELs in in vivo studies for endocrine disruption to gain experience on the predictivity of such a concept and therewith knowledge on its potential future applicability.

As a simple approach, it was assumed that for each compound, the LOEC in the in vitro tests indicates the most sensitive endpoint and is a relevant parameter for risk assessment. The lowest LOEC values were from YES or YAS binding assays for EE, KET, MTT, and TRE; therefore, the corresponding in vivo endpoints used for comparison were the uterotrophic or Hershberger assay, respectively. For FEN and FLU, the LOEC values were from the steroid mapping assays. Consequently, predicted LOEL values were compared with the LOELs from the in vivo pubertal assay or one- and two-generation studies. APAP and CAF did not show clear effects in the in vitro assays up to the highest concentrations tested. Therefore, the highest concentration was translated into a corresponding in vivo dose that should be considered not to induce any endocrine effect.

For the applied IVIVE approach, it was assumed that a concentration (not area under the curve)-related endocrine effect was due to the parent compound. This is justified by in vitro tests that take receptor binding and/or enzyme interactions, both established as typical concentration-driven processes, as relevant modes of action into account. In addition, the nominal compound concentration is assumed to be responsible for effects in tests which have limited metabolic functions (Coecke et al. 2006). However, the understanding of exposure in in vitro testing becomes critical in the applied concept, since the nominal dose may not be the effective-free concentration if it binds to the plastic culture vessel or medium components, or evaporates (Groothuis et al. 2013). The assumption that the parent compound causes the toxic effect also implies that its metabolites do not contribute to the endocrine effects. The relevance of metabolic activation (Dekant 2009) is widely recognized and ways to implement this in vitro systems are described generally (Landsiedel et al. 2011), but these principles are not yet universally applied.

For the IVIVE, the maximum plasma concentration (Cmax) after a simulated single oral dose was adjusted to equal to the LOEC in vitro. This is a simplified and straightforward approach for compounds with short half-lives. However, this approach has a number of caveats. First, it may fail for compounds that accumulate, resulting in significantly higher steady-state concentrations after multiple dosing than after single dosing. Second, it is assumed that the test compounds do not induce or inhibit metabolizing enzymes in the liver, which would also significantly change compound kinetics after multiple dosing compared to single dosing. In addition, inter-individual differences in kinetics that may be addressed in Monte Carlo models are not taken into consideration in the presented basic IVIVE approach.

For PBTK modeling, Cmax in plasma was defined as the dose metric related to the lowest nominal effect concentrations in the respective in vitro assay. With the assumption that medium in the in vitro test system mirrors blood in a living organism, both providing nutrients to the cells and distributing the test substances within the system to the cells, this compartment was chosen to bridge the nominal in vitro concentrations to in vivo. The targets of the endocrine effects (ovaries, testes and/or adrenals) were added to the PBTK model as a future option, but should then be linked to in vitro effect concentrations of test substances in the tested cells or directly at the site of action. Based on available literature data that describe the observed effects as a function of nominal concentrations, this approach may be followed by the application of in vitro dosimetry concepts as described by (Groothuis et al. 2013), but was not addressed within the current case study. Since for the investigated potential endocrine disruptors, the investigated key events of receptor binding and enzyme interactions are concentration, not AUC-mediated effects, the maximum-free concentration of the test compound in plasma was applied for extrapolation. This approach reflects a worst-case scenario, since Cmax as dose metric results in the lowest possible estimated LOEL.

For in vitro-to-in vivo extrapolation, plasma protein binding was taken into consideration and the free concentration of the test compound in plasma was calculated. It was assumed that the free fraction of compound affects the toxicological activity (Smith et al. 2010). This implies that the compound: (1) does not induce irreversible inactivation of the target, e.g., by covalent binding; (2) does not act via multiple mechanisms and by activation of target-mediated events; and (3) has an equal action/potency in the in vitro assays, as it does in vivo. Rapid equilibrium dialysis (RED) was chosen for the determination of the unbound fraction of the compound in plasma, because in our experiments, this method was robust in respect of recovery, replicability, and obtained results of the positive control Warfarin (Zhang et al. 2012).

Further assumptions for PBTK modeling were that hepatic clearance drives the overall clearance of the test compound in the organism and other clearance mechanisms were not taken into consideration. This is true for many compounds, such that extrahepatic metabolism is minor compared to hepatic metabolism (Gundert-Remy et al. 2014; Oesch et al. 2018), and the metabolites, not the parent compound, are excreted via urine and/or bile. Consequently, in the overall clearance process, metabolism is the time critical process of elimination. This assumption is likely to be valid for most of the modeled compounds in our analysis, since the major metabolites of most of the test compounds are either oxidized and/or direct glucuronic acid and/or sulfate conjugates, which are excreted in the feces or urine (see Table 1 for metabolic and excretion pathways). The exception to this was EE and GEN, whereby intestinal first-pass metabolism contributes extensively to their overall metabolism in the rat, such that conjugation in the intestine reduces significantly their bioavailability (Hirai et al. 1981; Schwenk et al. 1982; Sfakianos et al. 1997). Since the current IVIVE concept did not take extrahepatic metabolism of EE or GEN into account and assumes direct uptake of the compounds from the GI tract, the model could consequently underestimate the extrapolated LOEL, although this was not the case for EE and GEN (Table 4). Therefore, the total clearance of the modeled compounds in this evaluation may be well reflected by their hepatic clearance. In this strategy, intrinsic clearance was linearly correlated to substrate concentrations and for higher plasma concentrations, this approach may overestimate hepatic clearance. Therefore, it may be considered for a potential higher tier modeling to apply Michaelis Menten parameters for the description of the kinetics of the metabolism of the test compound if available, or to determine them in appropriate experiments.

Another assumption of the PBTK modeling is that distribution is based on diffusion and the resulting steady-state concentrations are described by blood/tissue partition coefficients that are derived from physicochemical parameters of the test compounds, as described by (Jones and Rowland-Yeo 2013). This means that active transport was not addressed for modeling uptake and distribution of the test compound between defined compartments. Hence, this diffusion-based modeling is correct only if active transport processes are negligible. Although for the current modeled test compounds, APAP, EE, KET, BPA, and GEN are described in the literature to be substrates or inhibitors of active transporters (Manov et al. 2006; Zamek-Gliszczynski et al. 2011; Englund et al. 2014; Mazur et al. 2012; Ge et al. 2017), the quantitative input in the overall kinetics of these compounds is difficult to judge and should be addressed in future works.

Sensitivity analysis demonstrated that, generally, permeability (based on Papp), and log Kow input parameters highly influenced the output parameter cmax in plasma. First, the SC absolute value was higher than 0.5 for Log Kow for 8 of 10 substances (BPA, EE, FEN, FLU, GEN, KET, MTT, and TRE). Likewise, the SC absolute value was higher than 0.5 for Papp for also 8 of 10 substances (APAP, BPA, EE, FEN, GEN, KET, MTT, and TRE). SC absolute values were lower than 0.5 for all other investigated input parameters. In general, hepatic clearance of the compounds had minor influence on Cmax. This means that with respect to the kinetics of the test compounds, absorption and partitioning of the compounds are the main drivers of Cmax. Therefore, special attention should be given to these input parameters when the current dose metric is applied. It should be mentioned here that changing the dose metric will also change the sensitivity of input parameters and AUC or average plasma concentration, which are more dependent on clearance than Cmax.

Plasma kinetics data from in vivo experiments in rats were available for 6 of the 10 test compounds and were used to compare with predicted Cmax values (see Table 6). The modeled Cmax values for APAP, CAF, EE, and KET are in general accordance with the measured in vivo data and differ by less than threefold. By contrast, predicted Cmax values for BPA and GEN clearly overestimate the in vivo Cmax by more than tenfold. Thus, modeled plasma concentrations could be considered valid (within threefold of the in vivo value) for 4 out of 6 compounds, i.e., 66% were correctly predicted. These results are similar to those of a study in which a six-compartment rat PBTK model was used for a set of active ingredients of plant protection products (50% were correctly predicted) (Fabian et al. 2015).

Using reverse dosimetry to predict the in vivo LOEL for endocrine disruption, 6 of 10 LOELs were predicted within the correct order of magnitude. The predicted LOELs differed by more than tenfold from the described in vivo value for 4 compounds, 2 of which were over and 2 were under-predicted. Interestingly, for EE, although the LOEL estimation was more than tenfold higher than the in vivo value, the calculation of its Cmax was well predicted. This implies that the in vitro result may not reflect a relevant value. In contrast to EE, the LOEL estimations for BPA and GEN were within tenfold of the in vivo value, whereas the predictions of their Cmax values were more than tenfold higher than in vivo values. This observation is interesting, since a correct prediction of an LOEL based on PBTK modelling is critical in respect of the defined assumptions of the applied strategy. The LOELs for FLU, MTT, and TRE were not well predicted, with more than one order of magnitude difference between the modeled and observed in vivo values (in vivo Cmax values were unavailable for these three compounds). Reasons for these lacks of correlation could be limitations of the applied models which do not consider, e.g., deviation from the assumed linear kinetics of hepatic clearance, extrahepatic metabolism, enterohepatic recirculation, renal clearance, and active transport of the compound (as described above).

Endocrine disruption served as an example to correlate concentrations causing in vitro effects-to-in vivo effect doses using IVIVE reverse dosimetry. This IVIVE concept is also applicable to other toxicological effects. To make risk assessments based on in vitro and in silico methods widely applicable and acceptable, those cases or test substances, for which the IVIVE predictions (or in vitro models) are not correlating with in vivo data, need to be reliably identified and the inadequacies of the models need to be clarified. This can then be used to amend and improve the methods accordingly.