FormalPara Key Points

Current body surface area-based dosing leads to highly variable fludarabine exposures.

All pharmacokinetic parameters were related to body weight, adequately characterized by standard allometric scaling.

Renal clearance (expressed as estimated glomerular filtration rate) corresponded to approximately 65% of total clearance for adults.

1 Introduction

Allogeneic hematopoietic cell transplantation (HCT) is a potentially curative treatment for a variety of malignant and benign hematological disorders. The preparative or conditioning regimen prior to HCT consists of a combination of cytotoxic agents, chemotherapy and serotherapy (antibodies against the host immune system), administered to ablate the bone marrow and the immune system [1].

Fludarabine has been evaluated in various studies as a replacement for cyclophosphamide, in combination with busulfan, aiming to decrease non-relapse mortality while maintaining the immunosuppressive and anti-cancer efficacy of cyclophosphamide [2]. Following confirmation of this hypothesis [3], fludarabine is currently used in various conditioning regimens, ranging from non-myeloablative regimens [4], to reduced intensity and myeloablative regimens.

Fludarabine is dosed based on body surface area (BSA) and intravenously administered as a monophosphate prodrug (F-ara-AMP). It is very rapidly fully converted to the circulating metabolite F-ara-A, which is distributed intracellularly. Subsequently, intracellular phosphorylation takes place to the active metabolite fludarabine triphosphate (F-ara-ATP), which is built into the DNA and RNA, thereby inhibiting DNA/RNA synthesis. This leads to apoptosis in both chronic lymphocytic leukemia cells [5] and (with different susceptibility) in cell types targeted in the HCT setting [6, 7].

As F-ara-ATP is the active metabolite, intuitively it is the form of interest for pharmacokinetic (PK) studies. However, during conditioning, only a limited number of target cells can be acquired, especially when antithymocyte globulin (ATG) is administered prior to fludarabine in the conditioning, complicating accurate quantification of intracellular F-ara-ATP. Therefore, ex vivo quantification of F-ara-ATP accumulation in pretreatment samples has been proposed in HCT recipients [8], although a relationship between this accumulation and outcome was not found [4]. Given these complexities and an apparent correlation between F-ara-A concentration and F-ara-ATP formation in cells [9], the freely circulating F-ara-A has been used primarily for PK analyses.

During the phase I trial for fludarabine, triphasic first-order kinetics were found for F-ara-A. The main known route of elimination is through the kidney, and indeed a correlation between clearance from the central compartment and estimated glomerular filtration rate (eGFR) has been found [10].

As this study was performed in adults with chronic lymphocytic leukemia, several HCT-specific PK studies were performed to further explore fludarabine PK in this setting [9, 11,12,13,14]. However, to date, none have led to a harmonized dosing regimen for both children and adults, which takes both renal function and body size into account. Previous study results could not be extrapolated to the general population due to the use of non-population-based methods [12, 13], limited sample size [11, 14], or containing only pediatric data [9]. This causes most centers to still use BSA-based dosing, although a PK rationale for this is lacking.

Therefore, this study aims to build a population PK model, using a very large heterogeneous dataset of both pediatric and adult HCT recipients. As such, this study can provide a rational base for optimal and harmonized dosing regimens for patients of all ages in this setting, while taking renal function into account.

2 Methods

2.1 Patients

A retrospective PK analysis was performed with data from patients who received myeloablative conditioning before HCT, between May 2010 and January 2017, at the University Medical Centre Utrecht (UMCU), and of whom PK samples were available. No restrictions were applied for comorbidities, age, and indication for HCT. Patients were included after written informed consent was acquired. Ethical approval by the institutional Medical Ethics Committee of the UMCU was obtained under protocol number 11/063.

2.2 Procedures

The conditioning regimen consisted of 4 days of chemotherapy (administered from day − 5 to day − 2 relative to HCT) consisting of a 1 h infusion of fludarabine phosphate directly followed by a 3 h infusion of busulfan (Busilvex; Pierre Fabre, Paris, France). A 1 h infusion of clofarabine preceded fludarabine infusion in children with malignancies. Rabbit ATG was added in the unrelated donor HCT setting: 4-h infusions on 4 consecutive days from day − 9 (10 mg/kg < 30 kg, 7.5 mg/kg > 30 kg) for children, and four 12-h infusions from day −12 (6 mg/kg) for adults.

Patients received either a cumulative dose of 160 mg/m2 of fludarabine phosphate or 40 mg/m2 fludarabine phosphate combined with 120 mg/m2 clofarabine. Intravenous busulfan was targeted to a myeloablative cumulative 4-day exposure of 90 mg h/L or 30 mg h/L for Fanconi anemia patients (expressed as area under the curve for all doses [AUCT0−∞]). For patients receiving ATG, clemastine, paracetamol, and 2 mg/kg prednisolone (with a maximum of 100 mg) were administered intravenously prior to ATG infusion.

2.3 Pharmacokinetic (PK) Samples and Analyses

Concentrations of the circulating metabolite of fludarabine (F-ara-A, hereafter referred to as fludarabine) were analyzed in PK samples taken for routine busulfan therapeutic drug monitoring (TDM) according to local protocol. Quantification of fludarabine concentrations was performed using a liquid chromatography mass spectrometry method validated according to US FDA and European Medicines Agency (EMA) guidelines as described previously, with a lower limit of quantification of 0.001 mg/L [15]. In the TDM protocol, plasma samples were drawn on the first or second day of conditioning. If considered necessary for TDM purposes, samples were also drawn on the following days. Additional samples were taken on the final day of conditioning. In general, plasma samples were taken at 4, 5, 6, and 7 h, after the end of fludarabine infusion. For a subset of patients, additional samples were collected from 7 to 24 h post-infusion. From January 2016 onwards, additional samples were collected between the end of fludarabine infusion and the start of busulfan infusion, 15–45 min after the end of fludarabine infusion.

A population approach based on non-linear mixed-effects modeling was applied [16], using the software package NONMEM version 7.3.0 (Icon, Hanover, MD, USA). Pirana version 2.9.5 and R version 3.3.3 were used for workflow management, and data handling and visualization, respectively [17, 18]. The first-order conditional estimation option with interaction between random and residual error components (FOCE-I), as implemented in NONMEM, was used as the estimation method.

2.4 Pharmacokinetic (PK) Model-Building Procedure

For the structural model, one-, two- and three-compartment models with first-order kinetics were tested.

Interindividual variability (IIV) was assumed to follow a log-normal distribution and was therefore implemented into the model as follows:

$$P_{i} = P_{\text{pop}} \times e^{{(\eta_{i} )}}$$
(1)

where Pi depicts the individual or post hoc value of the parameter for the ith individual, Ppop depicts the population mean for the parameter, and \(\eta_{i}\) predicts the empirical Bayes estimate of IIV for the ith individual, sampled from a normal distribution with a mean of zero and a variance of ω2.

Interoccasion variability (IOV) was implemented similarly, with each dose and subsequent sampling defined as a separate occasion. This variability was evaluated for all parameters to diagnose potential time-dependent trends and to allow for random unaccounted variability between dosing moments.

Residual error was evaluated as a proportional or additive error, or as a combination of both (Eq. 2).

$$P_{\text{obs}} = P_{i} \times \left( {1 + \varepsilon_{\text{proportional}} } \right) + \varepsilon_{\text{additive}}$$
(2)

where Pobs is the observed value, εproportional is the proportional error component, and εadditive is the additive error component. Residual error components are sampled from a normal distribution with a mean of zero and variance of σ.

2.5 Covariate Model

Following development of the structural and stochastic PK model, potential predictors (covariates) for variability in PK parameters were evaluated. Assessed covariates included patient-related (body size, i.e. actual body weight [BW], fat-free mass [FFM], BSA, and other, i.e. age, renal function) and treatment-related (serotherapy, additional co-conditioning agents) factors. FFM was calculated using the equation developed by McCune et al. (Eq. 3) [19], and BSA was calculated according to the method developed by Du Bois et al. [20].

$$P_{i} = 9270 \times \left( {\left( {\frac{\text{BW}}{{\frac{\text{BW}}{{{\text{HT}}^{2} }} \times S_{\text{sex}} + C_{\text{sex}} }}} \right)} \right)$$
(3)

where HT corresponds to height in meters, BW corresponds to actual BW in kg, and Ssex (m2/kg) and Csex (dimensionless) are constants that change upon sex. Ssex takes values of 216 and 244, and Csex takes values of 6680 and 8780, for males and females, respectively.

Continuous covariates were evaluated using both a linear function and a power function (Eqs. 4 and 5):

$$P_{i} = P_{\text{pop}} \times \left( {\left( {\frac{{{\text{Cov}}_{i} }}{{{\text{Cov}}_{\text{typical}} }}} \right) \times {\text{l}}} \right)$$
(4)
$$P_{i} = P_{\text{pop}} \times \left( {\frac{{{\text{Cov}}_{i} }}{{{\text{Cov}}_{\text{typical}} }}} \right)^{p}$$
(5)

where Covi is the covariate value for the ith individual, and Covtypical is the typical or median value for the covariate in the population. The estimated parameters are l and p for the linear and power function, respectively.

Binary categorical covariates were tested by using Eq. 6:

$$P_{i} = P_{\text{pop}} \times (1 + P_{\text{cov}} )$$
(6)

where Pcov is the estimated proportional factor with which the parameter changes for a specific covariate value.

To implement body size descriptors on PK parameters, standard allometric scaling was initially applied using Eq. 4, with p fixed at 0.75 (BW/FFM) or 1 (BSA) for clearances, and 1 for distribution volumes (BW/FFM/BSA). Alternative body size measures (FFM, BSA) were tested as a replacement of BW. Empirical estimation of the exponents was tested for the optimal body size descriptor and was only preferred, if this resulted in a relevant improvement of the model fit and when the estimated parameters were markedly different from the theoretical values.

Renal function was evaluated as a covariate, since fludarabine is predominantly eliminated renally [10, 21]. As creatinine levels were not measured daily, the mean value of available individual creatinine values between day − 7 and day 0 prior to infusion were used. Subsequently, renal function (as eGFR) was calculated using the Cockroft–Gault equation, which takes age into account [22]. eGFR for patients below the age of 17 years for women and 14 years for men was calculated using the Schwartz equation [23]. To prevent physiologically implausible high eGFR values, these were capped to a maximum. Maximum eGFR was set at 140 mL/min/1.73 m2, but was assumed to increase to this value from birth until 1.5 years of age, starting at 35 mL/min/1.73 m2 (25% of maximum value), and to decline by 8 mL/min/1.73 m2 per decade after the age of 30 years, as suggested earlier [24].

2.6 PK Model Evaluation

The structural and covariate model with corresponding estimates had to be scientifically and biologically plausible. To investigate parameter–covariate relationships, covariates were plotted versus empirical Bayes estimates of IIV. Trends in these plots indicated potential relationships.

The addition of a parameter had to result in a significant improvement in model fit. This was evaluated using the objective function value (OFV), equal to minus twice the log-likelihood, which is assumed to follow a Chi-square distribution. In hierarchical models, an OFV change (ΔOFV) of − 3.84 corresponds to a p value of 0.05 for the addition of one parameter (i.e. 1 degree of freedom). Covariates were evaluated for significance using forward inclusion and backward elimination [12]. A significance level of p < 0.005 (− 7.9 ΔOFV) was used for the forward inclusion, and p < 0.001 (− 10.8 ΔOFV) for backward elimination. In addition, inclusion of a covariate had to result in a substantial decline in unexplained IIV [25].

A visual inspection of model performance was performed through standard goodness-of-fit plots. Examples of these plots are observed concentrations plotted versus individual and population- predicted concentrations, and conditional weighted residuals (CWRES) versus time and observed concentrations [26]. Particular emphasis was given to goodness-of-fit plots stratified for age, to assess potential age-related misspecification.

Furthermore, relative standard errors (RSEs, as estimated from the $COVARIANCE step) of all parameters, and shrinkage of random and residual error components, were assessed [12]. Values below 30% for shrinkage (IIV and residual error) and RSE (all parameters) were deemed acceptable. Finally, the condition number was calculated after each addition of a parameter, to check for over-parameterization, where a value below 1000 was accepted [27].

Several evaluation techniques were performed, all in accordance with EMA and FDA guidelines for population PK analyses [28, 29]. A non-parametric bootstrap evaluation (1000 samples) was performed to assess parameter precision. In addition, the normalized prediction distribution error (NPDE) was evaluated. Discrepancies between the final model and 1000 simulations of the model were evaluated, taking into account the correlation between observations in the same individual and the predictive distribution [30].

To assess the simulation properties, prediction-corrected visual predictive checks (VPCs) were created to assess the predictive performance of the final model compared with the observed concentrations. The prediction-corrected VPC allows for variability in dosing [31]. In this analysis, the observed concentration data, and its median and 95% confidence intervals (CIs), were compared with the 95% CIs of the predicted mean, 2.5th and 95th percentile, derived from 1000 model simulations.

3 Results

3.1 Patients and Samples

A total of 258 patients with a median age of 18 years were included in this study. Of these patients, 197 received a cumulative fludarabine dose of 160 mg/m2, and 61 received a cumulative dose of 40 mg/m2 in combination with clofarabine. Of the obtained 2605 samples, none were below the lower limit of quantification. The concentration-time data were divided over 596 administered doses, of which 116 (19%) contained peak samples (< 3 h) and 117 (20%) contained trough samples (> 8 h). Detailed patient characteristics are shown in Table 1.

Table 1 Patient characteristics

3.2 Structural and Stochastic Model

BW was a priori included as a covariate using a power function (Eq. 4) on all (intercompartmental) clearances and volumes of distribution during model development. The exponents for BW on volume of distribution and clearance were fixed to 1.0 and 0.75, respectively, prior to covariate analyses.

A three-compartment model best described the data. In addition, both the VPCs and goodness-of-fit plots showed substantial misspecification for the two-compartment model, which was absent in the three-compartment model. The three-compartment model was parameterized in terms of volume of distribution of the central (V1), peripheral (V2), and second peripheral (V3) compartment, and clearance from the central compartment as well as intercompartmental clearance between V1 and V2 (Q2) and V1 and V3 (Q3).

IIV was added on V1 as well as clearance from the central compartment, and inclusion of IOV was also significant for both these parameters. Inclusion of IOV and IIV on Q2, Q3, V2, and V3, led to improved model fit, however this model was highly over-parameterized (condition number > 1000) and unstable (sensitive to initial estimates). Upon visual inspection of the random effects and estimation of the covariance matrices, it was shown that the random effects on volume (V1, V2, V3) and clearances (Cl, Q2, Q3) were highly correlated. Therefore, single random effects (both IIV and IOV) were estimated for V1, V2, and V3, and for CL, Q2, Q3, respectively. This approximation was adequate to describe the observed variability and provided stable and reproducible parameter estimates (Table 2).

Table 2 Final population pharmacokinetic parameter estimates

3.3 Covariate Model

Figure 1a and b depict the variability in observed fludarabine concentrations over time (Fig. 1a), and total exposure (observed AUCT0−∞) (Fig. 1b) stratified BSA-adjusted dose (10 and 40 mg/m2). In both the low (40 mg/m2) and high (160 mg/m2) dose groups, concentrations over time after dose were highly variable, resulting in a large range for AUCT0−∞ (2.7–12, 10–66 mg h/L for 10 and 40 mg/m2, respectively). With a median AUCT0−∞ of 21 mg*h/L (range 5.7–42) and 26 mg h/L (range 13–65) at a 40 mg/m2 dose for children and adults, respectively, these values are slightly higher than those reported by Ivaturi et al. for children (median 18 mg h/L) and by Long-Boyle et al. for adults (median 20 mg h/L) at the same cumulative dose of 160 mg/m2.

Fig. 1
figure 1

Exposure variability and covariates predicting variability. a Fludarabine plasma concentrations versus time after last dose on a logarithmic scale. Each line corresponds to a single dose, stratified to dose. b Histogram (grey area) and density plot (black solid line) of the observed AUCT0−∞. AUCT0−∞ of patients receiving a low dose (40 mg/m2) were normalized to 160 mg/m2. c Boxplots of AUCT0−∞ per weight quartile of observed AUCT0−∞. d Boxplots of observed AUCT0−∞ stratified for renal function. HCT hematopoietic cell transplantation, eGFR estimated glomerular filtration rate

Figure 1c depicts the exposures at different weight quartiles. Low weights correlate to low exposures and high weights correlate to high exposures, indicating that BSA is not a sufficient body-size descriptor for fludarabine clearance. Figure 1d depicts exposures at subgroups stratified for eGFR values according to the classification of the National Kidney Foundation [32]: healthy renal function (eGFR > 90 mL/min/1.73 m2), mild (eGFR 60–90 mL/min/1.83 m2; n = 37) and moderate (eGFR 30–60 mL/min/1.83 m2; n = 11) renal impairment [6]. Healthy renal function was further subdivided into above (n = 129) and below (n = 81) the median (eGFR 120 mL/min/1.73 m2). Herein, it seems that eGFR and fludarabine clearance are correlated.

Therefore, non-renal clearance was differentiated from renal clearance by adding eGFR with an estimated slope. BW and IIV were implemented on the total clearance [33], as illustrated in Eq. 7:

$${\text{Cl}}_{i} = ({\text{Cl}}_{{70{\text{kg}} - {\text{non}} - {\text{renal}}}} + {\text{eGFR}} \times {\text{Slope}}_{\text{pop}} ) \times \left( {\frac{\text{BW}}{{70{\text{kg}}}}} \right)^{0.75} \times {\text{e}}^{\text{IIV}}$$
(7)

where IIV represents the IIV for clearance, Q2 and Q3; eGFR is based on individual creatinine levels and used in this equation as l/h/kg; and Slopepop is a unitless estimate, representing the fraction of eGFR accounting for renal clearance of fludarabine.

The addition of eGFR resulted in a statistically significant improvement in fit (∆OFV −172, 1 degree of freedom, p < 0.001) and IIV on clearance reduced from 34 to 23%. The Slopepop was estimated at 0.78 (RSE 11%), indicating that there is limited renal resorption of fludarabine.

The use of alternative body size descriptors FFM and BSA, did not improve the model fit (ΔOFV of + 60 and + 68, respectively). Estimation of the allometric exponents for volume or clearance resulted in values very close to 0.75 and 1.0, respectively, and did not result in a relevant improvement in model fit. Therefore, the fixed exponents were kept in the model. After inclusion of BW and eGFR, no trends were visible in the plots of empirical Bayes estimates of IIVs versus eGFR and BW.

No other covariates, such as coadminstration of clofarabine or rabbit ATG, could be identified. Importantly, no effect of age could be identified on any of the PK parameters.

3.4 Model Evaluation

The final estimates and the results of the bootstrap analysis are shown in Table 2. The median, 2.5th and 97.5th percentiles of bootstrap estimates are in line with those of the original data.

Age- and renal function-stratified goodness-of-fit plots (Fig. 2) generally demonstrated accurate population and individual predictions. Population predictions for children with a renal function below 60 mL/min/1.73 m2 seemed a bit off, although this group consisted of only four patients. No trends were observed for CWRES versus time after dose (Fig. 3a), predicted concentrations (Fig. 3b), or covariates of renal function (Fig. 3c) and weight (Fig. 3d). The NPDE analysis showed a normal distribution, and no trends were observed in the NPDEs versus time or predictions (data not shown). The VPC showed that the median and 95% CI of the observed data were in line with those from the simulation-based predictions from the model for all strata (Fig. 4), but not the children with moderate renal impairment (< 60 mL/min/1.73 m2). The median of the observations is slightly higher than predicted, indicating an over-prediction of clearance for this group. All four children in this group were at an age < 0.5 years, indicating that maturation for very young children is possibly not well accounted for in the model.

Fig. 2
figure 2

Goodness-of-fit plots for the final pharmacokinetic model stratified by age. a, b Depict the population and individual predictions, respectively, versus observed concentrations, stratified for age (</≥20 years) and renal function (eGFR > 90, 90–60, < 60 mL/min/1.73 m2). Black open circles represent the observations and the solid grey line is a local regression fit of these values. Dashed lines depict the line of unity. eGFR estimated glomerular filtration rate

Fig. 3
figure 3

Conditional weighted residuals versus time after dose, population predictions, and covariates. a–d Depict the CWRES versus time after dose, the population predictions, renal function (eGFR) and actual body weight, respectively. Black open circles represent the CWRES values and the solid grey line is a local regression fit of these values. Dashed lines depict the zero-line. CWRES conditional weighted residuals, eGFR estimated glomerular filtration rate

Fig. 4
figure 4

Stratified prediction-corrected visual predictive check. Black lines depict the observed median (solid) and 2.5% and 97.5% percentile (dashed) concentrations. Dark- and light-grey areas represent 95% prediction intervals of the simulated mean and the 2.5 and 97.5% percentiles, respectively. Round dots represent observations. Asterisks highlight observed percentiles outside the prediction area. Increased median concentrations after 8 h are caused by the prediction correction. Subjects for whom samples for these bins were available had higher actual body weights

We performed a sensitivity analysis to investigate potential bias caused by the imbalanced sampling design. This analysis showed that bias in clearance was negligible (data not shown).

4 Discussion

To our knowledge, this is the first PK model developed on the basis of a large and diverse dataset (n = 258) including both adult and pediatric patients. In addition, this study covered the vast majority of HCT indications: acute leukemia, lymphoma, plasma cell disorders, myelodysplastic syndrome, and a variety of benign disorders (autoimmune diseases, immune deficiencies, bone marrow failure, and metabolic diseases). This allowed for a unique platform to quantify fludarabine PK for all HCT populations.

Fludarabine PK was best described using a three-compartment model, which is in line with the data of the phase I study [10]. In contrast, fludarabine plasma PK in two other population PK studies was described using a two-compartment model [9, 14]; however, these analyses had smaller sample sizes (n = 54/n = 133) and included only children. In addition, no formal testing of three-compartmental kinetics was mentioned.

Allometric scaling of all parameters using BW was found to best account for differences in body size, and, after inclusion of eGFR, no body size-independent effect of age could be identified. Other studies did not compare body-size descriptors but rather implemented either allometric scaling to BW [9] or BSA-adjusted PK [11, 14] a priori. Given the evidence supporting allometric theory over BSA-adjusted PK [34, 35], allometry-based adjustment is preferred. In addition, we found that fludarabine BSA-based dosing led to major under- and over-prediction of exposures at high and low BW, respectively.

Similar to Ivaturi et al., eGFR was included using a body size-adjusted method. This method has the advantage of reflecting solely renal function, while absolute eGFR strongly correlates to body -size. Using the BW-adjusted, rather than BSA-adjusted [9] eGFR, allowed separation of a renal and non-renal fraction of clearance. In addition, the estimated slope now represents the fraction of eGFR accounting for renal clearance. This slope was estimated at 0.78, implicating that renal resorption of fludarabine occurs; however. this is based on the assumption that eGFR accurately reflects actual glomerular filtration rate. Furthermore, the use of the Schwartz and Cockroft–Gault equations, as well as the age-dependent capping of eGFR, may have impact on the relationship between actual glomerular filtration rate and eGFR. As seen in the VPC, the maturation of renal function might not be properly accounted for in the current model, thus careful monitoring might be advised for the very young children.

Interestingly, the fludarabine label [36] only indicates a dose reduction (up to 50%) when eGFR is below 70 mL/min/1.73 m2. Although this 50% dose reduction is supported by our findings, an eGFR below 120 mL/min/1.73 m2 has already been associated with a substantial decrease in fludarabine clearance (> 25%) and concomitant high exposures. Furthermore, the decrease in clearance is a gradual process. Therefore, a dosing algorithm that takes eGFR into account, similar to the clinically applied algorithm for carboplatin, may be more appropriate for fludarabine. Such an algorithm could be an equation directly derived from the model-predicted clearance.

The applicability of such an algorithm depends on the diversity of the underlying dataset, which was sufficient in our study regarding age and indication, but less sufficient regarding co-conditioning agents (busulfan, ± clofarabine). This has the disadvantage of not being able to quantify possible PK interactions with a variety of conditioning agents. Given the predominant renal elimination, such interactions are unlikely and were indeed found to be absent with clofarabine in these data. PK extrapolation of dosing algorithms may therefore be well translatable to other conditioning regimens, although the target exposure might differ. The advantage of this homogeneously treated cohort is the possibility of finding an optimal PK exposure for fludarabine in the widely applied busulfan/fludarabine regimen.

5 Conclusion

Given the observed high variability in exposure, the current BSA-based dosing regimen, without taking eGFR into account, may not be appropriate. The current analysis provides a rational base for a harmonized optimal dosing regimen for all age groups in HCT.