Introduction

Calibration curves illustrate the relationship between the detected response variable and the concentration of a reference standard that is presumed to be representative of the analyte of interest in a test sample. They are used to estimate the unknown concentration of the analyte of interest in a test sample by dose interpolation. Calibration curves are prepared by spiking the target analyte into a matrix that has been judged to be representative of the test sample matrix. The instrument read-out values for unknown samples and quality controls (QCs) are subsequently used to interpolate their concentrations from the calibration (or standard) curve. Three factors should be given consideration for optimal fitting of non-linear calibration curves. These include fitting the mean concentration response relationship, use of an appropriate weighting to account for the known heteroscedasticity (non-constant response-error relationship) in non-linear dose response curves, and a suitable curve fitting algorithm to estimate the curve fit parameters.

The accuracy of sample quantitation depends on the robustness and reproducibility of the assay calibration curve, which is in turn dependent upon the performance of the reference material and other assay components. Performance characteristics of ligand binding assay (LBA) components which include but are not limited to the solid or immobilized surfaces such as microtiter plates and the capture and detection antibodies should be thoroughly evaluated in the method development phase, and appropriate plans should be put in place to monitor lot to lot reagent consistency. The general requirements for the design of calibration curves, the acceptance criteria for individual calibrators, and the guidelines for the selection of an appropriate regression model have been defined in regulatory guidance documents and lead publications by subject matter experts (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17). Compliance with these guidelines and adherence to the published requirements would enhance reproducibility of a calibration curve across runs and across studies. Other aspects of calibration curves including editing specifications and preparation guidelines have not been established or adequately addressed. It is ultimately the responsibility of each bioanalytical laboratory to define the criteria for the design, preparation, acceptance, and editing of LBA calibration curves in their standard operating procedures (SOPs). This publication aims to present a collective view from members of the LBA community to fill in gaps by providing recommendations and best practices for the preparation of calibration curves as well as for the treatment of calibrator data points. Although the content of this publication may be applicable to subsets of biomarker assays, its focus remains on calibration curves for quantitative pharmacokinetic (PK) LBAs. All other assay types are outside the scope of this paper.

Calibration Curves in Quantitative Analysis

Non-Linear Nature of Ligand Binding Assay Calibration Curves

There are key differences between calibration curves in LBAs and in chromatographic assays. In LBAs, the instrument response may be directly or inversely related to the analyte concentration depending on the non-competitive or competitive format of the assay. Irrespective of the format, the use of a semi-log scale translates the curve into a sigmodal “S-shaped” relationship between the response and the concentration of the analyte. This is in contrast to the chromatographic assays where response is typically a linear function of the concentration, and the two are proportional over most of the calibration curve range. For chromatographic methods, loss of linearity is an indication that the assay has reached its limits of the detection. LBAs rely upon the interaction of the analyte with a binding agent such as an antibody or a receptor component; this is in contrast to traditional chromatographic assays in which detection of the target analyte is independent of its binding to a macromolecule. The dynamic equilibrium nature of protein-protein interaction leads to a non-linear response in LBAs. Furthermore, since performance of LBAs heavily depends upon the performance of their constituent biological reagents, these assays typically manifest greater variability. The non-linear nature of an LBA curve limits the concentration-response correlation at the upper and lower ends of the curve, resulting in plateaus and therefore an S-shaped curve. Quantitation from the asymptote (upper and lower plateaus) of the calibration curve would result in poor precision and accuracy. These characteristics ultimately narrow the validated quantitative range of LBAs and render the selection of an appropriate non-linear data fitting algorithm all the more important.

Performance and Validation Requirements for Non-Linear Regression Software

There is a wide range of commercial software available to perform non-linear regression for LBAs. Most instrument manufacturers provide a non-linear regression software that is compatible with the equipment. Depending on which better meets their needs and requirements, laboratories may alternatively choose to install a stand-alone software or use their laboratory information management system (LIMS) for regression purposes. There are performance requirements for the software. The software used for LBA calibration curves should have the capability to

  • Perform four and five parameter logistic (4 PL and 5 PL) regressions

  • Allow for application of various weighting factors

  • Calculate %bias

  • Determine %CV

  • Possess the capability to plot concentration versus %bias for each model with various weighting factors and the response curve

  • Allow for editing of the curve and repeat regression after editing

  • Be compatible with standard computer equipment, infrastructure, networks, and data processing procedures

  • Be compatible with standard or custom system interfaces

  • Allow for data acquisition, analysis, and reporting

  • Allow for data upload to LIMS

  • Include an audit trail feature

  • Include an edit lock feature

  • Allow for creation of custom immunoassay templates to incorporate the acceptance criteria of the validated method

Software validation is the responsibility of the end user. Additional recommendations and the general requirement for software validation have been provided in Appendix 2.

Calibration Curve Minimum Requirements

Comparison of Requirements from Various Regulatory Agencies

The minimum requirements for calibration standard curves have been established in a number of bioanalytical guidance or in the bioanalytical subsections of regulatory documents. At the same time, several bioanalytical guidance from around the world are still in the draft stage. Guidance documents are generally aligned with regard to the requirements and performance expectations of the LBA curves. Table I summarizes the calibration curve requirements from the US Food and Drug Administration (FDA), European Medicines Agency (EMA), the Japanese Ministry of Health, Labor and Welfare (MHLW), and the Brazilian Sanitary Surveillance Agency (ANVISA) (9,10,11,12,13,14). The FDA and EMA guidance are the lead regulatory documents for the vast majority of bioanalytical laboratories; individual groups should assess their regulatory requirements based on the agency they intend to submit to.

Table I Comparison of Regulatory Agency Requirements for Ligand Binding Assay Calibration Curves

Preparation of Calibrator Standards

Calibrator standards are generated by adding known concentrations of the reference standard into a qualified batch of matrix identical to or consistent with the study matrix. The concentration-response relationship of these calibrator standards establishes the calibration curve of the assay. In studies involving co-administration of multiple drugs, one calibration curve is required per analyte present in the study sample. Preparation of calibrators must be independent of assay QCs (6,11) to prevent the spread or magnification of potential spiking errors. This means that calibrators and QCs may not be prepared from the same intermediate stock of the reference standard. Calibrators may be prepared by serially diluting a primary or intermediate stock of the reference material. It is not required to spike calibrators individually at each level although such practice would add another level of control and allow for monitoring of the spiking accuracy. Irrespective of the composition of the intermediate stocks of the reference standard, calibrators should contain a minimum of 95% study matrix.

Surrogate Matrix

The expectation for LBAs is that where possible, every effort is made to prepare the standard calibrators in a biological matrix which matches the study matrix with respect to species, composition, and matrix pre-processing (4,5,11). For example, if study samples are unfiltered serum, then the calibrator matrix pool must also be prepared from unfiltered serum from the same species. A point to consider is that often the study population has a disease condition, whereas the calibrator matrix is from healthy subjects. Preparation of calibrators in depleted or in surrogate matrix may be justified provided that no other strategy to quantify the analyte exists; for example, when the study uses a matrix that is rare or difficult to obtain, when a therapeutic has an endogenous counterpart, or in the case of biomarker assays. In these situations, the bioanalytical method should be validated using study matrix QCs and study matrix selectivity samples to be evaluated against a calibration curve prepared in the surrogate matrix (4). Additional tests to demonstrate comparability between the dilution curves of the surrogate versus study matrices are also recommended. A commonly accepted method is to test the equivalence of the lower and upper asymptotes, growth rates, and in the case of 5 PL curves, the asymmetry factors. Implementation of parallelism tests has been presented by Yang et al. and Sondag et al. (18,19). Acceptance criteria for these equivalence tests have not been published, while efforts in that direction have been ongoing and presented at scientific meetings (20). Figure 1 provides example of a calibration curve prepared in human plasma but also one that is prepared in a buffer surrogate matrix and is a representation of two parallel curves.

Fig. 1
figure 1

Representation of parallel plasma and surrogate matrix calibration curves. Instrument response in optical density (OD) is plotted on the y-axis against calibrator concentration in nanograms per milliliter on the x-axis

Use of MRD-Diluted Matrix

Calibrators may be prepared in 100% or minimum required dilution (MRD) diluted matrix. Examples of methods using MRD diluted matrix include but are not limited to methods for rare matrices or those performed on automated platforms where use of 100% matrix could become problematic due to limited availability or due to matrix viscosity. The decision whether calibrators are prepared in 100% or MRD diluted matrix should be based on the assay performance during method development. Specifically, appropriate assessments should be conducted to ensure that calibrator and QC recoveries are within the expected range (e.g., ± 20%) of the nominal value. When prepared in 100% matrix, calibrators will require the same MRD dilution as that applied to assay QCs and study samples. The back-calculated concentrations of unknown samples should be reported as the concentration in 100% matrix.

Qualified Matrix Pool

The selection process for a qualified matrix pool (QMP) for bioanalytical applications is critical. It is recommended to qualify and store sufficient volumes of the matrix pool to last through method development, validation, and at least the first in-study bioanalysis. It is also recommended that the QMP be stored under the same conditions as those set for assay QCs and study samples. For example, if samples are stored at less than or equal to − 65°C, the QMP should also be stored in that temperature range. During method development, individual matrix samples or individual matrix pools may be screened, selected, and consolidated to generate a QMP. Individual commercial pooled lots of the matrix may also be applied. QMP screening should include evaluation of assay signal generated by the unfortified as well as the analyte-fortified matrix samples. For screening purposes, individual samples may, for example, be spiked at the LLOQ level. Matrix samples with high background or suboptimal spiked recoveries should be excluded from the QMP. Example acceptance for matrix samples may be that a minimum of 80% of all spiked individual matrix samples meet analytical recovery acceptance criterion of 80 to 120% of the nominal concentration based on a calibration curve prepared with the QMP. The QMP should be representative of the study population and prepared by pooling only the individual matrix samples which meet the targeted acceptance criteria.

Fresh Versus Frozen Calibrators

LBA calibrators may be freshly prepared or frozen. Some laboratories use freshly prepared calibrators in all phases including method development, pre-study validation, and subsequent bioanalysis. Fresh calibrators may be prepared from aliquots of an original or an intermediate reference standard stock. If intermediate reference standard aliquots are used, their stability must be established prior to issuance of the validation report. The use of freshly prepared calibrators to evaluate frozen QCs during pre-study method validation serves to establish preliminary stability of the QCs. Once preliminary QC stability has been established, some laboratories use that information to prepare and store frozen aliquots of individual calibrator levels. This approach is equally acceptable provided that the LLOQ and ULOQ levels are included in the stability testing and the stability testing window covers the calibrator storage period. Pre-qualified frozen calibrators are intended to reduce run-to-run variability and increase efficiency. Frozen calibrators should be prepared in single-use aliquots, and subjecting calibrators to freeze-thaw should be avoided.

Calibration Curve Design

Regulatory agencies have provided guidelines for the design of the calibration curve. The following is a short list of such guidelines:

  • Calibration curves should include a minimum of six non-zero concentrations including LLOQ and ULOQ which meet the acceptance criteria.

  • The simplest regression model should be used.

  • Weighting should be justified if applied.

  • LLOQ and ULOQ levels should not coincide with the low, medium, or high QC.

  • Calibration curve range should be appropriate for the expected concentration range of study samples. This means that assay should be capable of generating reportable sample concentrations as per PK requirements and for as many samples as possible. An appropriate dilution prior to sample analysis may be applied.

There are additional good practices and suggestions not addressed by regulatory agencies which may be applied. These include

  • A semi-log scale is recommended during data analysis to view the data and to facilitate evaluation of the assay performance.

  • A standard practice for the minimum number of calibrators is 1 + the number of unknown parameters in the model; however, this does not account for the assay variability and leaves zero freedom if there is only one replicate per concentration point. As such, a minimum of six calibrator points are recommended for a 4 PL curve.

  • Even spacing of calibrators, e.g., on a logarithmic scale.

  • Minimum spacing requirement between the zero calibrator (if applicable) and the LLOQ to help prevent loss of LLOQ due to run-to-run variability, to be established during method development and defined in the validated method.

  • Maximal achievable ULOQ/LLOQ signal ratio is recommended to ensure robustness, and it is to be assessed during method development.

  • Inclusion of anchor points may be beneficial to the curve fit as determined during method development (see Appendix 3).

  • Inclusion of zero as an anchor point may be beneficial to the curve fit as determined during method development (see Appendix 5).

Quantitative Range

The ULOQ and LLOQ which represent the upper and lower limits of the quantitative assay range, respectively, must be validated as part of pre-study validation. To validate LLOQ and ULOQ calibrators, it is not sufficient to merely include calibrators at those levels; rather, validation samples (QCs) prepared at LLOQ and ULOQ levels must also be included in accuracy and precision runs during pre-study validation. ULOQ calibrator must meet the relative error (RE) and coefficient of variation (CV) acceptance criteria of ± 20 and ≤ 20%, respectively, and the LLOQ calibrator, the RE criterion of ± 25% and the CV of ≤ 25% as set forth in the FDA guidance before they are qualified for inclusion in the calibration curve. Appendix 1 provides an example of a step-by-step approach to the selection and qualification of the standard curve range and the assay quantitative range. Once validated, LLOQ and ULOQ level calibrators become an integral part of the curve and must be included in every run. There should be one calibration curve for each analyte in the study and one in each analytical run. The calibration range must be appropriate for and correspond to the anticipated concentration range of study samples (4). Generally, an extended quantitative range is helpful to cover a broader concentration of samples. A narrow quantitative range limits the analytical capability of the assay resulting in unnecessary repeat analysis at additional dilutions to bring the sample concentration within range. Zero calibrator defined as a matrix sample without the analyte (see Appendix 5 for additional information) is not required but may be beneficial. Current recommendation is that the LBA calibrators be analyzed in duplicate although the variability trending, such as high CVs, may necessitate triplicate testing. Use of singlicates would only be justified with demonstrated robustness and high precision of the raw responses over the quantification range of the method.

Anchor Points

Anchor points have been discussed in the 2012 EMA bioanalytical guidance and have been recommended throughout industry for their role in fitting non-linear regression models (1,4,5,7). Anchor points are defined as calibrators above and below the quantitative range of the assay that are not subject to the same performance requirements as the curve points. Inclusion of anchor points or their usefulness are not universally accepted ideas; yet, it is recommended that they be evaluated as part of method development and their impact on improvement aof the overall regression be assessed. Determination of whether anchor points improve the curve fit should be based on a proposed mathematical algorithm or a proposed weighting factor, and it should be determined on a case-by-case basis. Anchor points may be especially helpful in enhancing the curve fit not only in the cases of overly extended or abbreviated calibration curves but also in calibration models where weighting is employed. Lower anchor points which are placed below the LLOQ of the assay have in some cases enhanced the curve fit and helped the LLOQ of assay meet its acceptance criteria. An example is provided in Appendix 3.

Spacing

Calibrator spacing and the ULOQ/LLOQ signal ratio have not been addressed by regulatory guidance. Even spacing of calibrators, e.g., on a logarithmic scale of the power of 2, is generally recommended and may be beneficial for the assay performance (1,7). Additional calibrators may be added to better define the inflection points of the curve provided that these points are included in the assessment of the regression model. Inclusion of more closely spaced calibrators in the proximity of the assay LLOQ may be beneficial to minimize loss of sensitivity in the case of LLOQ failure. Inclusion of a zero calibrator is not an agency requirement or standard practice; however, should a laboratory choose to include zero as part of the curve fit, it is suggested that an adequate spacing be allowed between the LLOQ and the zero calibrator to safeguard the LLOQ from failure. As an example, the EMA 2012 guidance states that the signal of the LLOQ should be at least five times the signal of a blank sample. LBA laboratories should determine appropriate spacing for each method on the basis of the assay performance. The ULOQ/LLOQ concentration ratio is influenced by assay format, platform, and reagent characteristics. Laboratories should aim to maximize ULOQ/LLOQ ratio and may set a minimum target ratio (e.g., 10/1).

Selection of Regression Model

Selection of a non-linear regression model requires multiple iterations to achieve the best fit for the LBA calibration curve. Current regulatory guidance recommends that the simplest model which results in an adequate fit be used. The authors recommend that the regression model is selected first before any potential weighting is evaluated. It is also important to assess weighting to mitigate unequal variability of the response at different concentrations (2,21). Additionally, a critical parameter to be factored in is the quality of reportable results which supersedes the consideration for the quality of the model fit and which could be assessed through an accuracy profile (22). Accuracy profile aides in the visualization of various model fits. It must be emphasized that use of linear functions to approximate sigmoidal curves and log-log transformation of the data to make an inherently non-linear relationship approximately linear, has been discouraged in current literature (1,23).

Regression Model

Common non-linear models for calibration curves (24) are the 4 PL and 5 PL which can be achieved by a number of automated software (21). While other non-linear models can be used, their application should be carefully justified as special cases where 4 PL and 5 PL are not appropriate.

The most common parameterization for the 4 PL is

$$ {y}_j=a+\frac{d-a}{1+{\left(\frac{x_j}{c}\right)}^b} $$

where y j is the response at concentration x j , a is the upper asymptote, d is the lower asymptote, c is the concentration at the inflection point of the curve, and b is the growth factor. One characteristic of this model is the symmetry around the inflection point which corresponds to one half of the distance between d and a. While this approach may be appropriate, it often yields an asymmetrical calibration curve that requires the use of a 5 PL logistic fit (25). The general parameterization for the 5 PL is

$$ {y}_j=a+\frac{d-a}{{\left(1+{\left(\frac{x_j}{c}\right)}^b\right)}^g} $$

where g is the asymmetry factor. When g = 1, the 5 PL function is exactly equivalent to a 4 PL function. Figure 2 illustrates the difference between a 4 PL and 5 PL curve fits.

Fig. 2
figure 2

Four parameter logistic (4 PL, left) and five parameter logistic (5 PL, right) curves. In these graphs, instrument response in optical density (OD) is plotted on the y-axis against calibrator concentration in nanograms per milliliter on the x-axis in log scale

Weighting

Another common challenge of LBA curve fitting is unequal variability of the response for different calibrator concentrations (2,21). This phenomenon is known as heteroscedasticity and is addressed by weighting the model proportional to the variability of the response across the concentration range. Failure to use proper weighting will result in greater bias and imprecision of interpolated values near the LLOQ and ULOQ. Appropriate use of weighting mitigates the unequal variance in replicate response measurements. Most software possesses the required functionality to perform weighted regression. A selection of commonly used weights such as 1/y and 1/y2 are usually built into the software, and the choice of the best weighting function is made by assessing the “response-error relation” (26). 1/y weight enhances the points at the bottom portion of the curve, and 1/y2 enhances both the top and the bottom ends. There are other weighting factors and weighting methodologies which are equally acceptable and may better serve the needs of individual assays. While the asymmetry of the curve and heteroscedasticity are important considerations for the curve fit, their inclusion in the model without proper pre-assessment may lead to fitting an overly complex model (27). Inclusion of parameters in the calibration curve which reflect the natural variability of the assay may lead to back-calculation errors and increased bias in the reported results.

Accuracy Profile

Different methodologies have been employed to statistically assess the quality of the model fit; these include R2 and root mean square error. However, such parameters are not appropriate in all cases as they are designed to minimize the error of the response instead of the error of the reportable result (28). Additionally, Anscombe’s quartet (29) shows that highly different data can lead to similar quality of fit and do not guarantee similar inverse predictions.

The accuracy profile is an alternative graphical method that is based on the future reportable results (22). It connects the β-expectation tolerance interval of the relative difference between the assay reportable results and that of the true value at each concentration level. The β-expectation tolerance interval is defined as the interval within which a certain proportion (β%) of the future results are expected to fall. Because this tool easily allows for visualization of the bias, precision, and LOQ (limit of quantitation, where the accuracy profile crosses the acceptance limits), it provides a simple method to compare models and choose the one which leads to the highest accuracy for the reportable results. Figure 3 presents two different calibration curves with their associated accuracy profiles based on 15% relative error (RE; defined as the ratio of the difference between experimental and theoretical values over theoretical value) acceptance limit. 15% RE acceptance limit is only recommended as a guide.

Fig. 3
figure 3

Comparison of model fits—weighting versus no-weighting. Calibration curves of an LBA and the resulting accuracy profile (a top, b bottom). a A weighted 4 PL model. b A non-weighted 5 PL model. Instrument response in optical density (OD) is plotted on the y-axis against calibrator concentration in nanograms per milliliter on the x-axis. The dotted line connects the lower and upper β-expectation tolerance interval, and the plain lines are the acceptance limits. Models are still similar, but the 4 PL provides better inverse predictions at low concentrations. In this case, the R2 of the 4 PL model is 0.9949137 and the R2 of the 5 PL is 0.9949457, but the 4 PL provides better inverse prediction. The desired feature for residual analysis is random distribution with no bias at either low or high concentration levels. Additionally, attention needs to be paid to low or high levels that could over-quantify or under-quantify

Recommendations for Editing a Calibration Curve

A minimum of 75% of calibrators, including those at LLOQ and ULOQ levels, must pass the LBA acceptance criteria of back-calculated concentrations which are within ± 20% (25 % for LLOQ) of the stated nominal concentrations and CVs which are in the < 20% (25% for LLOQ) range (FDA 2013; EMA 2012) (9,11) (also see Table I). CVs of back-calculated concentrations, and not those of the instrument response, should be reported for each calibrator. Calibrators should be first excluded on the basis of precision. After each exclusion, the curve should be re-regressed and re-evaluated. Next, calibrators should be excluded one at a time in the order of bias (RE) starting with the highest bias. Additional exclusions should be performed only if needed. Masking is defined as removal of a calibrator point from the standard curve regression while the calibrator remains available in the system. The terms masking and exclusion are used interchangeably in the curve editing discussions. LBA calibrators are typically run in duplicate, that is in two wells of the microtiter plate. Acceptance of a calibrator which is run in duplicate should be based on two passing wells if the curve is fit to the mean of standard replicates. In such cases, it is not recommended to exclude one well and use the result from the remaining well alone. In some platforms such as Singulex, calibrators may be run in triplicates; in such cases, at least two of three wells must pass in order to accept a calibrator. In the event all replicates of both the LLOQ and ULOQ calibrators fail, the validation run fails, and the possible sources of the failure should be investigated (EMA 2012) (9). Calibration curve may only be edited due to an assignable cause such as documented spiking or pipetting error or application of a priori statistical criteria. Editing a calibration curve must be conducted independent of QC assessment; this means that calibrators should not be excluded to facilitate QC passing, unless calibrator-related acceptance criteria were not met. The example provided in Appendix 4 demonstrates how masking a single out of specification calibrator improved the CV and REs of a number of other calibrators and brought them to the acceptance range. Each laboratory must define the guidelines for calibrator masking and editing in their SOPs. The general guidelines for masking and exclusion of calibrators are listed below:

  • Calibrators must first be masked if they fail CV.

  • Subsequently, calibrators should be excluded one at a time in the order of bias.

  • No two consecutive calibrators may be masked, but two or more consecutive anchor points may be masked.

  • The number of masked duplicate calibrators must be ≤ 25% of total assay (duplicate) calibrators.

  • Following editing a calibration curve, a minimum of six valid points must remain (17).

  • If either LLOQ or ULOQ calibrators are masked, the assay limit is shifted to the next highest or the next lowest valid calibrator, respectively.

  • After masking, the low and high QCs should remain bracketed by valid calibrators; otherwise, the assay fails.

  • Exclusion of a calibrator should not lead to a change in the regression model already established for the validated assay.

  • Consistent need for editing warrants reevaluation of the calibration curve range and the anchor points.

Reporting of Sample Analysis Results

During sample analysis, the first step is always evaluation of the calibrators to assess whether an acceptable curve has been generated. Acceptability is based on ± 20% RE and ≤ 20% CV criteria for individual calibrators (± 25% RE and ≤ 25% CV for LLOQ). A minimum of 75% of all non-zero calibrators must meet the above criteria. Comparability of calibrator performance to historical data is another factor to keep under watch; abnormally high background signal or low overall response may be warning signs and cause for re-evaluation of the assay performance. Additional assessment of the calibration curve should be performed after any and all appropriate curve editing. Only after an acceptable curve fit has been obtained should the QC samples be judged against their acceptance criteria. Assay QCs must meet their acceptance criteria as outlined in the EMA 2012 Bioanalytical Method Validation guidance before a run can be accepted. An assay is deemed to have passed only if both calibrators and QCs meet their respective acceptance criteria.

Once an assay run has passed, each individual sample can be evaluated against the calibration curve. If an unknown sample result has an acceptable %CV, and the mean concentration falls within the validated range of the method, that result may be reported. If the mean concentration of the sample is outside the quantitative range of the assay, the sample should be reanalyzed at an appropriate dilution to obtain an in-range result. In cases where one replicate is within the validated range and the other replicate is either below the LLOQ or above the ULOQ, the mean result should be reported, provided that the mean is within the validated range and the %CV is acceptable. Sample concentration results should be reported for 100% matrix taking into account the assay MRD and any other applied dilution factors.

As stated in the previous section, if the LLOQ calibrator level fails and must be masked, the quantitative range of the assay is truncated to the next highest calibrator. In such case, the low QC must still remain bracketed by acceptable calibrators; otherwise, the assay fails. Similarly, if ULOQ fails, the upper end of the curve is shifted down to the next acceptable calibrator, and here again, the HQC must remain bracketed by acceptable calibrators, or the assay fails. Should the measured value of a sample fall outside the quantitative range of the assay that is below LLOQ or above ULOQ, the sample must be re-analyzed at an adjusted dilution to bring its measured value within range.

In-Study Monitoring of Assay Calibration Curve Performance

LBA curves are potentially susceptible to calibration drift over time. Calibration curve performance drift is defined as a shift in calibration of the assay due to changes in the reactivity or the binding properties of the reference standard, assay reagents, and other assay components. This shift may change the slope or other properties of the standard curve and ultimately lead to over-reporting or under-reporting of sample concentrations. Additional changes in the assay upper and lower limits of quantitation or in sample dilutional linearity patterns may occur. Factors that may lead to calibration curve performance drift include but are not limited to (a) new batch of matrix pool, (b) changes in critical reagent characteristics (such as purity, specificity, and binding affinity of capture or detection antibodies), (c) changes in the performance of non-critical reagents containing proteinaceous or lipidaceous additives or carriers, and (d) modifications to the formulation of the reference material.

Monitoring assay calibration curve performance should begin early in method development and continue through pre-study validation and into bioanalysis. Additionally, it is critical to track calibration drift over the span of multiple clinical studies. There is currently no consensus or established methodology for monitoring assay calibration curve performance. Recommendations for monitoring drift include

  1. 1.

    Track accuracy and precision performance of calibrators and QCs based on acceptance criteria established during pre-study validation.

  2. 2.

    Plot QC concentrations and the assay zero calibrator (blank) signal over time. A graphical QC chart may also be constructed to aide in the assessment of trending (8).

  3. 3.

    Cross evaluate the concentrations of old and new lots of QC and calibrators and track the %difference. Criteria for acceptable %difference between old and new must be established during pre-study method validation and specified in the test method.

  4. 4.

    Periodically test a pre-selected panel of control or study samples which have established value and stability. A drift in the measured concentrations of such samples may indicate a drift in the assay.

Discussion/Conclusion

High-quality reliable ligand binding assays designed to determine drug concentrations in support of pharmacokinetic studies play a critical role in the drug development process. In typical LBA methods, a calibration curve is employed to interpolate unknown sample and assay quality control concentrations. A non-linear signal to concentration relationship is expected for the majority of LBAs. It is therefore recommended to apply a multi-parametric, typically 4 PL or 5 PL mathematical fit to a minimum of six calibrator data points within the quantifiable range. Additional calibrators, including a zero or other anchor points, may be considered to improve the quality of the curve fit. The most appropriate and the simplest regression model with possible application of a weighting should be selected based on the analysis of accuracy profile. Software which allow for relevant regression analyses come as part of laboratory information management systems, as part of the instrument data analysis package, or as stand-alone packages. The quality of the calibration curve and ultimately of the assay is highly dependent upon the calibrator preparation process, the type of matrix used, and the calibrator sample storage conditions. Agency guidelines issued by the FDA, EMA, MHLW, and ANVISA address calibration curve parameters and their performance requirements to varying extent. Many of these requirements are well aligned, while some differences exist. One important topic discussed in detail in the present publication is related to the specifications and appropriate rules for editing a calibration curve. Authors aimed to provide editing recommendations which are scientifically sound and in line with the industry practices. This paper contains a collection of useful examples designed to elucidate the proposed approaches to regression model selection, calibration curve design, and data analysis. Manuscript recommendations should be viewed as examples of best practice. Other approaches may also be acceptable with the demonstration of suitability and scientific rationale. The primary goal of the paper is to help readers develop high-quality PK assays and enable optimal and fit-for-purpose evaluation of drug concentrations in non-clinical and clinical investigations.