Keywords

1 Introduction

The mention of certified or standardized laboratory requirements to other scientists is often met with apprehension, a skeptical expression, and concerned questions about taking on additional bureaucracy and time-consuming paperwork in addition to introducing what is perceived as unnecessary limits for creative and innovative scientific freedom. However, we want to dispel misconceptions that implementing quality management in research practice is only marginally beneficial or too burdensome and costly to justify. Furthermore, the treatment of quality costs is misleading and incomplete if regarding only the costs of “doing something” – in this case implementing quality management. This approach, known as “omission bias,” ignores the fact that “not doing something,” i.e., not implementing quality management, also comes with costs, some clear and many others hidden or absent from general estimations.

We will describe two examples of implementing quality management systems in preclinical experimental (animal) research environments – one in Europe, the German Mouse Clinic, having established ISO 9001 and the other in the United States, the University of Kentucky (UK), having established Good Laboratory Practice (GLP)-compliant infrastructure. In the end, we hope to have made a convincing case for taking a long-term approach to promoting quality, where the costs are comparatively minor and the benefits exceed initial “activation energy” and commitments needed for implementation. Finally, we present a summary of benefits to having an effective QMS, as it may be useful in guiding discussions with funders or administrators to promote interest and investment in a QMS, which ultimately supports shared, mutually beneficial outcomes.

2 German Mouse Clinic: ISO 9001

2.1 Our Mission

The German Mouse Clinic (GMC) is part of the Helmholtz Zentrum München (HMGU) and located in Munich, Germany. Understanding gene function in general, and furthermore the causation, etiology, and factors for the onset of genetic diseases, is the driving force of the GMC. Established in 2001 as a high-throughput phenotyping platform for the scientific community, we have set up different phenotyping pipelines covering various organ systems and disease areas (Gailus-Durner et al. 2005, 2009; Fuchs et al. 2017; www.mouseclinic.de).

Standardized phenotyping is designated to the areas of behavior, bone and cartilage development, neurology, clinical chemistry, eye development, immunology, allergy, steroid metabolism, energy metabolism, lung function, vision and pain perception, molecular phenotyping, cardiovascular analyses, and pathology. In a comprehensive primary screen, we can analyze 700+ parameters per mouse and collect 400+ additional metadata (Maier et al. 2015). Our collaboration partners have to provide a cohort of age-matched mutant animals from both sexes and the corresponding wildtype littermates. Expectations for high-quality and scientifically valid outcomes are very high, due to widespread financial and time resources spent collectively. Therefore, study design, performance of experiments, material and equipment, as well as training of personnel have to be well coordinated, harmonized, and advanced.

2.2 Our Team and Main Stakeholders

Our GMC team consists of scientists with expertise in a specific disease area (e.g., energy metabolism or pathology), technicians performing the phenotypic analysis, animal caretakers, computer scientists, a management team, and the director. In total, we have approximately 50 team members with a third of the staff working in the animal facility barrier, which represents an additional challenge to maintaining efficient communication crucial for successful output and teamwork.

Collaboration partners who send us mouse models for phenotypic analysis are scientists or clinicians from groups at the HMGU and many different academic institutions, universities, hospitals, or industry from Germany, Europe, and other countries/continents like the United States, Australia, and Asia. Since the beginning of the GMC, we have analyzed mice from 170 collaboration partners/laboratories from 20 different countries.

The GMC is a partner in a number of consortia like the International Mouse Phenotyping Consortium (IMPC, a consortium of so-called mouse clinics all over the world, http://www.mousephenotype.org, Brown and Moore 2012; Brown et al. 2018) and INFRAFRONTIER (https://www.infrafrontier.eu; Raess et al. 2016). Together with the European mouse clinics, the GMC has developed standardized phenotyping protocols (standard operating procedures (SOPs)) in the European EUMODIC program (Hrabe de Angelis et al. 2015). These SOPs have been further developed for the use in the IMPC (https://www.mousephenotype.org/impress). Starting on the European (EUMODIC) and expanding to the international level (IMPC), we had to harmonize equipment, handling of animals, and documentation of laboratories to enable the compatibility of phenotyping data from different facilities around the globe.

2.3 Needs Concerning Quality Management and Why ISO 9001

There are logistical, experimental, and analytical challenges of systemic, large-scale mouse phenotyping to ensure high-quality phenotyping data. The mutant mice we receive for analysis are generated by different technologies (e.g., gene editing, knockout, knock-in) and on various genetic backgrounds. As a unique feature, we import age-matched cohorts of mice from other animal facilities for the phenotyping screening.

Therefore, several processes need to be quality-controlled in many different areas, including project management (request procedure, reporting), legal requirements (collaboration agreement, animal welfare, and gene technology regulations), scientific processes (scientific question, study design, choice of phenotyping pipeline, data analysis, reproducibility), capacity management (many parallel projects), and knowledge and information management (sustain and transfer expertise, internal and external communication). In order to cover this complex management situation, we decided to implement a QMS within the GMC.

Being a partner in an international consortium allows benchmarking with centers in similar environments. Due to this circumstance, our rationale for using the ISO 9001 standard was based on the following: (1) two other benchmark institutions with similar scope had already adopted ISO 9001-based QMSs, and (2) our products, consisting of research data, inference, and publications, are not regulated like data of, e.g., safety, toxicity, or pharmacokinetic studies and clinical trial data used for a new drug application to the regulatory authorities. Therefore, a process-based system seemed to be most suitable for our needs.

An ISO 9001-based QMS is very general and serves as a framework to increase quality in many aspects. In the GMC, we have implemented measures not only to improve our processes but also to directly improve instruments and increase the quality of our research results. To this end, the implementation of quality management went hand in hand with investments in information technology (IT) structure and software development.

2.4 Challenges

Building a QMS and going through the ISO 9001 certification process was a project that required significant personnel effort and time (2 years in our case). Although efforts should not be underestimated, they should be viewed within the context of promoting benefits (see Sect. 2.6). To answer the question what resources are required and how one can get started, we describe hereafter how the process was carried out at the GMC as an example of implementing an ISO 9001 QMS in a German Research Center setting, which includes project management and organization, the IT infrastructure, as well as social aspects.

Project Management and Organization

After an introductory quality management training by an expert consultant for all GMC staff, we formed a project management team consisting of a quality manager (lead), the two heads of the GMC, the head of the IT group, and an affiliated project manager. An initial gap analysis provided useful data about our status quo (e.g., inventory of existing documentation). In the beginning, the main tasks were (1) development of a project plan with timelines and milestones, (2) defining “quality” in our biomedical research activities, and (3) determination of the scope of the QMS.

GMC’s current strategy and future plans were reconsidered by establishing a quality policy (highest possible standard of research quality) and corresponding common quality objectives (specific, measurable, achievable, realistic, time-based (SMART)). The relaunched version of the ISO 9001:2015 challenged us to comprehensively describe our context. The external context of the GMC includes political, economic, social, and technological factors like animal welfare regulations, funding, health and translational research for the society, and working with state-of-the-art technologies. Our internal context is represented by our technical expertise (over 15 years of experience in mouse phenotyping), knowledge and technology transfer (workshops), application and improvement of the 3 Rs (https://www.nc3rs.org.uk/), and adherence to the rules of good scientific practice (GSP) and the ARRIVE guidelines (Karp et al. 2015).

Implementation of the process approach was addressed by defining the GMC’s key processes in a process model (Maier et al. 2015) and specifying related key performance indicators (KPIs) (e.g., number of publications, trainings, reported errors, measures of internal audits). Process descriptions were installed including responsibilities, interfaces, critical elements, as well as risks and opportunities (risk-based thinking) to ensure that processes are clearly understood by everyone, particularly new employees.

It is often claimed that the ISO 9001 documentation management increases bureaucracy (Alič 2013). Therefore we did not create documents just for the sake of the ISO 9001. Instead we revised existing phenotyping protocols (SOPs) by adding essential topics (like data quality control (QC)) on the one hand and implemented missing quality-related procedure instructions on the other hand (e.g., regarding error management, data management, calibration, etc.). All documents were transferred in a user-friendly standard format and made easily retrievable (keyword-search) using commercial wiki software. A transparent documentation management system was successfully put into effect by implementing these processes.

The constant improvement of our systemic mouse phenotyping processes is continuously ensured by using established key elements of quality management such as error management including corrective and preventive actions (CAPA), an audit system, annual management reviews, and highly organized capacity and resource management structures.

IT Infrastructure and Software Tools

Although our custom-built data management solution “MausDB” (Maier et al. 2008) has supported science and logistics for many years, the decision to implement a QMS triggered a critical review of our entire IT infrastructure. As a result, we decided to adapt our IT structure according to our broad process model (Maier et al. 2015) to deliver more reproducible data.

To this end, MausDB has been restructured into process-specific modules. MausDB 2.0 is a state-of-the-art laboratory information management system (LIMS) for automated data collection and data analysis (Maier et al. 2015). Standardized R scripts for data visualization and statistics are custom-developed for every phenotyping test and routinely applied. Numerous QC steps are built into the LIMS, including validation of data completeness and data ranges (e.g., min/max). Additional modules cover planning of capacities and resources as well as animal welfare monitoring and a project database tool for the improvement of project management and project status tracking.

On a data management level, a series of SOPs regulate reproducible data handling and organization. Comprehensive data monitoring allows detection of data range shifts over time, eventually triggered by changes in methods or machinery.

On the infrastructural level, a well-defined software development process built on the Scrum methodology ensures proper IT requirements management. Thus, a continuous improvement process can be applied to our IT tools. Script-based automation of frequent tasks encompasses daily backups of data as well as software build procedures. An “IT emergency SOP” has been developed to ensure well-planned IT crisis management (e.g., in case of server failure) and provides checklists and instructions for troubleshooting.

The challenges with IT-related issues described above were primarily of two categories: resources and change management. Self-evidently, implementation of all IT improvements required years of effort. However, the resulting overall process is much more efficient and less error prone. Active change management was essential in order to convince IT and non-IT staff that changes were necessary, although these would affect daily work routines. In the end, employees have come to realize that these processes save time and produce higher data quality.

Social Aspects

In a preclinical research environment, the members of a research group traditionally have a high level of freedom for work planning and execution of scientific projects. Often, only one scientist conducts one detailed research project, plans the next steps from day to day, and communicates progress to the group in regular meetings. Therefore, a research environment encourages a self-responsible, independent working structure, leaving room for innovative trial and error, and supports both a creative and a competitive mind.

When you plan to implement a QMS in a preclinical research environment, you want to preserve the positive aspects of open mindedness and combine them with more regulated processes. As the initiator, you might find yourself in the position where you see both the opportunities and possible restrictions like limitations for innovative and unrestricted science. We decided to limit the certification to the standard screening pipelines in the beginning and not to force every research project into the ISO framework. Still, we encountered expected resistance to the implementation, since people feared losing freedom and control of their work structure, as well as disruption with unnecessary additional bureaucracy. This is, in every case, a complex psychological situation. Therefore, the implementation of a QMS takes time, understanding, sympathy, and measures of change and expectation management. A good deal of stamina, patience, and commitment is indispensable.

2.5 Costs

Quality control and management of preclinical animal research is a topic of increasing importance since low reproducibility rates (Begley and Ellis 2012) have put the knowledge generated by basic research in question. Furthermore, low reproducibility rates have caused immense delays and increased costs of therapeutic drug development (Freedman et al. 2015). NOT implementing recommended solutions like rigorous study designs, statistics consultation, randomization and blinding of samples to reduce bias, sharing data, or transparent reporting (Kilkenny et al. 2010; Landis et al. 2012; Freedman et al. 2015, 2017) in preclinical research will keep these costs high. Practical implementation of these standards could be supported by a well-developed QMS and provide structure and ensure achievement.

However, there are many concerns about the financial expenses needed for implementing quality in research practice by establishing and maintaining a QMS. Initial minimum financial costs include gaining knowledge about the chosen standard, in our case through trainings by an external consultant on ISO 9001-based quality management and documentation organized for the whole staff. Designation of at least one person who coordinates the implementation of the QMS is essential, which brings about the issue of salary costs. We hired a quality manager for 2 years (1 FTE) and in parallel trained a project manager from our team to the standard and for becoming an auditor to take over after the first certification (0.5 FTE).

In addition, the costs for the certification body needs to be included in cost calculations. Different certification bodies perform the ISO 9001 certification with varying costs. The first 3-year period comprises audit fees for the initial certification and two annual surveillance audits. This is followed by a recertification in the fourth year. As an example, our certification body costs were as follows: ~6,500€ for the initial certification, ~3,000€ for an annual surveillance audit, and ~5,000€ for the recertification.

Since we implemented new IT solutions, we had additional financial costs. For transparent documentation management, we acquired licenses for a supporting commercial wiki software (~2,200€ per year). All other software acquirements were not directly in the context of the ISO implementation and were solely data analysis or software development related. The same is true for a permanent statistician position (1 FTE) to support study design and data analysis.

Costs for implementing quality management in research practice are often a deterrent as the advantages of saving this money might be more obvious than the disadvantages. However, “not investing in costs” with respect to quality lead to “silent” costs. An example is the nonconformity management: if errors and corresponding measures are not properly documented, reduction as well as detection and avoidance of recurring errors through a detailed analysis is hardly possible, and the positive effects of increased efficiency and reduced failure costs are left out. The financial gain of an effective QMS can hardly be calculated in a research environment; however, documented, reviewed, and continuously improved processes ensure identification of inefficiencies, an optimized resource management, avoidance of duplication of work, and improved management information reducing the general operating costs.

2.6 Payoffs/Benefits

Why should an institution decide to improve quality in research practice by investing in an ISO 9001 QMS? We want to list a number of fundamental arguments and provide practical examples that might open a different perspective.

Building a QMS in the GMC demanded the highest efforts in the first 2 years before the certification (in 2014). However, with increasing maturation of the QMS, the process ran more efficiently due to enhanced quality awareness and a general cultural change which both led to increased quality of output (assured by monitoring the KPIs). People started to like the environment of having a QMS, and continual improvement became a habit. Over time, the benefits associated with using a QMS will offset the efforts it took to build it in the first place. Some of the most striking benefits of having an expanded QMS are listed hereafter.

Management Reviews

These are important controlling steps as they give an annual overview of the actual state of the processes including all KPIs, the content of errors and the corresponding actions, open decisions that were supposed to be closed during the year, or specific actions that are pending. To this end, this kind of review differs from the usual reporting to funding authorities. They solely serve the quality status and reinforce focus on strategic, quality-related goals that have been identified as priorities. This is particularly useful since concentrating on important issues (e.g., increased QC issues in specific tests or applying a risk-based approach) is something that is often postponed in favor of other tasks requiring frequent or immediate attention. Here you need to deal with formal numbers and can react and adapt milestones if specific problems have not been addressed adequately. Surprisingly, this kind of review enabled us to react quickly to new developments. Since the digital assembly of the KPIs is in place, the numbers can be easily reported also during the year, and fact-based decisions can be made. To this end, contrary to the belief that a QMS causes a bureaucratic burden, the QMS actually facilitates agile project management.

Audit System

Internal audits are an often underestimated element of a QMS. By performing internal audits (e.g., independent phenotyping protocol reviews, complex process audits, or audits addressing current quality problems such as reduction of bias), we ensure that standardization is guaranteed, measures for improvement are defined, and prevention of undesired effects is addressed. Internal system audits as well as the third-party audits ensure the integrity and effectiveness of the QMS.

Box 1 First Third-Party Audit

BEFORE: Being part of a third-party audit was initially mentally and emotionally demanding: just before the first certification audit, personnel (afraid of the visit by the audit team) kept calling to report minor issues and ask what to do.

AFTER: Now, members of the group are used to internal scientific method auditing and have realized that we do not run the QMS solely for the certification body but for our own benefit. Today, while presenting the systemic phenotyping methods in a third-party audit, people feel accomplished, enthusiastic, and self-confident.

Training Concept

Comprehensive training of personnel is time-consuming and associated with extensive documentation. However, training ensures establishment and maintenance of knowledge. New employees complete an intensive induction training including the rules of good scientific practice (GSP), 3 Rs, awareness for working in an animal research environment, QM issues, and legal regulations. Regular QM trainings build and maintain awareness of quality issues. The “not documented, not done” principle is well accepted now and supports the transparency of personnel competence assessment. In addition, we are currently building an eLearning training program in order to save time for logistics in case people missed trainings.

Traceability

All processes were critically assessed for traceability. On the physical level, temporal and spatial tracking of mice and samples (blood, tissue) is an issue. We implemented a barcoding in our LIMS to register all samples. On the data level, we aim to maintain full traceability of data and metadata. This means we link file-based raw data to our LIMS and capture all metadata that may influence actual data, e.g., experimenter, equipment, timestamp, and device settings. On the process level, all transitions between sub-processes (“waiting,” “done,” “cancelled”) are logged. This enables us to monitor dozens of ongoing projects at any time with custom-built tools to identify and manage impediments.

Box 2 Traceability Versus Personalized Data Storage

BEFORE: Some 10 years ago, we had to ask a collaboration partner for a re-genotyping because of identity problems within a cohort of mice. Tail biopsies were sent, but the electronic list correlating the biopsy numbers to the corresponding mouse IDs was saved on a personal device unavailable for the team. Therefore, the results could not be matched and the data analysis was delayed for more than 4 weeks.

AFTER: Samples now carry a barcode label with the mouse ID and any lists are saved in a central project folder.

Reproducibility

With respect to quality, reproducibility of results is of paramount importance. In addition to SOPs, which regulate how phenotyping procedures are physically carried out, we put considerable efforts in making data analysis and visualization reproducible. To this end, we seamlessly integrated R (R Core Team 2013), a free statistical computing environment and programming language with our LIMS MausDB. Upon user request in MausDB, R scripts perform customized, test-specific statistical analyses as well as data visualizations. This tool restricts user interaction to the mere selection of a data set and the respective R script ensuring that same data will always reproduce the same statistical results and the same plots.

Box 3 Taking Responsibility in Writing Up Publications

BEFORE: We always provided our collaboration partners with the raw data so that they could perform additional analyses. In the past, during drafting manuscripts, we did not verify in detail if we were able to reproduce their statistical analysis and figures.

AFTER: With the implementation of the QMS, we have formalized how a manuscript is processed. This process now includes a step in which our in-house statistician reproduces all analyses and figures using our data as well as additional data from the collaboration partner.

2.7 Lessons Learned/Outlook

Certification to ISO 9001 is not a requirement in nonregulated preclinical biomedical research and also does not define scientific standards, but it represents a reasonable strategy to improve data quality.

GMC’s QMS: A Success Story?

Our ISO 9001:2015-based QMS helps us to generate and maintain transparent and traceable data records within a broad spectrum of standardized phenotyping processes with low variability and increases collaboration partner’s trust in the analysis, interpretation, and reporting of research data. This structured approach also supports compliance with manifold regulations and promotes awareness and risk-based thinking for the institutional context as well as meeting the requirements of funders, personnel, the scientific community, and the public. However, to specifically address the quality of data output, we see the need to broaden the perspective and to reach out to other parties who perform quality assessments in preclinical research.

Networking

Although certification is rarely found in preclinical research, participating in a network of institutions in similar scientific research areas performing, e.g., annual internal ISO 9001 audits on a mutual basis is an opportunity to address common scientific quality problems and therefore a future goal. Positive examples are the Austrian biobanks (BBMRI.at) and a French network of technological research platforms (IQuaRe; https://www.ibisa.net) having built ISO 9001 cross-audit programs.

Limits of Automation

We have learned that beyond a certain level of complexity, further automation requires increasingly and disproportionately higher efforts and is therefore limited. At the GMC, automation of data analysis and visualization works well for projects adhering to our standardized workflow. Beyond that, customization of projects adds additional complexity that is not compatible with full automation. In such projects, custom data analysis still has to be performed manually.

Innovation

At the GMC, information technology has supported operative processes since 2001. In that sense, “digitalization” is not just a buzzword for us, but a continuous process that aims for measurable and sustainable improvement of our work. IT solutions implemented so far mainly cover standardized processes.

“Machine learning” is another heavily used catch phrase. In the case of the auditory brainstem response test, we currently use our vast data set to develop methods for automated detection of auditory thresholds, including deep learning by neural networks. We are sure that this will provide a more reproducible method, independent from human influences. Of course, human experts will always review and QC the results. Nevertheless, setting the scene with a QMS paves the way for future investment in modern IT technologies and digitalization.

Finally, we made the experience that you need to allow flexibility and consider not including all processes and/or details in the ISO 9001 QMS. Indeed, the ISO 9001 QMS does not require to incorporate every process, and this might be also a misconception of many principal investigators (PIs) that prevents the introduction in academic settings. Real innovation is a truly inefficient and non-directed process (Tenner 2018) that needs to reside in a protected area. As soon as innovative research projects generate either new technologies or techniques, we slowly implement these into our processes and apply quality management measures step by step. Therefore, it is also important to not allow the system to “take over,” getting lost in micromanagement or obsessed by automation with new IT solutions. It is all about balancing the needs for quality and scientific freedom and keeping the expectations from all involved parties in a reasonable frame. To this end, it is necessary to allow time during the implementation process and to understand that benefits are apparent only after a longer time period.

Although efforts for implementing a QMS might be more tricky in an academic setting in a university (many PIs, high diversity of activities, rapid change of personnel) than in a mouse clinic performing highly standardized tests and procedures, the ISO 9001 standard gives a framework for introducing more quality-relevant aspects in preclinical research and helps enormously with team mindset. An ISO 9001-based QMS supports quality in manifold key process types as well as in supporting, analysis, and improvement processes like training, communication, documentation, auditing, and error management.

Nevertheless, the determination of good quality data output can only be judged by scientific peers and the respective community.

3 University of Kentucky Good Research Practice (GRP) Resource Center

The Good Research Practice (GRP) Resource Center at the University of Kentucky (UK) is a research support unit under the office of the UK Vice-President for Research. In the context of experimental life sciences, it is particularly important to note that UK is also an academic healthcare center. The UK Albert B. Chandler Medical Center is located in the health sciences campus and is comprised of six colleges of biomedical sciences (dentistry, health sciences, medicine, nursing, pharmacy, and public health) as well as the clinical facilities associated with UK HealthCare. These include the UK Chandler Hospital, Kentucky Children’s Hospital, UK Good Samaritan Hospital, Markey Cancer Center, Gill Heart Institute, Kentucky Neuroscience Institute, and Kentucky Clinic, which collectively support research, education, and healthcare. The Chandler Medical Center is 1 of only 22 academic medical centers in the United States that house 3 nationally recognized federally funded centers: a National Cancer Institute-designated cancer center, an Alzheimer’s Disease Center funded by the National Institute on Aging, and the Kentucky Center for Clinical and Translational Science funded as part of the NIH’s Clinical and Translational Science Award Consortium.

UK is a public, land grant university established in 1865 located on a 784 acre urban campus in Lexington, Kentucky. Sixteen colleges and diverse professional schools are available including Colleges of Agriculture, Dentistry, Health Sciences, Medicine, Pharmacy, and Public Health, to name a few. Major graduate research centers offering high-quality multidisciplinary graduate training include the Graduate Centers for Nutritional Sciences and Gerontology. UK researchers have developed highly productive collaborations across diverse disciplines, in multidisciplinary and interdisciplinary research. Research and academic activities at UK span all 16 colleges, 76 multidisciplinary research centers, and 31 core research facilities. UK has over 80 national rankings for academic and research excellence and is 1 of 108 private and public universities in the country to be classified as a research university with very high research activity by the Carnegie Foundation for the Advancement of Teaching.

3.1 Our Mission

The mission of the GRP Resource Center is to assist academic scientists by providing tools to support research processes that promote data quality, integrity, and reproducibility. Our center is comprised of faculty and staff with combined experience as researchers as well as regulatory experience. Our experience includes conducting both (1) research in full compliance with US FDA Good Laboratory Practices (GLP), involving defined roles as management, study directors, and quality assurance (QA), and (2) non-GLP research, which is not required to be conducted according to GLP regulations but is carried out using many of the requirements outlined (e.g., maintaining written methods and record of method changes, rigorous data documentation and traceability, etc.), however typically without QA oversight. Using experience and training with laboratory and quality management requirements set forth by the GLP regulations, our center supports laboratories seeking either mandated requirements (i.e., GLP-compliant research) or those voluntarily seeking to strengthen current practice to conduct nonregulated research. With recent alignment of stakeholder interest in enhancing the value of preclinical research, discussed further below, we support researchers in meeting these expectations as well. This is achieved through individual consultation, group training, and other resource sharing to provide templates for building a practical and effective quality management system.

3.2 Our Stakeholder’s Interests and Concerns

Most visibly, primary stakeholders in research include those with direct economic investment, which may take forms of both private and government funding mechanisms. In many governmental structures, investment can be extended to the general public. The public may contribute both direct financial investment in funding research and, more indirectly, the economic burden of supporting healthcare measures that support unfavorable health outcomes where effective therapies are delayed or do not exist. Additionally, many individuals also have a personal investment with hope for a better understanding of diseases and development of therapies for illnesses that may affect them personally or family members. Other stakeholders include publication agencies, which rely on transparent and accurate reporting of research processes and results as well as a thorough discussion of possible limitations of findings when interpreting significance of results.

Public opinion of scientific research has declined due to increasing evidence for lack of reproducibility (Begley and Ioannidis 2015; Pusztai et al. 2013). Studies conducted by Bayer included review of new drug targets in published results and found that 65% of in-house experimental data did not match published results (Mullard 2011). Similarly, a 10-year retrospective analysis of landmark preclinical studies indicates that the percentage of irreproducible scientific findings may reach as high as 89% (Begley and Ellis 2012). In addition, studies indicate a recent surge in publication retractions. Approximately 30 retraction notices appeared annually in the early 2000s, while in 2011 the web of science indexed over 400 retractions (a 13-fold increase), despite total number of papers published rising by only 44% (Van Noorden 2011).

Although stories of scientific misconduct, particularly fraud, are often the most memorable and garner the most attention, studies support that other experimental factors related to irreproducibility are bigger contributors to the problem (Gunn 2014; Freedman et al. 2015; Collins and Tabak 2014; Nath et al. 2006). Key contributors to irreproducibility include biological reagents/reference materials, study design, analysis, reporting, researcher and/or publishing bias, and institutional incentives and career pressures that compromise quality (Freedman et al. 2015; Begley and Ellis 2012; Begley et al. 2015; Ioannidis 2005; Baker 2016a, b; Ioannidis et al. 2007). These factors engage the scientific community at multiple levels, highlighting a need for a cultural shift and prioritization on quality, which includes the primary scientist, but also call into action funding agencies/institutes, publishers, and institutional policy makers who each have responsibilities in directing the research process and setting benchmarks for success. In the United States, a major stakeholder in research quality, the National Institutes of Health (NIH), has recognized the combination and interaction of factors that contribute to irreproducibility in preclinical research and acknowledged that there is a community responsibility to the reproducibility mission. Several NIH institutes and centers are testing measures to better train researchers and better evaluate grant applications to enhance data reproducibility (Begley and Ellis 2012; Collins and Tabak 2014), and major publishers have committed to taking steps to increase the transparency of published results (McNutt 2014; Announcement 2013).

3.3 How to Address Data Irreproducibility

With the abovementioned contributors to irreproducibility in mind, several research processes may be isolated that would benefit from implementing a quality management system. At the forefront are standardized and comprehensive documentation procedures that allow adequate study reconstructability. In research environments with frequent personnel turnover, which is particularly the case in academic settings, accessing traceable records is a cornerstone to further isolation of specific sources of irreproducibility as well as accurate and transparent reporting of experimental methods and results. For example, documentation of the key reagent batch/lot information may be critical in troubleshooting discrepancies in bioanalytical results. A secure, indexed archiving system that protects the integrity of materials is also necessary to facilitate expedient access to study materials over time. Standardization of experimental design elements is also needed with agreement among and within laboratories for critical aspects, which include, but are not limited to, key chemical/biological resources, personnel training, equipment testing, statistical methods, and reporting standards (Nath et al. 2006; Kilkenny et al. 2010; Landis et al. 2012). Standardization and best practice guidelines for particular technologies and fields of study are continuing to be developed, debated, and refined (Taussig et al. 2018; Almeida et al. 2016); however, preclinical research would also benefit from reaching an agreement on quality management expectations common to many research applications.

3.4 Why Build a GLP-Compliant Quality Management System in Academia

Our decision to become GLP-compliant resulted from requests from industry partners who were seeking continuity through early discovery work to GLP-compliant safety studies. This circumstance arose out of a combination of experience and capabilities unique to our team with magnetic resonance imaging-targeted intracranial drug delivery in large animal models.

The GLP regulations describe the minimum requirements for conducting nonclinical studies that support or are intended to support research or marketing permits for products regulated by the FDA and Environmental Protection Agency (Pesticide Programs: Good Laboratory Practice Standards 1983; Nonclinical Laboratory Studies: Good Laboratory Practice Regulations 1978). The GLP regulations were formalized as laws in 1978 following evidence for fraudulent and careless research at major toxicology laboratories, namely, Industrial Bio-Test Laboratories and Searle. In one example of a study that was executed poorly, validation of methods used to generate test article mixtures had not been carried out, leading to nonhomogeneous mixtures and resulting in uncontrolled dosing. In other cases, study records were poorly maintained and reports appeared falsified: animals that were documented as deceased were later reported alive in study reports. Investigations culminated in the conclusion that toxicology data could not be considered valid for critical decision-making by the agency, putting public health at a tremendous risk (Baldeshwiler 2003; Bressler 1977).

With the intent to assure the quality, integrity, and reproducibility of data, GLP regulations direct conditions under which studies are initiated, planned, performed, monitored, and reported. To name select general examples, meeting GLP regulatory requirements involves:

  • Characterization of key reagents (e.g., test/control articles)

  • Traceable, accurate, and archived study records

  • Established personnel organization

  • Training of all personnel

  • A written study protocol and other written methods for facility processes with record of any changes

  • An independent monitoring entity, quality assurance unit (QAU), which performs inspections and audits to assure facility and regulatory requirements were met and reports accurately reflect the raw data

These elements, although not all-inclusive, are key components that address the quality management needs discussed above. Since achieving data quality is also a priority in nonregulated research, as it is a basis for drug development, adapting GLP elements and applying those as voluntary quality practices is one option to meet quality needs. At the University of Kentucky, the GRP Resource Center has taken a lead role in developing a model to guide individual researchers and facility directors who are seeking to strengthen current practices but are not conducting studies where GLP compliance is required. Our focus to date has involved initial and ongoing assessment of centralized, shared use research core facilities and instruction of research trainees. Our quality consultation services are also available by researcher request for both voluntary quality management consultation and GLP needs assessment, as facility needs dictate. Select GLP elements are listed as applicable to nonregulated preclinical laboratory research and achieving quality (see Table 1).

Table 1 Total CoQ as represented by cost of failure + cost of achieving

3.5 Challenges

Our discussion will focus on GLP suitability and limitations for quality achievement needs and implementation in preclinical research. Implementing an effective QMS, particularly those outlined in GLP regulations, is met with specific needs such that increased resource allocation should be expected including time, personnel, and associated fiscal investment. For example, few preclinical laboratories have the resources necessary to characterize test/control articles to GLP standards. Additionally, GLP compliance presents a greater challenge in the university setting where nonregulated operational units may interact with the GLP infrastructure. Also, complex administrative relationships may result in difficulty defining GLP organizational structure. Furthermore, personnel concerns include high turnover, inherent to the training environment of academia, as well as obtaining support for independent personnel to carry out QA functions, particularly where GLP studies may not be performed on a continual basis. However, compliance challenges can be lessened when executive administrators value quality and can offer support to address these needs. These unique GLP challenges and options for supporting a GLP program are discussed further in the literature (Hancock 2002; Adamo et al. 2012, 2014). Understandably, studies that require GLP compliance do not present substantial flexibility in quality management components. Nonetheless, using GLP elements as a template for nonregulated preclinical studies, where GLP is not mandated, is a viable option and may substantially lower variation and irreproducibility by focusing on top contributors to bad quality (Freedman et al. 2015). It is important to note that although these self-imposed standards are non-GLP, they still add value. Thinking in terms of a quality spectrum rather than all-or-nothing may be a more productive model for addressing reproducibility issues in a resource-limited setting. For example, without support for an independent QAU, a periodic peer-review process among laboratory staff can be implemented to verify that critical record keeping, for example, legibility and traceability, is present in study records (Baker 2016b; Adamo et al. 2014).

3.6 Costs

Although definitions and models vary, the cost of quality (CoQ) has been described as the sum of those costs associated with both failure and achieving the desired quality (Wood 2007). Treatment of CoQ in the clinical laboratory is helpful as a template to evaluate the nonclinical laboratory concerning both quality failure and quality achievement costs (Berte 2012a; Wood 2007; Carlson et al. 2012; Feigenbaum 1991). Models subdivide quality costs that are relevant to the assessment of research laboratory costs and include prevention, appraisal, internal failure, and external failure. As examples, prevention quality costs include quality management planning, process validation, training, preventive maintenance, and process improvement measures, and appraisal costs include those associated with verification that desired quality is being achieved like periodic proficiency testing, instrument calibration, quality control, and internal inspections/audits. Both prevention and appraisal costs are part of the cost of achieving quality, while internal and external costs are associated with failure, occurring either before or after delivery of a product/service, respectively. For example, obtaining a contaminated sample that must be collected again is an internal cost, while recalling released results is an external cost (Berte 2012a).

In calculating quality failure costs in preclinical research, using reproducibility as an indicator for variation/quality and a modest estimation of 50% irreproducibility rate, this cost is estimated to exceed 28 billion dollars in the United States annually (Freedman et al. 2015). Of note, lack of agreement defining reproducibility was noted in analyses and includes both variation in results and ability to carry out replication using the available methods, and this also does not imply that irreproducible studies were of zero value (Freedman et al. 2015; Begley and Ellis 2012; Mullard 2011; Ioannidis et al. 2009). Additional failure costs are publication retractions and time/resources consumed investigating confounding results and repeating experiments, which are resources consequently unavailable for consideration of biological questions and extending findings (Baker 2016b), ultimately contributing to delayed development of effective therapies (Freedman and Mullane 2017). Cost of quality models may also include “intangible” costs associated with quality failure such as lost opportunities (Schiffauerova and Thomson 2006). As has been described, one such cost is the lack of confidence among stakeholders in biomedical research and conceivably lost opportunities due to unreliable results. Quality deficiencies may also cause undue damage to workplace morale, for example, if deficiencies result in blame placed on personnel when a lack of quality management processes or ineffective processes are actually at fault (Berte 2012a). In implementing measures to achieve quality, intangible costs may also be incurred if quality achievement processes are perceived to result from distrust in personnel competency.

Estimations from clinical laboratories indicate that approximately 35% of total operating costs are associated with CoQ, with CoQ failure and achievement accounting for 25% and 10% of the total, respectively (Menichino 1992). Although estimates specific to preclinical research are lacking, in this laboratory example, the cost of failure exceeds the cost of achieving. This makes investment in processes that promote quality (prevention, appraisal) particularly valuable since these typically require less investment than quality failures and, if implemented effectively, reduce the cost of failure by reducing the total CoQ. Furthermore, costs associated with maintaining an effective quality management system would likely decrease over time as corrective and preventive actions are implemented (Berte 2012a, b). We estimate that the initial implementation costs to meet GLP requirements totaled $623,000 USD over a 3-year period. These costs related to achieving quality can be divided broadly into three categories: (1) internal staff/administrative costs related to establishing written methods for standardizing laboratory and other facility operations (52% of the costs); (2) appraisal costs, which includes gap analyses/external consulting and training of internal QA associates (47% of the costs); and (3) equipment maintenance and testing (1% of the costs). Of note, these categories were difficult to strictly partition among appraisal, preventing quality failure, and equipment. For example, standardizing reagents were used to both prevent failure, when used for equipment maintenance activities, and appraisal activities, if used to test equipment. Personnel effort (as accounted for as internal staff/administrative costs) also crosses categories since effort includes both time spent standardizing equipment testing criteria (to prevent failure) and time spent evaluating conformance to those criteria (appraisal). Thus, there may be some flexibility in the estimated contribution of each category to overall cost.

Although implementation costs were high, these were met with gained opportunities. Research funding by industry contracts during and after the establishment of our GLP-compliant infrastructure increased 109%, from $3.5 million USD in total support in the 6 years preceding its existence to $7.3 million USD in total support over the following 6-year period. While difficult to directly quantify, our facility collaborative opportunities were also broadened through increased industry recognition due to having experience using a QM framework familiar to the pharmaceutical/device industry. As would be predicted from other models (Berte 2012a, b), we experienced a sharp decline in costs after having established our QM framework. For example, our total annual maintenance cost represents, on average, only about 9% of the total implementation cost. Maintenance costs were reduced across all three categories, however most dramatically in costs related to appraisal/external consulting and establishing a QAU, which represents approximately 1% of the initial appraisal costs. Of note, our initial appraisal costs, which were directed toward full compliance with GLP regulations, involved identifying and working with specialized external consultants who have regulatory expertise. In research settings where quality needs do not engage GLP regulations, other options may be available including utilizing internal personnel for appraisal activities.

Costs incurred due to quality failures are difficult to quantify or even estimate; however, the further downstream in the research process quality failures is exposed, the greater the cost (Campanella 1999). For example, in the case that quality failure (as represented by variation or irreproducibility in the research results in preclinical research) is realized after publication, the cost of failure is the highest, requiring supporting all of the research costs again (Berte 2012a). Potentially, additional intangible costs are incurred if reported results were used as the scientific premise for additional research, an area that has been described clearly by NIH (Collins and Tabak 2014). Therefore, allocating resources to support the costs of attaining quality early in the research process (prevention and appraisal) is a worthwhile investment that, overtime, reduces the CoQ by reducing the cost of failure. Resources spent on services/reagents to perform preventive maintenance/testing of equipment are examples of such quality attainment costs. As outlined in Table 1, GLP regulations describe several requirements that support achieving quality which are included in a consideration of CoQ (Nonclinical Laboratory Studies: Good Laboratory Practice Regulations 1978; Berte 2012a).

3.7 Payoffs/Benefits

In short, major benefits to investing in quality in the research laboratory include recovering resources spent due to variation or reproducibility failures, as discussed above. Those involve both costs incurred in repeating the research process, in the case of irreproducibility, and recovering intangible costs: more certainty in extending research findings based on a solid scientific premise, regaining missed opportunities, and restoring stakeholder confidence. Prioritizing quality in preclinical research supports a more transparent research process with greater efficiency in developing therapies and furthering the understanding of health and disease. As described above, complying with research standards, like GLP regulations, is costly but may offer a competitive advantage by facilitating mutually beneficial industry partnerships earlier and more readily. Bridging this gap in translational research is of particular relevance in the academic healthcare center environment with the unique opportunity to utilize expertise and innovative technology platforms unique to academia (Stewart et al. 2016; Hayden 2014; Tuffery 2015; Cuatrecasas 2006) within a framework that can support continuity with researcher-initiated nonclinical studies through clinical trials (Adamo et al. 2014; Slusher et al. 2013; Yokley et al. 2017).

3.8 Lessons Learned/Outlook

Applying CoQ models to current estimations of the financial costs associated with irreproducible research makes a convincing case for reevaluating existing research practices. Exposure to regulatory guidelines and more general training in quality management in the academic research environment has been well received as a supplement to the research trainee curriculum, potentially broadening career opportunities. Following training workshops, we have received positive feedback, particularly regarding the benefit to research trainees, and some faculty have requested further individual consultation in order to identify quality needs and better implement voluntary quality practices in their laboratories. Researchers have also showed reluctance to implement a QMS with limited resources that can be allocated to such efforts. It is not uncommon for senior researchers to initially seek to implement only the minimum requirements to meet funding agency/journal expectations; however, we are optimistic that benefits will be realized with time. With this objective in mind, in our consultation process, we value incremental progress in implementing quality management processes and welcome feedback and customized, researcher-initiated solutions to quality management needs. This is the approach we have taken in consultation with UK research core facilities as an ongoing process, which is currently underway.

In our experience, it is clear that higher administrative support is necessary to guide these collective efforts. Our center received early support from institutional administration (Dean’s Office, College of Medicine) to supplement initial implementation costs. Additionally, ongoing administrative support from the Vice-President of Research has been indispensable in outlining quality indicators, initiating meaningful cultural change, and supporting the infrastructure necessary for continued achievement and assessment of quality. We believe our GLP experience strongly supports recent reproducibility initiatives from major funding agencies (e.g., NIH) and thus the shared interest of university leadership and researchers seeking to make meaningful contributions to the existing body of knowledge through their research efforts.

4 Conclusions: Investing in Quality

In this chapter, we have described our experience with the implementation of a QMS in research practice, using ISO 9001 and GLP as examples, and how the numerous benefits outweigh initial efforts. Existing concerns and evidence for quality collapse necessitate collaborative measures to improve the value of preclinical research. The costs associated with quality failures are considerable and often compound in the research process, leading to repeatedly encumbering research costs and/or distribution of misinformation and experiments based on unreliable results. However, investment in an effective QMS provides considerable benefits and return of costs multifold – recovering failure costs and expanding opportunities for the scientific community (see Table 2). In summary, although complete agreement and rapid adoption of quality standards in the scientific community seem hard to achieve, using general quality principles and processes that are available is not only a viable option but rather the cornerstone in promoting quality and reproducibility, effectively reducing the cost of quality.

Table 2 Major benefits of implementing quality management