adLIMS: a customized open source software that allows bridging clinical and basic molecular research studies
- 1.6k Downloads
Many biological laboratories that deal with genomic samples are facing the problem of sample tracking, both for pure laboratory management and for efficiency. Our laboratory exploits PCR techniques and Next Generation Sequencing (NGS) methods to perform high-throughput integration site monitoring in different clinical trials and scientific projects. Because of the huge amount of samples that we process every year, which result in hundreds of millions of sequencing reads, we need to standardize data management and tracking systems, building up a scalable and flexible structure with web-based interfaces, which are usually called Laboratory Information Management System (LIMS).
We started collecting end-users' requirements, composed of desired functionalities of the system and Graphical User Interfaces (GUI), and then we evaluated available tools that could address our requirements, spanning from pure LIMS to Content Management Systems (CMS) up to enterprise information systems. Our analysis identified ADempiere ERP, an open source Enterprise Resource Planning written in Java J2EE, as the best software that also natively implements some highly desirable technological advances, such as the high usability and modularity that grants high use-case flexibility and software scalability for custom solutions.
We extended and customized ADempiere ERP to fulfil LIMS requirements and we developed adLIMS. It has been validated by our end-users verifying functionalities and GUIs through test cases for PCRs samples and pre-sequencing data and it is currently in use in our laboratories. adLIMS implements authorization and authentication policies, allowing multiple users management and roles definition that enables specific permissions, operations and data views to each user. For example, adLIMS allows creating sample sheets from stored data using available exporting operations. This simplicity and process standardization may avoid manual errors and information backtracking, features that are not granted using track recording on files or spreadsheets.
adLIMS aims to combine sample tracking and data reporting features with higher accessibility and usability of GUIs, thus allowing time to be saved on doing repetitive laboratory tasks, and reducing errors with respect to manual data collection methods. Moreover, adLIMS implements automated data entry, exploiting sample data multiplexing and parallel/transactional processing. adLIMS is natively extensible to cope with laboratory automation through platform-dependent API interfaces, and could be extended to genomic facilities due to the ERP functionalities.
KeywordsLIMS Open Source Software Information Systems ADempiere ERP Sample Tracking
In many biological laboratories, sample tracking is an outstanding issue and often represents a bottleneck for the correct handling and interpretation of experimental data. This issue is becoming particularly critical when automation and high-throughput technologies are introduced in the laboratory practice. Our laboratory performs high-throughput characterization of vector-genomic integration sites in the context of gene therapy applications based on the delivery of therapeutic genes by viral vectors that stably integrate into the genome of targeted cells, as well as gene therapy preclinical models and insertional mutagenesis research projects [1, 2, 3, 4, 5, 6, 7, 8, 9]. Vector integration sites are retrieved and mapped in the genome through a combination of Polymerase Chain Reaction (PCR)-based techniques , next generation sequencing (NGS) and bioinformatics analyses . We process and analyze around 2000 samples/year resulting in hundreds of millions of sequencing reads. Despite the fact that adopting robotic automation for sample manipulation in our laboratory has provided many advantages in terms of manual error-reduction and data production scalability, drawbacks related to sample information volume and tracking are still present. These reasons prompted us to develop a Laboratory Information Management System (LIMS)  for sample tracking on a scalable and flexible infrastructure with an easily accessible and web-based interface. LIMS is a type of information system implemented as a software utility specifically designed to improve the data acquisition and sample monitoring along laboratory workflows, and supporting sample reporting. An information system is a combination of information technologies developed to grant business processes efficiency and monitoring. Extended IS are the Enterprise Resource Planning (ERP) solutions  that integrate the standard information system features with accounting and administrative operations for performance monitoring through dashboards and data mining tools.
In this work we describe our LIMS, developed on an existing open source ERP framework that natively implements all technological functionalities, as software customization and parameterization. After a brief introduction of the ERP framework with the motivation of the specific choice, we will describe in details our implementation with custom use cases and scenarios derived from our laboratory requirements and experience.
We followed a typical software engineering approach, "waterfall" , to design our solution. We first collected end-users' requirements, composed of desired functionalities of the systems and graphical user interfaces (GUI). Then we evaluated available tools that could address our requirements, spanning from pure LIMS to Content Management Systems (CMS) up to enterprise information systems. In the next step we selected our candidate solution and we designed its configuration to best fulfil our requirements.
The analysis of requirements, also called requirements engineering, is the process of acquiring software expectations from users/clients in terms of functionalities and interaction that are translated in software requirements .
User's Rules in "Create, Read, Update and Delete" format (CRUD). The table is a summary of the role permissions and policies in adLIMSin terms of data access. We use the CRUD syntax: create (C), read (R), update (U), delete (D).
The SampleManager role can access and edit metadata related to all clinical and preclinical aspects such as anonymized patients' IDs, DNA (concentrations, volume, and so on) and cell types (markers, lineage, and so on). The visibility spectrum of a user with the SampleManager role is restricted to sample metadata only and will not be able to read other information of post sample processing.
The WetManager role is able to control all aspects related to experimental procedures and workflows on the samples. A user assigned to the WetManager role will not be able to modify input metadata from the SampleManager role.
We added an administrator role (or SuperUser) that is able to control and edit every component of the system, customizing all the functional and graphical levels related to the model-view-controller paradigm . However this role is not authorized to modify input data. At the functional level, the administrator has the privileges to: (1) create, delete, and suspend accounts and (2) define rules to access to the database (such as tables, views, fields, etc.). At the graphical level, the administrator can create and edit windows and layouts, control how to graphically visualize and access data and define how to export and import data in the information system.
As additional requirements, the system has to be web-based and supported by reliable technologies with a backup system thus enabling data maintenance and recovery in case of failures (e.g. electrical supply problems). The centralized nature of the system requires standard hardware performances such as the simultaneous interaction with dozens of users keeping response times faster than 3 seconds.
Evaluation of available LIMS and alternative solutions
We evaluated available LIMS solutions, from commercial to open source ones. Commercial or stand-alone LIMSs [18, 19] are often very expensive and/or lack the flexibility and scalability needed to manage different types of sample data, procedures and analyses specifically designed for each research project. We also explored open source software designed for biological laboratories, such as Bika LIMS , LabKey  or Galaxy , and content management systems (CMS) like Plone  and Drupal  under the perspective of customizing them and exploiting built-in functionalities such as user management, workflow management and configuration. Unfortunately, none of them fully satisfied our requirements because most of the features, that are the peculiarity of a LIMS (like export, report, etc.), have not been implemented. In this context, we analyzed Enterprise Resource Planning (ERP) solutions and ADempiere ERP  was the best available software. ADempiere ERP has been developed under GPL license in Java J2EE with Model-View-Controller design pattern  and database-driven logic . ADempiere ERP implements all required features and presents some highly desirable technological advances (see Additional file 2 for the list of desirable features with a comparative analysis among Bika LIMS, LabKey and ADempiere), such as high usability (web and mobile interfaces) and modularity that grants high use-case flexibility (plug and play approach) and software scalability for custom solutions (adaptable to all use cases). The application server is JBoss and supported databases are Oracle and PostgreSQL. Web interfaces exploit the latest technologies with ZK . Moreover, it natively supports multiple languages, accounting procedures and dashboards for process monitoring thanks to the ERP functionalities that can be easily adapted to high-standard industrial and commercial contexts.
The last process in the "waterfall" approach is the system design. In order to translate the requirements into both functionalities and GUI within ADempiere ERP, we developed an extension of the core database by applying the required operations and user's policies resulting in a scalable system that natively supports Java fat client and web interfaces. For each end-user interaction and functionality, we designed a custom view of the workflow (Figure 2) with dedicated GUIs based on use cases (Figure 1). Since ADempiere ERP is database-driven, the design of the database related to the LIMS extension is a key aspect that drives and manages both the workflow instance and the GUIs. Our database extension is compliant with the core-system table design that required the addition of ten pre-defined fields (see Additional file 3). This operation is required because ADempiere ERP leverages on inner tables to directly create forms (windows) and GUIs. The "application dictionary" is one of the most powerful aspects of ADempiere ERP that acts as the engine of the database-driven model. All metadata needed to build data forms, windows and GUI are contained in the application dictionary that operates at the application layer and generates windows, tabs, menu, forms, nested elements connection, and so on. The application dictionary allows dynamic and flexible changes in GUI and data forms by changing its table content without requiring programming development that is thus drastically reduced. As a direct consequence, GUI changes or customizations can be configured directly in the application dictionary without requiring software compiling or re-building. For example, to create a new window with proper title, menu bar, tool bar, and status bar, ADempiere ERP automatically adds elements in the application dictionary and generates all required fields starting from the database table.
Based on the general workflow modeled in the analysis of requirements (Figure 2), we designed and implemented the LIMS database as an extension of the core ADempiere ERP database (Additional file 4), here reported as Entity-Relationship model in which each basic entity is associated to a custom table ("project", "subject", "DNA", "vector", "sample", and so on). We implemented the model in PostgreSQL and we used BLOB (Binary Large Object) to manage external files as attachments (such as images, pdf files, and so on) that users can upload into any entry. We then created custom GUIs related to the previously described workflow for the management of all data tables (Figure 2). Each window and the relative data are accessed by users according to their role (SampleManager, WetManager and administrator role) and authorization policies. Is always possible to modify existing roles and names and to add new roles according to new requirements or specifications. In our laboratory practice we routinely collect LAM-PCR data, gel images and sequencing quality reports and we store these data in adLIMS. The LAM-PCR workflow has been automated by implementing dedicated tables in the system ("experiment", "lam_pcr_linear", "lam_pcr_1st_exp", "lam_pcr_2nd_exp") with corresponding GUI input forms. Custom database triggers support the multiplexing of samples during the experimental operations required in different steps of the process (see Additional file 3 for trigger details). Similar automated procedures have been developed to support the generation of sequencing pools for high-throughput NGS platforms by defining custom tables ("fusion", "pool", "pool_details") and associated GUI. To avoid hard drive bottlenecks, raw NGS files are included in the system as links (absolute path with host server) and not as attachments.
Results and Discussion
adLIMS in our laboratory practice has brought an improvement through simplification and traceability of the data entry in sample workflows and sample reporting with respect to traditional data storage methods. As an example, adLIMS allows to create a sample sheets by selecting samples information in a few clicks whereas without adLIMS this activity may require hours, potentially introducing typing errors.
adLIMS can be adapted and extended to any laboratory working with biological samples with limited efforts, which are mainly related to the database design and customization, thus defining the business and view logics (workflows and GUI) and creating the data structures (PostgreSQL or Oracle database schema and tables) that are automatically converted into interfaces and data views by the ADempiere ERP engine.
adLIMS currently does not provide direct interfaces with laboratory instruments but integrates local server connections (storage mount access, network printers, and so on). Future extensions of the system will include API integration of laboratory platforms for sequencing, PCR product quantification and liquid handlers. On the other hand, client side interfaces (such as barcode readers and scanners) can be directly integrated into adLIMS, enhancing sample tracking and data acquisition.
The potentialities of adLIMS in the context of genomic companies and facilities are conveyed in the ready-to-use availability of the ERP features that would only require customization.
Currently, in many laboratories the procedures for data tracking and storage of sample information are based on spreadsheets without a real information management or standardization. This data management resulted in inefficiencies and redundancies, potentially generating many errors (typically typos) hard to backtrack or resolve. The use of a LIMS allows bypassing spreadsheets or local file management, supporting automation of all standard procedures for sample tracking. Exploiting adLIMS, our laboratory sustained the critical issues of sample tracking, data standardization and automation derived by the NGS and robotics revolution. This successful application of process engineering and monitoring allowed our laboratory to increase efficiency and reduce manual errors, thus posing concrete bases to sustain business scalability and potentially to approach pharmaco-vigilance monitoring of gene therapy patients in a highly standardized fashion compliant to regulatory requirements. Moreover adLIMS can natively be extended to incorporate ERP solutions, such as CRM, supply chain management, billing and accounting, integrated features that are critical for many genomics facilities.
Availability and requirements
The source code, user guide and appliance of adLIMS are freely available at the project homepage http://sourceforge.net/projects/adlims. We also provide a live demo for users who want to evaluate adLIMS without installation. Release notes and other information will be also updated on the project homepage.
Project name: adLIMS
Project homepage: http://sourceforge.net/projects/adlims
Author: Giulio Spinozzi
Operating system: Platform independent
Supported web browsers: Chrome, Firefox 3.5 +, Safari 4+
Programming language: Java EE
License: GNU GPLv2
Any restrictions to use by non-academics: None
This work was supported by Telethon Grant D1 to EM. We would like to thank Giorgio Cafasso for his support with ADempiere ERP and for his gentle availability. We thank Luca Biasco for suggestions in LAM-PCR workflow set-up.
Publication of this article has been funded by Telethon.
This article has been published as part of BMC Bioinformatics Volume 16 Supplement 9, 2015: Proceedings of the Italian Society of Bioinformatics (BITS): Annual Meeting 2014: Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/16/S9.
- 1.Ranzani M, Annunziato S, Calabria A, Brasca S, Benedicenti F, Gallina P, Naldini L, Montini E: Lentiviral vector-based insertional mutagenesis identifies genes involved in the resistance to targeted anticancer therapies. Mol Ther. 2014, 22 (12): 2056-2068. 10.1038/mt.2014.174.CrossRefPubMedGoogle Scholar
- 6.Deichmann A, Hacein-Bey-Abina S, Schmidt M, Garrigue A, Brugman MH, Hu J, Glimm H, Gyapay G, Prum B, Fraser CC, et al: Vector integration is nonrandom and clustered and influences the fate of lymphopoiesis in SCID-X1 gene therapy. J Clin Invest. 2007, 117 (8): 2225-2232. 10.1172/JCI31659.PubMedCentralCrossRefPubMedGoogle Scholar
- 15.Systems Engineering Fundamentals. 2001, Defense Acquisition UniversityGoogle Scholar
- 16.Booch G, Jacobson I: The Unified Modeling Language for Object-Oriented Development. 1996Google Scholar
- 17.Krasner G, Pope S: A description of the model-view-controller user interface paradigm in the smalltalk-80 system. 1988Google Scholar
- 18.Clarity LIMS. [http://www.genologics.com/claritylims]
- 19.MiniLIMS. [http://bioteam.net/minilims]
- 20.bika: Open Source laboratory information management systems. [http://www.bikalabs.com]
- 21.LabKey. [https://labkey.com]
- 23.Plone CMS: Open Source Content Management. [http://plone.org]
- 24.Drupal - Open Source CMS. [https://drupal.org]
- 26.Wirfs-Brock R, Wilkerson B: Object-oriented design: a responsibility-driven approach. OOPSLA. 1989Google Scholar
- 27.ZK. [http://www.zkoss.org]
- 28.JasperReport. [http://www.jaspersoft.com]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.