Scalable Big Data Analytics for Protein Bioinformatics

Efficient Computational Solutions for Protein Structures

  • Dariusz Mrozek

Part of the Computational Biology book series (COBO, volume 28)

Table of contents

  1. Front Matter
    Pages i-xxvi
  2. Background

  3. Cloud Services for Scalable Computations

  4. Big Data Analytics in Protein Bioinformatics

  5. Multi-threaded Solutions for Protein Bioinformatics

  6. Back Matter
    Pages 311-315

About this book


This book presents a focus on proteins and their structures. The text describes various scalable solutions for protein structure similarity searching, carried out at main representation levels and for prediction of 3D structures of proteins. Emphasis is placed on techniques that can be used to accelerate similarity searches and protein structure modeling processes.

The content of the book is divided into four parts. The first part provides background information on proteins and their representation levels, including a formal model of a 3D protein structure used in computational processes, and a brief overview of the technologies used in the solutions presented in the book. The second part of the book discusses Cloud services that are utilized in the development of scalable and reliable cloud applications for 3D protein structure similarity searching and protein structure prediction. The third part of the book shows the utilization of scalable Big Data computational frameworks, like Hadoop and Spark, in massive 3D protein structure alignments and identification of intrinsically disordered regions in protein structures. The fourth part of the book focuses on finding 3D protein structure similarities, accelerated with the use of GPUs and the use of multithreading and relational databases for efficient approximate searching on protein secondary structures.

The book introduces advanced techniques and computational architectures that benefit from recent achievements in the field of computing and parallelism. Recent developments in computer science have allowed algorithms previously considered too time-consuming to now be efficiently used for applications in bioinformatics and the life sciences. Given its depth of coverage, the book will be of interest to researchers and software developers working in the fields of structural bioinformatics and biomedical databases.


Amino Acid Sequence Bioinformatics Cloud Computing GPU, CUDA Multi-Agent Systems Multithreaded Processing Parallel Processing Protein Structure Proteins

Authors and affiliations

  1. 1.Silesian University of TechnologyGliwicePoland

Bibliographic information

Industry Sectors