Command-Line Tools in Linux for Handling Large Data Files

  • Deepti Mishra
  • Garima Khandelwal


Linux operating system is a freely available version of Unix which is mostly used as a command-line interface. Linux is frequently used for processing large data files for various types, and most of the software used in the field of genomics, proteomics, and bioinformatics are developed to work on it. This chapter explains the hierarchical structure of Linux operating system along with file types and commands used for file/process handling. Moreover, one of the most common text editors, Vi/Vim, used in Linux has been described. There are multiple modes of operation in the Vi editor and most of them are explained in detail. Vi is used for editing the files or writing codes in a programming language. Additionally, to edit the files directly from the terminal, multiple command-line options are available. One of these command-line tools named Awk, which in itself is interpreted programming language generally used for data/text processing is discussed, with examples of manipulating common data files as well as sequence data files.


Unix Linux Operating system Vi Awk Text processing 



We would like to thank our teacher, Dr. Asheesh Shanker, for giving us the opportunity to contribute towards this book and our friend Dr. Vasantika Singh for her helpful suggestions.


Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Deepti Mishra
    • 1
  • Garima Khandelwal
    • 2
  1. 1.Institute of Plant Molecular BiologyBiology Centre of the Academy of SciencesČeské BudějoviceCzech Republic
  2. 2.Cancer Research UK Manchester InstituteThe University of ManchesterManchesterUK

Personalised recommendations