Command-Line Tools in Linux for Handling Large Data Files
Linux operating system is a freely available version of Unix which is mostly used as a command-line interface. Linux is frequently used for processing large data files for various types, and most of the software used in the field of genomics, proteomics, and bioinformatics are developed to work on it. This chapter explains the hierarchical structure of Linux operating system along with file types and commands used for file/process handling. Moreover, one of the most common text editors, Vi/Vim, used in Linux has been described. There are multiple modes of operation in the Vi editor and most of them are explained in detail. Vi is used for editing the files or writing codes in a programming language. Additionally, to edit the files directly from the terminal, multiple command-line options are available. One of these command-line tools named Awk, which in itself is interpreted programming language generally used for data/text processing is discussed, with examples of manipulating common data files as well as sequence data files.
KeywordsUnix Linux Operating system Vi Awk Text processing
We would like to thank our teacher, Dr. Asheesh Shanker, for giving us the opportunity to contribute towards this book and our friend Dr. Vasantika Singh for her helpful suggestions.