Advanced R pp 141-157 | Cite as

Data Munging with data.table

  • Matt Wiley
  • Joshua F. Wiley


We already introduced the data.table package (Dowle, Srinivasan, Short, and Lianoglou, 2015). The data.table package is the heart of this chapter, covering the basics of accessing, editing, and manipulating data under the broad term data management. Although not glamorous, data management is a critical first step to data visualization or analysis. Furthermore, the majority of time on a particular analysis project often comes from data management. For example, running a linear model in R takes one line of code, once the data is clean and in the expected format. Data management is challenging because raw data comes in all types, shapes, and formats, and missing data is common. In addition, you may also have to combine or merge separate data sources. In this chapter, we go beyond the basic use of data.table to more-complex data management tasks.


Data Table Regular Expression Refuse Treatment Approximate String Match Fuzzy Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Matt Wiley and Joshua F. Wiley 2016

Authors and Affiliations

  • Matt Wiley
    • 1
  • Joshua F. Wiley
    • 1
  1. 1.Elkhart Group Ltd. & Victoria CollegeColumbia CityUSA

Personalised recommendations