Subsetting data.table can be done using base R subset with S3 Class for data.table object. Alternatively is to use the recode approach as mentioned in StackOverflow.

lookup <- list(v1 = 1:3, v2 = letters[5:7])
DT[lookup, on = names(lookup), nomatch = NULL]

This will return all columns in DT that match lookup list. Specifying with nomatch = NULL or nomatch = 0 will exclude rows in DT that don’t match lookup.

Another way to do dynamic subset is as below, but it’s a bit slower compared to the recode approach:

DT[DT[, all(mapply(`%in%`, .SD, lookup)), by = 1:nrow(DT), .SDcols=names(lookup)]$V1]

A nice reading on table join for data.table which is very relevant to this case is the blog post by Scott Lyden.

comments powered by Disqus