Updated: 2021-09-17
Some tips for data.table
that I came across while googling which might be useful.
To print more rows that default can be done with either:
options(datatable.print.topn = 70)
print(DT, topn = 70)
Using options
will implement the changes globally. Else to make print nicer is using:
options(datatable.prettyprint.char = 80L)
NA
Use nafill()
to replace NA
with specified value such as nafill(dt, fill==99)
will replace all NA
in
dt with 99
. To replace with front or back value of the vector then specify fill="locf|nocb"
accordingly, ie. locf
(last observation carried forward) and nocb
(next observation carried
backward).
Joining with roll and foverlaps
For detail example you can read here. Basically when specifying DT[dt, on=.(key_col), roll=TRUE]
or
roll=Inf
the data will be joined closest to the keyed value where dt >= DT
. Specifying with
roll=-Inf
will be the opposite. With roll="nearest"
will roll both ways to the nearest value. Else
you can specify an absolute value to roll on ie. roll=2
will roll to dt
with key_col + 2
.
Using foverlaps(DT, dt, type = "any")
will join DT and dt as long as the value ranges give a match
to the key values. When using foverlaps()
you need to specify two keys eg. key = c("col1", "col2")
.
As in the given example with both columns are the key columns:
foverlaps(dt4, dt3, type = "any")
## min_y max_y x dt4_y dt4_y_end
## 1: 0 10 c 5.7 10
## 2: 10 15 c 5.7 10
## 3: 10 15 a 11.9 13
## 4: 15 20 d 18.0 22
## 5: 20 30 d 18.0 22
## 6: 20 30 b 21.4 25
Else specifying type="within"
will join only those that match within the range.
Create empty data.table
Sometime I need to create an empty data.table
based on an exisiting
columnames. Lots of other methods to do this but the easiest way is:
cols <- names(DT)
dt <- setnames(data.table(matrix(nrow = 0, ncol = length(cols))), cols)