r - Aggregate info by quarters in data.table, setting a new name to column used in by -
i have data.table result aggregating bigger one:
data.table(period = c('2018.01', '2018.02'), sales = c(8850, 7950), qty = c(650, 650)) period sales qty 1: 2018.01 8850 650 2: 2018.02 7950 650
what need achieve , can't way it, aggregate info quarters, result be:
data.table(period = c('2018.01', '2018.02', '2018q1', '2018'), sales = c(8850, 7950, 16800, 16800), qty = c(650, 650, 1300, 1300)) period sales qty 1: 2018.01 8850 650 2: 2018.02 7950 650 3: 2018q1 16800 1300 4: 2018 16800 1300
i've tried with: dt = rbind(dt, dt[, lapply(.sd, sum), = .(period), .sdcols = c('sales', 'qty')])
but duplicated columns:
period ums men 1: 2018.01 8850 650 2: 2018.02 7950 650 3: 2018.01 8850 650 4: 2018.02 7950 650
besides, i'd need rename period cell q1 (q2, q3, q4) quarters , year total. how done?
edit
although accepted answer correct, i've reworked didn't need add columns nor install new libraries:
dt = data.table(period = c('2018.01', '2018.02'), sales = c(8850, 7950), qty = c(650, 650)) dt$period = as.double(str_replace(dt$period, "\\.", "")) ints = setinterval(2018) dt = dt[, lapply(.sd, sum), = .(period = cut(period, breaks = ints$i, labels = ints$q)), .sdcols = c('sales', 'qty')] dt = rbind(dt, dt[period %in% ints$q, lapply(.sd, sum), = .(period = '2018'), .sdcols = c('sales', 'qty')], fill = t) dt$period = paste(substr(dt$period, 1, 4), ".", right(dt$period, 2), sep = "") dt = rbind(dt, dt)
i needed create auxiliary function:
setinterval = function (year) { y = year * 100 return (list( = c(y, y + 3, y + 6, y + 9, y + 12), q = paste(year, '.', c('q1', 'q2', 'q3', 'q4'), sep = '') )) }
dt <- data.table(period = c('2018.01', '2018.02'), sales = c(8850, 7950), qty = c(650, 650)) library(zoo) dt$period_yq <- as.character(as.yearqtr(paste(dt$period, "01", sep="."), "%y.%m.%d")) dt$period_y <- strtrim(dt$period, 4) dt1 <- dt[,.sd,.sdcols=c(1:3)] dt2 <- dt[,lapply(.sd,sum), by="period_yq", .sdcols = c('sales', 'qty')] colnames(dt2) <- c('period','sales', 'qty') dt3 <- dt[,lapply(.sd,sum), by="period_y", .sdcols = c('sales', 'qty')] colnames(dt3) <- c('period','sales', 'qty') rbind(dt1,dt2,dt3)
hope help!
Comments
Post a Comment