r - Summing over constant calendar week interval -
i trying aggregate weekly data monthly data, looks this:
ups week ap 1111112016 1 385.22 1111112016 2 221.63 1111112016 3 317.47
there 132 different upcs , weeks indicated 1 - 52. however, vary across different upcs. in total have 4,027 rows. aggregate on 4 week interval until next upc category reached. have tried code:
z = aggregate(x$ap, by=list(x$upc, cut(x$week, breaks=13, lables = t)), fun = sum) colnames(z) = c("upc", "month", "ap") z = z[order(z$upc),]
i following output:
upc month ap 1 1111112016 (0.951,4.77] 1098.03 88 1111112016 (4.77,8.54] 1180.03 187 1111112016 (8.54,12.3] 491.18 303 1111112016 (12.3,16.1] 896.31
there several problems here: 1) month value wrong. have numerical value. (1 - 12) 2) first 2 aggregates correct, after sums seem correct , not.
here brief example of how data looks like:
dput(head(x)) structure(list(upc = c(1111112016, 1111112016, 1111112016, 1111112016, 1111112016, 1111112016), week = c(1, 2, 3, 4, 5, 6), ap = c(385.22, 221.63, 317.47, 173.71, 269.55, 311.48)), .names = c("upc", "week", "ap"), row.names = c(na, 6l), class = "data.frame")
would work (where data dataframe):
require(data.table) "ap"), row.names = c(na, 6l), class = "data.frame") setdt(data) result <- data[, .(ap=sum(ap, na.rm = t)), = .(upc, month = (floor(week/ 4.34) + 1))] result <- result[order(upc)]
and result be:
upc month ap 1: 1111112016 1 1098.03 2: 1111112016 2 581.03
Comments
Post a Comment