data munging - Building Sentences from a dataframe in R -
im trying generate sentences dataframe below dataframe
# code mycode <- c("aaabbb", "aaabbb", "aaaccc", "aaabbd") mycode <- sample(mycode, 20, replace = true) # date mydate <-c("2016-10-17","2016-10-18","2016-10-19","2016-10-20") mydate <-sample(mydate, 20, replace = true) # resort myresort <-c("gb","ie","gr","dk") myresort <-sample(myresort, 20, replace = true) # number of holidaymakers holidaymakers <- sample(1000, 20, replace = true) mydf <- data.frame(mycode, mydate, myresort, holidaymakers)
so if take mycode
example, want create sentence "for code mycode
, biggest destinations myresorts
top days of visiting mydate
total of holidaymakers
"
if assume there multiple lines per code. want single sentence example instead of having 1 sentence per mydate
, myresort
, like
"for code aaabbb, biggest destinations gb,gr,dk,ie top days of visiting 2016-10-17,2016-10-18,2016-10-19 total of 650"
the 650 sum of holiday makers countries days per mycode
any help?
thank time
you try:
library(dplyr) res <- mydf %>% group_by(mycode) %>% summarise(d = tostring(unique(mydate)), r = tostring(unique(myresort)), h = sum(holidaymakers)) %>% mutate(s = paste("for code", mycode, "the biggest destinations are", r, "where top days of visiting were", d, "with total of", h))
which gives:
> res$s #[1] "for code aaabbb biggest destinations gb, gr, ie, dk # top days of visiting 2016-10-17, 2016-10-18, # 2016-10-20, 2016-10-19 total of 6577" #[2] "for code aaabbd biggest destinations ie # top days of visiting 2016-10-17, 2016-10-18 # total of 1925" #[3] "for code aaaccc biggest destinations ie, gr, dk # top days of visiting 2016-10-20, 2016-10-17, # 2016-10-19, 2016-10-18 total of 2878"
note: since didn't provide guidance how intend calculate "top visiting days", included days. edit above fit actual situation.
Comments
Post a Comment