applying a function with multiple arguments over multiple paired variables in R -
i have function im using clean data , works correctly.
my_fun <- function (x, y){ y <- ifelse(str_detect(x, "-*\\d+\\.*\\d*"), as.numeric(str_extract(x, "-*\\d+\\.*\\d*")), as.numeric(y)) }
it takes numbers have been entered in wrong column , reassigns them correct column. used follows clean y variable:
df$y <- my_fun(x, y)
i have many columns/variables (more 10) paired in same format
x_vars <- c("x_1", "x_2", "x_3", "x_4", "x_5", "x_6") y_vars <- c("y_1", "y_2", "y_3", "y_4", "y_5", "y_6")
my question is. there way apply function across variables in data set need cleaned in same way? can in other instances data cleaning function has 1 argument using lapply
struggling in case.
i have tried mapply
not work, might because i'm still quite novice in r. advice appreciated.
we can use mapply/map
. need extract columns based on column names passing 'x_vars', 'y_vars' arguments map
, apply my_fun
on extracted vector
s, , assign 'y_vars' in original dataset
df[y_vars] <- map(function(x,y) my_fun(df[,x], df[,y]), x_vars, y_vars)
or can written as
df[y_vars] <- map(my_fun, df[x_vars], df[y_vars])
note: here, assuming elements in 'x_vars' , 'y_vars' columns in original dataset. state using map
more faster , efficient reshaping long , conversion.
to provide different approach, can use melt
data.table
library(data.table) dm <- melt(setdt(df), measure = list(x_vars, y_vars))[, value3 := my_fun(value1, value2), variable]
then, again, need dcast
'wide' format. so, requires more steps , not easy
setnames(dcast(dm, rowid(variable)~variable, value.var = c("value1", "value3"))[,variable := null][], c(x_vars, y_vars))[]
data
set.seed(24) df <- as.data.frame(matrix(sample(c(1:5, "something 10.5", "this -4.5", "what -5.2 value?"), 12*10, replace=true), ncol=12, dimnames = list(null, c(x_vars, y_vars))), stringsasfactors=false)
Comments
Post a Comment