Finding the range and averaging the corresponding elements in R -
Finding the range and averaging the corresponding elements in R -
i have different range of numbers (or coordinates) in 1 dataset , want find suitable range of numbers , take average of corresponding scores.
lets dataset is:
coordinate score 1000 1.1 1001 1.2 1002 1.1 1003 1.4 1006 1.8 1007 1.9 1010 0.5 1011 1.1 1012 1.0
i should find proper boundary (when coordinate
not consecutive) , calculate mean each particular range.
my desired result:
start end mean-score 1000 1003 1.2 1006 1007 1.85 1010 1012 0.86
try (assuming df
info set)
library(data.table) setdt(df)[, indx := .grp, = list(cumsum(c(1, diff(coordinate)) - 1))] df[, list(start = coordinate[1], end = coordinate[.n], mean_score = round(mean(score), 2)), = indx] # indx start end mean_score # 1: 1 1000 1003 1.20 # 2: 2 1006 1007 1.85 # 3: 3 1010 1012 0.87
or using dplyr
library(dplyr) df %>% mutate(indx = dense_rank(cumsum(c(1, diff(coordinate)) - 1))) %>% group_by(indx) %>% summarise(start = first(coordinate), end = last(coordinate), mean_score = round(mean(score), 2)) # source: local info frame [3 x 4] # # indx start end mean_score # 1 1 1000 1003 1.20 # 2 2 1006 1007 1.85 # 3 3 1010 1012 0.87
here alternative base of operations r solutions (much less efficient)
df$indx <- as.numeric(factor(cumsum(c(1, diff(df$coordinate)) - 1))) cbind(aggregate(coordinate ~ indx, df, function(x) c(start = head(x, 1), end = tail(x, 1))), aggregate(score ~ indx, df, function(x) mean_score = round(mean(x), 2))) # indx coordinate.start coordinate.end indx score # 1 1 1000 1003 1 1.20 # 2 2 1006 1007 2 1.85 # 3 3 1010 1012 3 0.87
or
cbind(do.call(rbind, (with(df, tapply(coordinate, indx, function(x) c(start = head(x, 1), end = tail(x, 1)))))), with(df, tapply(score, indx, function(x) mean_score = round(mean(x), 2)))) # start end # 1 1000 1003 1.20 # 2 1006 1007 1.85 # 3 1010 1012 0.87
r range average
Comments
Post a Comment