data.frame - R - Subset dataframe to include only subjects with more than 1 record -
data.frame - R - Subset dataframe to include only subjects with more than 1 record -
i'd subset dataframe include records subjects have >1 record, , exclude subjects 1 record.
let's take next dataframe;
mydata <- data.frame(subject_id = factor(c(1,2,3,4,4,5,5,6,6,7,8,9,9,9,10)),                      variable = rnorm(15))    the code below gives me subjects >1 record using duplicated();
duplicates <- mydata[duplicated(mydata$subject_id),]$subject_id    but want retain in subset all records each subject >1 record, tried;
mydata[mydata$subject_id==as.factor(duplicates),]    which not homecoming result i'm expecting.
any ideas?
a simple alternative  utilize dplyr:
library(dplyr) dfr <- data.frame(a=sample(1:2,10,rep=t), b=sample(1:5,10, rep=t)) dfr <- group_by(dfr, b) dfr # source: local   info frame [10 x 2] # groups: b #  #    b # 1  2 4 # 2  2 2 # 3  2 5 # 4  2 1 # 5  1 2 # 6  1 3 # 7  2 1 # 8  2 4 # 9  1 4 # 10 2 4 filter(dfr, n() > 1) # source: local   info frame [8 x 2] # groups: b #  #   b # 1 2 4 # 2 2 2 # 3 2 1 # 4 1 2 # 5 2 1 # 6 2 4 # 7 1 4 # 8 2 4        r data.frame duplicates subset 
 
  
Comments
Post a Comment