data.frame - R - Subset dataframe to include only subjects with more than 1 record -

i'd subset dataframe include records subjects have >1 record, , exclude subjects 1 record.

let's take next dataframe;

mydata <- data.frame(subject_id = factor(c(1,2,3,4,4,5,5,6,6,7,8,9,9,9,10)), variable = rnorm(15))

the code below gives me subjects >1 record using duplicated();

duplicates <- mydata[duplicated(mydata$subject_id),]$subject_id

but want retain in subset all records each subject >1 record, tried;

mydata[mydata$subject_id==as.factor(duplicates),]

which not homecoming result i'm expecting.

any ideas?

a simple alternative utilize dplyr:

library(dplyr) dfr <- data.frame(a=sample(1:2,10,rep=t), b=sample(1:5,10, rep=t)) dfr <- group_by(dfr, b) dfr # source: local   info frame [10 x 2] # groups: b #  #    b # 1  2 4 # 2  2 2 # 3  2 5 # 4  2 1 # 5  1 2 # 6  1 3 # 7  2 1 # 8  2 4 # 9  1 4 # 10 2 4 filter(dfr, n() > 1) # source: local   info frame [8 x 2] # groups: b #  #   b # 1 2 4 # 2 2 2 # 3 2 1 # 4 1 2 # 5 2 1 # 6 2 4 # 7 1 4 # 8 2 4

r data.frame duplicates subset

Search This Blog

Four

data.frame - R - Subset dataframe to include only subjects with more than 1 record -

Comments

Post a Comment

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -