python - Pandas - Create new dataframes based on ranked values in select columns -
python - Pandas - Create new dataframes based on ranked values in select columns -
i have dataframe columns containing numerical info , other containing text. looks like:
age weight blood sugar study grouping gender notes 29 195 126 b female notes of kind 34 180 140 b male different set of notes 48 220 111 c male blah blah 55 189 109 c male more notes i want create sub-divisions of info frame based on rankings of numerical info columns. example, if need 2 oldest patients new dataframe this:
age weight blood sugar study grouping gender notes 48 220 111 c male blah blah 55 189 109 c male more notes the rank function looks useful. figure run:
df2 = rank.df(axis=0) and find way utilize index of df2 pull rows df new dataframes. along lines of:
cutoff = df2[df2 > 10] # delete rows nan values in columns of involvement this feels bit clunky though. i'm hoping there's more straight-forward way say,
"pandas, want new dataframe 15 oldest people in one. great! want new dataframe 20 youngest people, etc"
one alternative sort dataframe age:
df = df.sort('age') then age of n-th youngest person df['age'].values[n] , age of n-th oldest person df['age'].values[-n].
therefore, view dataframe people of 15 oldest ages, do:
df[df['age'] >= df['age'].values[-15]] alternatively, if want limit number of rows returned (e.g. don't mind there may 20 people sharing oldest age of, say, 55), utilize head , tail methods on sorted dataframe...
df_age = df.sort('age', ascending=false) ...then df_age.head(15) view 15 of people, df_age.tail(20) view 20 of youngest people.
python pandas dataframes
Comments
Post a Comment