python - Transform slow pandas iterrows into apply -



python - Transform slow pandas iterrows into apply -

i have next dataframe:

value count recl_2007 recl_2008 recl_2009 a_a a_b a_c b_a b_b \ 0 189 149.5872 503 503 500 0 0 0 0 0 1 209 1939.6160 503 503 503 0 0 0 0 0 2 499 617.4784 503 500 503 0 0 0 0 0 3 585 73.0688 503 503 503 0 0 0 0 0 4 611 133.9072 503 500 503 0 0 0 0 0 5 645 278.7904 503 503 503 0 0 0 0 0 6 659 138.2976 500 503 503 0 0 0 0 0 7 719 769.5744 503 503 502 0 0 0 0 0 b_c c_a c_b c_c 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3 0 0 0 0 4 0 0 0 0 5 0 0 0 0 6 0 0 0 0 7 0 0 0 0

also, values in columns:recl_2007, recl_2008 , recl_2009 correspond variables a, b , c follows:

= 500, b=502, c=503.

i want fill columns a_a...c_c using values in count column such recl_2007 value gives first part of column name , recl_2009 gives sec part of column name.

i.e, if recl_2007 == 503 , recl_2009 == 500, column c_a , value should updated whatever in count column of row.

currently iterating through pandas dataframe using iterrows:

for index, row in df.iterrows(): init = opp_lu[row[name_init]] # first part of column name finl = opp_lu[row[name_finl]] # sec part of column name col_name = init+'_'+finl df.loc[index,col_name] = row['count']

this slow, not sure how translate using apply. hints?

there 2 ways that.

you can utilize apply function, need work here(just simplify work).

a dictionary help build naming

d={'500':'a','502':'b','503':'c'}

function naming

name= lambda x: "{0}_{1}".format(d[str(int(x['recl_2007']))],d[str(int(x['recl_2009']))])

then, go through items , re-create count item name similar.

df["c_a"] = df.apply(lambda x: x['count'] if name(x)=='c_a' else 0, axis=1)

the other solution, simpler filter info have, re-create count item

df.loc[(df['recl_2007']==503) & (df['recl_2009']==503), 'c_c']= df['count']

the code like, quick example, need work on other scenarios.

data= """value,count,recl_2007,recl_2008,recl_2009\n189,149.5872,503,503,500\n209,939.6160,503,503,503\n499,617.4784,503,500,503\n585,73.0688,503,503,503\n611,133.9072,503,500,503\n645,278.7904,503,503,503\n659,138.2976,500,503,503\n719,769.5744,503,503,502""" import pandas pd io import stringio df= pd.read_csv(stringio(data.decode('utf-8')),sep=',' ) #first approach: d={'500':'a','502':'b','503':'c'} name= lambda x: "{0}_{1}".format(d[str(int(x['recl_2007']))],d[str(int(x['recl_2009']))]) df['c_c']=[0]*len(df.value) df["c_a"] = df.apply(lambda x: x['count'] if name(x)=='c_a' else 0, axis=1) #second approach: df.loc[(df['recl_2007']==503) & (df['recl_2009']==503), 'c_c']= df['count'] print df

output:

value count recl_2007 recl_2008 recl_2009 c_c c_a 0 189 149.5872 503 503 500 0.0000 149.5872 1 209 939.6160 503 503 503 939.6160 0.0000 2 499 617.4784 503 500 503 617.4784 0.0000 3 585 73.0688 503 503 503 73.0688 0.0000 4 611 133.9072 503 500 503 133.9072 0.0000 5 645 278.7904 503 503 503 278.7904 0.0000 6 659 138.2976 500 503 503 0.0000 0.0000 7 719 769.5744 503 503 502 0.0000 0.0000

python pandas

Comments

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -