python - Prepend values to Panda's dataframe based on index level of another dataframe -
python - Prepend values to Panda's dataframe based on index level of another dataframe -
below have 2 dataframes. first dataframe (d1) has 'date' index, , 2nd dataframe (d2) has 'date' , 'name' index. you'll notice d1 starts @ 2014-04-30 , d2 starts @ 2014-01-31.
d1:
value date 2014-04-30 1 2014-05-31 2 2014-06-30 3 2014-07-31 4 2014-08-31 5 2014-09-30 6 2014-10-31 7
d2:
value date name 2014-01-31 n1 5 2014-02-30 n1 6 2014-03-30 n1 7 2014-04-30 n1 8 2014-05-31 n2 9 2014-06-30 n2 3 2014-07-31 n2 4 2014-08-31 n2 5 2014-09-30 n2 6 2014-10-31 n2 7
what want prepend before dates d2, utilize first value d1 populate value rows of prepended rows.
the result should this:
value date 2014-01-31 1 2014-02-30 1 2014-03-30 1 2014-04-30 1 2014-05-31 2 2014-06-30 3 2014-07-31 4 2014-08-31 5 2014-09-30 6 2014-10-31 7
what efficient or easiest way using pandas
this direct formulation of problem, , quite fast already:
in [126]: def direct(d1, d2): dates2 = d2.index.get_level_values('date') dates1 = d1.index homecoming d1.reindex(dates2[dates2 < min(dates1)].append(dates1), method='bfill') .....: in [127]: direct(d1, d2) out[127]: value date 2014-01-31 1 2014-02-28 1 2014-03-30 1 2014-04-30 1 2014-05-31 2 2014-06-30 3 2014-07-31 4 2014-08-31 5 2014-09-30 6 2014-10-31 7 in [128]: %timeit direct(d1, d2) 1000 loops, best of 3: 362 µs per loop
if willing sacrifice readability performance, compare dates internal representation (integers faster) , "backfilling" manually:
in [129]: def fast(d1, d2): dates2 = d2.index.get_level_values('date') dates1 = d1.index new_dates = dates2[dates2.asi8 < min(dates1.asi8)] new_index = new_dates.append(dates1) new_values = np.concatenate((np.repeat(d1.values[:1], len(new_dates), axis=0), d1.values)) homecoming pd.dataframe(new_values, index=new_index, columns=d1.columns, copy=false) .....: in [130]: %timeit fast(d1, d2) 1000 loops, best of 3: 213 µs per loop
python numpy pandas
Comments
Post a Comment