python - Assigning result of pandas groupby -
i have following dataframe:
date, industry, symbol, roc 25-02-2015, health, abc, 200 25-02-2015, health, xyz, 150 25-02-2015, mining, tyr, 45 25-02-2015, mining, ujk, 70 26-02-2015, health, abc, 60 26-02-2015, health, xyz, 310 26-02-2015, mining, tyr, 65 26-02-2015, mining, ujk, 23
i need determine average 'roc', max 'roc', min 'roc' how many symbols exist each date+industry. in other words need groupby date , industry, , determine various averages, max/min etc.
so far doing following, working seems slow , inefficient:
sector_df = primary_df.groupby(['date', 'industry'], sort=true).mean() tmp_max_df = primary_df.groupby(['date', 'industry'], sort=true).max() tmp_min_df = primary_df.groupby(['date', 'industry'], sort=true).min() tmp_count_df = primary_df.groupby(['date', 'industry'], sort=true).count() sector_df['max_roc'] = tmp_max_df['roc'] sector_df['min_roc'] = tmp_min_df['roc'] sector_df['count'] = tmp_count_df['roc'] sector_df.reset_index(inplace=true) sector_df.set_index(['date', 'industry'], inplace=true)
the above code works, resulting in dataframe indexed date+industry, showing me min/max 'roc' each date+industry, how many symbols existed each date+industry.
i doing complete groupby multiple times (to determine mean, max, min, count of 'roc'). slow because it's doing same thing on , over.
is there way group once. perform mean, max etc on object , assign result sector_df?
you want perform aggregate using agg
:
in [72]: df.groupby(['date','industry']).agg([pd.series.mean, pd.series.max, pd.series.min, pd.series.count]) out[72]: roc mean max min count date industry 2015-02-25 health 175.0 200 150 2 mining 57.5 70 45 2 2015-02-26 health 185.0 310 60 2 mining 44.0 65 23 2
this allows pass iterable (a list in case) of functions perform.
edit
to access individual results need pass tuple each axis:
in [78]: gp.loc[('2015-02-25','health'),('roc','mean')] out[78]: 175.0
where gp = df.groupby(['date','industry']).agg([pd.series.mean, pd.series.max, pd.series.min, pd.series.count])
Comments
Post a Comment