Pivot A Pandas Dataframe And Get The Non-axis Columns As A Series
I have a data set pulled from a database using pandas.io.sql.read_frame which looks like this Period Category Projected Actual Previous 0 2013-01 A 1
Solution 1:
I believe you're presenting us and XY problem, as resulting dataset containing Series has no practical applicability.
Maybe you're looking for a groupby object instead of pivot?
>>>df.groupby(["Category", 'Period']).get_group(('A', '2013-01'))
Period Category Projected Actual Previous
0 2013-01 A 1214432.94 3175516.32 3001149.5
>>>df.groupby(["Category", 'Period']).get_group(('A', '2013-01'))[['Projected', 'Actual', 'Previous']].sum()
Projected 1214432.94
Actual 3175516.32
Previous 3001149.50
dtype: float64
Solution 2:
I believe @alko is on the right track suggesting a groupby
at the beginning followed by a sum
. If your goal is to then have an iterable in each place then you can use zip
to create a column of tuples. How about this:
import pandas as pd
import numpy as np
from itertools import product
np.random.seed(1)
periods = range(0,3)
categories = list('ABC')
rows = list(product(periods, categories)) * 2
n = len(rows)
df = pd.DataFrame({'Projected': np.random.randn(n),
'Actual': np.random.randn(n),
'Previous': np.random.randn(n)},
index = pd.MultiIndex.from_tuples(rows))
df.index.names = ['Period', 'Category']
summed = df.groupby(level=['Period', 'Category']).sum()
summed['tuple'] = zip(*[summed[c] for c in ['Projected', 'Actual', 'Previous']])
result = summed['tuple'].unstack('Period')
Gives
And just for completeness, you can go back the other way, though it's a bit of a pain:
andback = result.stack().apply(lambda t: pd.Series({'Projected': t[0],
'Actual': t[1],
'Previous': t[2]}))
Gives
And just to help someone out in the comments. Here's how I'd add subtotals and grand totals:
def add_subtotal(g):
category = g.index.get_level_values('Category')[0]
g.loc[(category, 'subtotal'), :] = g.sum()
return g
with_subtotals = andback.groupby(level='Category', axis=0).transform(add_subtotal)
with_subtotals.loc[('Grand', 'Total'), :] = with_subtotals\
.loc[with_subtotals.index.get_level_values('Period')=='subtotal', :]\
.sum()
Which gives:
Post a Comment for "Pivot A Pandas Dataframe And Get The Non-axis Columns As A Series"