Pandas Groupby Quantile Values
I tried to calculate specific quantile values from a data frame, as shown in the code below. There was no problem when calculate it in separate lines. When attempting to run last 2
Solution 1:
I prefer def functions
defq1(x):
return x.quantile(0.25)
defq3(x):
return x.quantile(0.75)
f = {'number': ['median', 'std', q1, q3]}
df1 = df.groupby('x').agg(f)
df1
Out[1643]:
number
median std q1 q3
x
05250017969.882211400006125014300016337.5844813575055000
Solution 2:
@WeNYoBen's answer is great. There is one limitation though, and that lies with the fact that one needs to create a new function for every quantile. This can be a very unpythonic exercise if the number of quantiles become large. A better approach is to use a function to create a function, and to rename that function appropriately.
defrename(newname):
defdecorator(f):
f.__name__ = newname
return f
return decorator
defq_at(y):
@rename(f'q{y:0.2f}')defq(x):
return x.quantile(y)
return q
f = {'number': ['median', 'std', q_at(0.25) ,q_at(0.75)]}
df1 = df.groupby('x').agg(f)
df1
Out[]:
number
median std q0.25 q0.75
x
05250017969.882211400006125014300016337.5844813575055000
The rename decorator renames the function so that the pandas agg function can deal with the reuse of the quantile function returned (otherwise all quantiles results end up in columns that are named q).
Solution 3:
There's a nice way if you want to give names to aggregated columns:
df1.groupby('x').agg(
q1_foo=pd.NamedAgg('number', q1),
q2_foo=pd.NamedAgg('number', q2)
)
where q1
and q2
are functions.
Post a Comment for "Pandas Groupby Quantile Values"