python pivot table/group by - i need to know top 3 group
up vote
0
down vote
favorite
import pandas as pd
df = pd.DataFrame({
'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
With the above data frame , I wanted the output as below :
For each card number
, I wanted the below records :
Card number
, % of Amount spent of Group code 1
, % of Amount spent on Group code 2
, ….so on for different Group code
% of Amount spent on any group = (Total amount spend on the card / Amount spend on that group ) * 100
Also, on larger picture, I wanted to know the Top 5 Groups for each card where the amount is spent ?
It's basically 2 queries , It will be great if anyone can help me.
Note : The code given is just for understanding how my data frame looks like.
python pivot-table pandas-groupby
add a comment |
up vote
0
down vote
favorite
import pandas as pd
df = pd.DataFrame({
'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
With the above data frame , I wanted the output as below :
For each card number
, I wanted the below records :
Card number
, % of Amount spent of Group code 1
, % of Amount spent on Group code 2
, ….so on for different Group code
% of Amount spent on any group = (Total amount spend on the card / Amount spend on that group ) * 100
Also, on larger picture, I wanted to know the Top 5 Groups for each card where the amount is spent ?
It's basically 2 queries , It will be great if anyone can help me.
Note : The code given is just for understanding how my data frame looks like.
python pivot-table pandas-groupby
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
import pandas as pd
df = pd.DataFrame({
'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
With the above data frame , I wanted the output as below :
For each card number
, I wanted the below records :
Card number
, % of Amount spent of Group code 1
, % of Amount spent on Group code 2
, ….so on for different Group code
% of Amount spent on any group = (Total amount spend on the card / Amount spend on that group ) * 100
Also, on larger picture, I wanted to know the Top 5 Groups for each card where the amount is spent ?
It's basically 2 queries , It will be great if anyone can help me.
Note : The code given is just for understanding how my data frame looks like.
python pivot-table pandas-groupby
import pandas as pd
df = pd.DataFrame({
'customer': [1,2,1,3,1,2,3],
"group_code": ['111', '111', '222', '111', '111', '111', '333'],
"ind_code": ['A', 'B', 'AA', 'A', 'AAA', 'C', 'BBB'],
"amount": [100, 200, 140, 400, 225, 125, 600],
"card": ['XXX', 'YYY', 'YYY', 'XXX', 'XXX', 'YYY', 'XXX']})
With the above data frame , I wanted the output as below :
For each card number
, I wanted the below records :
Card number
, % of Amount spent of Group code 1
, % of Amount spent on Group code 2
, ….so on for different Group code
% of Amount spent on any group = (Total amount spend on the card / Amount spend on that group ) * 100
Also, on larger picture, I wanted to know the Top 5 Groups for each card where the amount is spent ?
It's basically 2 queries , It will be great if anyone can help me.
Note : The code given is just for understanding how my data frame looks like.
python pivot-table pandas-groupby
python pivot-table pandas-groupby
edited Nov 22 at 7:28
rdj7
7331718
7331718
asked Nov 22 at 6:13
Aysha
11
11
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
Regarding the first query: first we get the total amount spent for each card:
card_totals = df.groupby('card').sum()['amount'].reset_index().to_dict(orient='list')
card_totals_dict = dict(zip(card_totals['card'], card_totals['amount']))
card_totals_dict
Output:
{'XXX': 1325, 'YYY': 465}
Then we calculate the percentage for each group:
group_percentage = df.groupby(['card', 'group_code']).sum()['amount'].reset_index()
group_percentage['percentage'] = group_percentage['amount'] * 100 / group_percentage['card'].apply(card_totals_dict.get)
group_percentage
Output:
card group_code amount percentage
0 XXX 111 725 54.7170
1 XXX 333 600 45.2830
2 YYY 111 325 69.8925
3 YYY 222 140 30.1075
Regarding the second query, it sounds very similar to this question, so I would say:
df.groupby(['card', 'group_code']).agg({'amount': sum})['amount'].groupby(level=0, group_keys=False).nlargest(5)
Using nlargest(1)
returns
card group_code
XXX 111 725
YYY 111 325
Name: amount, dtype: int64
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Regarding the first query: first we get the total amount spent for each card:
card_totals = df.groupby('card').sum()['amount'].reset_index().to_dict(orient='list')
card_totals_dict = dict(zip(card_totals['card'], card_totals['amount']))
card_totals_dict
Output:
{'XXX': 1325, 'YYY': 465}
Then we calculate the percentage for each group:
group_percentage = df.groupby(['card', 'group_code']).sum()['amount'].reset_index()
group_percentage['percentage'] = group_percentage['amount'] * 100 / group_percentage['card'].apply(card_totals_dict.get)
group_percentage
Output:
card group_code amount percentage
0 XXX 111 725 54.7170
1 XXX 333 600 45.2830
2 YYY 111 325 69.8925
3 YYY 222 140 30.1075
Regarding the second query, it sounds very similar to this question, so I would say:
df.groupby(['card', 'group_code']).agg({'amount': sum})['amount'].groupby(level=0, group_keys=False).nlargest(5)
Using nlargest(1)
returns
card group_code
XXX 111 725
YYY 111 325
Name: amount, dtype: int64
add a comment |
up vote
0
down vote
Regarding the first query: first we get the total amount spent for each card:
card_totals = df.groupby('card').sum()['amount'].reset_index().to_dict(orient='list')
card_totals_dict = dict(zip(card_totals['card'], card_totals['amount']))
card_totals_dict
Output:
{'XXX': 1325, 'YYY': 465}
Then we calculate the percentage for each group:
group_percentage = df.groupby(['card', 'group_code']).sum()['amount'].reset_index()
group_percentage['percentage'] = group_percentage['amount'] * 100 / group_percentage['card'].apply(card_totals_dict.get)
group_percentage
Output:
card group_code amount percentage
0 XXX 111 725 54.7170
1 XXX 333 600 45.2830
2 YYY 111 325 69.8925
3 YYY 222 140 30.1075
Regarding the second query, it sounds very similar to this question, so I would say:
df.groupby(['card', 'group_code']).agg({'amount': sum})['amount'].groupby(level=0, group_keys=False).nlargest(5)
Using nlargest(1)
returns
card group_code
XXX 111 725
YYY 111 325
Name: amount, dtype: int64
add a comment |
up vote
0
down vote
up vote
0
down vote
Regarding the first query: first we get the total amount spent for each card:
card_totals = df.groupby('card').sum()['amount'].reset_index().to_dict(orient='list')
card_totals_dict = dict(zip(card_totals['card'], card_totals['amount']))
card_totals_dict
Output:
{'XXX': 1325, 'YYY': 465}
Then we calculate the percentage for each group:
group_percentage = df.groupby(['card', 'group_code']).sum()['amount'].reset_index()
group_percentage['percentage'] = group_percentage['amount'] * 100 / group_percentage['card'].apply(card_totals_dict.get)
group_percentage
Output:
card group_code amount percentage
0 XXX 111 725 54.7170
1 XXX 333 600 45.2830
2 YYY 111 325 69.8925
3 YYY 222 140 30.1075
Regarding the second query, it sounds very similar to this question, so I would say:
df.groupby(['card', 'group_code']).agg({'amount': sum})['amount'].groupby(level=0, group_keys=False).nlargest(5)
Using nlargest(1)
returns
card group_code
XXX 111 725
YYY 111 325
Name: amount, dtype: int64
Regarding the first query: first we get the total amount spent for each card:
card_totals = df.groupby('card').sum()['amount'].reset_index().to_dict(orient='list')
card_totals_dict = dict(zip(card_totals['card'], card_totals['amount']))
card_totals_dict
Output:
{'XXX': 1325, 'YYY': 465}
Then we calculate the percentage for each group:
group_percentage = df.groupby(['card', 'group_code']).sum()['amount'].reset_index()
group_percentage['percentage'] = group_percentage['amount'] * 100 / group_percentage['card'].apply(card_totals_dict.get)
group_percentage
Output:
card group_code amount percentage
0 XXX 111 725 54.7170
1 XXX 333 600 45.2830
2 YYY 111 325 69.8925
3 YYY 222 140 30.1075
Regarding the second query, it sounds very similar to this question, so I would say:
df.groupby(['card', 'group_code']).agg({'amount': sum})['amount'].groupby(level=0, group_keys=False).nlargest(5)
Using nlargest(1)
returns
card group_code
XXX 111 725
YYY 111 325
Name: amount, dtype: int64
edited Nov 22 at 6:48
answered Nov 22 at 6:41
andersource
27115
27115
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53424864%2fpython-pivot-table-group-by-i-need-to-know-top-3-group%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown