Custom mapping of categorical to numeric values
up vote
0
down vote
favorite
I have object type columns that have categorical values example 15-16 Years, 17-23 Years ..... I have converted them to category and then cat.codes. However,the coding values start from 0 for the first group 0-4 years and I want the coding to start from 1 ie 0-4->1, 5-12-> and @@->NaN
The suggested solution of using dictionary mapping still has issues. Following is mcve
import pandas as pd
data = ['0-4 Years', '5-12 Years','13-18 Years', '19-21 Years','22-25 Years','26-29 Years','30-35 Years',
'36-41 Years','42-45 Years','46-49 Years','50-55 Years', '56-63 Years']
df = pd.DataFrame(data,columns=['Age'],dtype=object)
df['Age']=df['Age'].astype('category')
cats = dict(enumerate(df['Age'].cat.categories, 2))
df['Age']=df['Age'].cat.codes.map(cats).astype('category')
df['Age']
and here is the output, as you can see if I change the enumeration start other than 0, there is an issue with values as NaN. Secondly the column is not coded either:
df['Age']
0 NaN
1 36-41 Years
2 NaN
3 NaN
4 0-4 Years
5 13-18 Years
6 19-21 Years
7 22-25 Years
8 26-29 Years
9 30-35 Years
10 42-45 Years
11 46-49 Years
Name: Age, dtype: category
Categories (9, object): [0-4 Years, 13-18 Years, 19-21 Years, 22-25 Years, ..., 30-35 Years, 36-41 Years, 42-45 Years, 46-49 Years]
How to fix this.
python mapping
add a comment |
up vote
0
down vote
favorite
I have object type columns that have categorical values example 15-16 Years, 17-23 Years ..... I have converted them to category and then cat.codes. However,the coding values start from 0 for the first group 0-4 years and I want the coding to start from 1 ie 0-4->1, 5-12-> and @@->NaN
The suggested solution of using dictionary mapping still has issues. Following is mcve
import pandas as pd
data = ['0-4 Years', '5-12 Years','13-18 Years', '19-21 Years','22-25 Years','26-29 Years','30-35 Years',
'36-41 Years','42-45 Years','46-49 Years','50-55 Years', '56-63 Years']
df = pd.DataFrame(data,columns=['Age'],dtype=object)
df['Age']=df['Age'].astype('category')
cats = dict(enumerate(df['Age'].cat.categories, 2))
df['Age']=df['Age'].cat.codes.map(cats).astype('category')
df['Age']
and here is the output, as you can see if I change the enumeration start other than 0, there is an issue with values as NaN. Secondly the column is not coded either:
df['Age']
0 NaN
1 36-41 Years
2 NaN
3 NaN
4 0-4 Years
5 13-18 Years
6 19-21 Years
7 22-25 Years
8 26-29 Years
9 30-35 Years
10 42-45 Years
11 46-49 Years
Name: Age, dtype: category
Categories (9, object): [0-4 Years, 13-18 Years, 19-21 Years, 22-25 Years, ..., 30-35 Years, 36-41 Years, 42-45 Years, 46-49 Years]
How to fix this.
python mapping
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have object type columns that have categorical values example 15-16 Years, 17-23 Years ..... I have converted them to category and then cat.codes. However,the coding values start from 0 for the first group 0-4 years and I want the coding to start from 1 ie 0-4->1, 5-12-> and @@->NaN
The suggested solution of using dictionary mapping still has issues. Following is mcve
import pandas as pd
data = ['0-4 Years', '5-12 Years','13-18 Years', '19-21 Years','22-25 Years','26-29 Years','30-35 Years',
'36-41 Years','42-45 Years','46-49 Years','50-55 Years', '56-63 Years']
df = pd.DataFrame(data,columns=['Age'],dtype=object)
df['Age']=df['Age'].astype('category')
cats = dict(enumerate(df['Age'].cat.categories, 2))
df['Age']=df['Age'].cat.codes.map(cats).astype('category')
df['Age']
and here is the output, as you can see if I change the enumeration start other than 0, there is an issue with values as NaN. Secondly the column is not coded either:
df['Age']
0 NaN
1 36-41 Years
2 NaN
3 NaN
4 0-4 Years
5 13-18 Years
6 19-21 Years
7 22-25 Years
8 26-29 Years
9 30-35 Years
10 42-45 Years
11 46-49 Years
Name: Age, dtype: category
Categories (9, object): [0-4 Years, 13-18 Years, 19-21 Years, 22-25 Years, ..., 30-35 Years, 36-41 Years, 42-45 Years, 46-49 Years]
How to fix this.
python mapping
I have object type columns that have categorical values example 15-16 Years, 17-23 Years ..... I have converted them to category and then cat.codes. However,the coding values start from 0 for the first group 0-4 years and I want the coding to start from 1 ie 0-4->1, 5-12-> and @@->NaN
The suggested solution of using dictionary mapping still has issues. Following is mcve
import pandas as pd
data = ['0-4 Years', '5-12 Years','13-18 Years', '19-21 Years','22-25 Years','26-29 Years','30-35 Years',
'36-41 Years','42-45 Years','46-49 Years','50-55 Years', '56-63 Years']
df = pd.DataFrame(data,columns=['Age'],dtype=object)
df['Age']=df['Age'].astype('category')
cats = dict(enumerate(df['Age'].cat.categories, 2))
df['Age']=df['Age'].cat.codes.map(cats).astype('category')
df['Age']
and here is the output, as you can see if I change the enumeration start other than 0, there is an issue with values as NaN. Secondly the column is not coded either:
df['Age']
0 NaN
1 36-41 Years
2 NaN
3 NaN
4 0-4 Years
5 13-18 Years
6 19-21 Years
7 22-25 Years
8 26-29 Years
9 30-35 Years
10 42-45 Years
11 46-49 Years
Name: Age, dtype: category
Categories (9, object): [0-4 Years, 13-18 Years, 19-21 Years, 22-25 Years, ..., 30-35 Years, 36-41 Years, 42-45 Years, 46-49 Years]
How to fix this.
python mapping
python mapping
edited Nov 22 at 10:21
asked Nov 22 at 4:03
aus_fas
207
207
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
You can create you own dictionary that maps codes and categories with:
cats = dict(enumerate(df['Age'].cat.categories, 1))
And use this dictionary to map it in the dataframe
df['Age'].cat.codes.map(cats).astype('category')
But what do you want to store in the column, the codes themselves?
– b-fg
Nov 22 at 5:48
Also as I change the start of enumerate to any value higher than '1' then df['Age'] starts to have NaN for the categories where the mapping was available. The code upto cats is fine as I can see the categories based on dictionary but the second line seems to have an issue
– aus_fas
Nov 22 at 6:08
The first line of code only creates a dictionary, so is not very useful on its own. That's why there a second line where you use the dictionary to map it to your dataframe.
– b-fg
Nov 22 at 6:32
Maybe if you had provided a Minimum Complete and Verifiable Example this would not be the issue. So I encourage you to edit your question with some more content that other people can use to provide a better answer.
– b-fg
Nov 22 at 6:43
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
You can create you own dictionary that maps codes and categories with:
cats = dict(enumerate(df['Age'].cat.categories, 1))
And use this dictionary to map it in the dataframe
df['Age'].cat.codes.map(cats).astype('category')
But what do you want to store in the column, the codes themselves?
– b-fg
Nov 22 at 5:48
Also as I change the start of enumerate to any value higher than '1' then df['Age'] starts to have NaN for the categories where the mapping was available. The code upto cats is fine as I can see the categories based on dictionary but the second line seems to have an issue
– aus_fas
Nov 22 at 6:08
The first line of code only creates a dictionary, so is not very useful on its own. That's why there a second line where you use the dictionary to map it to your dataframe.
– b-fg
Nov 22 at 6:32
Maybe if you had provided a Minimum Complete and Verifiable Example this would not be the issue. So I encourage you to edit your question with some more content that other people can use to provide a better answer.
– b-fg
Nov 22 at 6:43
add a comment |
up vote
0
down vote
You can create you own dictionary that maps codes and categories with:
cats = dict(enumerate(df['Age'].cat.categories, 1))
And use this dictionary to map it in the dataframe
df['Age'].cat.codes.map(cats).astype('category')
But what do you want to store in the column, the codes themselves?
– b-fg
Nov 22 at 5:48
Also as I change the start of enumerate to any value higher than '1' then df['Age'] starts to have NaN for the categories where the mapping was available. The code upto cats is fine as I can see the categories based on dictionary but the second line seems to have an issue
– aus_fas
Nov 22 at 6:08
The first line of code only creates a dictionary, so is not very useful on its own. That's why there a second line where you use the dictionary to map it to your dataframe.
– b-fg
Nov 22 at 6:32
Maybe if you had provided a Minimum Complete and Verifiable Example this would not be the issue. So I encourage you to edit your question with some more content that other people can use to provide a better answer.
– b-fg
Nov 22 at 6:43
add a comment |
up vote
0
down vote
up vote
0
down vote
You can create you own dictionary that maps codes and categories with:
cats = dict(enumerate(df['Age'].cat.categories, 1))
And use this dictionary to map it in the dataframe
df['Age'].cat.codes.map(cats).astype('category')
You can create you own dictionary that maps codes and categories with:
cats = dict(enumerate(df['Age'].cat.categories, 1))
And use this dictionary to map it in the dataframe
df['Age'].cat.codes.map(cats).astype('category')
answered Nov 22 at 4:16
b-fg
1,71911422
1,71911422
But what do you want to store in the column, the codes themselves?
– b-fg
Nov 22 at 5:48
Also as I change the start of enumerate to any value higher than '1' then df['Age'] starts to have NaN for the categories where the mapping was available. The code upto cats is fine as I can see the categories based on dictionary but the second line seems to have an issue
– aus_fas
Nov 22 at 6:08
The first line of code only creates a dictionary, so is not very useful on its own. That's why there a second line where you use the dictionary to map it to your dataframe.
– b-fg
Nov 22 at 6:32
Maybe if you had provided a Minimum Complete and Verifiable Example this would not be the issue. So I encourage you to edit your question with some more content that other people can use to provide a better answer.
– b-fg
Nov 22 at 6:43
add a comment |
But what do you want to store in the column, the codes themselves?
– b-fg
Nov 22 at 5:48
Also as I change the start of enumerate to any value higher than '1' then df['Age'] starts to have NaN for the categories where the mapping was available. The code upto cats is fine as I can see the categories based on dictionary but the second line seems to have an issue
– aus_fas
Nov 22 at 6:08
The first line of code only creates a dictionary, so is not very useful on its own. That's why there a second line where you use the dictionary to map it to your dataframe.
– b-fg
Nov 22 at 6:32
Maybe if you had provided a Minimum Complete and Verifiable Example this would not be the issue. So I encourage you to edit your question with some more content that other people can use to provide a better answer.
– b-fg
Nov 22 at 6:43
But what do you want to store in the column, the codes themselves?
– b-fg
Nov 22 at 5:48
But what do you want to store in the column, the codes themselves?
– b-fg
Nov 22 at 5:48
Also as I change the start of enumerate to any value higher than '1' then df['Age'] starts to have NaN for the categories where the mapping was available. The code upto cats is fine as I can see the categories based on dictionary but the second line seems to have an issue
– aus_fas
Nov 22 at 6:08
Also as I change the start of enumerate to any value higher than '1' then df['Age'] starts to have NaN for the categories where the mapping was available. The code upto cats is fine as I can see the categories based on dictionary but the second line seems to have an issue
– aus_fas
Nov 22 at 6:08
The first line of code only creates a dictionary, so is not very useful on its own. That's why there a second line where you use the dictionary to map it to your dataframe.
– b-fg
Nov 22 at 6:32
The first line of code only creates a dictionary, so is not very useful on its own. That's why there a second line where you use the dictionary to map it to your dataframe.
– b-fg
Nov 22 at 6:32
Maybe if you had provided a Minimum Complete and Verifiable Example this would not be the issue. So I encourage you to edit your question with some more content that other people can use to provide a better answer.
– b-fg
Nov 22 at 6:43
Maybe if you had provided a Minimum Complete and Verifiable Example this would not be the issue. So I encourage you to edit your question with some more content that other people can use to provide a better answer.
– b-fg
Nov 22 at 6:43
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53423715%2fcustom-mapping-of-categorical-to-numeric-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown