reading from excel file pandas in the desired type
up vote
0
down vote
favorite
I am reading excel file using pandas containing 2 columns:
df
EID List of Tuples
1 [('Physics', 90)]
2 [('Physics', 80), ('Math', 70)]
3 [('Physics', 60, ('Math', 25))]
when I check df['List of Tuples'].iat[0]
it gives me u"[('Physics', 90)]"
I am getting this as a unicode and not as a list of tuples.
When I df['List of Tuples'].iat[0].decode('iso-8859-1').encode('utf-8')
, I get string: "[('Physics', 90)]"
I want to read/convert it as list of tuples [('Physics', 90)]
instead of string or unicode.In short,I want to get rid of double quotes around each entry and read it as [('Physics', 90)]
and [('Physics', 80), ('Math', 70)]
and so on.
python pandas
add a comment |
up vote
0
down vote
favorite
I am reading excel file using pandas containing 2 columns:
df
EID List of Tuples
1 [('Physics', 90)]
2 [('Physics', 80), ('Math', 70)]
3 [('Physics', 60, ('Math', 25))]
when I check df['List of Tuples'].iat[0]
it gives me u"[('Physics', 90)]"
I am getting this as a unicode and not as a list of tuples.
When I df['List of Tuples'].iat[0].decode('iso-8859-1').encode('utf-8')
, I get string: "[('Physics', 90)]"
I want to read/convert it as list of tuples [('Physics', 90)]
instead of string or unicode.In short,I want to get rid of double quotes around each entry and read it as [('Physics', 90)]
and [('Physics', 80), ('Math', 70)]
and so on.
python pandas
1
The formatting is all over the place and I can't fix from my phone, but if you're hoping to use dataframes for lists of tuples in single cells, you're opening a door to a lot of pain. That's not how pandas works. You should shoot for a structure that has scalars in cells or, IMO, drop pandas if you can't do that.
– roganjosh
Nov 21 at 19:56
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I am reading excel file using pandas containing 2 columns:
df
EID List of Tuples
1 [('Physics', 90)]
2 [('Physics', 80), ('Math', 70)]
3 [('Physics', 60, ('Math', 25))]
when I check df['List of Tuples'].iat[0]
it gives me u"[('Physics', 90)]"
I am getting this as a unicode and not as a list of tuples.
When I df['List of Tuples'].iat[0].decode('iso-8859-1').encode('utf-8')
, I get string: "[('Physics', 90)]"
I want to read/convert it as list of tuples [('Physics', 90)]
instead of string or unicode.In short,I want to get rid of double quotes around each entry and read it as [('Physics', 90)]
and [('Physics', 80), ('Math', 70)]
and so on.
python pandas
I am reading excel file using pandas containing 2 columns:
df
EID List of Tuples
1 [('Physics', 90)]
2 [('Physics', 80), ('Math', 70)]
3 [('Physics', 60, ('Math', 25))]
when I check df['List of Tuples'].iat[0]
it gives me u"[('Physics', 90)]"
I am getting this as a unicode and not as a list of tuples.
When I df['List of Tuples'].iat[0].decode('iso-8859-1').encode('utf-8')
, I get string: "[('Physics', 90)]"
I want to read/convert it as list of tuples [('Physics', 90)]
instead of string or unicode.In short,I want to get rid of double quotes around each entry and read it as [('Physics', 90)]
and [('Physics', 80), ('Math', 70)]
and so on.
python pandas
python pandas
edited Nov 21 at 21:29
Ken Dekalb
15911
15911
asked Nov 21 at 19:52
amanda smith
174
174
1
The formatting is all over the place and I can't fix from my phone, but if you're hoping to use dataframes for lists of tuples in single cells, you're opening a door to a lot of pain. That's not how pandas works. You should shoot for a structure that has scalars in cells or, IMO, drop pandas if you can't do that.
– roganjosh
Nov 21 at 19:56
add a comment |
1
The formatting is all over the place and I can't fix from my phone, but if you're hoping to use dataframes for lists of tuples in single cells, you're opening a door to a lot of pain. That's not how pandas works. You should shoot for a structure that has scalars in cells or, IMO, drop pandas if you can't do that.
– roganjosh
Nov 21 at 19:56
1
1
The formatting is all over the place and I can't fix from my phone, but if you're hoping to use dataframes for lists of tuples in single cells, you're opening a door to a lot of pain. That's not how pandas works. You should shoot for a structure that has scalars in cells or, IMO, drop pandas if you can't do that.
– roganjosh
Nov 21 at 19:56
The formatting is all over the place and I can't fix from my phone, but if you're hoping to use dataframes for lists of tuples in single cells, you're opening a door to a lot of pain. That's not how pandas works. You should shoot for a structure that has scalars in cells or, IMO, drop pandas if you can't do that.
– roganjosh
Nov 21 at 19:56
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
You might find it useful to parse these into python objects using ast
which can convert string representations back into python objectd. Try something like the following (I can't replicate exactly because I don't have your data):
import ast
df['transformed_tuples'] = df['List of Tuples'].apply(ast.literal_eval)
To avoid this arising in the first place you might consider the file format you choose to read/write to, for example pickle will retain the original information (I'm assuming this has come from a pandas DataFrame that has been saved to excel).
You might also consider a tabular schema which doesn't have this irregular data type within it which would probably prove to be more stable and effective in the long run.
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
You might find it useful to parse these into python objects using ast
which can convert string representations back into python objectd. Try something like the following (I can't replicate exactly because I don't have your data):
import ast
df['transformed_tuples'] = df['List of Tuples'].apply(ast.literal_eval)
To avoid this arising in the first place you might consider the file format you choose to read/write to, for example pickle will retain the original information (I'm assuming this has come from a pandas DataFrame that has been saved to excel).
You might also consider a tabular schema which doesn't have this irregular data type within it which would probably prove to be more stable and effective in the long run.
add a comment |
up vote
0
down vote
accepted
You might find it useful to parse these into python objects using ast
which can convert string representations back into python objectd. Try something like the following (I can't replicate exactly because I don't have your data):
import ast
df['transformed_tuples'] = df['List of Tuples'].apply(ast.literal_eval)
To avoid this arising in the first place you might consider the file format you choose to read/write to, for example pickle will retain the original information (I'm assuming this has come from a pandas DataFrame that has been saved to excel).
You might also consider a tabular schema which doesn't have this irregular data type within it which would probably prove to be more stable and effective in the long run.
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
You might find it useful to parse these into python objects using ast
which can convert string representations back into python objectd. Try something like the following (I can't replicate exactly because I don't have your data):
import ast
df['transformed_tuples'] = df['List of Tuples'].apply(ast.literal_eval)
To avoid this arising in the first place you might consider the file format you choose to read/write to, for example pickle will retain the original information (I'm assuming this has come from a pandas DataFrame that has been saved to excel).
You might also consider a tabular schema which doesn't have this irregular data type within it which would probably prove to be more stable and effective in the long run.
You might find it useful to parse these into python objects using ast
which can convert string representations back into python objectd. Try something like the following (I can't replicate exactly because I don't have your data):
import ast
df['transformed_tuples'] = df['List of Tuples'].apply(ast.literal_eval)
To avoid this arising in the first place you might consider the file format you choose to read/write to, for example pickle will retain the original information (I'm assuming this has come from a pandas DataFrame that has been saved to excel).
You might also consider a tabular schema which doesn't have this irregular data type within it which would probably prove to be more stable and effective in the long run.
edited Nov 21 at 20:02
answered Nov 21 at 19:57
Sven Harris
1,6871211
1,6871211
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53419579%2freading-from-excel-file-pandas-in-the-desired-type%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
The formatting is all over the place and I can't fix from my phone, but if you're hoping to use dataframes for lists of tuples in single cells, you're opening a door to a lot of pain. That's not how pandas works. You should shoot for a structure that has scalars in cells or, IMO, drop pandas if you can't do that.
– roganjosh
Nov 21 at 19:56