Pandas: compare list objects in Series
In my dataframe a column is made up of lists, for example:
df = pd.DataFrame({'A':[[1,2],[2,4],[3,1]]})
I need to find out the location of list [1,2] in this dataframe. I tried:
df.loc[df['A'] == [1,2]]
and
df.loc[df['A'] == [[1,2]]]
but failed totally. The comparison seems very simple but that just doesn't work. Am I missing something here?
python pandas
add a comment |
In my dataframe a column is made up of lists, for example:
df = pd.DataFrame({'A':[[1,2],[2,4],[3,1]]})
I need to find out the location of list [1,2] in this dataframe. I tried:
df.loc[df['A'] == [1,2]]
and
df.loc[df['A'] == [[1,2]]]
but failed totally. The comparison seems very simple but that just doesn't work. Am I missing something here?
python pandas
The only thing you're "missing" is that data frames aren't really great for storing lists. Any reason you don't want two separate columns?
– BallpointBen
Nov 1 '18 at 21:18
@BallpointBen Thanks for your attention, I've posted a new question to explain the whole question. stackoverflow.com/questions/53115592/…
– Shiang Hoo
Nov 2 '18 at 9:11
@Luuklag This may be a duplicate, but I don't believe it's a duplicate of the target you suggest. That one seems to be trying to filter based on whether multiple columns are equal to particular values. This one is trying to check if the list is equal to a single column's value, which has a very different answer.
– jpmc26
Nov 13 '18 at 22:19
Feel free to suggest a more appropriate target.
– Luuklag
Nov 13 '18 at 23:05
@Luuklag, I posted the two questions because I don't think they are the same. As jpmc described, they are connected but also very different. This post is actually the varietas of that one: I tried stupid things to solve that one and based on the stupid thing I posted this one. But this one still has its distinct value. Can you please remove the duplicate target?
– Shiang Hoo
Nov 19 '18 at 2:56
add a comment |
In my dataframe a column is made up of lists, for example:
df = pd.DataFrame({'A':[[1,2],[2,4],[3,1]]})
I need to find out the location of list [1,2] in this dataframe. I tried:
df.loc[df['A'] == [1,2]]
and
df.loc[df['A'] == [[1,2]]]
but failed totally. The comparison seems very simple but that just doesn't work. Am I missing something here?
python pandas
In my dataframe a column is made up of lists, for example:
df = pd.DataFrame({'A':[[1,2],[2,4],[3,1]]})
I need to find out the location of list [1,2] in this dataframe. I tried:
df.loc[df['A'] == [1,2]]
and
df.loc[df['A'] == [[1,2]]]
but failed totally. The comparison seems very simple but that just doesn't work. Am I missing something here?
python pandas
python pandas
edited Nov 1 '18 at 14:25
Seanny123
2,30743364
2,30743364
asked Nov 1 '18 at 13:53
Shiang HooShiang Hoo
794
794
The only thing you're "missing" is that data frames aren't really great for storing lists. Any reason you don't want two separate columns?
– BallpointBen
Nov 1 '18 at 21:18
@BallpointBen Thanks for your attention, I've posted a new question to explain the whole question. stackoverflow.com/questions/53115592/…
– Shiang Hoo
Nov 2 '18 at 9:11
@Luuklag This may be a duplicate, but I don't believe it's a duplicate of the target you suggest. That one seems to be trying to filter based on whether multiple columns are equal to particular values. This one is trying to check if the list is equal to a single column's value, which has a very different answer.
– jpmc26
Nov 13 '18 at 22:19
Feel free to suggest a more appropriate target.
– Luuklag
Nov 13 '18 at 23:05
@Luuklag, I posted the two questions because I don't think they are the same. As jpmc described, they are connected but also very different. This post is actually the varietas of that one: I tried stupid things to solve that one and based on the stupid thing I posted this one. But this one still has its distinct value. Can you please remove the duplicate target?
– Shiang Hoo
Nov 19 '18 at 2:56
add a comment |
The only thing you're "missing" is that data frames aren't really great for storing lists. Any reason you don't want two separate columns?
– BallpointBen
Nov 1 '18 at 21:18
@BallpointBen Thanks for your attention, I've posted a new question to explain the whole question. stackoverflow.com/questions/53115592/…
– Shiang Hoo
Nov 2 '18 at 9:11
@Luuklag This may be a duplicate, but I don't believe it's a duplicate of the target you suggest. That one seems to be trying to filter based on whether multiple columns are equal to particular values. This one is trying to check if the list is equal to a single column's value, which has a very different answer.
– jpmc26
Nov 13 '18 at 22:19
Feel free to suggest a more appropriate target.
– Luuklag
Nov 13 '18 at 23:05
@Luuklag, I posted the two questions because I don't think they are the same. As jpmc described, they are connected but also very different. This post is actually the varietas of that one: I tried stupid things to solve that one and based on the stupid thing I posted this one. But this one still has its distinct value. Can you please remove the duplicate target?
– Shiang Hoo
Nov 19 '18 at 2:56
The only thing you're "missing" is that data frames aren't really great for storing lists. Any reason you don't want two separate columns?
– BallpointBen
Nov 1 '18 at 21:18
The only thing you're "missing" is that data frames aren't really great for storing lists. Any reason you don't want two separate columns?
– BallpointBen
Nov 1 '18 at 21:18
@BallpointBen Thanks for your attention, I've posted a new question to explain the whole question. stackoverflow.com/questions/53115592/…
– Shiang Hoo
Nov 2 '18 at 9:11
@BallpointBen Thanks for your attention, I've posted a new question to explain the whole question. stackoverflow.com/questions/53115592/…
– Shiang Hoo
Nov 2 '18 at 9:11
@Luuklag This may be a duplicate, but I don't believe it's a duplicate of the target you suggest. That one seems to be trying to filter based on whether multiple columns are equal to particular values. This one is trying to check if the list is equal to a single column's value, which has a very different answer.
– jpmc26
Nov 13 '18 at 22:19
@Luuklag This may be a duplicate, but I don't believe it's a duplicate of the target you suggest. That one seems to be trying to filter based on whether multiple columns are equal to particular values. This one is trying to check if the list is equal to a single column's value, which has a very different answer.
– jpmc26
Nov 13 '18 at 22:19
Feel free to suggest a more appropriate target.
– Luuklag
Nov 13 '18 at 23:05
Feel free to suggest a more appropriate target.
– Luuklag
Nov 13 '18 at 23:05
@Luuklag, I posted the two questions because I don't think they are the same. As jpmc described, they are connected but also very different. This post is actually the varietas of that one: I tried stupid things to solve that one and based on the stupid thing I posted this one. But this one still has its distinct value. Can you please remove the duplicate target?
– Shiang Hoo
Nov 19 '18 at 2:56
@Luuklag, I posted the two questions because I don't think they are the same. As jpmc described, they are connected but also very different. This post is actually the varietas of that one: I tried stupid things to solve that one and based on the stupid thing I posted this one. But this one still has its distinct value. Can you please remove the duplicate target?
– Shiang Hoo
Nov 19 '18 at 2:56
add a comment |
5 Answers
5
active
oldest
votes
Do not use list
in cell, it creates a lot of problem for pandas
. If you do need an object
column, using tuple
:
df.A.map(tuple).isin([(1,2)])
Out[293]:
0 True
1 False
2 False
Name: A, dtype: bool
#df[df.A.map(tuple).isin([(1,2)])]
add a comment |
You can use apply
and compare as:
df['A'].apply(lambda x: x==[1,2])
0 True
1 False
2 False
Name: A, dtype: bool
print(df[df['A'].apply(lambda x: x==[1,2])])
A
0 [1, 2]
add a comment |
With Numpy arrays
df.assign(B=(np.array(df.A.tolist()) == [1, 2]).all(1))
A B
0 [1, 2] True
1 [2, 4] False
2 [3, 1] False
2
This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]
– jpp
Nov 1 '18 at 14:43
Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.
– ALollz
Nov 1 '18 at 15:40
2
@ALollz yes and yes
– piRSquared
Nov 1 '18 at 15:53
Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?
– Shiang Hoo
Nov 2 '18 at 2:41
add a comment |
Using numpy
df.A.apply(lambda x: (np.array(x) == np.array([1,2])).all())
0 True
1 False
2 False
add a comment |
Or:
df['A'].apply(([1,2]).__eq__)
Then:
df[df['A'].apply(([1,2]).__eq__)]
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53102731%2fpandas-compare-list-objects-in-series%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Do not use list
in cell, it creates a lot of problem for pandas
. If you do need an object
column, using tuple
:
df.A.map(tuple).isin([(1,2)])
Out[293]:
0 True
1 False
2 False
Name: A, dtype: bool
#df[df.A.map(tuple).isin([(1,2)])]
add a comment |
Do not use list
in cell, it creates a lot of problem for pandas
. If you do need an object
column, using tuple
:
df.A.map(tuple).isin([(1,2)])
Out[293]:
0 True
1 False
2 False
Name: A, dtype: bool
#df[df.A.map(tuple).isin([(1,2)])]
add a comment |
Do not use list
in cell, it creates a lot of problem for pandas
. If you do need an object
column, using tuple
:
df.A.map(tuple).isin([(1,2)])
Out[293]:
0 True
1 False
2 False
Name: A, dtype: bool
#df[df.A.map(tuple).isin([(1,2)])]
Do not use list
in cell, it creates a lot of problem for pandas
. If you do need an object
column, using tuple
:
df.A.map(tuple).isin([(1,2)])
Out[293]:
0 True
1 False
2 False
Name: A, dtype: bool
#df[df.A.map(tuple).isin([(1,2)])]
edited Nov 25 '18 at 19:42
anothermh
3,21331531
3,21331531
answered Nov 1 '18 at 13:56
W-BW-B
104k73165
104k73165
add a comment |
add a comment |
You can use apply
and compare as:
df['A'].apply(lambda x: x==[1,2])
0 True
1 False
2 False
Name: A, dtype: bool
print(df[df['A'].apply(lambda x: x==[1,2])])
A
0 [1, 2]
add a comment |
You can use apply
and compare as:
df['A'].apply(lambda x: x==[1,2])
0 True
1 False
2 False
Name: A, dtype: bool
print(df[df['A'].apply(lambda x: x==[1,2])])
A
0 [1, 2]
add a comment |
You can use apply
and compare as:
df['A'].apply(lambda x: x==[1,2])
0 True
1 False
2 False
Name: A, dtype: bool
print(df[df['A'].apply(lambda x: x==[1,2])])
A
0 [1, 2]
You can use apply
and compare as:
df['A'].apply(lambda x: x==[1,2])
0 True
1 False
2 False
Name: A, dtype: bool
print(df[df['A'].apply(lambda x: x==[1,2])])
A
0 [1, 2]
answered Nov 1 '18 at 13:56
Sandeep KadapaSandeep Kadapa
6,277429
6,277429
add a comment |
add a comment |
With Numpy arrays
df.assign(B=(np.array(df.A.tolist()) == [1, 2]).all(1))
A B
0 [1, 2] True
1 [2, 4] False
2 [3, 1] False
2
This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]
– jpp
Nov 1 '18 at 14:43
Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.
– ALollz
Nov 1 '18 at 15:40
2
@ALollz yes and yes
– piRSquared
Nov 1 '18 at 15:53
Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?
– Shiang Hoo
Nov 2 '18 at 2:41
add a comment |
With Numpy arrays
df.assign(B=(np.array(df.A.tolist()) == [1, 2]).all(1))
A B
0 [1, 2] True
1 [2, 4] False
2 [3, 1] False
2
This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]
– jpp
Nov 1 '18 at 14:43
Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.
– ALollz
Nov 1 '18 at 15:40
2
@ALollz yes and yes
– piRSquared
Nov 1 '18 at 15:53
Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?
– Shiang Hoo
Nov 2 '18 at 2:41
add a comment |
With Numpy arrays
df.assign(B=(np.array(df.A.tolist()) == [1, 2]).all(1))
A B
0 [1, 2] True
1 [2, 4] False
2 [3, 1] False
With Numpy arrays
df.assign(B=(np.array(df.A.tolist()) == [1, 2]).all(1))
A B
0 [1, 2] True
1 [2, 4] False
2 [3, 1] False
answered Nov 1 '18 at 14:34
piRSquaredpiRSquared
153k22144287
153k22144287
2
This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]
– jpp
Nov 1 '18 at 14:43
Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.
– ALollz
Nov 1 '18 at 15:40
2
@ALollz yes and yes
– piRSquared
Nov 1 '18 at 15:53
Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?
– Shiang Hoo
Nov 2 '18 at 2:41
add a comment |
2
This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]
– jpp
Nov 1 '18 at 14:43
Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.
– ALollz
Nov 1 '18 at 15:40
2
@ALollz yes and yes
– piRSquared
Nov 1 '18 at 15:53
Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?
– Shiang Hoo
Nov 2 '18 at 2:41
2
2
This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]
– jpp
Nov 1 '18 at 14:43
This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]
– jpp
Nov 1 '18 at 14:43
Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.
– ALollz
Nov 1 '18 at 15:40
Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.
– ALollz
Nov 1 '18 at 15:40
2
2
@ALollz yes and yes
– piRSquared
Nov 1 '18 at 15:53
@ALollz yes and yes
– piRSquared
Nov 1 '18 at 15:53
Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?
– Shiang Hoo
Nov 2 '18 at 2:41
Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?
– Shiang Hoo
Nov 2 '18 at 2:41
add a comment |
Using numpy
df.A.apply(lambda x: (np.array(x) == np.array([1,2])).all())
0 True
1 False
2 False
add a comment |
Using numpy
df.A.apply(lambda x: (np.array(x) == np.array([1,2])).all())
0 True
1 False
2 False
add a comment |
Using numpy
df.A.apply(lambda x: (np.array(x) == np.array([1,2])).all())
0 True
1 False
2 False
Using numpy
df.A.apply(lambda x: (np.array(x) == np.array([1,2])).all())
0 True
1 False
2 False
edited Nov 21 '18 at 0:28
answered Nov 1 '18 at 14:32
VaishaliVaishali
18.6k31030
18.6k31030
add a comment |
add a comment |
Or:
df['A'].apply(([1,2]).__eq__)
Then:
df[df['A'].apply(([1,2]).__eq__)]
add a comment |
Or:
df['A'].apply(([1,2]).__eq__)
Then:
df[df['A'].apply(([1,2]).__eq__)]
add a comment |
Or:
df['A'].apply(([1,2]).__eq__)
Then:
df[df['A'].apply(([1,2]).__eq__)]
Or:
df['A'].apply(([1,2]).__eq__)
Then:
df[df['A'].apply(([1,2]).__eq__)]
answered Nov 9 '18 at 4:18
U9-ForwardU9-Forward
14k21337
14k21337
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53102731%2fpandas-compare-list-objects-in-series%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
The only thing you're "missing" is that data frames aren't really great for storing lists. Any reason you don't want two separate columns?
– BallpointBen
Nov 1 '18 at 21:18
@BallpointBen Thanks for your attention, I've posted a new question to explain the whole question. stackoverflow.com/questions/53115592/…
– Shiang Hoo
Nov 2 '18 at 9:11
@Luuklag This may be a duplicate, but I don't believe it's a duplicate of the target you suggest. That one seems to be trying to filter based on whether multiple columns are equal to particular values. This one is trying to check if the list is equal to a single column's value, which has a very different answer.
– jpmc26
Nov 13 '18 at 22:19
Feel free to suggest a more appropriate target.
– Luuklag
Nov 13 '18 at 23:05
@Luuklag, I posted the two questions because I don't think they are the same. As jpmc described, they are connected but also very different. This post is actually the varietas of that one: I tried stupid things to solve that one and based on the stupid thing I posted this one. But this one still has its distinct value. Can you please remove the duplicate target?
– Shiang Hoo
Nov 19 '18 at 2:56