How to replace a float value with NaN in pandas?

up vote
1
down vote

favorite

I'm aware about the replace function in pandas: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.replace.html

But I've done this simple test and it is not working as expected when I try to replace a float value:

import pandas as pd

import numpy as np



df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737



df = df.replace(1.437202, np.nan)

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737

As you see the [[0],[0]] has no change...any idea about what this could be due to?

asked Nov 22 at 8:35

ralvarez

961111

add a comment |

up vote
1
down vote

favorite

I'm aware about the replace function in pandas: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.replace.html

But I've done this simple test and it is not working as expected when I try to replace a float value:

import pandas as pd

import numpy as np



df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737



df = df.replace(1.437202, np.nan)

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737

As you see the [[0],[0]] has no change...any idea about what this could be due to?

asked Nov 22 at 8:35

ralvarez

961111

add a comment |

up vote
1
down vote

favorite

I'm aware about the replace function in pandas: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.replace.html

But I've done this simple test and it is not working as expected when I try to replace a float value:

import pandas as pd

import numpy as np



df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737



df = df.replace(1.437202, np.nan)

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737

As you see the [[0],[0]] has no change...any idea about what this could be due to?

asked Nov 22 at 8:35

ralvarez

961111

I'm aware about the replace function in pandas: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.replace.html

But I've done this simple test and it is not working as expected when I try to replace a float value:

import pandas as pd

import numpy as np



df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737



df = df.replace(1.437202, np.nan)

print(df.head(n=1))



      A         B        C         D

0  1.437202  1.919894 -1.40674 -0.316737

As you see the [[0],[0]] has no change...any idea about what this could be due to?

python pandas replace nan

asked Nov 22 at 8:35

ralvarez

961111

asked Nov 22 at 8:35

ralvarez

961111

asked Nov 22 at 8:35

ralvarez

961111

asked Nov 22 at 8:35

ralvarez

961111

asked Nov 22 at 8:35

ralvarez

961111

add a comment |

2 Answers
2

active

oldest

votes

up vote
3
down vote

Problem is float precision, so use function numpy.isclose with mask:

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))

          A         B         C         D

0 -1.085631  0.997345  0.282978 -1.506295



df = df.mask(np.isclose(df.values, 0.997345))

Or use numpy.where:

arr = np.where(np.isclose(df.values, 0.997345), np.nan, df.values)

df = pd.DataFrame(arr, index=df.index, columns=df.columns)

print(df.head(n=1))

          A   B         C         D

0 -1.085631 NaN  0.282978 -1.506295

EDIT: You can also get only numeric columns by select_dtypes for filtering by subset with :

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD')).assign(E='a')



cols = df.select_dtypes(np.number).columns

df[cols] = df[cols].mask(np.isclose(df[cols].values, 0.997345))

print(df.head(n=1))

          A   B         C         D  E

0 -1.085631 NaN  0.282978 -1.506295  a

edited Nov 22 at 9:50

answered Nov 22 at 8:38

jezrael

315k22254332

Good options indeed, but they both fail if the column's datatypes are not all numeric. If I set a random value with a string df.iloc[[0], [0]] = 'random_string' and then I try to apply both methods over the whole dataset, they return an error TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
– ralvarez
Nov 22 at 9:43

Maybe I should have explained that before. The example I gave was only with numeric values, but I'm looking for a method that should work with features of any type'
– ralvarez
Nov 22 at 9:49

@ralvarez - Added general solution - filtering only numeric columns and apply solution
– jezrael
Nov 22 at 9:53

add a comment |

up vote
0
down vote

Just a another trick for specific indices :

>>> print(df.head(n=1))

          A         B         C         D

0 -0.042839  1.701118  0.064779  1.513046



>>> df['A'][0] = np.nan



>>> print(df.head(n=1))

    A         B         C         D

0 NaN  1.701118  0.064779  1.513046

answered Nov 22 at 9:52

pygo

1,7361416

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53426787%2fhow-to-replace-a-float-value-with-nan-in-pandas%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
3
down vote

Problem is float precision, so use function numpy.isclose with mask:

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))

          A         B         C         D

0 -1.085631  0.997345  0.282978 -1.506295



df = df.mask(np.isclose(df.values, 0.997345))

Or use numpy.where:

arr = np.where(np.isclose(df.values, 0.997345), np.nan, df.values)

df = pd.DataFrame(arr, index=df.index, columns=df.columns)

print(df.head(n=1))

          A   B         C         D

0 -1.085631 NaN  0.282978 -1.506295

EDIT: You can also get only numeric columns by select_dtypes for filtering by subset with :

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD')).assign(E='a')



cols = df.select_dtypes(np.number).columns

df[cols] = df[cols].mask(np.isclose(df[cols].values, 0.997345))

print(df.head(n=1))

          A   B         C         D  E

0 -1.085631 NaN  0.282978 -1.506295  a

edited Nov 22 at 9:50

answered Nov 22 at 8:38

jezrael

315k22254332

Good options indeed, but they both fail if the column's datatypes are not all numeric. If I set a random value with a string df.iloc[[0], [0]] = 'random_string' and then I try to apply both methods over the whole dataset, they return an error TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
– ralvarez
Nov 22 at 9:43

Maybe I should have explained that before. The example I gave was only with numeric values, but I'm looking for a method that should work with features of any type'
– ralvarez
Nov 22 at 9:49

@ralvarez - Added general solution - filtering only numeric columns and apply solution
– jezrael
Nov 22 at 9:53

add a comment |

up vote
3
down vote

Problem is float precision, so use function numpy.isclose with mask:

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))

          A         B         C         D

0 -1.085631  0.997345  0.282978 -1.506295



df = df.mask(np.isclose(df.values, 0.997345))

Or use numpy.where:

arr = np.where(np.isclose(df.values, 0.997345), np.nan, df.values)

df = pd.DataFrame(arr, index=df.index, columns=df.columns)

print(df.head(n=1))

          A   B         C         D

0 -1.085631 NaN  0.282978 -1.506295

EDIT: You can also get only numeric columns by select_dtypes for filtering by subset with :

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD')).assign(E='a')



cols = df.select_dtypes(np.number).columns

df[cols] = df[cols].mask(np.isclose(df[cols].values, 0.997345))

print(df.head(n=1))

          A   B         C         D  E

0 -1.085631 NaN  0.282978 -1.506295  a

edited Nov 22 at 9:50

answered Nov 22 at 8:38

jezrael

315k22254332

Good options indeed, but they both fail if the column's datatypes are not all numeric. If I set a random value with a string df.iloc[[0], [0]] = 'random_string' and then I try to apply both methods over the whole dataset, they return an error TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
– ralvarez
Nov 22 at 9:43

Maybe I should have explained that before. The example I gave was only with numeric values, but I'm looking for a method that should work with features of any type'
– ralvarez
Nov 22 at 9:49

@ralvarez - Added general solution - filtering only numeric columns and apply solution
– jezrael
Nov 22 at 9:53

add a comment |

up vote
3
down vote

Problem is float precision, so use function numpy.isclose with mask:

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))

          A         B         C         D

0 -1.085631  0.997345  0.282978 -1.506295



df = df.mask(np.isclose(df.values, 0.997345))

Or use numpy.where:

arr = np.where(np.isclose(df.values, 0.997345), np.nan, df.values)

df = pd.DataFrame(arr, index=df.index, columns=df.columns)

print(df.head(n=1))

          A   B         C         D

0 -1.085631 NaN  0.282978 -1.506295

EDIT: You can also get only numeric columns by select_dtypes for filtering by subset with :

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD')).assign(E='a')



cols = df.select_dtypes(np.number).columns

df[cols] = df[cols].mask(np.isclose(df[cols].values, 0.997345))

print(df.head(n=1))

          A   B         C         D  E

0 -1.085631 NaN  0.282978 -1.506295  a

edited Nov 22 at 9:50

answered Nov 22 at 8:38

jezrael

315k22254332

Problem is float precision, so use function numpy.isclose with mask:

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD'))

print(df.head(n=1))

          A         B         C         D

0 -1.085631  0.997345  0.282978 -1.506295



df = df.mask(np.isclose(df.values, 0.997345))

Or use numpy.where:

arr = np.where(np.isclose(df.values, 0.997345), np.nan, df.values)

df = pd.DataFrame(arr, index=df.index, columns=df.columns)

print(df.head(n=1))

          A   B         C         D

0 -1.085631 NaN  0.282978 -1.506295

EDIT: You can also get only numeric columns by select_dtypes for filtering by subset with :

np.random.seed(123)

df = pd.DataFrame(np.random.randn(50, 4), columns=list('ABCD')).assign(E='a')



cols = df.select_dtypes(np.number).columns

df[cols] = df[cols].mask(np.isclose(df[cols].values, 0.997345))

print(df.head(n=1))

          A   B         C         D  E

0 -1.085631 NaN  0.282978 -1.506295  a

edited Nov 22 at 9:50

answered Nov 22 at 8:38

jezrael

315k22254332

edited Nov 22 at 9:50

answered Nov 22 at 8:38

jezrael

315k22254332

answered Nov 22 at 8:38

jezrael

315k22254332

answered Nov 22 at 8:38

jezrael

315k22254332

Good options indeed, but they both fail if the column's datatypes are not all numeric. If I set a random value with a string df.iloc[[0], [0]] = 'random_string' and then I try to apply both methods over the whole dataset, they return an error TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
– ralvarez
Nov 22 at 9:43

Maybe I should have explained that before. The example I gave was only with numeric values, but I'm looking for a method that should work with features of any type'
– ralvarez
Nov 22 at 9:49

@ralvarez - Added general solution - filtering only numeric columns and apply solution
– jezrael
Nov 22 at 9:53

add a comment |

Good options indeed, but they both fail if the column's datatypes are not all numeric. If I set a random value with a string df.iloc[[0], [0]] = 'random_string' and then I try to apply both methods over the whole dataset, they return an error TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
– ralvarez
Nov 22 at 9:43

Maybe I should have explained that before. The example I gave was only with numeric values, but I'm looking for a method that should work with features of any type'
– ralvarez
Nov 22 at 9:49

@ralvarez - Added general solution - filtering only numeric columns and apply solution
– jezrael
Nov 22 at 9:53

Good options indeed, but they both fail if the column's datatypes are not all numeric. If I set a random value with a string df.iloc[[0], [0]] = 'random_string' and then I try to apply both methods over the whole dataset, they return an error

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

– ralvarez
Nov 22 at 9:43

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

– ralvarez
Nov 22 at 9:43

Maybe I should have explained that before. The example I gave was only with numeric values, but I'm looking for a method that should work with features of any type'
– ralvarez
Nov 22 at 9:49

@ralvarez - Added general solution - filtering only numeric columns and apply solution
– jezrael
Nov 22 at 9:53

add a comment |

up vote
0
down vote

Just a another trick for specific indices :

>>> print(df.head(n=1))

          A         B         C         D

0 -0.042839  1.701118  0.064779  1.513046



>>> df['A'][0] = np.nan



>>> print(df.head(n=1))

    A         B         C         D

0 NaN  1.701118  0.064779  1.513046

answered Nov 22 at 9:52

pygo

1,7361416

add a comment |

up vote
0
down vote

Just a another trick for specific indices :

>>> print(df.head(n=1))

          A         B         C         D

0 -0.042839  1.701118  0.064779  1.513046



>>> df['A'][0] = np.nan



>>> print(df.head(n=1))

    A         B         C         D

0 NaN  1.701118  0.064779  1.513046

answered Nov 22 at 9:52

pygo

1,7361416

add a comment |

up vote
0
down vote

Just a another trick for specific indices :

>>> print(df.head(n=1))

          A         B         C         D

0 -0.042839  1.701118  0.064779  1.513046



>>> df['A'][0] = np.nan



>>> print(df.head(n=1))

    A         B         C         D

0 NaN  1.701118  0.064779  1.513046

answered Nov 22 at 9:52

pygo

1,7361416

Just a another trick for specific indices :

>>> print(df.head(n=1))

          A         B         C         D

0 -0.042839  1.701118  0.064779  1.513046



>>> df['A'][0] = np.nan



>>> print(df.head(n=1))

    A         B         C         D

0 NaN  1.701118  0.064779  1.513046

answered Nov 22 at 9:52

pygo

1,7361416

answered Nov 22 at 9:52

pygo

1,7361416

answered Nov 22 at 9:52

pygo

1,7361416

answered Nov 22 at 9:52

pygo

1,7361416

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl