populating pandas columns based on values in other columns

My dataframe contains these columns

ID   Address1   Address1-State   Address1-City  Address2  Address2-State  Address2-City   Address    State      City

 1    6th street   MN              Mpls

 2                                               15th St      MI           Flint

 3                 MA              Boston          Essex St   NY            New York

 4    7 street SE  MN              Mpls           8th St      IL             Chicago

Now I want to populate Address field in such a way that if Address1 is blank,populate Address2 and the state city fields of Address2

In the above case final dataframe would be like this

 ID     Address   State    City

  1      6th street   MN    Mpls

  2      15th St      MI    Flint

  3      Essex St     NY     New York

  4     7 street SE  MN     Mpls

Currently,I am doing this

def fill_add(address1,address2):

if address1!='':

    address=address1



elif address1=='' and address2!='':

    address=address2

elif address1=='' and address2=='':

     address=''



return address





def fill_add_apply(df):

df['Address']=df.apply(lambda row:fill_add(row['Address1'],row['Address2']),axis=1)

Do I have to do the same for all the other columns?Is there a better way?

Just to clarify, in ID=3 Address,State,City should be "Essex St NY New York" because Address 1 is blank so it should choose Address2 and the city and state of Address2.
In short, if Address1 is blank it should choose Address2,Address2-State and Address2-City,even if Address1-State and Address1-City are not blank.

edited Nov 23 at 4:15

asked Nov 23 at 3:34

amanda smith

244

add a comment |

My dataframe contains these columns

ID   Address1   Address1-State   Address1-City  Address2  Address2-State  Address2-City   Address    State      City

 1    6th street   MN              Mpls

 2                                               15th St      MI           Flint

 3                 MA              Boston          Essex St   NY            New York

 4    7 street SE  MN              Mpls           8th St      IL             Chicago

Now I want to populate Address field in such a way that if Address1 is blank,populate Address2 and the state city fields of Address2

In the above case final dataframe would be like this

 ID     Address   State    City

  1      6th street   MN    Mpls

  2      15th St      MI    Flint

  3      Essex St     NY     New York

  4     7 street SE  MN     Mpls

Currently,I am doing this

def fill_add(address1,address2):

if address1!='':

    address=address1



elif address1=='' and address2!='':

    address=address2

elif address1=='' and address2=='':

     address=''



return address





def fill_add_apply(df):

df['Address']=df.apply(lambda row:fill_add(row['Address1'],row['Address2']),axis=1)

Do I have to do the same for all the other columns?Is there a better way?

edited Nov 23 at 4:15

asked Nov 23 at 3:34

amanda smith

244

add a comment |

My dataframe contains these columns

ID   Address1   Address1-State   Address1-City  Address2  Address2-State  Address2-City   Address    State      City

 1    6th street   MN              Mpls

 2                                               15th St      MI           Flint

 3                 MA              Boston          Essex St   NY            New York

 4    7 street SE  MN              Mpls           8th St      IL             Chicago

Now I want to populate Address field in such a way that if Address1 is blank,populate Address2 and the state city fields of Address2

In the above case final dataframe would be like this

 ID     Address   State    City

  1      6th street   MN    Mpls

  2      15th St      MI    Flint

  3      Essex St     NY     New York

  4     7 street SE  MN     Mpls

Currently,I am doing this

def fill_add(address1,address2):

if address1!='':

    address=address1



elif address1=='' and address2!='':

    address=address2

elif address1=='' and address2=='':

     address=''



return address





def fill_add_apply(df):

df['Address']=df.apply(lambda row:fill_add(row['Address1'],row['Address2']),axis=1)

Do I have to do the same for all the other columns?Is there a better way?

edited Nov 23 at 4:15

asked Nov 23 at 3:34

amanda smith

244

My dataframe contains these columns

ID   Address1   Address1-State   Address1-City  Address2  Address2-State  Address2-City   Address    State      City

 1    6th street   MN              Mpls

 2                                               15th St      MI           Flint

 3                 MA              Boston          Essex St   NY            New York

 4    7 street SE  MN              Mpls           8th St      IL             Chicago

Now I want to populate Address field in such a way that if Address1 is blank,populate Address2 and the state city fields of Address2

In the above case final dataframe would be like this

 ID     Address   State    City

  1      6th street   MN    Mpls

  2      15th St      MI    Flint

  3      Essex St     NY     New York

  4     7 street SE  MN     Mpls

Currently,I am doing this

def fill_add(address1,address2):

if address1!='':

    address=address1



elif address1=='' and address2!='':

    address=address2

elif address1=='' and address2=='':

     address=''



return address





def fill_add_apply(df):

df['Address']=df.apply(lambda row:fill_add(row['Address1'],row['Address2']),axis=1)

Do I have to do the same for all the other columns?Is there a better way?

python pandas

edited Nov 23 at 4:15

asked Nov 23 at 3:34

amanda smith

244

edited Nov 23 at 4:15

asked Nov 23 at 3:34

amanda smith

244

edited Nov 23 at 4:15

asked Nov 23 at 3:34

amanda smith

244

asked Nov 23 at 3:34

amanda smith

244

asked Nov 23 at 3:34

amanda smith

244

add a comment |

3 Answers
3

active

oldest

votes

1st modify your column , then using groupby +first

df=df.replace('',np.nan)#prepare for first 



df.columns=df.columns.str.replace('d+','')

df.columns=df.columns.str.split('-').str[-1]

newdf=df.groupby(level=0,axis=1).first()

newdf.loc[df.iloc[:,1].isnull(),:]=df.groupby(level=0,axis=1).last()

newdf

Out[40]: 

       Address      City  ID State

0  6th street       Mpls   1    MN

1      15th St     Flint   2    MI

2    Essexb St  New York   3    NY

3  7 street SE      Mpls   4    MN

edited Nov 23 at 4:10

answered Nov 23 at 4:03

W-B

100k73163

Thanks! But in ID=3 it should be "Essex St NY New York" because Address 1 is blank so it should choose Address2 and the city and state of Address2.
– amanda smith
Nov 23 at 4:07

@amandasmith check the update, also if this is what you need , would you like accept it ? check mark at the left
– W-B
Nov 23 at 4:10

add a comment |

(Given you do not have any duplicate index)

Select indices you want to fill with Adress1:

Address1_index = df.loc[!df.Address1.empty() and !df.Address1-State.empty() and !df.Address1-City.empty()].index

then put Address1 data in your desired columns:

df.loc[Adress1_index, ["Adress", "State", "City"]] = df.loc[Adress1_index, ["Adress1", "Adress1-State", "Adress1-City"]]

Now select indices you want to fill with address2:

Address2_index = df.loc[df.Adress1.empty() or df.Adress1-State.empty() or df.Adress1-City.empty()].index

then fill these also:

df.loc[Adress2_index, ["Adress", "State", "City"]] = df.loc[Adress2_index, ["Adress2", "Adress2-State", "Adress2-City"]]

Remove columns that you do not want:

df.drop(["Address1", "Adress1-State", "Adress1-City", "Address2", "Adress2-State", "Adress2-City"], axis = 1, inplace = True)

edited Nov 23 at 4:24

answered Nov 23 at 4:19

Joseph Choi

788

add a comment |

import numpy as np



df=df.replace('',np.nan)



addr_1=['ID','Address1','Address1-State','Address1-City']

addr_2=['ID','Address2','Address2-State','Address2-City']



new_df=pd.DataFrame(df[addr_1].values.copy(),columns=['ID','Address','State','City'])



new_df.loc[new_df['Address'].isnull(),:]=df.loc[df['Address1'].isnull(),addr_2].values



#print(new_df)

    ID  Address     State   City

0   1   6th street  MN      Mpls

1   2   15th St     MI      Flint

2   3   Essex St    NY      New York

3   4   7 street SE MN      Mpls

edited Nov 23 at 5:16

answered Nov 23 at 5:10

pyd

1,8241924

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440366%2fpopulating-pandas-columns-based-on-values-in-other-columns%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

1st modify your column , then using groupby +first

df=df.replace('',np.nan)#prepare for first 



df.columns=df.columns.str.replace('d+','')

df.columns=df.columns.str.split('-').str[-1]

newdf=df.groupby(level=0,axis=1).first()

newdf.loc[df.iloc[:,1].isnull(),:]=df.groupby(level=0,axis=1).last()

newdf

Out[40]: 

       Address      City  ID State

0  6th street       Mpls   1    MN

1      15th St     Flint   2    MI

2    Essexb St  New York   3    NY

3  7 street SE      Mpls   4    MN

edited Nov 23 at 4:10

answered Nov 23 at 4:03

W-B

100k73163

Thanks! But in ID=3 it should be "Essex St NY New York" because Address 1 is blank so it should choose Address2 and the city and state of Address2.
– amanda smith
Nov 23 at 4:07

@amandasmith check the update, also if this is what you need , would you like accept it ? check mark at the left
– W-B
Nov 23 at 4:10

add a comment |

1st modify your column , then using groupby +first

df=df.replace('',np.nan)#prepare for first 



df.columns=df.columns.str.replace('d+','')

df.columns=df.columns.str.split('-').str[-1]

newdf=df.groupby(level=0,axis=1).first()

newdf.loc[df.iloc[:,1].isnull(),:]=df.groupby(level=0,axis=1).last()

newdf

Out[40]: 

       Address      City  ID State

0  6th street       Mpls   1    MN

1      15th St     Flint   2    MI

2    Essexb St  New York   3    NY

3  7 street SE      Mpls   4    MN

edited Nov 23 at 4:10

answered Nov 23 at 4:03

W-B

100k73163

Thanks! But in ID=3 it should be "Essex St NY New York" because Address 1 is blank so it should choose Address2 and the city and state of Address2.
– amanda smith
Nov 23 at 4:07

@amandasmith check the update, also if this is what you need , would you like accept it ? check mark at the left
– W-B
Nov 23 at 4:10

add a comment |

1st modify your column , then using groupby +first

df=df.replace('',np.nan)#prepare for first 



df.columns=df.columns.str.replace('d+','')

df.columns=df.columns.str.split('-').str[-1]

newdf=df.groupby(level=0,axis=1).first()

newdf.loc[df.iloc[:,1].isnull(),:]=df.groupby(level=0,axis=1).last()

newdf

Out[40]: 

       Address      City  ID State

0  6th street       Mpls   1    MN

1      15th St     Flint   2    MI

2    Essexb St  New York   3    NY

3  7 street SE      Mpls   4    MN

edited Nov 23 at 4:10

answered Nov 23 at 4:03

W-B

100k73163

1st modify your column , then using groupby +first

df=df.replace('',np.nan)#prepare for first 



df.columns=df.columns.str.replace('d+','')

df.columns=df.columns.str.split('-').str[-1]

newdf=df.groupby(level=0,axis=1).first()

newdf.loc[df.iloc[:,1].isnull(),:]=df.groupby(level=0,axis=1).last()

newdf

Out[40]: 

       Address      City  ID State

0  6th street       Mpls   1    MN

1      15th St     Flint   2    MI

2    Essexb St  New York   3    NY

3  7 street SE      Mpls   4    MN

edited Nov 23 at 4:10

answered Nov 23 at 4:03

W-B

100k73163

edited Nov 23 at 4:10

answered Nov 23 at 4:03

W-B

100k73163

answered Nov 23 at 4:03

W-B

100k73163

answered Nov 23 at 4:03

W-B

100k73163

Thanks! But in ID=3 it should be "Essex St NY New York" because Address 1 is blank so it should choose Address2 and the city and state of Address2.
– amanda smith
Nov 23 at 4:07

@amandasmith check the update, also if this is what you need , would you like accept it ? check mark at the left
– W-B
Nov 23 at 4:10

add a comment |

Thanks! But in ID=3 it should be "Essex St NY New York" because Address 1 is blank so it should choose Address2 and the city and state of Address2.
– amanda smith
Nov 23 at 4:07

@amandasmith check the update, also if this is what you need , would you like accept it ? check mark at the left
– W-B
Nov 23 at 4:10

Thanks! But in ID=3 it should be "Essex St NY New York" because Address 1 is blank so it should choose Address2 and the city and state of Address2.
– amanda smith
Nov 23 at 4:07

@amandasmith check the update, also if this is what you need , would you like accept it ? check mark at the left
– W-B
Nov 23 at 4:10

add a comment |

(Given you do not have any duplicate index)

Select indices you want to fill with Adress1:

Address1_index = df.loc[!df.Address1.empty() and !df.Address1-State.empty() and !df.Address1-City.empty()].index

then put Address1 data in your desired columns:

df.loc[Adress1_index, ["Adress", "State", "City"]] = df.loc[Adress1_index, ["Adress1", "Adress1-State", "Adress1-City"]]

Now select indices you want to fill with address2:

Address2_index = df.loc[df.Adress1.empty() or df.Adress1-State.empty() or df.Adress1-City.empty()].index

then fill these also:

df.loc[Adress2_index, ["Adress", "State", "City"]] = df.loc[Adress2_index, ["Adress2", "Adress2-State", "Adress2-City"]]

Remove columns that you do not want:

df.drop(["Address1", "Adress1-State", "Adress1-City", "Address2", "Adress2-State", "Adress2-City"], axis = 1, inplace = True)

edited Nov 23 at 4:24

answered Nov 23 at 4:19

Joseph Choi

788

add a comment |

(Given you do not have any duplicate index)

Select indices you want to fill with Adress1:

Address1_index = df.loc[!df.Address1.empty() and !df.Address1-State.empty() and !df.Address1-City.empty()].index

then put Address1 data in your desired columns:

df.loc[Adress1_index, ["Adress", "State", "City"]] = df.loc[Adress1_index, ["Adress1", "Adress1-State", "Adress1-City"]]

Now select indices you want to fill with address2:

Address2_index = df.loc[df.Adress1.empty() or df.Adress1-State.empty() or df.Adress1-City.empty()].index

then fill these also:

df.loc[Adress2_index, ["Adress", "State", "City"]] = df.loc[Adress2_index, ["Adress2", "Adress2-State", "Adress2-City"]]

Remove columns that you do not want:

df.drop(["Address1", "Adress1-State", "Adress1-City", "Address2", "Adress2-State", "Adress2-City"], axis = 1, inplace = True)

edited Nov 23 at 4:24

answered Nov 23 at 4:19

Joseph Choi

788

add a comment |

(Given you do not have any duplicate index)

Select indices you want to fill with Adress1:

Address1_index = df.loc[!df.Address1.empty() and !df.Address1-State.empty() and !df.Address1-City.empty()].index

then put Address1 data in your desired columns:

df.loc[Adress1_index, ["Adress", "State", "City"]] = df.loc[Adress1_index, ["Adress1", "Adress1-State", "Adress1-City"]]

Now select indices you want to fill with address2:

Address2_index = df.loc[df.Adress1.empty() or df.Adress1-State.empty() or df.Adress1-City.empty()].index

then fill these also:

df.loc[Adress2_index, ["Adress", "State", "City"]] = df.loc[Adress2_index, ["Adress2", "Adress2-State", "Adress2-City"]]

Remove columns that you do not want:

df.drop(["Address1", "Adress1-State", "Adress1-City", "Address2", "Adress2-State", "Adress2-City"], axis = 1, inplace = True)

edited Nov 23 at 4:24

answered Nov 23 at 4:19

Joseph Choi

788

(Given you do not have any duplicate index)

Select indices you want to fill with Adress1:

Address1_index = df.loc[!df.Address1.empty() and !df.Address1-State.empty() and !df.Address1-City.empty()].index

then put Address1 data in your desired columns:

df.loc[Adress1_index, ["Adress", "State", "City"]] = df.loc[Adress1_index, ["Adress1", "Adress1-State", "Adress1-City"]]

Now select indices you want to fill with address2:

Address2_index = df.loc[df.Adress1.empty() or df.Adress1-State.empty() or df.Adress1-City.empty()].index

then fill these also:

df.loc[Adress2_index, ["Adress", "State", "City"]] = df.loc[Adress2_index, ["Adress2", "Adress2-State", "Adress2-City"]]

Remove columns that you do not want:

df.drop(["Address1", "Adress1-State", "Adress1-City", "Address2", "Adress2-State", "Adress2-City"], axis = 1, inplace = True)

edited Nov 23 at 4:24

answered Nov 23 at 4:19

Joseph Choi

788

edited Nov 23 at 4:24

answered Nov 23 at 4:19

Joseph Choi

788

answered Nov 23 at 4:19

Joseph Choi

788

answered Nov 23 at 4:19

Joseph Choi

788

add a comment |

import numpy as np



df=df.replace('',np.nan)



addr_1=['ID','Address1','Address1-State','Address1-City']

addr_2=['ID','Address2','Address2-State','Address2-City']



new_df=pd.DataFrame(df[addr_1].values.copy(),columns=['ID','Address','State','City'])



new_df.loc[new_df['Address'].isnull(),:]=df.loc[df['Address1'].isnull(),addr_2].values



#print(new_df)

    ID  Address     State   City

0   1   6th street  MN      Mpls

1   2   15th St     MI      Flint

2   3   Essex St    NY      New York

3   4   7 street SE MN      Mpls

edited Nov 23 at 5:16

answered Nov 23 at 5:10

pyd

1,8241924

add a comment |

import numpy as np



df=df.replace('',np.nan)



addr_1=['ID','Address1','Address1-State','Address1-City']

addr_2=['ID','Address2','Address2-State','Address2-City']



new_df=pd.DataFrame(df[addr_1].values.copy(),columns=['ID','Address','State','City'])



new_df.loc[new_df['Address'].isnull(),:]=df.loc[df['Address1'].isnull(),addr_2].values



#print(new_df)

    ID  Address     State   City

0   1   6th street  MN      Mpls

1   2   15th St     MI      Flint

2   3   Essex St    NY      New York

3   4   7 street SE MN      Mpls

edited Nov 23 at 5:16

answered Nov 23 at 5:10

pyd

1,8241924

add a comment |

import numpy as np



df=df.replace('',np.nan)



addr_1=['ID','Address1','Address1-State','Address1-City']

addr_2=['ID','Address2','Address2-State','Address2-City']



new_df=pd.DataFrame(df[addr_1].values.copy(),columns=['ID','Address','State','City'])



new_df.loc[new_df['Address'].isnull(),:]=df.loc[df['Address1'].isnull(),addr_2].values



#print(new_df)

    ID  Address     State   City

0   1   6th street  MN      Mpls

1   2   15th St     MI      Flint

2   3   Essex St    NY      New York

3   4   7 street SE MN      Mpls

edited Nov 23 at 5:16

answered Nov 23 at 5:10

pyd

1,8241924

import numpy as np



df=df.replace('',np.nan)



addr_1=['ID','Address1','Address1-State','Address1-City']

addr_2=['ID','Address2','Address2-State','Address2-City']



new_df=pd.DataFrame(df[addr_1].values.copy(),columns=['ID','Address','State','City'])



new_df.loc[new_df['Address'].isnull(),:]=df.loc[df['Address1'].isnull(),addr_2].values



#print(new_df)

    ID  Address     State   City

0   1   6th street  MN      Mpls

1   2   15th St     MI      Flint

2   3   Essex St    NY      New York

3   4   7 street SE MN      Mpls

edited Nov 23 at 5:16

answered Nov 23 at 5:10

pyd

1,8241924

edited Nov 23 at 5:16

answered Nov 23 at 5:10

pyd

1,8241924

answered Nov 23 at 5:10

pyd

1,8241924

answered Nov 23 at 5:10

pyd

1,8241924

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl