Pandas pd.read_csv does not work for simple sep=','

Good afternoon, everybody.

I know that it is quite an easy question, although, I simply do not understand why it does not work the way I expected.

The task is as following:

I have a file data.csv presented in this format:

id,"feature_1","feature_2","feature_3"

00100429,"PROTO","Proprietary","Phone"

00100429,"PROTO","Proprietary","Phone"

The thing is to import this data using pandas. I know that by default pandas read_csv uses comma separator, so I just imported it as following:

data = pd.read_csv('data.csv')

And the result I got is the one I presented at the beginning with no change at all. I mean one column which contains everything.

I tried many other separators using regex, and the only one that made some sort of improvement was:

data = pd.read_csv('data.csv',sep=",",engine='python')

On the one hand it finally separated all columns, on the other hand the way data is presented is not that convenient to use. In particular:

"id         ""feature_1""   ""feature_2""   ""feature_3"""

"00100429   ""PROTO""       ""Proprietary"" ""Phone"""

Therefore, I think that somewhere must be a mistake, because the data seems to be fine.

So the question is - how to import csv file with separated columns and no triple quote symbols?

Thank you.

asked Nov 24 '18 at 7:01

Kakalukia

133

I think there is another format like you mentioned have a file data.csv presented in this format:, because yur sample data working with sep=',' very nice. Can you create better data sample which return your bad output?

– jezrael
Nov 24 '18 at 7:16

Your Problem is here sep="," , simply use sep="," dont put ``

– pygo
Nov 24 '18 at 8:04

Using data = pd.read_csv("sample.csv", sep=",",engine='python') gives me same output as your because or of that ``.

– pygo
Nov 24 '18 at 8:07

add a comment |

Good afternoon, everybody.

I know that it is quite an easy question, although, I simply do not understand why it does not work the way I expected.

The task is as following:

I have a file data.csv presented in this format:

id,"feature_1","feature_2","feature_3"

00100429,"PROTO","Proprietary","Phone"

00100429,"PROTO","Proprietary","Phone"

The thing is to import this data using pandas. I know that by default pandas read_csv uses comma separator, so I just imported it as following:

data = pd.read_csv('data.csv')

And the result I got is the one I presented at the beginning with no change at all. I mean one column which contains everything.

I tried many other separators using regex, and the only one that made some sort of improvement was:

data = pd.read_csv('data.csv',sep=",",engine='python')

On the one hand it finally separated all columns, on the other hand the way data is presented is not that convenient to use. In particular:

"id         ""feature_1""   ""feature_2""   ""feature_3"""

"00100429   ""PROTO""       ""Proprietary"" ""Phone"""

Therefore, I think that somewhere must be a mistake, because the data seems to be fine.

So the question is - how to import csv file with separated columns and no triple quote symbols?

Thank you.

asked Nov 24 '18 at 7:01

Kakalukia

133

I think there is another format like you mentioned have a file data.csv presented in this format:, because yur sample data working with sep=',' very nice. Can you create better data sample which return your bad output?

– jezrael
Nov 24 '18 at 7:16

Your Problem is here sep="," , simply use sep="," dont put ``

– pygo
Nov 24 '18 at 8:04

Using data = pd.read_csv("sample.csv", sep=",",engine='python') gives me same output as your because or of that ``.

– pygo
Nov 24 '18 at 8:07

add a comment |

Good afternoon, everybody.

I know that it is quite an easy question, although, I simply do not understand why it does not work the way I expected.

The task is as following:

I have a file data.csv presented in this format:

id,"feature_1","feature_2","feature_3"

00100429,"PROTO","Proprietary","Phone"

00100429,"PROTO","Proprietary","Phone"

The thing is to import this data using pandas. I know that by default pandas read_csv uses comma separator, so I just imported it as following:

data = pd.read_csv('data.csv')

And the result I got is the one I presented at the beginning with no change at all. I mean one column which contains everything.

I tried many other separators using regex, and the only one that made some sort of improvement was:

data = pd.read_csv('data.csv',sep=",",engine='python')

On the one hand it finally separated all columns, on the other hand the way data is presented is not that convenient to use. In particular:

"id         ""feature_1""   ""feature_2""   ""feature_3"""

"00100429   ""PROTO""       ""Proprietary"" ""Phone"""

Therefore, I think that somewhere must be a mistake, because the data seems to be fine.

So the question is - how to import csv file with separated columns and no triple quote symbols?

Thank you.

asked Nov 24 '18 at 7:01

Kakalukia

133

Good afternoon, everybody.

I know that it is quite an easy question, although, I simply do not understand why it does not work the way I expected.

The task is as following:

I have a file data.csv presented in this format:

id,"feature_1","feature_2","feature_3"

00100429,"PROTO","Proprietary","Phone"

00100429,"PROTO","Proprietary","Phone"

The thing is to import this data using pandas. I know that by default pandas read_csv uses comma separator, so I just imported it as following:

data = pd.read_csv('data.csv')

And the result I got is the one I presented at the beginning with no change at all. I mean one column which contains everything.

I tried many other separators using regex, and the only one that made some sort of improvement was:

data = pd.read_csv('data.csv',sep=",",engine='python')

On the one hand it finally separated all columns, on the other hand the way data is presented is not that convenient to use. In particular:

"id         ""feature_1""   ""feature_2""   ""feature_3"""

"00100429   ""PROTO""       ""Proprietary"" ""Phone"""

Therefore, I think that somewhere must be a mistake, because the data seems to be fine.

So the question is - how to import csv file with separated columns and no triple quote symbols?

Thank you.

python pandas csv

asked Nov 24 '18 at 7:01

Kakalukia

133

asked Nov 24 '18 at 7:01

Kakalukia

133

asked Nov 24 '18 at 7:01

Kakalukia

133

asked Nov 24 '18 at 7:01

Kakalukia

133

asked Nov 24 '18 at 7:01

Kakalukia

133

I think there is another format like you mentioned have a file data.csv presented in this format:, because yur sample data working with sep=',' very nice. Can you create better data sample which return your bad output?

– jezrael
Nov 24 '18 at 7:16

Your Problem is here sep="," , simply use sep="," dont put ``

– pygo
Nov 24 '18 at 8:04

Using data = pd.read_csv("sample.csv", sep=",",engine='python') gives me same output as your because or of that ``.

– pygo
Nov 24 '18 at 8:07

add a comment |

I think there is another format like you mentioned have a file data.csv presented in this format:, because yur sample data working with sep=',' very nice. Can you create better data sample which return your bad output?

– jezrael
Nov 24 '18 at 7:16

Your Problem is here sep="," , simply use sep="," dont put ``

– pygo
Nov 24 '18 at 8:04

Using data = pd.read_csv("sample.csv", sep=",",engine='python') gives me same output as your because or of that ``.

– pygo
Nov 24 '18 at 8:07

I think there is another format like you mentioned have a file data.csv presented in this format:, because yur sample data working with sep=',' very nice. Can you create better data sample which return your bad output?

– jezrael
Nov 24 '18 at 7:16

Your Problem is here sep="," , simply use sep="," dont put ``

– pygo
Nov 24 '18 at 8:04

Using data = pd.read_csv("sample.csv", sep=",",engine='python') gives me same output as your because or of that ``.

– pygo
Nov 24 '18 at 8:07

add a comment |

3 Answers
3

active

oldest

votes

Here's my quick solution for your problem -

import numpy as np

import pandas as pd



### Reading the file, treating header as first row and later removing all the double apostrophe 

df = pd.read_csv('file.csv', sep=',', header=None).apply(lambda x: x.str.replace(r""",""))

df



    0              1           2       3

0   id      feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Putting column names back and dropping the first row.

df.columns = df.iloc[0]

df.drop(index=0, inplace=True)

df



## You can reset the index 

        id  feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Converting `id` column datatype back to `int` (change according to your needs)



df.id = df.id.astype(np.int)

np.result_type(df.id)



dtype('int64')

answered Nov 24 '18 at 8:09

dataLeo

5861419

Thank you for your help, I tried this solution, and it worked perfectly. In fact, I tried to open this dataset with excel and it did not show me any problems with it (that's why I though that problem is with the code), however, when I opened it using python's open('file.csv','r'), I found that lines were presented like this - '"tac,""vendor"",""platform"",""type"""n' That's clearly shows why I had such an issue with reading it using pandas. Thanks again for help.

– Kakalukia
Nov 25 '18 at 9:51

1

@kakalukia good to hear that it helped. Also if it's a small dataset which excel can handle then you can simply split one column into distinct columns and later import in Python. That way much of the things will be simplified. Good going and you can also upvote this answer :)

– dataLeo
Nov 25 '18 at 10:02

add a comment |

Here's just an alternative way to dataLeo's answer -

import pandas as pd

import numpy as np

Reading the file in a dataframe, and later removing all the double apostrophe from row values

df = pd.read_csv("file.csv", sep=",").apply(lambda x: x.str.replace(r""",""))

df



    "id"   "feature_1"  "feature_2" "feature_3"

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Removing all the double apostrophe from column names

df.columns = df.columns.str.replace('"', '')

df



      id    feature_1   feature_2   feature_3

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Converting `id` column datatype back to `int` (change according to your needs)

df.id = df.id.astype('int')

np.result_type(df.id)



dtype('int32')

edited Nov 24 '18 at 8:36

dataLeo

5861419

answered Nov 24 '18 at 8:25

Shadab Hussain

117

add a comment |

It should work without any issue with sep until there is anything really bad on the CSV file you have, However simulating your data example it works file for me:

As per your data sample, you don't need to escape char for comma delimited Values.

>>> import pandas as pd

>>> data = pd.read_csv("sample.csv", sep=",")

>>> data

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

>>> pd.__version__

'0.23.3'

There is a problem here as i noticed sep=","

Alternatively Try:

Here skipinitialspace=True - this "deals with the spaces after the comma-delimiter"

quotechar='"' : string (length 1) The character used to denote the start and end of a quoted item. Quoted items can include the delimiter and it will be ignored.

So, in that case worth trying..

>>> data1 = pd.read_csv("sample.csv", skipinitialspace = True, quotechar = '"')

>>> data1

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

Note from Pandas doc:

Separators longer than 1 character and different from 's+' will be
interpreted as regular expressions, will force use of the python
parsing engine and will ignore quotes in the data.

edited Nov 24 '18 at 8:52

answered Nov 24 '18 at 8:01

pygo

2,4281619

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53455947%2fpandas-pd-read-csv-does-not-work-for-simple-sep%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Here's my quick solution for your problem -

import numpy as np

import pandas as pd



### Reading the file, treating header as first row and later removing all the double apostrophe 

df = pd.read_csv('file.csv', sep=',', header=None).apply(lambda x: x.str.replace(r""",""))

df



    0              1           2       3

0   id      feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Putting column names back and dropping the first row.

df.columns = df.iloc[0]

df.drop(index=0, inplace=True)

df



## You can reset the index 

        id  feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Converting `id` column datatype back to `int` (change according to your needs)



df.id = df.id.astype(np.int)

np.result_type(df.id)



dtype('int64')

answered Nov 24 '18 at 8:09

dataLeo

5861419

Thank you for your help, I tried this solution, and it worked perfectly. In fact, I tried to open this dataset with excel and it did not show me any problems with it (that's why I though that problem is with the code), however, when I opened it using python's open('file.csv','r'), I found that lines were presented like this - '"tac,""vendor"",""platform"",""type"""n' That's clearly shows why I had such an issue with reading it using pandas. Thanks again for help.

– Kakalukia
Nov 25 '18 at 9:51

1

@kakalukia good to hear that it helped. Also if it's a small dataset which excel can handle then you can simply split one column into distinct columns and later import in Python. That way much of the things will be simplified. Good going and you can also upvote this answer :)

– dataLeo
Nov 25 '18 at 10:02

add a comment |

Here's my quick solution for your problem -

import numpy as np

import pandas as pd



### Reading the file, treating header as first row and later removing all the double apostrophe 

df = pd.read_csv('file.csv', sep=',', header=None).apply(lambda x: x.str.replace(r""",""))

df



    0              1           2       3

0   id      feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Putting column names back and dropping the first row.

df.columns = df.iloc[0]

df.drop(index=0, inplace=True)

df



## You can reset the index 

        id  feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Converting `id` column datatype back to `int` (change according to your needs)



df.id = df.id.astype(np.int)

np.result_type(df.id)



dtype('int64')

answered Nov 24 '18 at 8:09

dataLeo

5861419

Thank you for your help, I tried this solution, and it worked perfectly. In fact, I tried to open this dataset with excel and it did not show me any problems with it (that's why I though that problem is with the code), however, when I opened it using python's open('file.csv','r'), I found that lines were presented like this - '"tac,""vendor"",""platform"",""type"""n' That's clearly shows why I had such an issue with reading it using pandas. Thanks again for help.

– Kakalukia
Nov 25 '18 at 9:51

1

@kakalukia good to hear that it helped. Also if it's a small dataset which excel can handle then you can simply split one column into distinct columns and later import in Python. That way much of the things will be simplified. Good going and you can also upvote this answer :)

– dataLeo
Nov 25 '18 at 10:02

add a comment |

Here's my quick solution for your problem -

import numpy as np

import pandas as pd



### Reading the file, treating header as first row and later removing all the double apostrophe 

df = pd.read_csv('file.csv', sep=',', header=None).apply(lambda x: x.str.replace(r""",""))

df



    0              1           2       3

0   id      feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Putting column names back and dropping the first row.

df.columns = df.iloc[0]

df.drop(index=0, inplace=True)

df



## You can reset the index 

        id  feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Converting `id` column datatype back to `int` (change according to your needs)



df.id = df.id.astype(np.int)

np.result_type(df.id)



dtype('int64')

answered Nov 24 '18 at 8:09

dataLeo

5861419

Here's my quick solution for your problem -

import numpy as np

import pandas as pd



### Reading the file, treating header as first row and later removing all the double apostrophe 

df = pd.read_csv('file.csv', sep=',', header=None).apply(lambda x: x.str.replace(r""",""))

df



    0              1           2       3

0   id      feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Putting column names back and dropping the first row.

df.columns = df.iloc[0]

df.drop(index=0, inplace=True)

df



## You can reset the index 

        id  feature_1   feature_2   feature_3

1   00100429    PROTO   Proprietary Phone

2   00100429    PROTO   Proprietary Phone



### Converting `id` column datatype back to `int` (change according to your needs)



df.id = df.id.astype(np.int)

np.result_type(df.id)



dtype('int64')

answered Nov 24 '18 at 8:09

dataLeo

5861419

answered Nov 24 '18 at 8:09

dataLeo

5861419

answered Nov 24 '18 at 8:09

dataLeo

5861419

answered Nov 24 '18 at 8:09

dataLeo

5861419

Thank you for your help, I tried this solution, and it worked perfectly. In fact, I tried to open this dataset with excel and it did not show me any problems with it (that's why I though that problem is with the code), however, when I opened it using python's open('file.csv','r'), I found that lines were presented like this - '"tac,""vendor"",""platform"",""type"""n' That's clearly shows why I had such an issue with reading it using pandas. Thanks again for help.

– Kakalukia
Nov 25 '18 at 9:51

1

@kakalukia good to hear that it helped. Also if it's a small dataset which excel can handle then you can simply split one column into distinct columns and later import in Python. That way much of the things will be simplified. Good going and you can also upvote this answer :)

– dataLeo
Nov 25 '18 at 10:02

add a comment |

Thank you for your help, I tried this solution, and it worked perfectly. In fact, I tried to open this dataset with excel and it did not show me any problems with it (that's why I though that problem is with the code), however, when I opened it using python's open('file.csv','r'), I found that lines were presented like this - '"tac,""vendor"",""platform"",""type"""n' That's clearly shows why I had such an issue with reading it using pandas. Thanks again for help.

– Kakalukia
Nov 25 '18 at 9:51

1

@kakalukia good to hear that it helped. Also if it's a small dataset which excel can handle then you can simply split one column into distinct columns and later import in Python. That way much of the things will be simplified. Good going and you can also upvote this answer :)

– dataLeo
Nov 25 '18 at 10:02

Thank you for your help, I tried this solution, and it worked perfectly. In fact, I tried to open this dataset with excel and it did not show me any problems with it (that's why I though that problem is with the code), however, when I opened it using python's open('file.csv','r'), I found that lines were presented like this - '"tac,""vendor"",""platform"",""type"""n' That's clearly shows why I had such an issue with reading it using pandas. Thanks again for help.

– Kakalukia
Nov 25 '18 at 9:51

@kakalukia good to hear that it helped. Also if it's a small dataset which excel can handle then you can simply split one column into distinct columns and later import in Python. That way much of the things will be simplified. Good going and you can also upvote this answer :)

– dataLeo
Nov 25 '18 at 10:02

add a comment |

Here's just an alternative way to dataLeo's answer -

import pandas as pd

import numpy as np

Reading the file in a dataframe, and later removing all the double apostrophe from row values

df = pd.read_csv("file.csv", sep=",").apply(lambda x: x.str.replace(r""",""))

df



    "id"   "feature_1"  "feature_2" "feature_3"

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Removing all the double apostrophe from column names

df.columns = df.columns.str.replace('"', '')

df



      id    feature_1   feature_2   feature_3

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Converting `id` column datatype back to `int` (change according to your needs)

df.id = df.id.astype('int')

np.result_type(df.id)



dtype('int32')

edited Nov 24 '18 at 8:36

dataLeo

5861419

answered Nov 24 '18 at 8:25

Shadab Hussain

117

add a comment |

Here's just an alternative way to dataLeo's answer -

import pandas as pd

import numpy as np

Reading the file in a dataframe, and later removing all the double apostrophe from row values

df = pd.read_csv("file.csv", sep=",").apply(lambda x: x.str.replace(r""",""))

df



    "id"   "feature_1"  "feature_2" "feature_3"

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Removing all the double apostrophe from column names

df.columns = df.columns.str.replace('"', '')

df



      id    feature_1   feature_2   feature_3

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Converting `id` column datatype back to `int` (change according to your needs)

df.id = df.id.astype('int')

np.result_type(df.id)



dtype('int32')

edited Nov 24 '18 at 8:36

dataLeo

5861419

answered Nov 24 '18 at 8:25

Shadab Hussain

117

add a comment |

Here's just an alternative way to dataLeo's answer -

import pandas as pd

import numpy as np

Reading the file in a dataframe, and later removing all the double apostrophe from row values

df = pd.read_csv("file.csv", sep=",").apply(lambda x: x.str.replace(r""",""))

df



    "id"   "feature_1"  "feature_2" "feature_3"

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Removing all the double apostrophe from column names

df.columns = df.columns.str.replace('"', '')

df



      id    feature_1   feature_2   feature_3

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Converting `id` column datatype back to `int` (change according to your needs)

df.id = df.id.astype('int')

np.result_type(df.id)



dtype('int32')

edited Nov 24 '18 at 8:36

dataLeo

5861419

answered Nov 24 '18 at 8:25

Shadab Hussain

117

Here's just an alternative way to dataLeo's answer -

import pandas as pd

import numpy as np

Reading the file in a dataframe, and later removing all the double apostrophe from row values

df = pd.read_csv("file.csv", sep=",").apply(lambda x: x.str.replace(r""",""))

df



    "id"   "feature_1"  "feature_2" "feature_3"

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Removing all the double apostrophe from column names

df.columns = df.columns.str.replace('"', '')

df



      id    feature_1   feature_2   feature_3

0   00100429    PROTO   Proprietary Phone

1   00100429    PROTO   Proprietary Phone

Converting `id` column datatype back to `int` (change according to your needs)

df.id = df.id.astype('int')

np.result_type(df.id)



dtype('int32')

edited Nov 24 '18 at 8:36

dataLeo

5861419

answered Nov 24 '18 at 8:25

Shadab Hussain

117

edited Nov 24 '18 at 8:36

dataLeo

5861419

edited Nov 24 '18 at 8:36

dataLeo

5861419

edited Nov 24 '18 at 8:36

dataLeo

5861419

answered Nov 24 '18 at 8:25

Shadab Hussain

117

answered Nov 24 '18 at 8:25

Shadab Hussain

117

answered Nov 24 '18 at 8:25

Shadab Hussain

117

add a comment |

It should work without any issue with sep until there is anything really bad on the CSV file you have, However simulating your data example it works file for me:

As per your data sample, you don't need to escape char for comma delimited Values.

>>> import pandas as pd

>>> data = pd.read_csv("sample.csv", sep=",")

>>> data

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

>>> pd.__version__

'0.23.3'

There is a problem here as i noticed sep=","

Alternatively Try:

Here skipinitialspace=True - this "deals with the spaces after the comma-delimiter"

quotechar='"' : string (length 1) The character used to denote the start and end of a quoted item. Quoted items can include the delimiter and it will be ignored.

So, in that case worth trying..

>>> data1 = pd.read_csv("sample.csv", skipinitialspace = True, quotechar = '"')

>>> data1

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

Note from Pandas doc:

Separators longer than 1 character and different from 's+' will be
interpreted as regular expressions, will force use of the python
parsing engine and will ignore quotes in the data.

edited Nov 24 '18 at 8:52

answered Nov 24 '18 at 8:01

pygo

2,4281619

add a comment |

It should work without any issue with sep until there is anything really bad on the CSV file you have, However simulating your data example it works file for me:

As per your data sample, you don't need to escape char for comma delimited Values.

>>> import pandas as pd

>>> data = pd.read_csv("sample.csv", sep=",")

>>> data

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

>>> pd.__version__

'0.23.3'

There is a problem here as i noticed sep=","

Alternatively Try:

Here skipinitialspace=True - this "deals with the spaces after the comma-delimiter"

quotechar='"' : string (length 1) The character used to denote the start and end of a quoted item. Quoted items can include the delimiter and it will be ignored.

So, in that case worth trying..

>>> data1 = pd.read_csv("sample.csv", skipinitialspace = True, quotechar = '"')

>>> data1

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

Note from Pandas doc:

Separators longer than 1 character and different from 's+' will be
interpreted as regular expressions, will force use of the python
parsing engine and will ignore quotes in the data.

edited Nov 24 '18 at 8:52

answered Nov 24 '18 at 8:01

pygo

2,4281619

add a comment |

It should work without any issue with sep until there is anything really bad on the CSV file you have, However simulating your data example it works file for me:

As per your data sample, you don't need to escape char for comma delimited Values.

>>> import pandas as pd

>>> data = pd.read_csv("sample.csv", sep=",")

>>> data

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

>>> pd.__version__

'0.23.3'

There is a problem here as i noticed sep=","

Alternatively Try:

Here skipinitialspace=True - this "deals with the spaces after the comma-delimiter"

quotechar='"' : string (length 1) The character used to denote the start and end of a quoted item. Quoted items can include the delimiter and it will be ignored.

So, in that case worth trying..

>>> data1 = pd.read_csv("sample.csv", skipinitialspace = True, quotechar = '"')

>>> data1

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

Note from Pandas doc:

Separators longer than 1 character and different from 's+' will be
interpreted as regular expressions, will force use of the python
parsing engine and will ignore quotes in the data.

edited Nov 24 '18 at 8:52

answered Nov 24 '18 at 8:01

pygo

2,4281619

It should work without any issue with sep until there is anything really bad on the CSV file you have, However simulating your data example it works file for me:

As per your data sample, you don't need to escape char for comma delimited Values.

>>> import pandas as pd

>>> data = pd.read_csv("sample.csv", sep=",")

>>> data

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

>>> pd.__version__

'0.23.3'

There is a problem here as i noticed sep=","

Alternatively Try:

Here skipinitialspace=True - this "deals with the spaces after the comma-delimiter"

quotechar='"' : string (length 1) The character used to denote the start and end of a quoted item. Quoted items can include the delimiter and it will be ignored.

So, in that case worth trying..

>>> data1 = pd.read_csv("sample.csv", skipinitialspace = True, quotechar = '"')

>>> data1

       id feature_1    feature_2 feature_3

0  100429     PROTO  Proprietary     Phone

1  100429     PROTO  Proprietary     Phone

Note from Pandas doc:

Separators longer than 1 character and different from 's+' will be
interpreted as regular expressions, will force use of the python
parsing engine and will ignore quotes in the data.

edited Nov 24 '18 at 8:52

answered Nov 24 '18 at 8:01

pygo

2,4281619

edited Nov 24 '18 at 8:52

answered Nov 24 '18 at 8:01

pygo

2,4281619

answered Nov 24 '18 at 8:01

pygo

2,4281619

answered Nov 24 '18 at 8:01

pygo

2,4281619

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl