Making a bar chart to represent the number of occurrences in a Pandas Series
I was wondering if anyone could help me with how to make a bar chart to show the frequencies of values in a Pandas Series.
I start with a Pandas DataFrame of shape (2000, 7), and from there I extract the last column. The column is shape (2000,).
The entries in the Series that I mentioned vary from 0 to 17, each with different frequencies, and I tried to plot them using a bar chart but faced some difficulties. Here is my code:
# First, I counted the number of occurrences.
count = np.zeros(max(data_val))
for i in range(count.shape[0]):
for j in range(data_val.shape[0]):
if (i == data_val[j]):
count[i] = count[i] + 1
'''
This gives us
count = array([192., 105., ... 19.])
'''
temp = np.arange(0, 18, 1) # Array for the x-axis.
plt.bar(temp, count)
I am getting an error on the last line of code, saying that the objects cannot be broadcast to a single shape.
What I ultimately want is a bar chart where each bar corresponds to an integer value from 0 to 17, and the height of each bar (i.e. the y-axis) represents the frequencies.
Thank you.
UPDATE
I decided to post the fixed code using the suggestions that people were kind enough to give below, just in case anybody facing similar issues will be able to see my revised code in the future.
data = pd.read_csv("./data/train.csv") # Original data is a (2000, 7) DataFrame
# data contains 6 feature columns and 1 target column.
# Separate the design matrix from the target labels.
X = data.iloc[:, :-1]
y = data['target']
'''
The next line of code uses pandas.Series.value_counts() on y in order to count
the number of occurrences for each label, and then proceeds to sort these according to
index (i.e. label).
You can also use pandas.DataFrame.sort_values() instead if you're interested in sorting
according to the number of frequencies rather than labels.
'''
y.value_counts().sort_index().plot.bar(x='Target Value', y='Number of Occurrences')

There was no need to use for loops if we use the methods that are built into the Pandas library.
The specific methods that were mentioned in the answers are pandas.Series.values_count(), pandas.DataFrame.sort_index(), and pandas.DataFrame.plot.bar().
python pandas bar-chart
add a comment |
I was wondering if anyone could help me with how to make a bar chart to show the frequencies of values in a Pandas Series.
I start with a Pandas DataFrame of shape (2000, 7), and from there I extract the last column. The column is shape (2000,).
The entries in the Series that I mentioned vary from 0 to 17, each with different frequencies, and I tried to plot them using a bar chart but faced some difficulties. Here is my code:
# First, I counted the number of occurrences.
count = np.zeros(max(data_val))
for i in range(count.shape[0]):
for j in range(data_val.shape[0]):
if (i == data_val[j]):
count[i] = count[i] + 1
'''
This gives us
count = array([192., 105., ... 19.])
'''
temp = np.arange(0, 18, 1) # Array for the x-axis.
plt.bar(temp, count)
I am getting an error on the last line of code, saying that the objects cannot be broadcast to a single shape.
What I ultimately want is a bar chart where each bar corresponds to an integer value from 0 to 17, and the height of each bar (i.e. the y-axis) represents the frequencies.
Thank you.
UPDATE
I decided to post the fixed code using the suggestions that people were kind enough to give below, just in case anybody facing similar issues will be able to see my revised code in the future.
data = pd.read_csv("./data/train.csv") # Original data is a (2000, 7) DataFrame
# data contains 6 feature columns and 1 target column.
# Separate the design matrix from the target labels.
X = data.iloc[:, :-1]
y = data['target']
'''
The next line of code uses pandas.Series.value_counts() on y in order to count
the number of occurrences for each label, and then proceeds to sort these according to
index (i.e. label).
You can also use pandas.DataFrame.sort_values() instead if you're interested in sorting
according to the number of frequencies rather than labels.
'''
y.value_counts().sort_index().plot.bar(x='Target Value', y='Number of Occurrences')

There was no need to use for loops if we use the methods that are built into the Pandas library.
The specific methods that were mentioned in the answers are pandas.Series.values_count(), pandas.DataFrame.sort_index(), and pandas.DataFrame.plot.bar().
python pandas bar-chart
add a comment |
I was wondering if anyone could help me with how to make a bar chart to show the frequencies of values in a Pandas Series.
I start with a Pandas DataFrame of shape (2000, 7), and from there I extract the last column. The column is shape (2000,).
The entries in the Series that I mentioned vary from 0 to 17, each with different frequencies, and I tried to plot them using a bar chart but faced some difficulties. Here is my code:
# First, I counted the number of occurrences.
count = np.zeros(max(data_val))
for i in range(count.shape[0]):
for j in range(data_val.shape[0]):
if (i == data_val[j]):
count[i] = count[i] + 1
'''
This gives us
count = array([192., 105., ... 19.])
'''
temp = np.arange(0, 18, 1) # Array for the x-axis.
plt.bar(temp, count)
I am getting an error on the last line of code, saying that the objects cannot be broadcast to a single shape.
What I ultimately want is a bar chart where each bar corresponds to an integer value from 0 to 17, and the height of each bar (i.e. the y-axis) represents the frequencies.
Thank you.
UPDATE
I decided to post the fixed code using the suggestions that people were kind enough to give below, just in case anybody facing similar issues will be able to see my revised code in the future.
data = pd.read_csv("./data/train.csv") # Original data is a (2000, 7) DataFrame
# data contains 6 feature columns and 1 target column.
# Separate the design matrix from the target labels.
X = data.iloc[:, :-1]
y = data['target']
'''
The next line of code uses pandas.Series.value_counts() on y in order to count
the number of occurrences for each label, and then proceeds to sort these according to
index (i.e. label).
You can also use pandas.DataFrame.sort_values() instead if you're interested in sorting
according to the number of frequencies rather than labels.
'''
y.value_counts().sort_index().plot.bar(x='Target Value', y='Number of Occurrences')

There was no need to use for loops if we use the methods that are built into the Pandas library.
The specific methods that were mentioned in the answers are pandas.Series.values_count(), pandas.DataFrame.sort_index(), and pandas.DataFrame.plot.bar().
python pandas bar-chart
I was wondering if anyone could help me with how to make a bar chart to show the frequencies of values in a Pandas Series.
I start with a Pandas DataFrame of shape (2000, 7), and from there I extract the last column. The column is shape (2000,).
The entries in the Series that I mentioned vary from 0 to 17, each with different frequencies, and I tried to plot them using a bar chart but faced some difficulties. Here is my code:
# First, I counted the number of occurrences.
count = np.zeros(max(data_val))
for i in range(count.shape[0]):
for j in range(data_val.shape[0]):
if (i == data_val[j]):
count[i] = count[i] + 1
'''
This gives us
count = array([192., 105., ... 19.])
'''
temp = np.arange(0, 18, 1) # Array for the x-axis.
plt.bar(temp, count)
I am getting an error on the last line of code, saying that the objects cannot be broadcast to a single shape.
What I ultimately want is a bar chart where each bar corresponds to an integer value from 0 to 17, and the height of each bar (i.e. the y-axis) represents the frequencies.
Thank you.
UPDATE
I decided to post the fixed code using the suggestions that people were kind enough to give below, just in case anybody facing similar issues will be able to see my revised code in the future.
data = pd.read_csv("./data/train.csv") # Original data is a (2000, 7) DataFrame
# data contains 6 feature columns and 1 target column.
# Separate the design matrix from the target labels.
X = data.iloc[:, :-1]
y = data['target']
'''
The next line of code uses pandas.Series.value_counts() on y in order to count
the number of occurrences for each label, and then proceeds to sort these according to
index (i.e. label).
You can also use pandas.DataFrame.sort_values() instead if you're interested in sorting
according to the number of frequencies rather than labels.
'''
y.value_counts().sort_index().plot.bar(x='Target Value', y='Number of Occurrences')

There was no need to use for loops if we use the methods that are built into the Pandas library.
The specific methods that were mentioned in the answers are pandas.Series.values_count(), pandas.DataFrame.sort_index(), and pandas.DataFrame.plot.bar().
python pandas bar-chart
python pandas bar-chart
edited Nov 25 '18 at 9:17
Seankala
asked Nov 25 '18 at 6:41
SeankalaSeankala
3511213
3511213
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
I believe you need value_counts with Series.plot.bar:
df = pd.DataFrame({
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[1,1,6,1,6,5],
})
print (df)
a b c d
0 4 7 1 1
1 5 8 3 1
2 4 9 5 6
3 5 4 7 1
4 5 2 1 6
5 4 3 0 5
df['d'].value_counts(sort=False).plot.bar()

If possible some value missing and need set it to 0 add reindex:
df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0).plot.bar()

Detail:
print (df['d'].value_counts(sort=False))
1 3
5 1
6 2
Name: d, dtype: int64
print (df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0))
0 0
1 3
2 0
3 0
4 0
5 1
6 2
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
Name: d, dtype: int64
1
Your answer helped me out a lot! I wasn't aware that Pandas had these methods and was using loops unnecessarily. I also happened to usepandas.DataFrame.sort_index()to get the result I wanted as well.
– Seankala
Nov 25 '18 at 9:00
add a comment |
Here's an approach using Seaborn
import numpy as np
import pandas as pd
import seaborn as sns
s = pd.Series(np.random.choice(17, 10))
s
# 0 10
# 1 13
# 2 12
# 3 0
# 4 0
# 5 5
# 6 13
# 7 9
# 8 11
# 9 0
# dtype: int64
val, cnt = np.unique(s, return_counts=True)
val, cnt
# (array([ 0, 5, 9, 10, 11, 12, 13]), array([3, 1, 1, 1, 1, 1, 2]))
sns.barplot(val, cnt)

Thanks for the answer! I've actually never heard ofSeabornbefore, but will take a look at it in the future. Thanks again.
– Seankala
Nov 25 '18 at 9:18
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465262%2fmaking-a-bar-chart-to-represent-the-number-of-occurrences-in-a-pandas-series%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I believe you need value_counts with Series.plot.bar:
df = pd.DataFrame({
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[1,1,6,1,6,5],
})
print (df)
a b c d
0 4 7 1 1
1 5 8 3 1
2 4 9 5 6
3 5 4 7 1
4 5 2 1 6
5 4 3 0 5
df['d'].value_counts(sort=False).plot.bar()

If possible some value missing and need set it to 0 add reindex:
df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0).plot.bar()

Detail:
print (df['d'].value_counts(sort=False))
1 3
5 1
6 2
Name: d, dtype: int64
print (df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0))
0 0
1 3
2 0
3 0
4 0
5 1
6 2
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
Name: d, dtype: int64
1
Your answer helped me out a lot! I wasn't aware that Pandas had these methods and was using loops unnecessarily. I also happened to usepandas.DataFrame.sort_index()to get the result I wanted as well.
– Seankala
Nov 25 '18 at 9:00
add a comment |
I believe you need value_counts with Series.plot.bar:
df = pd.DataFrame({
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[1,1,6,1,6,5],
})
print (df)
a b c d
0 4 7 1 1
1 5 8 3 1
2 4 9 5 6
3 5 4 7 1
4 5 2 1 6
5 4 3 0 5
df['d'].value_counts(sort=False).plot.bar()

If possible some value missing and need set it to 0 add reindex:
df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0).plot.bar()

Detail:
print (df['d'].value_counts(sort=False))
1 3
5 1
6 2
Name: d, dtype: int64
print (df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0))
0 0
1 3
2 0
3 0
4 0
5 1
6 2
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
Name: d, dtype: int64
1
Your answer helped me out a lot! I wasn't aware that Pandas had these methods and was using loops unnecessarily. I also happened to usepandas.DataFrame.sort_index()to get the result I wanted as well.
– Seankala
Nov 25 '18 at 9:00
add a comment |
I believe you need value_counts with Series.plot.bar:
df = pd.DataFrame({
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[1,1,6,1,6,5],
})
print (df)
a b c d
0 4 7 1 1
1 5 8 3 1
2 4 9 5 6
3 5 4 7 1
4 5 2 1 6
5 4 3 0 5
df['d'].value_counts(sort=False).plot.bar()

If possible some value missing and need set it to 0 add reindex:
df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0).plot.bar()

Detail:
print (df['d'].value_counts(sort=False))
1 3
5 1
6 2
Name: d, dtype: int64
print (df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0))
0 0
1 3
2 0
3 0
4 0
5 1
6 2
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
Name: d, dtype: int64
I believe you need value_counts with Series.plot.bar:
df = pd.DataFrame({
'a':[4,5,4,5,5,4],
'b':[7,8,9,4,2,3],
'c':[1,3,5,7,1,0],
'd':[1,1,6,1,6,5],
})
print (df)
a b c d
0 4 7 1 1
1 5 8 3 1
2 4 9 5 6
3 5 4 7 1
4 5 2 1 6
5 4 3 0 5
df['d'].value_counts(sort=False).plot.bar()

If possible some value missing and need set it to 0 add reindex:
df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0).plot.bar()

Detail:
print (df['d'].value_counts(sort=False))
1 3
5 1
6 2
Name: d, dtype: int64
print (df['d'].value_counts(sort=False).reindex(np.arange(18), fill_value=0))
0 0
1 3
2 0
3 0
4 0
5 1
6 2
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
Name: d, dtype: int64
edited Nov 25 '18 at 6:55
answered Nov 25 '18 at 6:44
jezraeljezrael
329k23270349
329k23270349
1
Your answer helped me out a lot! I wasn't aware that Pandas had these methods and was using loops unnecessarily. I also happened to usepandas.DataFrame.sort_index()to get the result I wanted as well.
– Seankala
Nov 25 '18 at 9:00
add a comment |
1
Your answer helped me out a lot! I wasn't aware that Pandas had these methods and was using loops unnecessarily. I also happened to usepandas.DataFrame.sort_index()to get the result I wanted as well.
– Seankala
Nov 25 '18 at 9:00
1
1
Your answer helped me out a lot! I wasn't aware that Pandas had these methods and was using loops unnecessarily. I also happened to use
pandas.DataFrame.sort_index() to get the result I wanted as well.– Seankala
Nov 25 '18 at 9:00
Your answer helped me out a lot! I wasn't aware that Pandas had these methods and was using loops unnecessarily. I also happened to use
pandas.DataFrame.sort_index() to get the result I wanted as well.– Seankala
Nov 25 '18 at 9:00
add a comment |
Here's an approach using Seaborn
import numpy as np
import pandas as pd
import seaborn as sns
s = pd.Series(np.random.choice(17, 10))
s
# 0 10
# 1 13
# 2 12
# 3 0
# 4 0
# 5 5
# 6 13
# 7 9
# 8 11
# 9 0
# dtype: int64
val, cnt = np.unique(s, return_counts=True)
val, cnt
# (array([ 0, 5, 9, 10, 11, 12, 13]), array([3, 1, 1, 1, 1, 1, 2]))
sns.barplot(val, cnt)

Thanks for the answer! I've actually never heard ofSeabornbefore, but will take a look at it in the future. Thanks again.
– Seankala
Nov 25 '18 at 9:18
add a comment |
Here's an approach using Seaborn
import numpy as np
import pandas as pd
import seaborn as sns
s = pd.Series(np.random.choice(17, 10))
s
# 0 10
# 1 13
# 2 12
# 3 0
# 4 0
# 5 5
# 6 13
# 7 9
# 8 11
# 9 0
# dtype: int64
val, cnt = np.unique(s, return_counts=True)
val, cnt
# (array([ 0, 5, 9, 10, 11, 12, 13]), array([3, 1, 1, 1, 1, 1, 2]))
sns.barplot(val, cnt)

Thanks for the answer! I've actually never heard ofSeabornbefore, but will take a look at it in the future. Thanks again.
– Seankala
Nov 25 '18 at 9:18
add a comment |
Here's an approach using Seaborn
import numpy as np
import pandas as pd
import seaborn as sns
s = pd.Series(np.random.choice(17, 10))
s
# 0 10
# 1 13
# 2 12
# 3 0
# 4 0
# 5 5
# 6 13
# 7 9
# 8 11
# 9 0
# dtype: int64
val, cnt = np.unique(s, return_counts=True)
val, cnt
# (array([ 0, 5, 9, 10, 11, 12, 13]), array([3, 1, 1, 1, 1, 1, 2]))
sns.barplot(val, cnt)

Here's an approach using Seaborn
import numpy as np
import pandas as pd
import seaborn as sns
s = pd.Series(np.random.choice(17, 10))
s
# 0 10
# 1 13
# 2 12
# 3 0
# 4 0
# 5 5
# 6 13
# 7 9
# 8 11
# 9 0
# dtype: int64
val, cnt = np.unique(s, return_counts=True)
val, cnt
# (array([ 0, 5, 9, 10, 11, 12, 13]), array([3, 1, 1, 1, 1, 1, 2]))
sns.barplot(val, cnt)

answered Nov 25 '18 at 6:51
dataLeodataLeo
6031419
6031419
Thanks for the answer! I've actually never heard ofSeabornbefore, but will take a look at it in the future. Thanks again.
– Seankala
Nov 25 '18 at 9:18
add a comment |
Thanks for the answer! I've actually never heard ofSeabornbefore, but will take a look at it in the future. Thanks again.
– Seankala
Nov 25 '18 at 9:18
Thanks for the answer! I've actually never heard of
Seaborn before, but will take a look at it in the future. Thanks again.– Seankala
Nov 25 '18 at 9:18
Thanks for the answer! I've actually never heard of
Seaborn before, but will take a look at it in the future. Thanks again.– Seankala
Nov 25 '18 at 9:18
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53465262%2fmaking-a-bar-chart-to-represent-the-number-of-occurrences-in-a-pandas-series%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown