Advanced histogram usage in Python with numpy
I need to quickly do this across large amounts of data, so I ideally want to use an approach such as numpy that is fast. I would normally just write a loop but python is too slow for that. Here is the problem:
I would like to add up sums according to the bins of another array. for example, i have three arrays of
weights = [100, 130, 112, 150]
ages = [1, 14, 15, 25]
I want to sum the weights according to ages being binned with bins of 0-9, 10-19, 20-29. so i'd get [100, 130+112, 150] -> [100, 242, 150] as my end result.
My current understanding of numpy's histograms is that I would only be able to sum the array that I am binning with. Meaning that I could only get the sum of the ages if I bin ages.
I would also like the knowledge of how to do this well, it's likely in the future other operations than sums will be required of me (such as averaging them rather than just a pure sum). Thank you for your help.
python numpy
add a comment |
I need to quickly do this across large amounts of data, so I ideally want to use an approach such as numpy that is fast. I would normally just write a loop but python is too slow for that. Here is the problem:
I would like to add up sums according to the bins of another array. for example, i have three arrays of
weights = [100, 130, 112, 150]
ages = [1, 14, 15, 25]
I want to sum the weights according to ages being binned with bins of 0-9, 10-19, 20-29. so i'd get [100, 130+112, 150] -> [100, 242, 150] as my end result.
My current understanding of numpy's histograms is that I would only be able to sum the array that I am binning with. Meaning that I could only get the sum of the ages if I bin ages.
I would also like the knowledge of how to do this well, it's likely in the future other operations than sums will be required of me (such as averaging them rather than just a pure sum). Thank you for your help.
python numpy
add a comment |
I need to quickly do this across large amounts of data, so I ideally want to use an approach such as numpy that is fast. I would normally just write a loop but python is too slow for that. Here is the problem:
I would like to add up sums according to the bins of another array. for example, i have three arrays of
weights = [100, 130, 112, 150]
ages = [1, 14, 15, 25]
I want to sum the weights according to ages being binned with bins of 0-9, 10-19, 20-29. so i'd get [100, 130+112, 150] -> [100, 242, 150] as my end result.
My current understanding of numpy's histograms is that I would only be able to sum the array that I am binning with. Meaning that I could only get the sum of the ages if I bin ages.
I would also like the knowledge of how to do this well, it's likely in the future other operations than sums will be required of me (such as averaging them rather than just a pure sum). Thank you for your help.
python numpy
I need to quickly do this across large amounts of data, so I ideally want to use an approach such as numpy that is fast. I would normally just write a loop but python is too slow for that. Here is the problem:
I would like to add up sums according to the bins of another array. for example, i have three arrays of
weights = [100, 130, 112, 150]
ages = [1, 14, 15, 25]
I want to sum the weights according to ages being binned with bins of 0-9, 10-19, 20-29. so i'd get [100, 130+112, 150] -> [100, 242, 150] as my end result.
My current understanding of numpy's histograms is that I would only be able to sum the array that I am binning with. Meaning that I could only get the sum of the ages if I bin ages.
I would also like the knowledge of how to do this well, it's likely in the future other operations than sums will be required of me (such as averaging them rather than just a pure sum). Thank you for your help.
python numpy
python numpy
edited Nov 28 '18 at 9:42
Saugat Bhattarai
1,02821428
1,02821428
asked Nov 28 '18 at 9:35
PlezosPlezos
408
408
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
This can be done pretty simply with a list comprehension and some numpy logical functions, and it won't be limited only to summation.
import numpy as np
ages = [1, 14, 15, 25]
weights = np.array([100, 130, 112, 150]) # easier indexing with a np.array
bin_left_marks = np.arange(0, 40, 10)
my_func = np.sum
my_binned_aggregation = [my_func(weights[np.where(np.logical_and(bin_left_marks[i] <= ages, ages < bin_left_marks[i+1]))]) for i in range(len(bin_left_marks) - 1)]
Basically, for each bin, find the indexes of the ages list that match that bin, and aggregate the weights list accordingly.
Good luck!
Obviously can be done "less ugly" by splitting that one-liner, using a straight-forward loop, etc. This solution is going for concise.
Thanks for the help. Was worried I wouldn't get a question due to the negative score. I think its maybe because my question is vague but that's really the problem is that I don't know what direction to approach in but you've pointed me in the right one. I'll study this code now.
– Plezos
Nov 28 '18 at 10:09
I think it's actually because you didn't include any of your own attempts or show enough effort. For next time...
– ShlomiF
Nov 28 '18 at 10:10
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53516285%2fadvanced-histogram-usage-in-python-with-numpy%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
This can be done pretty simply with a list comprehension and some numpy logical functions, and it won't be limited only to summation.
import numpy as np
ages = [1, 14, 15, 25]
weights = np.array([100, 130, 112, 150]) # easier indexing with a np.array
bin_left_marks = np.arange(0, 40, 10)
my_func = np.sum
my_binned_aggregation = [my_func(weights[np.where(np.logical_and(bin_left_marks[i] <= ages, ages < bin_left_marks[i+1]))]) for i in range(len(bin_left_marks) - 1)]
Basically, for each bin, find the indexes of the ages list that match that bin, and aggregate the weights list accordingly.
Good luck!
Obviously can be done "less ugly" by splitting that one-liner, using a straight-forward loop, etc. This solution is going for concise.
Thanks for the help. Was worried I wouldn't get a question due to the negative score. I think its maybe because my question is vague but that's really the problem is that I don't know what direction to approach in but you've pointed me in the right one. I'll study this code now.
– Plezos
Nov 28 '18 at 10:09
I think it's actually because you didn't include any of your own attempts or show enough effort. For next time...
– ShlomiF
Nov 28 '18 at 10:10
add a comment |
This can be done pretty simply with a list comprehension and some numpy logical functions, and it won't be limited only to summation.
import numpy as np
ages = [1, 14, 15, 25]
weights = np.array([100, 130, 112, 150]) # easier indexing with a np.array
bin_left_marks = np.arange(0, 40, 10)
my_func = np.sum
my_binned_aggregation = [my_func(weights[np.where(np.logical_and(bin_left_marks[i] <= ages, ages < bin_left_marks[i+1]))]) for i in range(len(bin_left_marks) - 1)]
Basically, for each bin, find the indexes of the ages list that match that bin, and aggregate the weights list accordingly.
Good luck!
Obviously can be done "less ugly" by splitting that one-liner, using a straight-forward loop, etc. This solution is going for concise.
Thanks for the help. Was worried I wouldn't get a question due to the negative score. I think its maybe because my question is vague but that's really the problem is that I don't know what direction to approach in but you've pointed me in the right one. I'll study this code now.
– Plezos
Nov 28 '18 at 10:09
I think it's actually because you didn't include any of your own attempts or show enough effort. For next time...
– ShlomiF
Nov 28 '18 at 10:10
add a comment |
This can be done pretty simply with a list comprehension and some numpy logical functions, and it won't be limited only to summation.
import numpy as np
ages = [1, 14, 15, 25]
weights = np.array([100, 130, 112, 150]) # easier indexing with a np.array
bin_left_marks = np.arange(0, 40, 10)
my_func = np.sum
my_binned_aggregation = [my_func(weights[np.where(np.logical_and(bin_left_marks[i] <= ages, ages < bin_left_marks[i+1]))]) for i in range(len(bin_left_marks) - 1)]
Basically, for each bin, find the indexes of the ages list that match that bin, and aggregate the weights list accordingly.
Good luck!
Obviously can be done "less ugly" by splitting that one-liner, using a straight-forward loop, etc. This solution is going for concise.
This can be done pretty simply with a list comprehension and some numpy logical functions, and it won't be limited only to summation.
import numpy as np
ages = [1, 14, 15, 25]
weights = np.array([100, 130, 112, 150]) # easier indexing with a np.array
bin_left_marks = np.arange(0, 40, 10)
my_func = np.sum
my_binned_aggregation = [my_func(weights[np.where(np.logical_and(bin_left_marks[i] <= ages, ages < bin_left_marks[i+1]))]) for i in range(len(bin_left_marks) - 1)]
Basically, for each bin, find the indexes of the ages list that match that bin, and aggregate the weights list accordingly.
Good luck!
Obviously can be done "less ugly" by splitting that one-liner, using a straight-forward loop, etc. This solution is going for concise.
answered Nov 28 '18 at 9:48
ShlomiFShlomiF
855410
855410
Thanks for the help. Was worried I wouldn't get a question due to the negative score. I think its maybe because my question is vague but that's really the problem is that I don't know what direction to approach in but you've pointed me in the right one. I'll study this code now.
– Plezos
Nov 28 '18 at 10:09
I think it's actually because you didn't include any of your own attempts or show enough effort. For next time...
– ShlomiF
Nov 28 '18 at 10:10
add a comment |
Thanks for the help. Was worried I wouldn't get a question due to the negative score. I think its maybe because my question is vague but that's really the problem is that I don't know what direction to approach in but you've pointed me in the right one. I'll study this code now.
– Plezos
Nov 28 '18 at 10:09
I think it's actually because you didn't include any of your own attempts or show enough effort. For next time...
– ShlomiF
Nov 28 '18 at 10:10
Thanks for the help. Was worried I wouldn't get a question due to the negative score. I think its maybe because my question is vague but that's really the problem is that I don't know what direction to approach in but you've pointed me in the right one. I'll study this code now.
– Plezos
Nov 28 '18 at 10:09
Thanks for the help. Was worried I wouldn't get a question due to the negative score. I think its maybe because my question is vague but that's really the problem is that I don't know what direction to approach in but you've pointed me in the right one. I'll study this code now.
– Plezos
Nov 28 '18 at 10:09
I think it's actually because you didn't include any of your own attempts or show enough effort. For next time...
– ShlomiF
Nov 28 '18 at 10:10
I think it's actually because you didn't include any of your own attempts or show enough effort. For next time...
– ShlomiF
Nov 28 '18 at 10:10
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53516285%2fadvanced-histogram-usage-in-python-with-numpy%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown