How can I achieve reduction using OpenMP Tasks?
I have this OpenMP code that performs a simple reduction:
for(k = 0; k < m; k++)
{
#pragma omp parallel for private(i) reduction(+:mysum) schedule(static)
for (i = 0; i < m; i++)
{
mysum += a[i][k] * a[i][k];
}
}
I want to create a code equivalent to this one, but using OpenMP Tasks. Here is what I tried by following this article:
for(k = 0; k < m; k++)
{
#pragma omp parallel reduction(+:mysum)
{
#pragma omp single
{
for (i = 0; i < m; i++)
{
#pragma omp task private(i) shared(k)
{
partialSum += a[i][k] * a[i][k];
}
}
}
#pragma omp taskwait
mysum += partialSum;
}
}
The variable partialSum is declared as threadprivate and it's also a global variable:
int partialSum = 0;
#pragma omp threadprivate(partialSum)
a is a simple array of ints (m x m).
The problem is that when I run the code above (the one with tasks) multiple times, I get different results.
Do you have an idea on what should I change to make this work?
Thank you respectfully
c parallel-processing openmp reduction
add a comment |
I have this OpenMP code that performs a simple reduction:
for(k = 0; k < m; k++)
{
#pragma omp parallel for private(i) reduction(+:mysum) schedule(static)
for (i = 0; i < m; i++)
{
mysum += a[i][k] * a[i][k];
}
}
I want to create a code equivalent to this one, but using OpenMP Tasks. Here is what I tried by following this article:
for(k = 0; k < m; k++)
{
#pragma omp parallel reduction(+:mysum)
{
#pragma omp single
{
for (i = 0; i < m; i++)
{
#pragma omp task private(i) shared(k)
{
partialSum += a[i][k] * a[i][k];
}
}
}
#pragma omp taskwait
mysum += partialSum;
}
}
The variable partialSum is declared as threadprivate and it's also a global variable:
int partialSum = 0;
#pragma omp threadprivate(partialSum)
a is a simple array of ints (m x m).
The problem is that when I run the code above (the one with tasks) multiple times, I get different results.
Do you have an idea on what should I change to make this work?
Thank you respectfully
c parallel-processing openmp reduction
In your second code,partialSumis shared among all your threads. The reduction handles making private copies ofmysumand combining them at the end, but the same treatment is not extended topartialSum, which therefore is the subject of a data race. The slide deck you linked uses athreadprivate()directive to address that problem. I'm not certain that would be sufficient for you, but it would at least resolve the data race.
– John Bollinger
Nov 24 '18 at 16:08
I don't think thatpartialSumis shared among all threads because I also declare it asthreadPrivate, exactly as in that article
– Cosmin Ioniță
Nov 25 '18 at 17:17
1
I guess I overlooked that at the end of the question. Please, present a Minimal, Complete, and Verifiable example exhibiting the problem. Not only will that reduce the likelihood of such misunderstandings, but the additional context may prove important.
– John Bollinger
Nov 25 '18 at 17:43
What misunderstandings? I stated the fact thatpartialSumisthreadPrivatefrom the beginning. I think that you should have read the entire question from the beginning.
– Cosmin Ioniță
Nov 26 '18 at 8:23
add a comment |
I have this OpenMP code that performs a simple reduction:
for(k = 0; k < m; k++)
{
#pragma omp parallel for private(i) reduction(+:mysum) schedule(static)
for (i = 0; i < m; i++)
{
mysum += a[i][k] * a[i][k];
}
}
I want to create a code equivalent to this one, but using OpenMP Tasks. Here is what I tried by following this article:
for(k = 0; k < m; k++)
{
#pragma omp parallel reduction(+:mysum)
{
#pragma omp single
{
for (i = 0; i < m; i++)
{
#pragma omp task private(i) shared(k)
{
partialSum += a[i][k] * a[i][k];
}
}
}
#pragma omp taskwait
mysum += partialSum;
}
}
The variable partialSum is declared as threadprivate and it's also a global variable:
int partialSum = 0;
#pragma omp threadprivate(partialSum)
a is a simple array of ints (m x m).
The problem is that when I run the code above (the one with tasks) multiple times, I get different results.
Do you have an idea on what should I change to make this work?
Thank you respectfully
c parallel-processing openmp reduction
I have this OpenMP code that performs a simple reduction:
for(k = 0; k < m; k++)
{
#pragma omp parallel for private(i) reduction(+:mysum) schedule(static)
for (i = 0; i < m; i++)
{
mysum += a[i][k] * a[i][k];
}
}
I want to create a code equivalent to this one, but using OpenMP Tasks. Here is what I tried by following this article:
for(k = 0; k < m; k++)
{
#pragma omp parallel reduction(+:mysum)
{
#pragma omp single
{
for (i = 0; i < m; i++)
{
#pragma omp task private(i) shared(k)
{
partialSum += a[i][k] * a[i][k];
}
}
}
#pragma omp taskwait
mysum += partialSum;
}
}
The variable partialSum is declared as threadprivate and it's also a global variable:
int partialSum = 0;
#pragma omp threadprivate(partialSum)
a is a simple array of ints (m x m).
The problem is that when I run the code above (the one with tasks) multiple times, I get different results.
Do you have an idea on what should I change to make this work?
Thank you respectfully
c parallel-processing openmp reduction
c parallel-processing openmp reduction
asked Nov 24 '18 at 15:49
Cosmin IonițăCosmin Ioniță
567830
567830
In your second code,partialSumis shared among all your threads. The reduction handles making private copies ofmysumand combining them at the end, but the same treatment is not extended topartialSum, which therefore is the subject of a data race. The slide deck you linked uses athreadprivate()directive to address that problem. I'm not certain that would be sufficient for you, but it would at least resolve the data race.
– John Bollinger
Nov 24 '18 at 16:08
I don't think thatpartialSumis shared among all threads because I also declare it asthreadPrivate, exactly as in that article
– Cosmin Ioniță
Nov 25 '18 at 17:17
1
I guess I overlooked that at the end of the question. Please, present a Minimal, Complete, and Verifiable example exhibiting the problem. Not only will that reduce the likelihood of such misunderstandings, but the additional context may prove important.
– John Bollinger
Nov 25 '18 at 17:43
What misunderstandings? I stated the fact thatpartialSumisthreadPrivatefrom the beginning. I think that you should have read the entire question from the beginning.
– Cosmin Ioniță
Nov 26 '18 at 8:23
add a comment |
In your second code,partialSumis shared among all your threads. The reduction handles making private copies ofmysumand combining them at the end, but the same treatment is not extended topartialSum, which therefore is the subject of a data race. The slide deck you linked uses athreadprivate()directive to address that problem. I'm not certain that would be sufficient for you, but it would at least resolve the data race.
– John Bollinger
Nov 24 '18 at 16:08
I don't think thatpartialSumis shared among all threads because I also declare it asthreadPrivate, exactly as in that article
– Cosmin Ioniță
Nov 25 '18 at 17:17
1
I guess I overlooked that at the end of the question. Please, present a Minimal, Complete, and Verifiable example exhibiting the problem. Not only will that reduce the likelihood of such misunderstandings, but the additional context may prove important.
– John Bollinger
Nov 25 '18 at 17:43
What misunderstandings? I stated the fact thatpartialSumisthreadPrivatefrom the beginning. I think that you should have read the entire question from the beginning.
– Cosmin Ioniță
Nov 26 '18 at 8:23
In your second code,
partialSum is shared among all your threads. The reduction handles making private copies of mysum and combining them at the end, but the same treatment is not extended to partialSum, which therefore is the subject of a data race. The slide deck you linked uses a threadprivate() directive to address that problem. I'm not certain that would be sufficient for you, but it would at least resolve the data race.– John Bollinger
Nov 24 '18 at 16:08
In your second code,
partialSum is shared among all your threads. The reduction handles making private copies of mysum and combining them at the end, but the same treatment is not extended to partialSum, which therefore is the subject of a data race. The slide deck you linked uses a threadprivate() directive to address that problem. I'm not certain that would be sufficient for you, but it would at least resolve the data race.– John Bollinger
Nov 24 '18 at 16:08
I don't think that
partialSum is shared among all threads because I also declare it as threadPrivate, exactly as in that article– Cosmin Ioniță
Nov 25 '18 at 17:17
I don't think that
partialSum is shared among all threads because I also declare it as threadPrivate, exactly as in that article– Cosmin Ioniță
Nov 25 '18 at 17:17
1
1
I guess I overlooked that at the end of the question. Please, present a Minimal, Complete, and Verifiable example exhibiting the problem. Not only will that reduce the likelihood of such misunderstandings, but the additional context may prove important.
– John Bollinger
Nov 25 '18 at 17:43
I guess I overlooked that at the end of the question. Please, present a Minimal, Complete, and Verifiable example exhibiting the problem. Not only will that reduce the likelihood of such misunderstandings, but the additional context may prove important.
– John Bollinger
Nov 25 '18 at 17:43
What misunderstandings? I stated the fact that
partialSum is threadPrivate from the beginning. I think that you should have read the entire question from the beginning.– Cosmin Ioniță
Nov 26 '18 at 8:23
What misunderstandings? I stated the fact that
partialSum is threadPrivate from the beginning. I think that you should have read the entire question from the beginning.– Cosmin Ioniță
Nov 26 '18 at 8:23
add a comment |
1 Answer
1
active
oldest
votes
private variables are uninitialized (at least not initialized by their outside value). i should be firstprivate.
If you just get rid of private(i) shared(k) everything is correct by default. k comes from outside of the parallel section and thus is implicitly shared in the parallel section. This also makes it implicitly shared in the task generating construct. Right now i is also shared/shared. If you define it locally instead, (for (int i...), it becomes implicitly private to the parallel section and thus implicitly firstprivate in the task generating construct.
You should also add
#pragma omp atomic
mysum += partialSum;
On the other hand, you don't necessarily need the taskwait (see this answer)
Note that the talk uses firstprivate correctly.
You say aboutkthat issharedin the task generating construct but then you say that it'sfirstprivate. It's a bit unclear to me what's happening withk
– Cosmin Ioniță
Nov 26 '18 at 21:05
Sorry there was a typo and I misread your code regardingi. The first sentence (implicitlyshared/sharedapplies tok). You should defineilocally then it becomesprivate/firstprivateimplicitly which is what you want. You may also specify the data-sharing attributes manually (just withfirstprivate(i)instead.
– Zulan
Nov 27 '18 at 8:02
Understood. Thanks a lot for your answer!
– Cosmin Ioniță
Nov 27 '18 at 14:08
@CosminIoniță please note that the recently released OpenMP 5.0 natively supports task_reduction, but it is a bit different and I have no experience with it yet. AFAIK it is only supported in the Intel 18.0 compiler yet.
– Zulan
Nov 28 '18 at 11:54
Great to know! Thanks, @Zulan!
– Cosmin Ioniță
Nov 28 '18 at 17:18
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53459821%2fhow-can-i-achieve-reduction-using-openmp-tasks%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
private variables are uninitialized (at least not initialized by their outside value). i should be firstprivate.
If you just get rid of private(i) shared(k) everything is correct by default. k comes from outside of the parallel section and thus is implicitly shared in the parallel section. This also makes it implicitly shared in the task generating construct. Right now i is also shared/shared. If you define it locally instead, (for (int i...), it becomes implicitly private to the parallel section and thus implicitly firstprivate in the task generating construct.
You should also add
#pragma omp atomic
mysum += partialSum;
On the other hand, you don't necessarily need the taskwait (see this answer)
Note that the talk uses firstprivate correctly.
You say aboutkthat issharedin the task generating construct but then you say that it'sfirstprivate. It's a bit unclear to me what's happening withk
– Cosmin Ioniță
Nov 26 '18 at 21:05
Sorry there was a typo and I misread your code regardingi. The first sentence (implicitlyshared/sharedapplies tok). You should defineilocally then it becomesprivate/firstprivateimplicitly which is what you want. You may also specify the data-sharing attributes manually (just withfirstprivate(i)instead.
– Zulan
Nov 27 '18 at 8:02
Understood. Thanks a lot for your answer!
– Cosmin Ioniță
Nov 27 '18 at 14:08
@CosminIoniță please note that the recently released OpenMP 5.0 natively supports task_reduction, but it is a bit different and I have no experience with it yet. AFAIK it is only supported in the Intel 18.0 compiler yet.
– Zulan
Nov 28 '18 at 11:54
Great to know! Thanks, @Zulan!
– Cosmin Ioniță
Nov 28 '18 at 17:18
add a comment |
private variables are uninitialized (at least not initialized by their outside value). i should be firstprivate.
If you just get rid of private(i) shared(k) everything is correct by default. k comes from outside of the parallel section and thus is implicitly shared in the parallel section. This also makes it implicitly shared in the task generating construct. Right now i is also shared/shared. If you define it locally instead, (for (int i...), it becomes implicitly private to the parallel section and thus implicitly firstprivate in the task generating construct.
You should also add
#pragma omp atomic
mysum += partialSum;
On the other hand, you don't necessarily need the taskwait (see this answer)
Note that the talk uses firstprivate correctly.
You say aboutkthat issharedin the task generating construct but then you say that it'sfirstprivate. It's a bit unclear to me what's happening withk
– Cosmin Ioniță
Nov 26 '18 at 21:05
Sorry there was a typo and I misread your code regardingi. The first sentence (implicitlyshared/sharedapplies tok). You should defineilocally then it becomesprivate/firstprivateimplicitly which is what you want. You may also specify the data-sharing attributes manually (just withfirstprivate(i)instead.
– Zulan
Nov 27 '18 at 8:02
Understood. Thanks a lot for your answer!
– Cosmin Ioniță
Nov 27 '18 at 14:08
@CosminIoniță please note that the recently released OpenMP 5.0 natively supports task_reduction, but it is a bit different and I have no experience with it yet. AFAIK it is only supported in the Intel 18.0 compiler yet.
– Zulan
Nov 28 '18 at 11:54
Great to know! Thanks, @Zulan!
– Cosmin Ioniță
Nov 28 '18 at 17:18
add a comment |
private variables are uninitialized (at least not initialized by their outside value). i should be firstprivate.
If you just get rid of private(i) shared(k) everything is correct by default. k comes from outside of the parallel section and thus is implicitly shared in the parallel section. This also makes it implicitly shared in the task generating construct. Right now i is also shared/shared. If you define it locally instead, (for (int i...), it becomes implicitly private to the parallel section and thus implicitly firstprivate in the task generating construct.
You should also add
#pragma omp atomic
mysum += partialSum;
On the other hand, you don't necessarily need the taskwait (see this answer)
Note that the talk uses firstprivate correctly.
private variables are uninitialized (at least not initialized by their outside value). i should be firstprivate.
If you just get rid of private(i) shared(k) everything is correct by default. k comes from outside of the parallel section and thus is implicitly shared in the parallel section. This also makes it implicitly shared in the task generating construct. Right now i is also shared/shared. If you define it locally instead, (for (int i...), it becomes implicitly private to the parallel section and thus implicitly firstprivate in the task generating construct.
You should also add
#pragma omp atomic
mysum += partialSum;
On the other hand, you don't necessarily need the taskwait (see this answer)
Note that the talk uses firstprivate correctly.
edited Nov 27 '18 at 8:00
answered Nov 26 '18 at 12:27
ZulanZulan
15.3k63070
15.3k63070
You say aboutkthat issharedin the task generating construct but then you say that it'sfirstprivate. It's a bit unclear to me what's happening withk
– Cosmin Ioniță
Nov 26 '18 at 21:05
Sorry there was a typo and I misread your code regardingi. The first sentence (implicitlyshared/sharedapplies tok). You should defineilocally then it becomesprivate/firstprivateimplicitly which is what you want. You may also specify the data-sharing attributes manually (just withfirstprivate(i)instead.
– Zulan
Nov 27 '18 at 8:02
Understood. Thanks a lot for your answer!
– Cosmin Ioniță
Nov 27 '18 at 14:08
@CosminIoniță please note that the recently released OpenMP 5.0 natively supports task_reduction, but it is a bit different and I have no experience with it yet. AFAIK it is only supported in the Intel 18.0 compiler yet.
– Zulan
Nov 28 '18 at 11:54
Great to know! Thanks, @Zulan!
– Cosmin Ioniță
Nov 28 '18 at 17:18
add a comment |
You say aboutkthat issharedin the task generating construct but then you say that it'sfirstprivate. It's a bit unclear to me what's happening withk
– Cosmin Ioniță
Nov 26 '18 at 21:05
Sorry there was a typo and I misread your code regardingi. The first sentence (implicitlyshared/sharedapplies tok). You should defineilocally then it becomesprivate/firstprivateimplicitly which is what you want. You may also specify the data-sharing attributes manually (just withfirstprivate(i)instead.
– Zulan
Nov 27 '18 at 8:02
Understood. Thanks a lot for your answer!
– Cosmin Ioniță
Nov 27 '18 at 14:08
@CosminIoniță please note that the recently released OpenMP 5.0 natively supports task_reduction, but it is a bit different and I have no experience with it yet. AFAIK it is only supported in the Intel 18.0 compiler yet.
– Zulan
Nov 28 '18 at 11:54
Great to know! Thanks, @Zulan!
– Cosmin Ioniță
Nov 28 '18 at 17:18
You say about
k that is shared in the task generating construct but then you say that it's firstprivate. It's a bit unclear to me what's happening with k– Cosmin Ioniță
Nov 26 '18 at 21:05
You say about
k that is shared in the task generating construct but then you say that it's firstprivate. It's a bit unclear to me what's happening with k– Cosmin Ioniță
Nov 26 '18 at 21:05
Sorry there was a typo and I misread your code regarding
i. The first sentence (implicitly shared/shared applies to k). You should define i locally then it becomes private/firstprivate implicitly which is what you want. You may also specify the data-sharing attributes manually (just with firstprivate(i) instead.– Zulan
Nov 27 '18 at 8:02
Sorry there was a typo and I misread your code regarding
i. The first sentence (implicitly shared/shared applies to k). You should define i locally then it becomes private/firstprivate implicitly which is what you want. You may also specify the data-sharing attributes manually (just with firstprivate(i) instead.– Zulan
Nov 27 '18 at 8:02
Understood. Thanks a lot for your answer!
– Cosmin Ioniță
Nov 27 '18 at 14:08
Understood. Thanks a lot for your answer!
– Cosmin Ioniță
Nov 27 '18 at 14:08
@CosminIoniță please note that the recently released OpenMP 5.0 natively supports task_reduction, but it is a bit different and I have no experience with it yet. AFAIK it is only supported in the Intel 18.0 compiler yet.
– Zulan
Nov 28 '18 at 11:54
@CosminIoniță please note that the recently released OpenMP 5.0 natively supports task_reduction, but it is a bit different and I have no experience with it yet. AFAIK it is only supported in the Intel 18.0 compiler yet.
– Zulan
Nov 28 '18 at 11:54
Great to know! Thanks, @Zulan!
– Cosmin Ioniță
Nov 28 '18 at 17:18
Great to know! Thanks, @Zulan!
– Cosmin Ioniță
Nov 28 '18 at 17:18
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53459821%2fhow-can-i-achieve-reduction-using-openmp-tasks%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
In your second code,
partialSumis shared among all your threads. The reduction handles making private copies ofmysumand combining them at the end, but the same treatment is not extended topartialSum, which therefore is the subject of a data race. The slide deck you linked uses athreadprivate()directive to address that problem. I'm not certain that would be sufficient for you, but it would at least resolve the data race.– John Bollinger
Nov 24 '18 at 16:08
I don't think that
partialSumis shared among all threads because I also declare it asthreadPrivate, exactly as in that article– Cosmin Ioniță
Nov 25 '18 at 17:17
1
I guess I overlooked that at the end of the question. Please, present a Minimal, Complete, and Verifiable example exhibiting the problem. Not only will that reduce the likelihood of such misunderstandings, but the additional context may prove important.
– John Bollinger
Nov 25 '18 at 17:43
What misunderstandings? I stated the fact that
partialSumisthreadPrivatefrom the beginning. I think that you should have read the entire question from the beginning.– Cosmin Ioniță
Nov 26 '18 at 8:23