Vectorizing loop with running value that depends on previous value (+ if statement)
I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:
def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):
if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val
out[i] = new_val
prev_val = new_val
return out
I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?
I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).
So far, I have tried np.vectorize
and numba's @jit
, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()
?) Thanks.
(Yes, in the case of a=b this becomes easy!)
python numba
add a comment |
I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:
def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):
if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val
out[i] = new_val
prev_val = new_val
return out
I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?
I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).
So far, I have tried np.vectorize
and numba's @jit
, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()
?) Thanks.
(Yes, in the case of a=b this becomes easy!)
python numba
1
I doubt you will get much performance gain withnp.vectorize
. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".
– Deepak Saini
Nov 27 '18 at 4:21
add a comment |
I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:
def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):
if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val
out[i] = new_val
prev_val = new_val
return out
I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?
I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).
So far, I have tried np.vectorize
and numba's @jit
, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()
?) Thanks.
(Yes, in the case of a=b this becomes easy!)
python numba
I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:
def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):
if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val
out[i] = new_val
prev_val = new_val
return out
I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?
I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).
So far, I have tried np.vectorize
and numba's @jit
, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()
?) Thanks.
(Yes, in the case of a=b this becomes easy!)
python numba
python numba
edited Nov 27 '18 at 4:17
eyllanesc
79.3k103257
79.3k103257
asked Nov 27 '18 at 3:13
sh37211sh37211
425518
425518
1
I doubt you will get much performance gain withnp.vectorize
. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".
– Deepak Saini
Nov 27 '18 at 4:21
add a comment |
1
I doubt you will get much performance gain withnp.vectorize
. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".
– Deepak Saini
Nov 27 '18 at 4:21
1
1
I doubt you will get much performance gain with
np.vectorize
. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".– Deepak Saini
Nov 27 '18 at 4:21
I doubt you will get much performance gain with
np.vectorize
. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".– Deepak Saini
Nov 27 '18 at 4:21
add a comment |
2 Answers
2
active
oldest
votes
JITing in nopython mode is faster. Quoting from numba docs:
Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.
@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out
So that on:
x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))
The timing are :
my_loop4 : 0.235s
my_loop5 : 0.193s
HTH.
Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.
– sh37211
Nov 27 '18 at 4:38
@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.
– Deepak Saini
Nov 27 '18 at 4:41
add a comment |
I realized that by removing dummy variables, this code could be put into a form where numba and @autojit
could work their magic and make it "fast":
from numba import jit, autojit
@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out
Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.
Don't usenp.arange(x.shape[0])
. This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)
– max9111
Nov 27 '18 at 8:26
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53492190%2fvectorizing-loop-with-running-value-that-depends-on-previous-value-if-stateme%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
JITing in nopython mode is faster. Quoting from numba docs:
Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.
@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out
So that on:
x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))
The timing are :
my_loop4 : 0.235s
my_loop5 : 0.193s
HTH.
Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.
– sh37211
Nov 27 '18 at 4:38
@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.
– Deepak Saini
Nov 27 '18 at 4:41
add a comment |
JITing in nopython mode is faster. Quoting from numba docs:
Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.
@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out
So that on:
x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))
The timing are :
my_loop4 : 0.235s
my_loop5 : 0.193s
HTH.
Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.
– sh37211
Nov 27 '18 at 4:38
@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.
– Deepak Saini
Nov 27 '18 at 4:41
add a comment |
JITing in nopython mode is faster. Quoting from numba docs:
Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.
@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out
So that on:
x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))
The timing are :
my_loop4 : 0.235s
my_loop5 : 0.193s
HTH.
JITing in nopython mode is faster. Quoting from numba docs:
Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.
@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out
So that on:
x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))
The timing are :
my_loop4 : 0.235s
my_loop5 : 0.193s
HTH.
edited Nov 27 '18 at 17:06
answered Nov 27 '18 at 4:32
Deepak SainiDeepak Saini
1,599815
1,599815
Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.
– sh37211
Nov 27 '18 at 4:38
@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.
– Deepak Saini
Nov 27 '18 at 4:41
add a comment |
Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.
– sh37211
Nov 27 '18 at 4:38
@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.
– Deepak Saini
Nov 27 '18 at 4:41
Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.
– sh37211
Nov 27 '18 at 4:38
Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.
– sh37211
Nov 27 '18 at 4:38
@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.
– Deepak Saini
Nov 27 '18 at 4:41
@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.
– Deepak Saini
Nov 27 '18 at 4:41
add a comment |
I realized that by removing dummy variables, this code could be put into a form where numba and @autojit
could work their magic and make it "fast":
from numba import jit, autojit
@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out
Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.
Don't usenp.arange(x.shape[0])
. This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)
– max9111
Nov 27 '18 at 8:26
add a comment |
I realized that by removing dummy variables, this code could be put into a form where numba and @autojit
could work their magic and make it "fast":
from numba import jit, autojit
@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out
Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.
Don't usenp.arange(x.shape[0])
. This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)
– max9111
Nov 27 '18 at 8:26
add a comment |
I realized that by removing dummy variables, this code could be put into a form where numba and @autojit
could work their magic and make it "fast":
from numba import jit, autojit
@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out
Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.
I realized that by removing dummy variables, this code could be put into a form where numba and @autojit
could work their magic and make it "fast":
from numba import jit, autojit
@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out
Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.
answered Nov 27 '18 at 4:18
sh37211sh37211
425518
425518
Don't usenp.arange(x.shape[0])
. This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)
– max9111
Nov 27 '18 at 8:26
add a comment |
Don't usenp.arange(x.shape[0])
. This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)
– max9111
Nov 27 '18 at 8:26
Don't use
np.arange(x.shape[0])
. This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)– max9111
Nov 27 '18 at 8:26
Don't use
np.arange(x.shape[0])
. This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)– max9111
Nov 27 '18 at 8:26
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53492190%2fvectorizing-loop-with-running-value-that-depends-on-previous-value-if-stateme%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
I doubt you will get much performance gain with
np.vectorize
. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".– Deepak Saini
Nov 27 '18 at 4:21