Vectorizing loop with running value that depends on previous value (+ if statement)












1















I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:



def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):

if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val

out[i] = new_val

prev_val = new_val

return out


I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?



I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).



So far, I have tried np.vectorize and numba's @jit, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()?) Thanks.



(Yes, in the case of a=b this becomes easy!)










share|improve this question




















  • 1





    I doubt you will get much performance gain with np.vectorize. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".

    – Deepak Saini
    Nov 27 '18 at 4:21
















1















I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:



def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):

if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val

out[i] = new_val

prev_val = new_val

return out


I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?



I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).



So far, I have tried np.vectorize and numba's @jit, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()?) Thanks.



(Yes, in the case of a=b this becomes easy!)










share|improve this question




















  • 1





    I doubt you will get much performance gain with np.vectorize. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".

    – Deepak Saini
    Nov 27 '18 at 4:21














1












1








1








I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:



def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):

if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val

out[i] = new_val

prev_val = new_val

return out


I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?



I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).



So far, I have tried np.vectorize and numba's @jit, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()?) Thanks.



(Yes, in the case of a=b this becomes easy!)










share|improve this question
















I'm accustomed to writing vectorized statements and list comprehensions in Python, but I've got a problem that appears with both a "running" computation that depends on the previous value in the loop, as well as an if statement. Schematically it looks like this:



def my_loop(x, a=0.5, b=0.9):
out = np.copy(x)
prev_val = 0
for i in np.arange(x.shape[0]):

if x[i] < prev_val:
new_val = (1-a)*x[i] + a*prev_val
else:
new_val = (1-b)*x[i] + b*prev_val

out[i] = new_val

prev_val = new_val

return out


I haven't been able to figure out how one could vectorize this (e.g. via using some kind of accumulator), so I'll ask: Is there a way to make this more Pythonic/faster?



I've seen previous posts about vectorizing when there's an if statement -- usually solved via np.where() -- but not one where there's a "running" value that depends on its previous state...so I haven't found any duplicate questions yet (and this one isn't about vectorization in the usual sense, this one is about 'previous value' but referring to list indices).



So far, I have tried np.vectorize and numba's @jit, and they do run somewhat faster, but neither gives me the speed I'm hoping for. Is there something I'm missing? (Maybe something with map()?) Thanks.



(Yes, in the case of a=b this becomes easy!)







python numba






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 27 '18 at 4:17









eyllanesc

79.3k103257




79.3k103257










asked Nov 27 '18 at 3:13









sh37211sh37211

425518




425518








  • 1





    I doubt you will get much performance gain with np.vectorize. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".

    – Deepak Saini
    Nov 27 '18 at 4:21














  • 1





    I doubt you will get much performance gain with np.vectorize. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".

    – Deepak Saini
    Nov 27 '18 at 4:21








1




1





I doubt you will get much performance gain with np.vectorize. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".

– Deepak Saini
Nov 27 '18 at 4:21





I doubt you will get much performance gain with np.vectorize. Quoting from the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.".

– Deepak Saini
Nov 27 '18 at 4:21












2 Answers
2






active

oldest

votes


















1














JITing in nopython mode is faster. Quoting from numba docs:




Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.




@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)

for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out


So that on:



x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))


The timing are :




my_loop4 : 0.235s



my_loop5 : 0.193s




HTH.






share|improve this answer


























  • Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.

    – sh37211
    Nov 27 '18 at 4:38











  • @sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.

    – Deepak Saini
    Nov 27 '18 at 4:41



















0














I realized that by removing dummy variables, this code could be put into a form where numba and @autojit could work their magic and make it "fast":



from numba import jit, autojit

@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out


Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.






share|improve this answer
























  • Don't use np.arange(x.shape[0]). This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)

    – max9111
    Nov 27 '18 at 8:26











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53492190%2fvectorizing-loop-with-running-value-that-depends-on-previous-value-if-stateme%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














JITing in nopython mode is faster. Quoting from numba docs:




Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.




@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)

for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out


So that on:



x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))


The timing are :




my_loop4 : 0.235s



my_loop5 : 0.193s




HTH.






share|improve this answer


























  • Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.

    – sh37211
    Nov 27 '18 at 4:38











  • @sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.

    – Deepak Saini
    Nov 27 '18 at 4:41
















1














JITing in nopython mode is faster. Quoting from numba docs:




Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.




@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)

for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out


So that on:



x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))


The timing are :




my_loop4 : 0.235s



my_loop5 : 0.193s




HTH.






share|improve this answer


























  • Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.

    – sh37211
    Nov 27 '18 at 4:38











  • @sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.

    – Deepak Saini
    Nov 27 '18 at 4:41














1












1








1







JITing in nopython mode is faster. Quoting from numba docs:




Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.




@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)

for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out


So that on:



x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))


The timing are :




my_loop4 : 0.235s



my_loop5 : 0.193s




HTH.






share|improve this answer















JITing in nopython mode is faster. Quoting from numba docs:




Numba has two compilation modes: nopython mode and object mode. The
former produces much faster code, but has limitations that can force
Numba to fall back to the latter. To prevent Numba from falling back,
and instead raise an error, pass nopython=True.




@nb.njit(cache=True)
def my_loop5(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)

for i in range(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a) * x[i] + a * out[i-1]
else:
out[i] = (1-b) * x[i] + b * out[i-1]
return out


So that on:



x = np.random.uniform(low=-5.0, high=5.0, size=(1000000,))


The timing are :




my_loop4 : 0.235s



my_loop5 : 0.193s




HTH.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 27 '18 at 17:06

























answered Nov 27 '18 at 4:32









Deepak SainiDeepak Saini

1,599815




1,599815













  • Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.

    – sh37211
    Nov 27 '18 at 4:38











  • @sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.

    – Deepak Saini
    Nov 27 '18 at 4:41



















  • Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.

    – sh37211
    Nov 27 '18 at 4:38











  • @sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.

    – Deepak Saini
    Nov 27 '18 at 4:41

















Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.

– sh37211
Nov 27 '18 at 4:38





Interesting, hey thanks for trying this. When I compare autojit with njit, the latter is about the same but a bit slower. 41.1 µs per loop for autojit vs. 41.2 µs per loop for njit.

– sh37211
Nov 27 '18 at 4:38













@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.

– Deepak Saini
Nov 27 '18 at 4:41





@sh37211 Try on larger inputs, as their is compile time also included, or time on second run after the compiled code is cached.

– Deepak Saini
Nov 27 '18 at 4:41













0














I realized that by removing dummy variables, this code could be put into a form where numba and @autojit could work their magic and make it "fast":



from numba import jit, autojit

@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out


Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.






share|improve this answer
























  • Don't use np.arange(x.shape[0]). This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)

    – max9111
    Nov 27 '18 at 8:26
















0














I realized that by removing dummy variables, this code could be put into a form where numba and @autojit could work their magic and make it "fast":



from numba import jit, autojit

@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out


Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.






share|improve this answer
























  • Don't use np.arange(x.shape[0]). This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)

    – max9111
    Nov 27 '18 at 8:26














0












0








0







I realized that by removing dummy variables, this code could be put into a form where numba and @autojit could work their magic and make it "fast":



from numba import jit, autojit

@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out


Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.






share|improve this answer













I realized that by removing dummy variables, this code could be put into a form where numba and @autojit could work their magic and make it "fast":



from numba import jit, autojit

@autojit
def my_loop4(x, a=0.5, b=0.9):
out = np.zeros(x.shape[0],dtype=x.dtype)
for i in np.arange(x.shape[0]):
if x[i] < out[i-1]:
out[i] = (1-a)*x[i] + a*out[i-1]
else:
out[i] = (1-b)*x[i] + b*out[i-1]
return out


Without the @autojit, this is still painfully slow. But with it on,...problem solved. So, removing the unnecessary variables and adding @autojit is what did the trick.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 27 '18 at 4:18









sh37211sh37211

425518




425518













  • Don't use np.arange(x.shape[0]). This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)

    – max9111
    Nov 27 '18 at 8:26



















  • Don't use np.arange(x.shape[0]). This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)

    – max9111
    Nov 27 '18 at 8:26

















Don't use np.arange(x.shape[0]). This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)

– max9111
Nov 27 '18 at 8:26





Don't use np.arange(x.shape[0]). This creates an array, but you want a simpile iterator like range (Python3) or xrange(Python2)

– max9111
Nov 27 '18 at 8:26


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53492190%2fvectorizing-loop-with-running-value-that-depends-on-previous-value-if-stateme%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

Calculate evaluation metrics using cross_val_predict sklearn

Insert data from modal to MySQL (multiple modal on website)