Is it possible to use apply function or vectorization on this code logic?
I am trying to calculate closing balance
Input dataframe:
open inOut close
0 3 100 0
1 0 300 0
2 0 200 0
3 0 230 0
4 0 150 0
Output DataFrame
open inOut close
0 3 100 103
1 103 300 403
2 403 200 603
3 603 230 833
4 833 150 983
I am able to achieve this using crude for-loop and to optimize it i have used iterrow()
For-Loop
%%timeit
for i in range(len(df.index)):
if i>0:
df.iloc[i]['open'] = df.iloc[i-1]['close']
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
else:
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
1.64 ms ± 51.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
iterrows
%%timeit
for index,row in dfOg.iterrows():
if index>0:
row['open'] = dfOg.iloc[index-1]['close']
row['close'] = row['open']+row['inOut']
else:
row['close'] = row['open']+row['inOut']
627 µs ± 28.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
performance optimized from 1.64ms -> 627µs
As per this blog, I am struggling to figure out how to write the above logic using apply() and vectorization.
for vectorization, I tried shifting the columns but not able to achieve the desired output.
python pandas numpy vectorization apply
add a comment |
I am trying to calculate closing balance
Input dataframe:
open inOut close
0 3 100 0
1 0 300 0
2 0 200 0
3 0 230 0
4 0 150 0
Output DataFrame
open inOut close
0 3 100 103
1 103 300 403
2 403 200 603
3 603 230 833
4 833 150 983
I am able to achieve this using crude for-loop and to optimize it i have used iterrow()
For-Loop
%%timeit
for i in range(len(df.index)):
if i>0:
df.iloc[i]['open'] = df.iloc[i-1]['close']
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
else:
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
1.64 ms ± 51.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
iterrows
%%timeit
for index,row in dfOg.iterrows():
if index>0:
row['open'] = dfOg.iloc[index-1]['close']
row['close'] = row['open']+row['inOut']
else:
row['close'] = row['open']+row['inOut']
627 µs ± 28.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
performance optimized from 1.64ms -> 627µs
As per this blog, I am struggling to figure out how to write the above logic using apply() and vectorization.
for vectorization, I tried shifting the columns but not able to achieve the desired output.
python pandas numpy vectorization apply
I'm sorry, I made a silly mistake in the closing balance logic
– Mukesh Suthar
Nov 27 '18 at 9:25
.apply
is not vectorization
– juanpa.arrivillaga
Nov 27 '18 at 9:32
@juanpa.arrivillaga yes i agree, but as per the blog i mentioned, apply is faster than itterrows()
– Mukesh Suthar
Nov 27 '18 at 9:34
You should useitertuples
and apply won't really be faster than that. Note, youriterrows
version doesn't work, it doesn't modify the original data-frame
– juanpa.arrivillaga
Nov 27 '18 at 9:50
thanks, @juanpa.arrivillaga ill check the performance of itertuples as well.
– Mukesh Suthar
Nov 27 '18 at 9:51
add a comment |
I am trying to calculate closing balance
Input dataframe:
open inOut close
0 3 100 0
1 0 300 0
2 0 200 0
3 0 230 0
4 0 150 0
Output DataFrame
open inOut close
0 3 100 103
1 103 300 403
2 403 200 603
3 603 230 833
4 833 150 983
I am able to achieve this using crude for-loop and to optimize it i have used iterrow()
For-Loop
%%timeit
for i in range(len(df.index)):
if i>0:
df.iloc[i]['open'] = df.iloc[i-1]['close']
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
else:
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
1.64 ms ± 51.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
iterrows
%%timeit
for index,row in dfOg.iterrows():
if index>0:
row['open'] = dfOg.iloc[index-1]['close']
row['close'] = row['open']+row['inOut']
else:
row['close'] = row['open']+row['inOut']
627 µs ± 28.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
performance optimized from 1.64ms -> 627µs
As per this blog, I am struggling to figure out how to write the above logic using apply() and vectorization.
for vectorization, I tried shifting the columns but not able to achieve the desired output.
python pandas numpy vectorization apply
I am trying to calculate closing balance
Input dataframe:
open inOut close
0 3 100 0
1 0 300 0
2 0 200 0
3 0 230 0
4 0 150 0
Output DataFrame
open inOut close
0 3 100 103
1 103 300 403
2 403 200 603
3 603 230 833
4 833 150 983
I am able to achieve this using crude for-loop and to optimize it i have used iterrow()
For-Loop
%%timeit
for i in range(len(df.index)):
if i>0:
df.iloc[i]['open'] = df.iloc[i-1]['close']
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
else:
df.iloc[i]['close'] = df.iloc[i]['open']+df.iloc[i]['inOut']
1.64 ms ± 51.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
iterrows
%%timeit
for index,row in dfOg.iterrows():
if index>0:
row['open'] = dfOg.iloc[index-1]['close']
row['close'] = row['open']+row['inOut']
else:
row['close'] = row['open']+row['inOut']
627 µs ± 28.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
performance optimized from 1.64ms -> 627µs
As per this blog, I am struggling to figure out how to write the above logic using apply() and vectorization.
for vectorization, I tried shifting the columns but not able to achieve the desired output.
python pandas numpy vectorization apply
python pandas numpy vectorization apply
edited Nov 27 '18 at 9:25
Mukesh Suthar
asked Nov 27 '18 at 8:58
Mukesh SutharMukesh Suthar
327
327
I'm sorry, I made a silly mistake in the closing balance logic
– Mukesh Suthar
Nov 27 '18 at 9:25
.apply
is not vectorization
– juanpa.arrivillaga
Nov 27 '18 at 9:32
@juanpa.arrivillaga yes i agree, but as per the blog i mentioned, apply is faster than itterrows()
– Mukesh Suthar
Nov 27 '18 at 9:34
You should useitertuples
and apply won't really be faster than that. Note, youriterrows
version doesn't work, it doesn't modify the original data-frame
– juanpa.arrivillaga
Nov 27 '18 at 9:50
thanks, @juanpa.arrivillaga ill check the performance of itertuples as well.
– Mukesh Suthar
Nov 27 '18 at 9:51
add a comment |
I'm sorry, I made a silly mistake in the closing balance logic
– Mukesh Suthar
Nov 27 '18 at 9:25
.apply
is not vectorization
– juanpa.arrivillaga
Nov 27 '18 at 9:32
@juanpa.arrivillaga yes i agree, but as per the blog i mentioned, apply is faster than itterrows()
– Mukesh Suthar
Nov 27 '18 at 9:34
You should useitertuples
and apply won't really be faster than that. Note, youriterrows
version doesn't work, it doesn't modify the original data-frame
– juanpa.arrivillaga
Nov 27 '18 at 9:50
thanks, @juanpa.arrivillaga ill check the performance of itertuples as well.
– Mukesh Suthar
Nov 27 '18 at 9:51
I'm sorry, I made a silly mistake in the closing balance logic
– Mukesh Suthar
Nov 27 '18 at 9:25
I'm sorry, I made a silly mistake in the closing balance logic
– Mukesh Suthar
Nov 27 '18 at 9:25
.apply
is not vectorization– juanpa.arrivillaga
Nov 27 '18 at 9:32
.apply
is not vectorization– juanpa.arrivillaga
Nov 27 '18 at 9:32
@juanpa.arrivillaga yes i agree, but as per the blog i mentioned, apply is faster than itterrows()
– Mukesh Suthar
Nov 27 '18 at 9:34
@juanpa.arrivillaga yes i agree, but as per the blog i mentioned, apply is faster than itterrows()
– Mukesh Suthar
Nov 27 '18 at 9:34
You should use
itertuples
and apply won't really be faster than that. Note, your iterrows
version doesn't work, it doesn't modify the original data-frame– juanpa.arrivillaga
Nov 27 '18 at 9:50
You should use
itertuples
and apply won't really be faster than that. Note, your iterrows
version doesn't work, it doesn't modify the original data-frame– juanpa.arrivillaga
Nov 27 '18 at 9:50
thanks, @juanpa.arrivillaga ill check the performance of itertuples as well.
– Mukesh Suthar
Nov 27 '18 at 9:51
thanks, @juanpa.arrivillaga ill check the performance of itertuples as well.
– Mukesh Suthar
Nov 27 '18 at 9:51
add a comment |
2 Answers
2
active
oldest
votes
Edit: I changed things around to match the edits OP made to the question
You can do what you want in a vectorized way without any loops like this:
import pandas as pd
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
df['close'].values[:] = df['open'].values[0] + df['inOut'].values.cumsum()
df['open'].values[1:] = df['close'].values[:-1]
Timing with %%timeit
:
529 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
close inOut open
0 103 100 3
1 403 300 103
2 603 200 403
3 833 230 603
4 983 150 833
So vectorizing your code this way is indeed somewhat faster. In fact, it's probably about as fast as possible. You can see this by timing just the dataframe creation code:
%%timeit
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
Result:
367 µs ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Subtracting out the time it takes to create the dataframe, the vectorized version of filling in your dataframe only takes about ~160 µs.
kindly revisit the question. btw i like this simple approach, but i doubt will this work to calc closing balance.
– Mukesh Suthar
Nov 27 '18 at 9:26
add a comment |
You can use np.where
%%timeit
df['open'] = np.where(df.index==0, df['open'], df['inOut'].shift())
df['close'] = df['open'] + df['inOut']
# 1.07 ms ± 16.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
open inOut close
0 3.0 100 103.0
1 100.0 300 300.0
2 300.0 200 200.0
3 200.0 230 230.0
4 230.0 150 150.0
1
That's pretty slick, but it's slow. I guess from the array construction thatnp.where
does?
– tel
Nov 27 '18 at 9:26
@tel yeah it's bit slower than your answer as there's condition check innp.where
– AkshayNevrekar
Nov 27 '18 at 9:31
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53495936%2fis-it-possible-to-use-apply-function-or-vectorization-on-this-code-logic%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Edit: I changed things around to match the edits OP made to the question
You can do what you want in a vectorized way without any loops like this:
import pandas as pd
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
df['close'].values[:] = df['open'].values[0] + df['inOut'].values.cumsum()
df['open'].values[1:] = df['close'].values[:-1]
Timing with %%timeit
:
529 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
close inOut open
0 103 100 3
1 403 300 103
2 603 200 403
3 833 230 603
4 983 150 833
So vectorizing your code this way is indeed somewhat faster. In fact, it's probably about as fast as possible. You can see this by timing just the dataframe creation code:
%%timeit
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
Result:
367 µs ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Subtracting out the time it takes to create the dataframe, the vectorized version of filling in your dataframe only takes about ~160 µs.
kindly revisit the question. btw i like this simple approach, but i doubt will this work to calc closing balance.
– Mukesh Suthar
Nov 27 '18 at 9:26
add a comment |
Edit: I changed things around to match the edits OP made to the question
You can do what you want in a vectorized way without any loops like this:
import pandas as pd
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
df['close'].values[:] = df['open'].values[0] + df['inOut'].values.cumsum()
df['open'].values[1:] = df['close'].values[:-1]
Timing with %%timeit
:
529 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
close inOut open
0 103 100 3
1 403 300 103
2 603 200 403
3 833 230 603
4 983 150 833
So vectorizing your code this way is indeed somewhat faster. In fact, it's probably about as fast as possible. You can see this by timing just the dataframe creation code:
%%timeit
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
Result:
367 µs ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Subtracting out the time it takes to create the dataframe, the vectorized version of filling in your dataframe only takes about ~160 µs.
kindly revisit the question. btw i like this simple approach, but i doubt will this work to calc closing balance.
– Mukesh Suthar
Nov 27 '18 at 9:26
add a comment |
Edit: I changed things around to match the edits OP made to the question
You can do what you want in a vectorized way without any loops like this:
import pandas as pd
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
df['close'].values[:] = df['open'].values[0] + df['inOut'].values.cumsum()
df['open'].values[1:] = df['close'].values[:-1]
Timing with %%timeit
:
529 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
close inOut open
0 103 100 3
1 403 300 103
2 603 200 403
3 833 230 603
4 983 150 833
So vectorizing your code this way is indeed somewhat faster. In fact, it's probably about as fast as possible. You can see this by timing just the dataframe creation code:
%%timeit
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
Result:
367 µs ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Subtracting out the time it takes to create the dataframe, the vectorized version of filling in your dataframe only takes about ~160 µs.
Edit: I changed things around to match the edits OP made to the question
You can do what you want in a vectorized way without any loops like this:
import pandas as pd
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
df['close'].values[:] = df['open'].values[0] + df['inOut'].values.cumsum()
df['open'].values[1:] = df['close'].values[:-1]
Timing with %%timeit
:
529 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
close inOut open
0 103 100 3
1 403 300 103
2 603 200 403
3 833 230 603
4 983 150 833
So vectorizing your code this way is indeed somewhat faster. In fact, it's probably about as fast as possible. You can see this by timing just the dataframe creation code:
%%timeit
d = {'open': [3] + [0]*4, 'inOut': [100, 300, 200, 230, 150], 'close': [0]*5}
df = pd.DataFrame(d)
Result:
367 µs ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Subtracting out the time it takes to create the dataframe, the vectorized version of filling in your dataframe only takes about ~160 µs.
edited Nov 27 '18 at 9:39
answered Nov 27 '18 at 9:10
teltel
7,41621431
7,41621431
kindly revisit the question. btw i like this simple approach, but i doubt will this work to calc closing balance.
– Mukesh Suthar
Nov 27 '18 at 9:26
add a comment |
kindly revisit the question. btw i like this simple approach, but i doubt will this work to calc closing balance.
– Mukesh Suthar
Nov 27 '18 at 9:26
kindly revisit the question. btw i like this simple approach, but i doubt will this work to calc closing balance.
– Mukesh Suthar
Nov 27 '18 at 9:26
kindly revisit the question. btw i like this simple approach, but i doubt will this work to calc closing balance.
– Mukesh Suthar
Nov 27 '18 at 9:26
add a comment |
You can use np.where
%%timeit
df['open'] = np.where(df.index==0, df['open'], df['inOut'].shift())
df['close'] = df['open'] + df['inOut']
# 1.07 ms ± 16.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
open inOut close
0 3.0 100 103.0
1 100.0 300 300.0
2 300.0 200 200.0
3 200.0 230 230.0
4 230.0 150 150.0
1
That's pretty slick, but it's slow. I guess from the array construction thatnp.where
does?
– tel
Nov 27 '18 at 9:26
@tel yeah it's bit slower than your answer as there's condition check innp.where
– AkshayNevrekar
Nov 27 '18 at 9:31
add a comment |
You can use np.where
%%timeit
df['open'] = np.where(df.index==0, df['open'], df['inOut'].shift())
df['close'] = df['open'] + df['inOut']
# 1.07 ms ± 16.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
open inOut close
0 3.0 100 103.0
1 100.0 300 300.0
2 300.0 200 200.0
3 200.0 230 230.0
4 230.0 150 150.0
1
That's pretty slick, but it's slow. I guess from the array construction thatnp.where
does?
– tel
Nov 27 '18 at 9:26
@tel yeah it's bit slower than your answer as there's condition check innp.where
– AkshayNevrekar
Nov 27 '18 at 9:31
add a comment |
You can use np.where
%%timeit
df['open'] = np.where(df.index==0, df['open'], df['inOut'].shift())
df['close'] = df['open'] + df['inOut']
# 1.07 ms ± 16.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
open inOut close
0 3.0 100 103.0
1 100.0 300 300.0
2 300.0 200 200.0
3 200.0 230 230.0
4 230.0 150 150.0
You can use np.where
%%timeit
df['open'] = np.where(df.index==0, df['open'], df['inOut'].shift())
df['close'] = df['open'] + df['inOut']
# 1.07 ms ± 16.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Output:
open inOut close
0 3.0 100 103.0
1 100.0 300 300.0
2 300.0 200 200.0
3 200.0 230 230.0
4 230.0 150 150.0
edited Nov 27 '18 at 9:27
answered Nov 27 '18 at 9:17
AkshayNevrekarAkshayNevrekar
4,85391837
4,85391837
1
That's pretty slick, but it's slow. I guess from the array construction thatnp.where
does?
– tel
Nov 27 '18 at 9:26
@tel yeah it's bit slower than your answer as there's condition check innp.where
– AkshayNevrekar
Nov 27 '18 at 9:31
add a comment |
1
That's pretty slick, but it's slow. I guess from the array construction thatnp.where
does?
– tel
Nov 27 '18 at 9:26
@tel yeah it's bit slower than your answer as there's condition check innp.where
– AkshayNevrekar
Nov 27 '18 at 9:31
1
1
That's pretty slick, but it's slow. I guess from the array construction that
np.where
does?– tel
Nov 27 '18 at 9:26
That's pretty slick, but it's slow. I guess from the array construction that
np.where
does?– tel
Nov 27 '18 at 9:26
@tel yeah it's bit slower than your answer as there's condition check in
np.where
– AkshayNevrekar
Nov 27 '18 at 9:31
@tel yeah it's bit slower than your answer as there's condition check in
np.where
– AkshayNevrekar
Nov 27 '18 at 9:31
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53495936%2fis-it-possible-to-use-apply-function-or-vectorization-on-this-code-logic%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I'm sorry, I made a silly mistake in the closing balance logic
– Mukesh Suthar
Nov 27 '18 at 9:25
.apply
is not vectorization– juanpa.arrivillaga
Nov 27 '18 at 9:32
@juanpa.arrivillaga yes i agree, but as per the blog i mentioned, apply is faster than itterrows()
– Mukesh Suthar
Nov 27 '18 at 9:34
You should use
itertuples
and apply won't really be faster than that. Note, youriterrows
version doesn't work, it doesn't modify the original data-frame– juanpa.arrivillaga
Nov 27 '18 at 9:50
thanks, @juanpa.arrivillaga ill check the performance of itertuples as well.
– Mukesh Suthar
Nov 27 '18 at 9:51