Awk to update file based on match and condition in another

The below awk will produce the tab-delimeted file1 with the difference between $3-$2 calulated for each line and printed in $6. Before the awk is executed only 5 fields exist.

What I am having trouble with updated each $2 value in file2 with the $7 value of file1 if the $1 value of file2 matches the $5 of file1 and $6 in file1 is not intron. If the value of $5 is intron then then the value of $7 in file1 is zero. So for example line 1 in file1 is intron so that is equvilant to zero or skipped (those lines are not needed in the calculation).

It is possible that a $1 value in file2 may not exist in file1 and in this case the value of $2 in file2 is zero. Line3 infile2 is an example and is set to zero because it does not exist in file1. Thank you:).

Awk w/ output

awk '

  FNR==NR{                 # process same line

     b[$4]=$3-$2;

     next  # process next line

}

{

 a[$5]+=($3-$2)

}

{

  split($1, b, " "); print b[0], a[b[0]]

}' OFS="t" file1 file2

Output

File1 tab-delimited

chr5    86667863    86667879    RASA1   intron  16

chr5    86669977    86669995    RASA1   splicing    18

chr5    86670703    86670805    RASA1   exon    102

chr5    86679453    86679547    RASA1   intron  94

chr5    86679571    86679673    RASA1   exon    102

chr19   15088950    15088961    NOTCH2  intron  50

chr19   15288950    15288961    NOTCH3  intron  11

chr19   15308240    15308275    NOTCH3  exon    35

File2 space delimited

RASA1 2135

NOTCH2 0

GIMAP8 87

NOTCH3 129

FOXF2 0

PRB3 63

Desired out after file2 is updated

RASA1 222  `(102+102+18)`

NOTCH2 0

GIMAP8 0

NOTCH3 35  `(35)`

FOXF2 0

PRB3 0

Maybe adding a | after the first awk with:

awk 'FNR==NR { a[$1]=$7; next } { if(a[$5]){$1=a[$5] }; print }'

To update file2

edited Nov 23 '18 at 20:58

asked Nov 23 '18 at 18:41

cm0728

1,486819

1

Instead of making the index of the array FNR make it $4 and increment the value stored in the array at that index by the result of your $3-$2 calcuation. Like a[$5]+=($3-$2) Then when processing the second file you can just pull the item from the array at the index that matches the first string in $1 like {split($1, b, " "); print b[0], a[b[0]] (pardon if that syntax isn't 100%, but that should get in the ballpark.
– JNevill
Nov 23 '18 at 18:54

I made the edits and don't think I follow, but included the output in the original post. Thank you :).
– cm0728
Nov 23 '18 at 20:55

add a comment |

The below awk will produce the tab-delimeted file1 with the difference between $3-$2 calulated for each line and printed in $6. Before the awk is executed only 5 fields exist.

Awk w/ output

awk '

  FNR==NR{                 # process same line

     b[$4]=$3-$2;

     next  # process next line

}

{

 a[$5]+=($3-$2)

}

{

  split($1, b, " "); print b[0], a[b[0]]

}' OFS="t" file1 file2

Output

File1 tab-delimited

chr5    86667863    86667879    RASA1   intron  16

chr5    86669977    86669995    RASA1   splicing    18

chr5    86670703    86670805    RASA1   exon    102

chr5    86679453    86679547    RASA1   intron  94

chr5    86679571    86679673    RASA1   exon    102

chr19   15088950    15088961    NOTCH2  intron  50

chr19   15288950    15288961    NOTCH3  intron  11

chr19   15308240    15308275    NOTCH3  exon    35

File2 space delimited

RASA1 2135

NOTCH2 0

GIMAP8 87

NOTCH3 129

FOXF2 0

PRB3 63

Desired out after file2 is updated

RASA1 222  `(102+102+18)`

NOTCH2 0

GIMAP8 0

NOTCH3 35  `(35)`

FOXF2 0

PRB3 0

Maybe adding a | after the first awk with:

awk 'FNR==NR { a[$1]=$7; next } { if(a[$5]){$1=a[$5] }; print }'

To update file2

edited Nov 23 '18 at 20:58

asked Nov 23 '18 at 18:41

cm0728

1,486819

1

Instead of making the index of the array FNR make it $4 and increment the value stored in the array at that index by the result of your $3-$2 calcuation. Like a[$5]+=($3-$2) Then when processing the second file you can just pull the item from the array at the index that matches the first string in $1 like {split($1, b, " "); print b[0], a[b[0]] (pardon if that syntax isn't 100%, but that should get in the ballpark.
– JNevill
Nov 23 '18 at 18:54

I made the edits and don't think I follow, but included the output in the original post. Thank you :).
– cm0728
Nov 23 '18 at 20:55

add a comment |

The below awk will produce the tab-delimeted file1 with the difference between $3-$2 calulated for each line and printed in $6. Before the awk is executed only 5 fields exist.

Awk w/ output

awk '

  FNR==NR{                 # process same line

     b[$4]=$3-$2;

     next  # process next line

}

{

 a[$5]+=($3-$2)

}

{

  split($1, b, " "); print b[0], a[b[0]]

}' OFS="t" file1 file2

Output

File1 tab-delimited

chr5    86667863    86667879    RASA1   intron  16

chr5    86669977    86669995    RASA1   splicing    18

chr5    86670703    86670805    RASA1   exon    102

chr5    86679453    86679547    RASA1   intron  94

chr5    86679571    86679673    RASA1   exon    102

chr19   15088950    15088961    NOTCH2  intron  50

chr19   15288950    15288961    NOTCH3  intron  11

chr19   15308240    15308275    NOTCH3  exon    35

File2 space delimited

RASA1 2135

NOTCH2 0

GIMAP8 87

NOTCH3 129

FOXF2 0

PRB3 63

Desired out after file2 is updated

RASA1 222  `(102+102+18)`

NOTCH2 0

GIMAP8 0

NOTCH3 35  `(35)`

FOXF2 0

PRB3 0

Maybe adding a | after the first awk with:

awk 'FNR==NR { a[$1]=$7; next } { if(a[$5]){$1=a[$5] }; print }'

To update file2

edited Nov 23 '18 at 20:58

asked Nov 23 '18 at 18:41

cm0728

1,486819

The below awk will produce the tab-delimeted file1 with the difference between $3-$2 calulated for each line and printed in $6. Before the awk is executed only 5 fields exist.

Awk w/ output

awk '

  FNR==NR{                 # process same line

     b[$4]=$3-$2;

     next  # process next line

}

{

 a[$5]+=($3-$2)

}

{

  split($1, b, " "); print b[0], a[b[0]]

}' OFS="t" file1 file2

Output

File1 tab-delimited

chr5    86667863    86667879    RASA1   intron  16

chr5    86669977    86669995    RASA1   splicing    18

chr5    86670703    86670805    RASA1   exon    102

chr5    86679453    86679547    RASA1   intron  94

chr5    86679571    86679673    RASA1   exon    102

chr19   15088950    15088961    NOTCH2  intron  50

chr19   15288950    15288961    NOTCH3  intron  11

chr19   15308240    15308275    NOTCH3  exon    35

File2 space delimited

RASA1 2135

NOTCH2 0

GIMAP8 87

NOTCH3 129

FOXF2 0

PRB3 63

Desired out after file2 is updated

RASA1 222  `(102+102+18)`

NOTCH2 0

GIMAP8 0

NOTCH3 35  `(35)`

FOXF2 0

PRB3 0

Maybe adding a | after the first awk with:

awk 'FNR==NR { a[$1]=$7; next } { if(a[$5]){$1=a[$5] }; print }'

To update file2

awk

edited Nov 23 '18 at 20:58

asked Nov 23 '18 at 18:41

cm0728

1,486819

edited Nov 23 '18 at 20:58

asked Nov 23 '18 at 18:41

cm0728

1,486819

edited Nov 23 '18 at 20:58

asked Nov 23 '18 at 18:41

cm0728

1,486819

asked Nov 23 '18 at 18:41

cm0728

1,486819

asked Nov 23 '18 at 18:41

cm0728

1,486819

1

Instead of making the index of the array FNR make it $4 and increment the value stored in the array at that index by the result of your $3-$2 calcuation. Like a[$5]+=($3-$2) Then when processing the second file you can just pull the item from the array at the index that matches the first string in $1 like {split($1, b, " "); print b[0], a[b[0]] (pardon if that syntax isn't 100%, but that should get in the ballpark.
– JNevill
Nov 23 '18 at 18:54

I made the edits and don't think I follow, but included the output in the original post. Thank you :).
– cm0728
Nov 23 '18 at 20:55

add a comment |

1

Instead of making the index of the array FNR make it $4 and increment the value stored in the array at that index by the result of your $3-$2 calcuation. Like a[$5]+=($3-$2) Then when processing the second file you can just pull the item from the array at the index that matches the first string in $1 like {split($1, b, " "); print b[0], a[b[0]] (pardon if that syntax isn't 100%, but that should get in the ballpark.
– JNevill
Nov 23 '18 at 18:54

I made the edits and don't think I follow, but included the output in the original post. Thank you :).
– cm0728
Nov 23 '18 at 20:55

Instead of making the index of the array FNR make it $4 and increment the value stored in the array at that index by the result of your $3-$2 calcuation. Like a[$5]+=($3-$2) Then when processing the second file you can just pull the item from the array at the index that matches the first string in $1 like {split($1, b, " "); print b[0], a[b[0]] (pardon if that syntax isn't 100%, but that should get in the ballpark.
– JNevill
Nov 23 '18 at 18:54

I made the edits and don't think I follow, but included the output in the original post. Thank you :).
– cm0728
Nov 23 '18 at 20:55

add a comment |

2 Answers
2

active

oldest

votes

Could you please try following. It will provide you sequence of output in same order of Input_file's order.

awk '

FNR==NR{

  if(!b[$1]++){

     c[++count]=$1

  }

  a[$1]

  next

}

($4 in a) && $5!="intron"{

  a[$4]+=$NF

}

END{

  for(i=1;i<=count;i++){

    print c[i],a[c[i]]?a[c[i]]:0

  }

}'  Input_file2   Input_file1

Since your Input_file1 is NOT TAB delimited as per your claim, so in case it is then edit Input_file2 Input_file1 -----> Input_file2 FS="t" Input_file1. To get output as TAB delimited either append above code's output to | column -t command or set OFS="t" near to FS="t" too.

Output will be as follows.

RASA1 222

NOTCH2 0

GIMAP8 0

NOTCH3 35

FOXF2 0

PRB3 0

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

add a comment |

if I understood it correctly, this should do what you expect

$ awk 'FNR==NR && $5!="intron" {a[$4]+=$3-$2; next}

       {$2=($1 in a)?a[$1]:0}1' file1 file2 > file2.updated

edited Nov 23 '18 at 23:58

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

IMHO karafka sir, IMHO, I think OP mentioned it wrongly as per shown samples field from Input_file1 is 4th please.
– RavinderSingh13
Nov 23 '18 at 23:48

1

Yes, thanks. Fixed to $5
– karakfa
Nov 23 '18 at 23:59

Thank you both very much :).
– cm0728
Nov 26 '18 at 13:24

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53451595%2fawk-to-update-file-based-on-match-and-condition-in-another%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Could you please try following. It will provide you sequence of output in same order of Input_file's order.

awk '

FNR==NR{

  if(!b[$1]++){

     c[++count]=$1

  }

  a[$1]

  next

}

($4 in a) && $5!="intron"{

  a[$4]+=$NF

}

END{

  for(i=1;i<=count;i++){

    print c[i],a[c[i]]?a[c[i]]:0

  }

}'  Input_file2   Input_file1

Output will be as follows.

RASA1 222

NOTCH2 0

GIMAP8 0

NOTCH3 35

FOXF2 0

PRB3 0

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

add a comment |

Could you please try following. It will provide you sequence of output in same order of Input_file's order.

awk '

FNR==NR{

  if(!b[$1]++){

     c[++count]=$1

  }

  a[$1]

  next

}

($4 in a) && $5!="intron"{

  a[$4]+=$NF

}

END{

  for(i=1;i<=count;i++){

    print c[i],a[c[i]]?a[c[i]]:0

  }

}'  Input_file2   Input_file1

Output will be as follows.

RASA1 222

NOTCH2 0

GIMAP8 0

NOTCH3 35

FOXF2 0

PRB3 0

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

add a comment |

Could you please try following. It will provide you sequence of output in same order of Input_file's order.

awk '

FNR==NR{

  if(!b[$1]++){

     c[++count]=$1

  }

  a[$1]

  next

}

($4 in a) && $5!="intron"{

  a[$4]+=$NF

}

END{

  for(i=1;i<=count;i++){

    print c[i],a[c[i]]?a[c[i]]:0

  }

}'  Input_file2   Input_file1

Output will be as follows.

RASA1 222

NOTCH2 0

GIMAP8 0

NOTCH3 35

FOXF2 0

PRB3 0

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

Could you please try following. It will provide you sequence of output in same order of Input_file's order.

awk '

FNR==NR{

  if(!b[$1]++){

     c[++count]=$1

  }

  a[$1]

  next

}

($4 in a) && $5!="intron"{

  a[$4]+=$NF

}

END{

  for(i=1;i<=count;i++){

    print c[i],a[c[i]]?a[c[i]]:0

  }

}'  Input_file2   Input_file1

Output will be as follows.

RASA1 222

NOTCH2 0

GIMAP8 0

NOTCH3 35

FOXF2 0

PRB3 0

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

answered Nov 23 '18 at 23:44

RavinderSingh13

25.7k41438

add a comment |

if I understood it correctly, this should do what you expect

$ awk 'FNR==NR && $5!="intron" {a[$4]+=$3-$2; next}

       {$2=($1 in a)?a[$1]:0}1' file1 file2 > file2.updated

edited Nov 23 '18 at 23:58

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

IMHO karafka sir, IMHO, I think OP mentioned it wrongly as per shown samples field from Input_file1 is 4th please.
– RavinderSingh13
Nov 23 '18 at 23:48

1

Yes, thanks. Fixed to $5
– karakfa
Nov 23 '18 at 23:59

Thank you both very much :).
– cm0728
Nov 26 '18 at 13:24

add a comment |

if I understood it correctly, this should do what you expect

$ awk 'FNR==NR && $5!="intron" {a[$4]+=$3-$2; next}

       {$2=($1 in a)?a[$1]:0}1' file1 file2 > file2.updated

edited Nov 23 '18 at 23:58

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

IMHO karafka sir, IMHO, I think OP mentioned it wrongly as per shown samples field from Input_file1 is 4th please.
– RavinderSingh13
Nov 23 '18 at 23:48

1

Yes, thanks. Fixed to $5
– karakfa
Nov 23 '18 at 23:59

Thank you both very much :).
– cm0728
Nov 26 '18 at 13:24

add a comment |

if I understood it correctly, this should do what you expect

$ awk 'FNR==NR && $5!="intron" {a[$4]+=$3-$2; next}

       {$2=($1 in a)?a[$1]:0}1' file1 file2 > file2.updated

edited Nov 23 '18 at 23:58

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

if I understood it correctly, this should do what you expect

$ awk 'FNR==NR && $5!="intron" {a[$4]+=$3-$2; next}

       {$2=($1 in a)?a[$1]:0}1' file1 file2 > file2.updated

edited Nov 23 '18 at 23:58

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

edited Nov 23 '18 at 23:58

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

answered Nov 23 '18 at 23:24

karakfa

48.2k52738

IMHO karafka sir, IMHO, I think OP mentioned it wrongly as per shown samples field from Input_file1 is 4th please.
– RavinderSingh13
Nov 23 '18 at 23:48

1

Yes, thanks. Fixed to $5
– karakfa
Nov 23 '18 at 23:59

Thank you both very much :).
– cm0728
Nov 26 '18 at 13:24

add a comment |

IMHO karafka sir, IMHO, I think OP mentioned it wrongly as per shown samples field from Input_file1 is 4th please.
– RavinderSingh13
Nov 23 '18 at 23:48

1

Yes, thanks. Fixed to $5
– karakfa
Nov 23 '18 at 23:59

Thank you both very much :).
– cm0728
Nov 26 '18 at 13:24

IMHO karafka sir, IMHO, I think OP mentioned it wrongly as per shown samples field from Input_file1 is 4th please.
– RavinderSingh13
Nov 23 '18 at 23:48

Yes, thanks. Fixed to $5
– karakfa
Nov 23 '18 at 23:59

Thank you both very much :).
– cm0728
Nov 26 '18 at 13:24

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl