Vala regex using groups with subgroups keeps ending in a segmentationfault












1














It is really simple. I am trying to use regex to indentify certain property values in a line of a vcard string.



So, here's the code:



int main(string args){

string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";



string regString = "(tel:(?<phnum>.*);)*(?<pref>PREF=1;)*";

Regex regex = new Regex(regString);
MatchInfo match;

regex.match(input_end, 0, out match);

stdout.printf(match.fetch_named("phnum"));

stdout.printf(match.fetch_named(pref));

return 0;


}



What i want to do, really, is to say that the tag phnum applies to a subgroup of characters, when it appears in the input (hence the *, at the outside of the group). so when i do: match.fetch_named("phnum") the returned value would be: "+1-555-555-5555".



I am just getting segmentation faults, even though regex tester apps seem to accept the pattern well enough.










share|improve this question





























    1














    It is really simple. I am trying to use regex to indentify certain property values in a line of a vcard string.



    So, here's the code:



    int main(string args){

    string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";



    string regString = "(tel:(?<phnum>.*);)*(?<pref>PREF=1;)*";

    Regex regex = new Regex(regString);
    MatchInfo match;

    regex.match(input_end, 0, out match);

    stdout.printf(match.fetch_named("phnum"));

    stdout.printf(match.fetch_named(pref));

    return 0;


    }



    What i want to do, really, is to say that the tag phnum applies to a subgroup of characters, when it appears in the input (hence the *, at the outside of the group). so when i do: match.fetch_named("phnum") the returned value would be: "+1-555-555-5555".



    I am just getting segmentation faults, even though regex tester apps seem to accept the pattern well enough.










    share|improve this question



























      1












      1








      1







      It is really simple. I am trying to use regex to indentify certain property values in a line of a vcard string.



      So, here's the code:



      int main(string args){

      string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";



      string regString = "(tel:(?<phnum>.*);)*(?<pref>PREF=1;)*";

      Regex regex = new Regex(regString);
      MatchInfo match;

      regex.match(input_end, 0, out match);

      stdout.printf(match.fetch_named("phnum"));

      stdout.printf(match.fetch_named(pref));

      return 0;


      }



      What i want to do, really, is to say that the tag phnum applies to a subgroup of characters, when it appears in the input (hence the *, at the outside of the group). so when i do: match.fetch_named("phnum") the returned value would be: "+1-555-555-5555".



      I am just getting segmentation faults, even though regex tester apps seem to accept the pattern well enough.










      share|improve this question















      It is really simple. I am trying to use regex to indentify certain property values in a line of a vcard string.



      So, here's the code:



      int main(string args){

      string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";



      string regString = "(tel:(?<phnum>.*);)*(?<pref>PREF=1;)*";

      Regex regex = new Regex(regString);
      MatchInfo match;

      regex.match(input_end, 0, out match);

      stdout.printf(match.fetch_named("phnum"));

      stdout.printf(match.fetch_named(pref));

      return 0;


      }



      What i want to do, really, is to say that the tag phnum applies to a subgroup of characters, when it appears in the input (hence the *, at the outside of the group). so when i do: match.fetch_named("phnum") the returned value would be: "+1-555-555-5555".



      I am just getting segmentation faults, even though regex tester apps seem to accept the pattern well enough.







      regex vala






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 23 '18 at 22:58









      toolic

      34.9k44477




      34.9k44477










      asked Nov 23 '18 at 22:39









      user3801839user3801839

      6421719




      6421719
























          4 Answers
          4






          active

          oldest

          votes


















          1














          There are a number of things that can be done to improve the Vala code:




          • GLib's Regex binding to PCRE will return an error message giving some details about an invalid regular expression. In Vala this message can be read by putting new Regex () in a try...catch block.


          • regex.match() returns true when a match is found, so wrapping regex.match() in an if statement makes the program more robust

          • Vala has the null coalescing operator, ??, which is a convenient way of providing an alternative value when there is a null value


          • MatchInfo has the next() method and when combined with Vala's do {} when () loop gives a good way of retrieving multiple matches safely


          The regex you are using needs to exclude the terminating character, ;. So tel:(?<phnum>[^;|.]*); would match all characters excluding ; after tel: until ; is reached.



          Here is a working example putting all that together:



          int main () {

          string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";

          string regString = "tel:(?<phnum>[^;|.]*);|PREF=(?<pref>[0-9]*);";
          Regex regex;
          MatchInfo match;
          try {
          regex = new Regex(regString);
          if (regex.match(input, 0, out match)) {
          do {
          stdout.printf("Phone number: %sn", match.fetch_named("phnum") ?? "None");
          stdout.printf("Preference: %sn", match.fetch_named("pref") ?? "None");
          }
          while (match.next());
          }
          }
          catch (Error error) {
          print (@"$(error.message)n");
          return 1;
          }

          return 0;
          }


          This outputs:



          Phone number: 
          Preference: 1
          Phone number: +1-555-555-5555
          Preference: None


          There are two matches. What is interesting is the first match returns empty string for the phone number. This is because it is a valid sub-pattern, but didn't match anything. Why pref is null for the second match is unclear to me. That needs some more investigation as to what is going on in the regex engine, but this hopefully gives you enough to get on with.






          share|improve this answer





















          • Excellent! Though, will this also work if PREF=1; is not in the input string at all?
            – user3801839
            Nov 27 '18 at 12:10



















          1














          printf takes a format string first. You need to change those to:



          stdout.printf("%sn", match.fetch_named("phnum"));

          stdout.printf("%sn", match.fetch_named("pref"));


          If the format string is null, printf will segfault.



          If you don't want to bother with a format string, you can use FileStream.puts, but you still need a null check:



          if (match.fetch_named("phnum")!=null)
          stdout.puts(match.fetch_named("phnum"));





          share|improve this answer





















          • This solves the Segmentationfault, thanks. But I still can not seem to form a regex where i could do the above and then do: ´match.fetch_named("phnum")´ and ´match.fetch_named("pref")´
            – user3801839
            Nov 24 '18 at 9:34



















          1














          In think regex.match(input_end should also be regex.match(input



          Without taking the exact format of the phone number into account, one possible solution could be to match the allowed characters in a character class and get the value from the group named phnum



          tel:(?<phnum>[0-9+-]+)



          Regex demo | Test it online



          Or a bit more broader match could be to use a negated character class [^ to match what you don't want and get the value from group named phnum



          tel:(?<phnum>[^rn;]+)



          Regex demo






          share|improve this answer





























            0














            Your regex should be shortened to:



            tel:(?<phnum>.*);$


            Now 'phnum' Group will contain the phone number.






            share|improve this answer





















              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53453582%2fvala-regex-using-groups-with-subgroups-keeps-ending-in-a-segmentationfault%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              4 Answers
              4






              active

              oldest

              votes








              4 Answers
              4






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1














              There are a number of things that can be done to improve the Vala code:




              • GLib's Regex binding to PCRE will return an error message giving some details about an invalid regular expression. In Vala this message can be read by putting new Regex () in a try...catch block.


              • regex.match() returns true when a match is found, so wrapping regex.match() in an if statement makes the program more robust

              • Vala has the null coalescing operator, ??, which is a convenient way of providing an alternative value when there is a null value


              • MatchInfo has the next() method and when combined with Vala's do {} when () loop gives a good way of retrieving multiple matches safely


              The regex you are using needs to exclude the terminating character, ;. So tel:(?<phnum>[^;|.]*); would match all characters excluding ; after tel: until ; is reached.



              Here is a working example putting all that together:



              int main () {

              string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";

              string regString = "tel:(?<phnum>[^;|.]*);|PREF=(?<pref>[0-9]*);";
              Regex regex;
              MatchInfo match;
              try {
              regex = new Regex(regString);
              if (regex.match(input, 0, out match)) {
              do {
              stdout.printf("Phone number: %sn", match.fetch_named("phnum") ?? "None");
              stdout.printf("Preference: %sn", match.fetch_named("pref") ?? "None");
              }
              while (match.next());
              }
              }
              catch (Error error) {
              print (@"$(error.message)n");
              return 1;
              }

              return 0;
              }


              This outputs:



              Phone number: 
              Preference: 1
              Phone number: +1-555-555-5555
              Preference: None


              There are two matches. What is interesting is the first match returns empty string for the phone number. This is because it is a valid sub-pattern, but didn't match anything. Why pref is null for the second match is unclear to me. That needs some more investigation as to what is going on in the regex engine, but this hopefully gives you enough to get on with.






              share|improve this answer





















              • Excellent! Though, will this also work if PREF=1; is not in the input string at all?
                – user3801839
                Nov 27 '18 at 12:10
















              1














              There are a number of things that can be done to improve the Vala code:




              • GLib's Regex binding to PCRE will return an error message giving some details about an invalid regular expression. In Vala this message can be read by putting new Regex () in a try...catch block.


              • regex.match() returns true when a match is found, so wrapping regex.match() in an if statement makes the program more robust

              • Vala has the null coalescing operator, ??, which is a convenient way of providing an alternative value when there is a null value


              • MatchInfo has the next() method and when combined with Vala's do {} when () loop gives a good way of retrieving multiple matches safely


              The regex you are using needs to exclude the terminating character, ;. So tel:(?<phnum>[^;|.]*); would match all characters excluding ; after tel: until ; is reached.



              Here is a working example putting all that together:



              int main () {

              string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";

              string regString = "tel:(?<phnum>[^;|.]*);|PREF=(?<pref>[0-9]*);";
              Regex regex;
              MatchInfo match;
              try {
              regex = new Regex(regString);
              if (regex.match(input, 0, out match)) {
              do {
              stdout.printf("Phone number: %sn", match.fetch_named("phnum") ?? "None");
              stdout.printf("Preference: %sn", match.fetch_named("pref") ?? "None");
              }
              while (match.next());
              }
              }
              catch (Error error) {
              print (@"$(error.message)n");
              return 1;
              }

              return 0;
              }


              This outputs:



              Phone number: 
              Preference: 1
              Phone number: +1-555-555-5555
              Preference: None


              There are two matches. What is interesting is the first match returns empty string for the phone number. This is because it is a valid sub-pattern, but didn't match anything. Why pref is null for the second match is unclear to me. That needs some more investigation as to what is going on in the regex engine, but this hopefully gives you enough to get on with.






              share|improve this answer





















              • Excellent! Though, will this also work if PREF=1; is not in the input string at all?
                – user3801839
                Nov 27 '18 at 12:10














              1












              1








              1






              There are a number of things that can be done to improve the Vala code:




              • GLib's Regex binding to PCRE will return an error message giving some details about an invalid regular expression. In Vala this message can be read by putting new Regex () in a try...catch block.


              • regex.match() returns true when a match is found, so wrapping regex.match() in an if statement makes the program more robust

              • Vala has the null coalescing operator, ??, which is a convenient way of providing an alternative value when there is a null value


              • MatchInfo has the next() method and when combined with Vala's do {} when () loop gives a good way of retrieving multiple matches safely


              The regex you are using needs to exclude the terminating character, ;. So tel:(?<phnum>[^;|.]*); would match all characters excluding ; after tel: until ; is reached.



              Here is a working example putting all that together:



              int main () {

              string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";

              string regString = "tel:(?<phnum>[^;|.]*);|PREF=(?<pref>[0-9]*);";
              Regex regex;
              MatchInfo match;
              try {
              regex = new Regex(regString);
              if (regex.match(input, 0, out match)) {
              do {
              stdout.printf("Phone number: %sn", match.fetch_named("phnum") ?? "None");
              stdout.printf("Preference: %sn", match.fetch_named("pref") ?? "None");
              }
              while (match.next());
              }
              }
              catch (Error error) {
              print (@"$(error.message)n");
              return 1;
              }

              return 0;
              }


              This outputs:



              Phone number: 
              Preference: 1
              Phone number: +1-555-555-5555
              Preference: None


              There are two matches. What is interesting is the first match returns empty string for the phone number. This is because it is a valid sub-pattern, but didn't match anything. Why pref is null for the second match is unclear to me. That needs some more investigation as to what is going on in the regex engine, but this hopefully gives you enough to get on with.






              share|improve this answer












              There are a number of things that can be done to improve the Vala code:




              • GLib's Regex binding to PCRE will return an error message giving some details about an invalid regular expression. In Vala this message can be read by putting new Regex () in a try...catch block.


              • regex.match() returns true when a match is found, so wrapping regex.match() in an if statement makes the program more robust

              • Vala has the null coalescing operator, ??, which is a convenient way of providing an alternative value when there is a null value


              • MatchInfo has the next() method and when combined with Vala's do {} when () loop gives a good way of retrieving multiple matches safely


              The regex you are using needs to exclude the terminating character, ;. So tel:(?<phnum>[^;|.]*); would match all characters excluding ; after tel: until ; is reached.



              Here is a working example putting all that together:



              int main () {

              string input = "TEL;VALUE=uri;PREF=1;TYPE="voice,home":tel:+1-555-555-5555;ext=5555";

              string regString = "tel:(?<phnum>[^;|.]*);|PREF=(?<pref>[0-9]*);";
              Regex regex;
              MatchInfo match;
              try {
              regex = new Regex(regString);
              if (regex.match(input, 0, out match)) {
              do {
              stdout.printf("Phone number: %sn", match.fetch_named("phnum") ?? "None");
              stdout.printf("Preference: %sn", match.fetch_named("pref") ?? "None");
              }
              while (match.next());
              }
              }
              catch (Error error) {
              print (@"$(error.message)n");
              return 1;
              }

              return 0;
              }


              This outputs:



              Phone number: 
              Preference: 1
              Phone number: +1-555-555-5555
              Preference: None


              There are two matches. What is interesting is the first match returns empty string for the phone number. This is because it is a valid sub-pattern, but didn't match anything. Why pref is null for the second match is unclear to me. That needs some more investigation as to what is going on in the regex engine, but this hopefully gives you enough to get on with.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Nov 24 '18 at 15:51









              AlThomasAlThomas

              2,482517




              2,482517












              • Excellent! Though, will this also work if PREF=1; is not in the input string at all?
                – user3801839
                Nov 27 '18 at 12:10


















              • Excellent! Though, will this also work if PREF=1; is not in the input string at all?
                – user3801839
                Nov 27 '18 at 12:10
















              Excellent! Though, will this also work if PREF=1; is not in the input string at all?
              – user3801839
              Nov 27 '18 at 12:10




              Excellent! Though, will this also work if PREF=1; is not in the input string at all?
              – user3801839
              Nov 27 '18 at 12:10













              1














              printf takes a format string first. You need to change those to:



              stdout.printf("%sn", match.fetch_named("phnum"));

              stdout.printf("%sn", match.fetch_named("pref"));


              If the format string is null, printf will segfault.



              If you don't want to bother with a format string, you can use FileStream.puts, but you still need a null check:



              if (match.fetch_named("phnum")!=null)
              stdout.puts(match.fetch_named("phnum"));





              share|improve this answer





















              • This solves the Segmentationfault, thanks. But I still can not seem to form a regex where i could do the above and then do: ´match.fetch_named("phnum")´ and ´match.fetch_named("pref")´
                – user3801839
                Nov 24 '18 at 9:34
















              1














              printf takes a format string first. You need to change those to:



              stdout.printf("%sn", match.fetch_named("phnum"));

              stdout.printf("%sn", match.fetch_named("pref"));


              If the format string is null, printf will segfault.



              If you don't want to bother with a format string, you can use FileStream.puts, but you still need a null check:



              if (match.fetch_named("phnum")!=null)
              stdout.puts(match.fetch_named("phnum"));





              share|improve this answer





















              • This solves the Segmentationfault, thanks. But I still can not seem to form a regex where i could do the above and then do: ´match.fetch_named("phnum")´ and ´match.fetch_named("pref")´
                – user3801839
                Nov 24 '18 at 9:34














              1












              1








              1






              printf takes a format string first. You need to change those to:



              stdout.printf("%sn", match.fetch_named("phnum"));

              stdout.printf("%sn", match.fetch_named("pref"));


              If the format string is null, printf will segfault.



              If you don't want to bother with a format string, you can use FileStream.puts, but you still need a null check:



              if (match.fetch_named("phnum")!=null)
              stdout.puts(match.fetch_named("phnum"));





              share|improve this answer












              printf takes a format string first. You need to change those to:



              stdout.printf("%sn", match.fetch_named("phnum"));

              stdout.printf("%sn", match.fetch_named("pref"));


              If the format string is null, printf will segfault.



              If you don't want to bother with a format string, you can use FileStream.puts, but you still need a null check:



              if (match.fetch_named("phnum")!=null)
              stdout.puts(match.fetch_named("phnum"));






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Nov 24 '18 at 3:19









              apmasellapmasell

              5,8991324




              5,8991324












              • This solves the Segmentationfault, thanks. But I still can not seem to form a regex where i could do the above and then do: ´match.fetch_named("phnum")´ and ´match.fetch_named("pref")´
                – user3801839
                Nov 24 '18 at 9:34


















              • This solves the Segmentationfault, thanks. But I still can not seem to form a regex where i could do the above and then do: ´match.fetch_named("phnum")´ and ´match.fetch_named("pref")´
                – user3801839
                Nov 24 '18 at 9:34
















              This solves the Segmentationfault, thanks. But I still can not seem to form a regex where i could do the above and then do: ´match.fetch_named("phnum")´ and ´match.fetch_named("pref")´
              – user3801839
              Nov 24 '18 at 9:34




              This solves the Segmentationfault, thanks. But I still can not seem to form a regex where i could do the above and then do: ´match.fetch_named("phnum")´ and ´match.fetch_named("pref")´
              – user3801839
              Nov 24 '18 at 9:34











              1














              In think regex.match(input_end should also be regex.match(input



              Without taking the exact format of the phone number into account, one possible solution could be to match the allowed characters in a character class and get the value from the group named phnum



              tel:(?<phnum>[0-9+-]+)



              Regex demo | Test it online



              Or a bit more broader match could be to use a negated character class [^ to match what you don't want and get the value from group named phnum



              tel:(?<phnum>[^rn;]+)



              Regex demo






              share|improve this answer


























                1














                In think regex.match(input_end should also be regex.match(input



                Without taking the exact format of the phone number into account, one possible solution could be to match the allowed characters in a character class and get the value from the group named phnum



                tel:(?<phnum>[0-9+-]+)



                Regex demo | Test it online



                Or a bit more broader match could be to use a negated character class [^ to match what you don't want and get the value from group named phnum



                tel:(?<phnum>[^rn;]+)



                Regex demo






                share|improve this answer
























                  1












                  1








                  1






                  In think regex.match(input_end should also be regex.match(input



                  Without taking the exact format of the phone number into account, one possible solution could be to match the allowed characters in a character class and get the value from the group named phnum



                  tel:(?<phnum>[0-9+-]+)



                  Regex demo | Test it online



                  Or a bit more broader match could be to use a negated character class [^ to match what you don't want and get the value from group named phnum



                  tel:(?<phnum>[^rn;]+)



                  Regex demo






                  share|improve this answer












                  In think regex.match(input_end should also be regex.match(input



                  Without taking the exact format of the phone number into account, one possible solution could be to match the allowed characters in a character class and get the value from the group named phnum



                  tel:(?<phnum>[0-9+-]+)



                  Regex demo | Test it online



                  Or a bit more broader match could be to use a negated character class [^ to match what you don't want and get the value from group named phnum



                  tel:(?<phnum>[^rn;]+)



                  Regex demo







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 24 '18 at 11:19









                  The fourth birdThe fourth bird

                  21k71326




                  21k71326























                      0














                      Your regex should be shortened to:



                      tel:(?<phnum>.*);$


                      Now 'phnum' Group will contain the phone number.






                      share|improve this answer


























                        0














                        Your regex should be shortened to:



                        tel:(?<phnum>.*);$


                        Now 'phnum' Group will contain the phone number.






                        share|improve this answer
























                          0












                          0








                          0






                          Your regex should be shortened to:



                          tel:(?<phnum>.*);$


                          Now 'phnum' Group will contain the phone number.






                          share|improve this answer












                          Your regex should be shortened to:



                          tel:(?<phnum>.*);$


                          Now 'phnum' Group will contain the phone number.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 23 '18 at 23:40









                          Poul BakPoul Bak

                          5,44331232




                          5,44331232






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53453582%2fvala-regex-using-groups-with-subgroups-keeps-ending-in-a-segmentationfault%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                              Calculate evaluation metrics using cross_val_predict sklearn

                              Insert data from modal to MySQL (multiple modal on website)