GridSearchCV fitting












-1















I'm having problems to fit my classifier using binarized labels.



clf_linear = GridSearchCV(SVC(kernel='linear', class_weight='balanced'),
param_grid, cv=5)

clf_linear = clf_linear.fit(X_train_pca, y_train)


y_train was binarized by the following method:



y_train = label_binarize(y_train, classes=[1, 2, 3])


I got the following error:



File "C:Pythonlibsite-packagessklearnutilsvalidation.py", line 788, in column_or_1d
raise ValueError("bad input shape {0}".format(shape))
ValueError: bad input shape (545, 3)



The input label shape is (682, 3) not (545, 3).



My professor told me to use binarized labels in gridSearchCV, but reading scikit-learn docs I think I can't do this.










share|improve this question



























    -1















    I'm having problems to fit my classifier using binarized labels.



    clf_linear = GridSearchCV(SVC(kernel='linear', class_weight='balanced'),
    param_grid, cv=5)

    clf_linear = clf_linear.fit(X_train_pca, y_train)


    y_train was binarized by the following method:



    y_train = label_binarize(y_train, classes=[1, 2, 3])


    I got the following error:



    File "C:Pythonlibsite-packagessklearnutilsvalidation.py", line 788, in column_or_1d
    raise ValueError("bad input shape {0}".format(shape))
    ValueError: bad input shape (545, 3)



    The input label shape is (682, 3) not (545, 3).



    My professor told me to use binarized labels in gridSearchCV, but reading scikit-learn docs I think I can't do this.










    share|improve this question

























      -1












      -1








      -1








      I'm having problems to fit my classifier using binarized labels.



      clf_linear = GridSearchCV(SVC(kernel='linear', class_weight='balanced'),
      param_grid, cv=5)

      clf_linear = clf_linear.fit(X_train_pca, y_train)


      y_train was binarized by the following method:



      y_train = label_binarize(y_train, classes=[1, 2, 3])


      I got the following error:



      File "C:Pythonlibsite-packagessklearnutilsvalidation.py", line 788, in column_or_1d
      raise ValueError("bad input shape {0}".format(shape))
      ValueError: bad input shape (545, 3)



      The input label shape is (682, 3) not (545, 3).



      My professor told me to use binarized labels in gridSearchCV, but reading scikit-learn docs I think I can't do this.










      share|improve this question














      I'm having problems to fit my classifier using binarized labels.



      clf_linear = GridSearchCV(SVC(kernel='linear', class_weight='balanced'),
      param_grid, cv=5)

      clf_linear = clf_linear.fit(X_train_pca, y_train)


      y_train was binarized by the following method:



      y_train = label_binarize(y_train, classes=[1, 2, 3])


      I got the following error:



      File "C:Pythonlibsite-packagessklearnutilsvalidation.py", line 788, in column_or_1d
      raise ValueError("bad input shape {0}".format(shape))
      ValueError: bad input shape (545, 3)



      The input label shape is (682, 3) not (545, 3).



      My professor told me to use binarized labels in gridSearchCV, but reading scikit-learn docs I think I can't do this.







      python scikit-learn gridsearchcv






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 25 '18 at 17:56









      Arthur BernardoArthur Bernardo

      33




      33
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Doesn't matter its 682,3 or 545,3. Why the target has 3 columns? Your y (targets) should be 1-d array for SVC. You dont need to do the label_binarize operation. Keep y_train as it is.



          Doing this:



          y_train = label_binarize(y_train, classes=[1, 2, 3])


          Will convert the y_train to label-indicator matrix. That is used for multi-label classification problems (where the sample can have more than one class at a time). Its not used for multi-class problems.



          Keep the y_train as it is to keep it as one-dimensional array and SVC will handle the rest.






          share|improve this answer
























          • Thank you @Vivek, I was trying to do this because my roc curve seems to be strange and talking with my professor he suggested me to do this. The roc curve is not as smooth as he wants it to be.

            – Arthur Bernardo
            Nov 26 '18 at 14:54











          • @ArthurBernardo ROC curve is defined for binary tasks. So you will need to convert your multi-class problem into binary class so doing label_binarize is correct here. Internally SVC does the same. But for that, you will need to use OneVsRestClassifier to handle that. Simply using one-hot encoded labels turns on multi-label classification, which is not supported by SVC. Have you seen this:scikit-learn.org/stable/auto_examples/model_selection/… ?

            – Vivek Kumar
            Nov 26 '18 at 14:57













          • Yes, I saw this topic. I'm plotting ROC curve like this. The result is this graphic link. Do you see something wrong with these curves? I'm classifying the three faces with more samples in this dataset link. My doubt is if this curves should be more smooth. Thanks again for your help @VivekKumar.

            – Arthur Bernardo
            Nov 26 '18 at 16:55











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53470284%2fgridsearchcv-fitting%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Doesn't matter its 682,3 or 545,3. Why the target has 3 columns? Your y (targets) should be 1-d array for SVC. You dont need to do the label_binarize operation. Keep y_train as it is.



          Doing this:



          y_train = label_binarize(y_train, classes=[1, 2, 3])


          Will convert the y_train to label-indicator matrix. That is used for multi-label classification problems (where the sample can have more than one class at a time). Its not used for multi-class problems.



          Keep the y_train as it is to keep it as one-dimensional array and SVC will handle the rest.






          share|improve this answer
























          • Thank you @Vivek, I was trying to do this because my roc curve seems to be strange and talking with my professor he suggested me to do this. The roc curve is not as smooth as he wants it to be.

            – Arthur Bernardo
            Nov 26 '18 at 14:54











          • @ArthurBernardo ROC curve is defined for binary tasks. So you will need to convert your multi-class problem into binary class so doing label_binarize is correct here. Internally SVC does the same. But for that, you will need to use OneVsRestClassifier to handle that. Simply using one-hot encoded labels turns on multi-label classification, which is not supported by SVC. Have you seen this:scikit-learn.org/stable/auto_examples/model_selection/… ?

            – Vivek Kumar
            Nov 26 '18 at 14:57













          • Yes, I saw this topic. I'm plotting ROC curve like this. The result is this graphic link. Do you see something wrong with these curves? I'm classifying the three faces with more samples in this dataset link. My doubt is if this curves should be more smooth. Thanks again for your help @VivekKumar.

            – Arthur Bernardo
            Nov 26 '18 at 16:55
















          0














          Doesn't matter its 682,3 or 545,3. Why the target has 3 columns? Your y (targets) should be 1-d array for SVC. You dont need to do the label_binarize operation. Keep y_train as it is.



          Doing this:



          y_train = label_binarize(y_train, classes=[1, 2, 3])


          Will convert the y_train to label-indicator matrix. That is used for multi-label classification problems (where the sample can have more than one class at a time). Its not used for multi-class problems.



          Keep the y_train as it is to keep it as one-dimensional array and SVC will handle the rest.






          share|improve this answer
























          • Thank you @Vivek, I was trying to do this because my roc curve seems to be strange and talking with my professor he suggested me to do this. The roc curve is not as smooth as he wants it to be.

            – Arthur Bernardo
            Nov 26 '18 at 14:54











          • @ArthurBernardo ROC curve is defined for binary tasks. So you will need to convert your multi-class problem into binary class so doing label_binarize is correct here. Internally SVC does the same. But for that, you will need to use OneVsRestClassifier to handle that. Simply using one-hot encoded labels turns on multi-label classification, which is not supported by SVC. Have you seen this:scikit-learn.org/stable/auto_examples/model_selection/… ?

            – Vivek Kumar
            Nov 26 '18 at 14:57













          • Yes, I saw this topic. I'm plotting ROC curve like this. The result is this graphic link. Do you see something wrong with these curves? I'm classifying the three faces with more samples in this dataset link. My doubt is if this curves should be more smooth. Thanks again for your help @VivekKumar.

            – Arthur Bernardo
            Nov 26 '18 at 16:55














          0












          0








          0







          Doesn't matter its 682,3 or 545,3. Why the target has 3 columns? Your y (targets) should be 1-d array for SVC. You dont need to do the label_binarize operation. Keep y_train as it is.



          Doing this:



          y_train = label_binarize(y_train, classes=[1, 2, 3])


          Will convert the y_train to label-indicator matrix. That is used for multi-label classification problems (where the sample can have more than one class at a time). Its not used for multi-class problems.



          Keep the y_train as it is to keep it as one-dimensional array and SVC will handle the rest.






          share|improve this answer













          Doesn't matter its 682,3 or 545,3. Why the target has 3 columns? Your y (targets) should be 1-d array for SVC. You dont need to do the label_binarize operation. Keep y_train as it is.



          Doing this:



          y_train = label_binarize(y_train, classes=[1, 2, 3])


          Will convert the y_train to label-indicator matrix. That is used for multi-label classification problems (where the sample can have more than one class at a time). Its not used for multi-class problems.



          Keep the y_train as it is to keep it as one-dimensional array and SVC will handle the rest.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 26 '18 at 14:11









          Vivek KumarVivek Kumar

          15.7k42054




          15.7k42054













          • Thank you @Vivek, I was trying to do this because my roc curve seems to be strange and talking with my professor he suggested me to do this. The roc curve is not as smooth as he wants it to be.

            – Arthur Bernardo
            Nov 26 '18 at 14:54











          • @ArthurBernardo ROC curve is defined for binary tasks. So you will need to convert your multi-class problem into binary class so doing label_binarize is correct here. Internally SVC does the same. But for that, you will need to use OneVsRestClassifier to handle that. Simply using one-hot encoded labels turns on multi-label classification, which is not supported by SVC. Have you seen this:scikit-learn.org/stable/auto_examples/model_selection/… ?

            – Vivek Kumar
            Nov 26 '18 at 14:57













          • Yes, I saw this topic. I'm plotting ROC curve like this. The result is this graphic link. Do you see something wrong with these curves? I'm classifying the three faces with more samples in this dataset link. My doubt is if this curves should be more smooth. Thanks again for your help @VivekKumar.

            – Arthur Bernardo
            Nov 26 '18 at 16:55



















          • Thank you @Vivek, I was trying to do this because my roc curve seems to be strange and talking with my professor he suggested me to do this. The roc curve is not as smooth as he wants it to be.

            – Arthur Bernardo
            Nov 26 '18 at 14:54











          • @ArthurBernardo ROC curve is defined for binary tasks. So you will need to convert your multi-class problem into binary class so doing label_binarize is correct here. Internally SVC does the same. But for that, you will need to use OneVsRestClassifier to handle that. Simply using one-hot encoded labels turns on multi-label classification, which is not supported by SVC. Have you seen this:scikit-learn.org/stable/auto_examples/model_selection/… ?

            – Vivek Kumar
            Nov 26 '18 at 14:57













          • Yes, I saw this topic. I'm plotting ROC curve like this. The result is this graphic link. Do you see something wrong with these curves? I'm classifying the three faces with more samples in this dataset link. My doubt is if this curves should be more smooth. Thanks again for your help @VivekKumar.

            – Arthur Bernardo
            Nov 26 '18 at 16:55

















          Thank you @Vivek, I was trying to do this because my roc curve seems to be strange and talking with my professor he suggested me to do this. The roc curve is not as smooth as he wants it to be.

          – Arthur Bernardo
          Nov 26 '18 at 14:54





          Thank you @Vivek, I was trying to do this because my roc curve seems to be strange and talking with my professor he suggested me to do this. The roc curve is not as smooth as he wants it to be.

          – Arthur Bernardo
          Nov 26 '18 at 14:54













          @ArthurBernardo ROC curve is defined for binary tasks. So you will need to convert your multi-class problem into binary class so doing label_binarize is correct here. Internally SVC does the same. But for that, you will need to use OneVsRestClassifier to handle that. Simply using one-hot encoded labels turns on multi-label classification, which is not supported by SVC. Have you seen this:scikit-learn.org/stable/auto_examples/model_selection/… ?

          – Vivek Kumar
          Nov 26 '18 at 14:57







          @ArthurBernardo ROC curve is defined for binary tasks. So you will need to convert your multi-class problem into binary class so doing label_binarize is correct here. Internally SVC does the same. But for that, you will need to use OneVsRestClassifier to handle that. Simply using one-hot encoded labels turns on multi-label classification, which is not supported by SVC. Have you seen this:scikit-learn.org/stable/auto_examples/model_selection/… ?

          – Vivek Kumar
          Nov 26 '18 at 14:57















          Yes, I saw this topic. I'm plotting ROC curve like this. The result is this graphic link. Do you see something wrong with these curves? I'm classifying the three faces with more samples in this dataset link. My doubt is if this curves should be more smooth. Thanks again for your help @VivekKumar.

          – Arthur Bernardo
          Nov 26 '18 at 16:55





          Yes, I saw this topic. I'm plotting ROC curve like this. The result is this graphic link. Do you see something wrong with these curves? I'm classifying the three faces with more samples in this dataset link. My doubt is if this curves should be more smooth. Thanks again for your help @VivekKumar.

          – Arthur Bernardo
          Nov 26 '18 at 16:55


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53470284%2fgridsearchcv-fitting%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

          Calculate evaluation metrics using cross_val_predict sklearn

          Insert data from modal to MySQL (multiple modal on website)