Calculate evaluation metrics using cross_val_predict sklearn












1















In the sklearn.model_selection.cross_val_predict page it is stated:




Generate cross-validated estimates for each input data point. It is
not appropriate to pass these predictions into an evaluation metric.




Can someone explain what does it mean? If this gives estimate of Y (y prediction) for every Y (true Y), why can't I calculate metrics such as RMSE or coefficient of determination using these results?










share|improve this question





























    1















    In the sklearn.model_selection.cross_val_predict page it is stated:




    Generate cross-validated estimates for each input data point. It is
    not appropriate to pass these predictions into an evaluation metric.




    Can someone explain what does it mean? If this gives estimate of Y (y prediction) for every Y (true Y), why can't I calculate metrics such as RMSE or coefficient of determination using these results?










    share|improve this question



























      1












      1








      1








      In the sklearn.model_selection.cross_val_predict page it is stated:




      Generate cross-validated estimates for each input data point. It is
      not appropriate to pass these predictions into an evaluation metric.




      Can someone explain what does it mean? If this gives estimate of Y (y prediction) for every Y (true Y), why can't I calculate metrics such as RMSE or coefficient of determination using these results?










      share|improve this question
















      In the sklearn.model_selection.cross_val_predict page it is stated:




      Generate cross-validated estimates for each input data point. It is
      not appropriate to pass these predictions into an evaluation metric.




      Can someone explain what does it mean? If this gives estimate of Y (y prediction) for every Y (true Y), why can't I calculate metrics such as RMSE or coefficient of determination using these results?







      python scikit-learn cross-validation






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 28 '18 at 17:52









      desertnaut

      20.3k74379




      20.3k74379










      asked Nov 28 '18 at 16:21









      user88484user88484

      1358




      1358
























          1 Answer
          1






          active

          oldest

          votes


















          1














          It seems to be based on how samples are grouped and predicted. From the user guide linked in the cross_val_predict docs:




          Warning Note on inappropriate usage of cross_val_predict



          The result of
          cross_val_predict may be different from those obtained using
          cross_val_score as the elements are grouped in different ways. The
          function cross_val_score takes an average over cross-validation folds,
          whereas cross_val_predict simply returns the labels (or probabilities)
          from several distinct models undistinguished. Thus, cross_val_predict
          is not an appropriate measure of generalisation error.




          The cross_val_score seems to say that it averages across all of the folds, while the cross_val_predict groups individual folds and distinct models but not all and therefore it won't necessarily generalize as well. For example, using the sample code from the sklearn page:



          from sklearn import datasets, linear_model
          from sklearn.model_selection import cross_val_predict, cross_val_score
          from sklearn.metrics import mean_squared_error, make_scorer
          diabetes = datasets.load_diabetes()
          X = diabetes.data[:200]
          y = diabetes.target[:200]
          lasso = linear_model.Lasso()
          y_pred = cross_val_predict(lasso, X, y, cv=3)

          print("Cross Val Prediction score:{}".format(mean_squared_error(y,y_pred)))

          print("Cross Val Score:{}".format(np.mean(cross_val_score(lasso, X, y, cv=3, scoring = make_scorer(mean_squared_error)))))

          Cross Val Prediction score:3993.771257795029
          Cross Val Score:3997.1789145156217





          share|improve this answer


























          • I read it, but I'm not sure I fully understand.. this is why I posted this question in the first place, if some one can explain it in different words. Is it because each fold is based on a bit different model (e.g., different PCs in PCA), and therefore, calculating for example RMSE is not current because it will be based on predictions from slightly different models?

            – user88484
            Nov 28 '18 at 19:57











          • See my edits above. Without digging into the sklearn source code (which you can do on github), you can view the results as shown. The differences are slight, but noticeable

            – G. Anderson
            Nov 28 '18 at 20:48













          • Thank you, this was I thought, and your answer helped to confirm it.

            – user88484
            Nov 30 '18 at 11:38











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53523887%2fcalculate-evaluation-metrics-using-cross-val-predict-sklearn%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          It seems to be based on how samples are grouped and predicted. From the user guide linked in the cross_val_predict docs:




          Warning Note on inappropriate usage of cross_val_predict



          The result of
          cross_val_predict may be different from those obtained using
          cross_val_score as the elements are grouped in different ways. The
          function cross_val_score takes an average over cross-validation folds,
          whereas cross_val_predict simply returns the labels (or probabilities)
          from several distinct models undistinguished. Thus, cross_val_predict
          is not an appropriate measure of generalisation error.




          The cross_val_score seems to say that it averages across all of the folds, while the cross_val_predict groups individual folds and distinct models but not all and therefore it won't necessarily generalize as well. For example, using the sample code from the sklearn page:



          from sklearn import datasets, linear_model
          from sklearn.model_selection import cross_val_predict, cross_val_score
          from sklearn.metrics import mean_squared_error, make_scorer
          diabetes = datasets.load_diabetes()
          X = diabetes.data[:200]
          y = diabetes.target[:200]
          lasso = linear_model.Lasso()
          y_pred = cross_val_predict(lasso, X, y, cv=3)

          print("Cross Val Prediction score:{}".format(mean_squared_error(y,y_pred)))

          print("Cross Val Score:{}".format(np.mean(cross_val_score(lasso, X, y, cv=3, scoring = make_scorer(mean_squared_error)))))

          Cross Val Prediction score:3993.771257795029
          Cross Val Score:3997.1789145156217





          share|improve this answer


























          • I read it, but I'm not sure I fully understand.. this is why I posted this question in the first place, if some one can explain it in different words. Is it because each fold is based on a bit different model (e.g., different PCs in PCA), and therefore, calculating for example RMSE is not current because it will be based on predictions from slightly different models?

            – user88484
            Nov 28 '18 at 19:57











          • See my edits above. Without digging into the sklearn source code (which you can do on github), you can view the results as shown. The differences are slight, but noticeable

            – G. Anderson
            Nov 28 '18 at 20:48













          • Thank you, this was I thought, and your answer helped to confirm it.

            – user88484
            Nov 30 '18 at 11:38
















          1














          It seems to be based on how samples are grouped and predicted. From the user guide linked in the cross_val_predict docs:




          Warning Note on inappropriate usage of cross_val_predict



          The result of
          cross_val_predict may be different from those obtained using
          cross_val_score as the elements are grouped in different ways. The
          function cross_val_score takes an average over cross-validation folds,
          whereas cross_val_predict simply returns the labels (or probabilities)
          from several distinct models undistinguished. Thus, cross_val_predict
          is not an appropriate measure of generalisation error.




          The cross_val_score seems to say that it averages across all of the folds, while the cross_val_predict groups individual folds and distinct models but not all and therefore it won't necessarily generalize as well. For example, using the sample code from the sklearn page:



          from sklearn import datasets, linear_model
          from sklearn.model_selection import cross_val_predict, cross_val_score
          from sklearn.metrics import mean_squared_error, make_scorer
          diabetes = datasets.load_diabetes()
          X = diabetes.data[:200]
          y = diabetes.target[:200]
          lasso = linear_model.Lasso()
          y_pred = cross_val_predict(lasso, X, y, cv=3)

          print("Cross Val Prediction score:{}".format(mean_squared_error(y,y_pred)))

          print("Cross Val Score:{}".format(np.mean(cross_val_score(lasso, X, y, cv=3, scoring = make_scorer(mean_squared_error)))))

          Cross Val Prediction score:3993.771257795029
          Cross Val Score:3997.1789145156217





          share|improve this answer


























          • I read it, but I'm not sure I fully understand.. this is why I posted this question in the first place, if some one can explain it in different words. Is it because each fold is based on a bit different model (e.g., different PCs in PCA), and therefore, calculating for example RMSE is not current because it will be based on predictions from slightly different models?

            – user88484
            Nov 28 '18 at 19:57











          • See my edits above. Without digging into the sklearn source code (which you can do on github), you can view the results as shown. The differences are slight, but noticeable

            – G. Anderson
            Nov 28 '18 at 20:48













          • Thank you, this was I thought, and your answer helped to confirm it.

            – user88484
            Nov 30 '18 at 11:38














          1












          1








          1







          It seems to be based on how samples are grouped and predicted. From the user guide linked in the cross_val_predict docs:




          Warning Note on inappropriate usage of cross_val_predict



          The result of
          cross_val_predict may be different from those obtained using
          cross_val_score as the elements are grouped in different ways. The
          function cross_val_score takes an average over cross-validation folds,
          whereas cross_val_predict simply returns the labels (or probabilities)
          from several distinct models undistinguished. Thus, cross_val_predict
          is not an appropriate measure of generalisation error.




          The cross_val_score seems to say that it averages across all of the folds, while the cross_val_predict groups individual folds and distinct models but not all and therefore it won't necessarily generalize as well. For example, using the sample code from the sklearn page:



          from sklearn import datasets, linear_model
          from sklearn.model_selection import cross_val_predict, cross_val_score
          from sklearn.metrics import mean_squared_error, make_scorer
          diabetes = datasets.load_diabetes()
          X = diabetes.data[:200]
          y = diabetes.target[:200]
          lasso = linear_model.Lasso()
          y_pred = cross_val_predict(lasso, X, y, cv=3)

          print("Cross Val Prediction score:{}".format(mean_squared_error(y,y_pred)))

          print("Cross Val Score:{}".format(np.mean(cross_val_score(lasso, X, y, cv=3, scoring = make_scorer(mean_squared_error)))))

          Cross Val Prediction score:3993.771257795029
          Cross Val Score:3997.1789145156217





          share|improve this answer















          It seems to be based on how samples are grouped and predicted. From the user guide linked in the cross_val_predict docs:




          Warning Note on inappropriate usage of cross_val_predict



          The result of
          cross_val_predict may be different from those obtained using
          cross_val_score as the elements are grouped in different ways. The
          function cross_val_score takes an average over cross-validation folds,
          whereas cross_val_predict simply returns the labels (or probabilities)
          from several distinct models undistinguished. Thus, cross_val_predict
          is not an appropriate measure of generalisation error.




          The cross_val_score seems to say that it averages across all of the folds, while the cross_val_predict groups individual folds and distinct models but not all and therefore it won't necessarily generalize as well. For example, using the sample code from the sklearn page:



          from sklearn import datasets, linear_model
          from sklearn.model_selection import cross_val_predict, cross_val_score
          from sklearn.metrics import mean_squared_error, make_scorer
          diabetes = datasets.load_diabetes()
          X = diabetes.data[:200]
          y = diabetes.target[:200]
          lasso = linear_model.Lasso()
          y_pred = cross_val_predict(lasso, X, y, cv=3)

          print("Cross Val Prediction score:{}".format(mean_squared_error(y,y_pred)))

          print("Cross Val Score:{}".format(np.mean(cross_val_score(lasso, X, y, cv=3, scoring = make_scorer(mean_squared_error)))))

          Cross Val Prediction score:3993.771257795029
          Cross Val Score:3997.1789145156217






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 28 '18 at 20:46

























          answered Nov 28 '18 at 16:30









          G. AndersonG. Anderson

          1,8141411




          1,8141411













          • I read it, but I'm not sure I fully understand.. this is why I posted this question in the first place, if some one can explain it in different words. Is it because each fold is based on a bit different model (e.g., different PCs in PCA), and therefore, calculating for example RMSE is not current because it will be based on predictions from slightly different models?

            – user88484
            Nov 28 '18 at 19:57











          • See my edits above. Without digging into the sklearn source code (which you can do on github), you can view the results as shown. The differences are slight, but noticeable

            – G. Anderson
            Nov 28 '18 at 20:48













          • Thank you, this was I thought, and your answer helped to confirm it.

            – user88484
            Nov 30 '18 at 11:38



















          • I read it, but I'm not sure I fully understand.. this is why I posted this question in the first place, if some one can explain it in different words. Is it because each fold is based on a bit different model (e.g., different PCs in PCA), and therefore, calculating for example RMSE is not current because it will be based on predictions from slightly different models?

            – user88484
            Nov 28 '18 at 19:57











          • See my edits above. Without digging into the sklearn source code (which you can do on github), you can view the results as shown. The differences are slight, but noticeable

            – G. Anderson
            Nov 28 '18 at 20:48













          • Thank you, this was I thought, and your answer helped to confirm it.

            – user88484
            Nov 30 '18 at 11:38

















          I read it, but I'm not sure I fully understand.. this is why I posted this question in the first place, if some one can explain it in different words. Is it because each fold is based on a bit different model (e.g., different PCs in PCA), and therefore, calculating for example RMSE is not current because it will be based on predictions from slightly different models?

          – user88484
          Nov 28 '18 at 19:57





          I read it, but I'm not sure I fully understand.. this is why I posted this question in the first place, if some one can explain it in different words. Is it because each fold is based on a bit different model (e.g., different PCs in PCA), and therefore, calculating for example RMSE is not current because it will be based on predictions from slightly different models?

          – user88484
          Nov 28 '18 at 19:57













          See my edits above. Without digging into the sklearn source code (which you can do on github), you can view the results as shown. The differences are slight, but noticeable

          – G. Anderson
          Nov 28 '18 at 20:48







          See my edits above. Without digging into the sklearn source code (which you can do on github), you can view the results as shown. The differences are slight, but noticeable

          – G. Anderson
          Nov 28 '18 at 20:48















          Thank you, this was I thought, and your answer helped to confirm it.

          – user88484
          Nov 30 '18 at 11:38





          Thank you, this was I thought, and your answer helped to confirm it.

          – user88484
          Nov 30 '18 at 11:38




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53523887%2fcalculate-evaluation-metrics-using-cross-val-predict-sklearn%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

          Insert data from modal to MySQL (multiple modal on website)