Using tensorflow dataset with stratified sampling












0















Given a tensorflow dataset



Train_dataset = tf.data.Dataset.from_tensor_slices((Train_Image_Filenames,Train_Image_Labels))
Train_dataset = Train_dataset.map(Parse_JPEG_Augmented)
...


I would like to stratify my batches to deal with class imbalance. I found tf.contrib.training.stratified_sample and thought I could use it in the following way:



Train_dataset_iter = Train_dataset.make_one_shot_iterator()
Train_dataset_Image_Batch,Train_dataset_Label_Batch = Train_dataset_iter.get_next()
Train_Stratified_Images,Train_Stratified_Labels = tf.contrib.training.stratified_sample(Train_dataset_Image_Batch,Train_dataset_Label_Batch,[1/Classes]*Classes,Batch_Size)


But it gives the following error and I'm not sure that this would allow me to keep the performance benefits of tensorflow dataset as I may have then have to pass Train_Stratified_Images and Train_Stratified_Labels via feed_dict ?



File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/sampling_ops.py", line 192, in stratified_sample
with ops.name_scope(name, 'stratified_sample', list(tensors) + [labels]):
File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 459, in __iter__
"Tensor objects are only iterable when eager execution is "
TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.


What would be the "best practice" way of using dataset with stratified batches?










share|improve this question



























    0















    Given a tensorflow dataset



    Train_dataset = tf.data.Dataset.from_tensor_slices((Train_Image_Filenames,Train_Image_Labels))
    Train_dataset = Train_dataset.map(Parse_JPEG_Augmented)
    ...


    I would like to stratify my batches to deal with class imbalance. I found tf.contrib.training.stratified_sample and thought I could use it in the following way:



    Train_dataset_iter = Train_dataset.make_one_shot_iterator()
    Train_dataset_Image_Batch,Train_dataset_Label_Batch = Train_dataset_iter.get_next()
    Train_Stratified_Images,Train_Stratified_Labels = tf.contrib.training.stratified_sample(Train_dataset_Image_Batch,Train_dataset_Label_Batch,[1/Classes]*Classes,Batch_Size)


    But it gives the following error and I'm not sure that this would allow me to keep the performance benefits of tensorflow dataset as I may have then have to pass Train_Stratified_Images and Train_Stratified_Labels via feed_dict ?



    File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/sampling_ops.py", line 192, in stratified_sample
    with ops.name_scope(name, 'stratified_sample', list(tensors) + [labels]):
    File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 459, in __iter__
    "Tensor objects are only iterable when eager execution is "
    TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.


    What would be the "best practice" way of using dataset with stratified batches?










    share|improve this question

























      0












      0








      0








      Given a tensorflow dataset



      Train_dataset = tf.data.Dataset.from_tensor_slices((Train_Image_Filenames,Train_Image_Labels))
      Train_dataset = Train_dataset.map(Parse_JPEG_Augmented)
      ...


      I would like to stratify my batches to deal with class imbalance. I found tf.contrib.training.stratified_sample and thought I could use it in the following way:



      Train_dataset_iter = Train_dataset.make_one_shot_iterator()
      Train_dataset_Image_Batch,Train_dataset_Label_Batch = Train_dataset_iter.get_next()
      Train_Stratified_Images,Train_Stratified_Labels = tf.contrib.training.stratified_sample(Train_dataset_Image_Batch,Train_dataset_Label_Batch,[1/Classes]*Classes,Batch_Size)


      But it gives the following error and I'm not sure that this would allow me to keep the performance benefits of tensorflow dataset as I may have then have to pass Train_Stratified_Images and Train_Stratified_Labels via feed_dict ?



      File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/sampling_ops.py", line 192, in stratified_sample
      with ops.name_scope(name, 'stratified_sample', list(tensors) + [labels]):
      File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 459, in __iter__
      "Tensor objects are only iterable when eager execution is "
      TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.


      What would be the "best practice" way of using dataset with stratified batches?










      share|improve this question














      Given a tensorflow dataset



      Train_dataset = tf.data.Dataset.from_tensor_slices((Train_Image_Filenames,Train_Image_Labels))
      Train_dataset = Train_dataset.map(Parse_JPEG_Augmented)
      ...


      I would like to stratify my batches to deal with class imbalance. I found tf.contrib.training.stratified_sample and thought I could use it in the following way:



      Train_dataset_iter = Train_dataset.make_one_shot_iterator()
      Train_dataset_Image_Batch,Train_dataset_Label_Batch = Train_dataset_iter.get_next()
      Train_Stratified_Images,Train_Stratified_Labels = tf.contrib.training.stratified_sample(Train_dataset_Image_Batch,Train_dataset_Label_Batch,[1/Classes]*Classes,Batch_Size)


      But it gives the following error and I'm not sure that this would allow me to keep the performance benefits of tensorflow dataset as I may have then have to pass Train_Stratified_Images and Train_Stratified_Labels via feed_dict ?



      File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/training/python/training/sampling_ops.py", line 192, in stratified_sample
      with ops.name_scope(name, 'stratified_sample', list(tensors) + [labels]):
      File "/xxx/xxx/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 459, in __iter__
      "Tensor objects are only iterable when eager execution is "
      TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.


      What would be the "best practice" way of using dataset with stratified batches?







      tensorflow






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 27 '18 at 17:34









      AgadeAgade

      789




      789
























          1 Answer
          1






          active

          oldest

          votes


















          0














          I am looking into a similar issue and I found this. I haven't tried it in my pipeline though. Maybe it will work for your purpose?






          share|improve this answer
























          • I found that rejection_resample is indeed the recommended way on similar topics. For my purposes I instead used the newer sample_from_datasets applied to a list of shuffled datasets for each class.

            – Agade
            14 hours ago











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53505150%2fusing-tensorflow-dataset-with-stratified-sampling%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          I am looking into a similar issue and I found this. I haven't tried it in my pipeline though. Maybe it will work for your purpose?






          share|improve this answer
























          • I found that rejection_resample is indeed the recommended way on similar topics. For my purposes I instead used the newer sample_from_datasets applied to a list of shuffled datasets for each class.

            – Agade
            14 hours ago
















          0














          I am looking into a similar issue and I found this. I haven't tried it in my pipeline though. Maybe it will work for your purpose?






          share|improve this answer
























          • I found that rejection_resample is indeed the recommended way on similar topics. For my purposes I instead used the newer sample_from_datasets applied to a list of shuffled datasets for each class.

            – Agade
            14 hours ago














          0












          0








          0







          I am looking into a similar issue and I found this. I haven't tried it in my pipeline though. Maybe it will work for your purpose?






          share|improve this answer













          I am looking into a similar issue and I found this. I haven't tried it in my pipeline though. Maybe it will work for your purpose?







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 6 '18 at 23:47









          NeergaardNeergaard

          14413




          14413













          • I found that rejection_resample is indeed the recommended way on similar topics. For my purposes I instead used the newer sample_from_datasets applied to a list of shuffled datasets for each class.

            – Agade
            14 hours ago



















          • I found that rejection_resample is indeed the recommended way on similar topics. For my purposes I instead used the newer sample_from_datasets applied to a list of shuffled datasets for each class.

            – Agade
            14 hours ago

















          I found that rejection_resample is indeed the recommended way on similar topics. For my purposes I instead used the newer sample_from_datasets applied to a list of shuffled datasets for each class.

          – Agade
          14 hours ago





          I found that rejection_resample is indeed the recommended way on similar topics. For my purposes I instead used the newer sample_from_datasets applied to a list of shuffled datasets for each class.

          – Agade
          14 hours ago




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53505150%2fusing-tensorflow-dataset-with-stratified-sampling%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Lallio

          Futebolista

          Jornalista