Sort Spark Dataframe with two columns in different order












2















Let's say, I have a table like this:



A,B
2,6
1,2
1,3
1,5
2,3


I want to sort it with ascending order for column A but within that I want to sort it in descending order of column B, like this:



A,B
1,5
1,3
1,2
2,6
2,3


I have tried to use orderBy("A", desc("B")) but it gives an error.



How should I write the query using dataframe in Spark 2.0?










share|improve this question





























    2















    Let's say, I have a table like this:



    A,B
    2,6
    1,2
    1,3
    1,5
    2,3


    I want to sort it with ascending order for column A but within that I want to sort it in descending order of column B, like this:



    A,B
    1,5
    1,3
    1,2
    2,6
    2,3


    I have tried to use orderBy("A", desc("B")) but it gives an error.



    How should I write the query using dataframe in Spark 2.0?










    share|improve this question



























      2












      2








      2








      Let's say, I have a table like this:



      A,B
      2,6
      1,2
      1,3
      1,5
      2,3


      I want to sort it with ascending order for column A but within that I want to sort it in descending order of column B, like this:



      A,B
      1,5
      1,3
      1,2
      2,6
      2,3


      I have tried to use orderBy("A", desc("B")) but it gives an error.



      How should I write the query using dataframe in Spark 2.0?










      share|improve this question
















      Let's say, I have a table like this:



      A,B
      2,6
      1,2
      1,3
      1,5
      2,3


      I want to sort it with ascending order for column A but within that I want to sort it in descending order of column B, like this:



      A,B
      1,5
      1,3
      1,2
      2,6
      2,3


      I have tried to use orderBy("A", desc("B")) but it gives an error.



      How should I write the query using dataframe in Spark 2.0?







      scala sorting apache-spark dataframe apache-spark-sql






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 27 '18 at 5:58









      Shaido

      12.6k122742




      12.6k122742










      asked Nov 27 '18 at 3:33









      kellokello

      428




      428
























          2 Answers
          2






          active

          oldest

          votes


















          5














          Use Column method desc, as shown below:



          val df = Seq(
          (2,6), (1,2), (1,3), (1,5), (2,3)
          ).toDF("A", "B")

          df.orderBy($"A", $"B".desc).show
          // +---+---+
          // | A| B|
          // +---+---+
          // | 1| 5|
          // | 1| 3|
          // | 1| 2|
          // | 2| 6|
          // | 2| 3|
          // +---+---+





          share|improve this answer
























          • I like to be as explicit as possible, so I would use the asc on the first column ($"A".asc), even if the default behavior is to sort ascending.

            – Luis Miguel Mejía Suárez
            Nov 27 '18 at 15:22



















          4














          desc is the correct method to use, however, not that it is a method in the Columnn class. It should therefore be applied as follows:



          df.orderBy($"A", $"B".desc)


          $"B".desc returns a column so "A" must also be changed to $"A" (or col("A") if spark implicits isn't imported).






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53492336%2fsort-spark-dataframe-with-two-columns-in-different-order%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            5














            Use Column method desc, as shown below:



            val df = Seq(
            (2,6), (1,2), (1,3), (1,5), (2,3)
            ).toDF("A", "B")

            df.orderBy($"A", $"B".desc).show
            // +---+---+
            // | A| B|
            // +---+---+
            // | 1| 5|
            // | 1| 3|
            // | 1| 2|
            // | 2| 6|
            // | 2| 3|
            // +---+---+





            share|improve this answer
























            • I like to be as explicit as possible, so I would use the asc on the first column ($"A".asc), even if the default behavior is to sort ascending.

              – Luis Miguel Mejía Suárez
              Nov 27 '18 at 15:22
















            5














            Use Column method desc, as shown below:



            val df = Seq(
            (2,6), (1,2), (1,3), (1,5), (2,3)
            ).toDF("A", "B")

            df.orderBy($"A", $"B".desc).show
            // +---+---+
            // | A| B|
            // +---+---+
            // | 1| 5|
            // | 1| 3|
            // | 1| 2|
            // | 2| 6|
            // | 2| 3|
            // +---+---+





            share|improve this answer
























            • I like to be as explicit as possible, so I would use the asc on the first column ($"A".asc), even if the default behavior is to sort ascending.

              – Luis Miguel Mejía Suárez
              Nov 27 '18 at 15:22














            5












            5








            5







            Use Column method desc, as shown below:



            val df = Seq(
            (2,6), (1,2), (1,3), (1,5), (2,3)
            ).toDF("A", "B")

            df.orderBy($"A", $"B".desc).show
            // +---+---+
            // | A| B|
            // +---+---+
            // | 1| 5|
            // | 1| 3|
            // | 1| 2|
            // | 2| 6|
            // | 2| 3|
            // +---+---+





            share|improve this answer













            Use Column method desc, as shown below:



            val df = Seq(
            (2,6), (1,2), (1,3), (1,5), (2,3)
            ).toDF("A", "B")

            df.orderBy($"A", $"B".desc).show
            // +---+---+
            // | A| B|
            // +---+---+
            // | 1| 5|
            // | 1| 3|
            // | 1| 2|
            // | 2| 6|
            // | 2| 3|
            // +---+---+






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 27 '18 at 3:46









            Leo CLeo C

            11.2k2719




            11.2k2719













            • I like to be as explicit as possible, so I would use the asc on the first column ($"A".asc), even if the default behavior is to sort ascending.

              – Luis Miguel Mejía Suárez
              Nov 27 '18 at 15:22



















            • I like to be as explicit as possible, so I would use the asc on the first column ($"A".asc), even if the default behavior is to sort ascending.

              – Luis Miguel Mejía Suárez
              Nov 27 '18 at 15:22

















            I like to be as explicit as possible, so I would use the asc on the first column ($"A".asc), even if the default behavior is to sort ascending.

            – Luis Miguel Mejía Suárez
            Nov 27 '18 at 15:22





            I like to be as explicit as possible, so I would use the asc on the first column ($"A".asc), even if the default behavior is to sort ascending.

            – Luis Miguel Mejía Suárez
            Nov 27 '18 at 15:22













            4














            desc is the correct method to use, however, not that it is a method in the Columnn class. It should therefore be applied as follows:



            df.orderBy($"A", $"B".desc)


            $"B".desc returns a column so "A" must also be changed to $"A" (or col("A") if spark implicits isn't imported).






            share|improve this answer




























              4














              desc is the correct method to use, however, not that it is a method in the Columnn class. It should therefore be applied as follows:



              df.orderBy($"A", $"B".desc)


              $"B".desc returns a column so "A" must also be changed to $"A" (or col("A") if spark implicits isn't imported).






              share|improve this answer


























                4












                4








                4







                desc is the correct method to use, however, not that it is a method in the Columnn class. It should therefore be applied as follows:



                df.orderBy($"A", $"B".desc)


                $"B".desc returns a column so "A" must also be changed to $"A" (or col("A") if spark implicits isn't imported).






                share|improve this answer













                desc is the correct method to use, however, not that it is a method in the Columnn class. It should therefore be applied as follows:



                df.orderBy($"A", $"B".desc)


                $"B".desc returns a column so "A" must also be changed to $"A" (or col("A") if spark implicits isn't imported).







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 27 '18 at 3:47









                ShaidoShaido

                12.6k122742




                12.6k122742






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53492336%2fsort-spark-dataframe-with-two-columns-in-different-order%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                    Calculate evaluation metrics using cross_val_predict sklearn

                    Insert data from modal to MySQL (multiple modal on website)