Difference between spark_session and sqlContext on loading a local file












0















I'm tried to load a local file as dataframe with using spark_session and sqlContext.



df = spark_session.read...load(localpath) 


It couldn't read local files. df is empty.
But, after creating sqlcontext from spark_context, it could load a local file.



sqlContext = SQLContext(spark_context)
df = sqlContext.read...load(localpath)


It worked fine. But I can't understand why. What is the cause ?



Envionment: Windows10, spark 2.2.1



EDIT



Finally I've resolved this problem. The root cause is version difference between PySpark installed with pip and PySpark installed in local file system. PySpark failed to start because of py4j failing.










share|improve this question

























  • Almost the same issue stackoverflow.com/q/48026195/2565527

    – hiropon
    Nov 28 '18 at 10:55
















0















I'm tried to load a local file as dataframe with using spark_session and sqlContext.



df = spark_session.read...load(localpath) 


It couldn't read local files. df is empty.
But, after creating sqlcontext from spark_context, it could load a local file.



sqlContext = SQLContext(spark_context)
df = sqlContext.read...load(localpath)


It worked fine. But I can't understand why. What is the cause ?



Envionment: Windows10, spark 2.2.1



EDIT



Finally I've resolved this problem. The root cause is version difference between PySpark installed with pip and PySpark installed in local file system. PySpark failed to start because of py4j failing.










share|improve this question

























  • Almost the same issue stackoverflow.com/q/48026195/2565527

    – hiropon
    Nov 28 '18 at 10:55














0












0








0








I'm tried to load a local file as dataframe with using spark_session and sqlContext.



df = spark_session.read...load(localpath) 


It couldn't read local files. df is empty.
But, after creating sqlcontext from spark_context, it could load a local file.



sqlContext = SQLContext(spark_context)
df = sqlContext.read...load(localpath)


It worked fine. But I can't understand why. What is the cause ?



Envionment: Windows10, spark 2.2.1



EDIT



Finally I've resolved this problem. The root cause is version difference between PySpark installed with pip and PySpark installed in local file system. PySpark failed to start because of py4j failing.










share|improve this question
















I'm tried to load a local file as dataframe with using spark_session and sqlContext.



df = spark_session.read...load(localpath) 


It couldn't read local files. df is empty.
But, after creating sqlcontext from spark_context, it could load a local file.



sqlContext = SQLContext(spark_context)
df = sqlContext.read...load(localpath)


It worked fine. But I can't understand why. What is the cause ?



Envionment: Windows10, spark 2.2.1



EDIT



Finally I've resolved this problem. The root cause is version difference between PySpark installed with pip and PySpark installed in local file system. PySpark failed to start because of py4j failing.







apache-spark pyspark






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 10 at 2:18







hiropon

















asked Nov 28 '18 at 10:19









hiroponhiropon

9462928




9462928













  • Almost the same issue stackoverflow.com/q/48026195/2565527

    – hiropon
    Nov 28 '18 at 10:55



















  • Almost the same issue stackoverflow.com/q/48026195/2565527

    – hiropon
    Nov 28 '18 at 10:55

















Almost the same issue stackoverflow.com/q/48026195/2565527

– hiropon
Nov 28 '18 at 10:55





Almost the same issue stackoverflow.com/q/48026195/2565527

– hiropon
Nov 28 '18 at 10:55












1 Answer
1






active

oldest

votes


















1














I am pasting a sample code that might help. We have used this to create a Sparksession object and read a local file with it:



import org.apache.spark.sql.SparkSession

object SetTopBox_KPI1_1 {

def main(args: Array[String]): Unit = {
if(args.length < 2) {
System.err.println("SetTopBox Data Analysis <Input-File> OR <Output-File> is missing")
System.exit(1)
}

val spark = SparkSession.builder().appName("KPI1_1").getOrCreate()

val record = spark.read.textFile(args(0)).rdd


.....



On the whole, in Spark 2.2 the preferred way to use Spark is by creating a SparkSession object.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53517113%2fdifference-between-spark-session-and-sqlcontext-on-loading-a-local-file%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    I am pasting a sample code that might help. We have used this to create a Sparksession object and read a local file with it:



    import org.apache.spark.sql.SparkSession

    object SetTopBox_KPI1_1 {

    def main(args: Array[String]): Unit = {
    if(args.length < 2) {
    System.err.println("SetTopBox Data Analysis <Input-File> OR <Output-File> is missing")
    System.exit(1)
    }

    val spark = SparkSession.builder().appName("KPI1_1").getOrCreate()

    val record = spark.read.textFile(args(0)).rdd


    .....



    On the whole, in Spark 2.2 the preferred way to use Spark is by creating a SparkSession object.






    share|improve this answer




























      1














      I am pasting a sample code that might help. We have used this to create a Sparksession object and read a local file with it:



      import org.apache.spark.sql.SparkSession

      object SetTopBox_KPI1_1 {

      def main(args: Array[String]): Unit = {
      if(args.length < 2) {
      System.err.println("SetTopBox Data Analysis <Input-File> OR <Output-File> is missing")
      System.exit(1)
      }

      val spark = SparkSession.builder().appName("KPI1_1").getOrCreate()

      val record = spark.read.textFile(args(0)).rdd


      .....



      On the whole, in Spark 2.2 the preferred way to use Spark is by creating a SparkSession object.






      share|improve this answer


























        1












        1








        1







        I am pasting a sample code that might help. We have used this to create a Sparksession object and read a local file with it:



        import org.apache.spark.sql.SparkSession

        object SetTopBox_KPI1_1 {

        def main(args: Array[String]): Unit = {
        if(args.length < 2) {
        System.err.println("SetTopBox Data Analysis <Input-File> OR <Output-File> is missing")
        System.exit(1)
        }

        val spark = SparkSession.builder().appName("KPI1_1").getOrCreate()

        val record = spark.read.textFile(args(0)).rdd


        .....



        On the whole, in Spark 2.2 the preferred way to use Spark is by creating a SparkSession object.






        share|improve this answer













        I am pasting a sample code that might help. We have used this to create a Sparksession object and read a local file with it:



        import org.apache.spark.sql.SparkSession

        object SetTopBox_KPI1_1 {

        def main(args: Array[String]): Unit = {
        if(args.length < 2) {
        System.err.println("SetTopBox Data Analysis <Input-File> OR <Output-File> is missing")
        System.exit(1)
        }

        val spark = SparkSession.builder().appName("KPI1_1").getOrCreate()

        val record = spark.read.textFile(args(0)).rdd


        .....



        On the whole, in Spark 2.2 the preferred way to use Spark is by creating a SparkSession object.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 28 '18 at 12:35









        BDABDA

        25610




        25610
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53517113%2fdifference-between-spark-session-and-sqlcontext-on-loading-a-local-file%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

            Calculate evaluation metrics using cross_val_predict sklearn

            Insert data from modal to MySQL (multiple modal on website)