Spark - python textFile creates weird rdd [duplicate]












0
















This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?










share|improve this question















marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04
















0
















This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?










share|improve this question















marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04














0












0








0









This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?










share|improve this question

















This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?





This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers








python apache-spark pyspark






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 25 '18 at 10:31









TrebuchetMS

2,4501722




2,4501722










asked Nov 25 '18 at 9:38









J DoeJ Doe

13




13




marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04



















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04

















In which form would you like the rdd to be? What is a useful format for you?

– Yaron
Nov 25 '18 at 10:32





In which form would you like the rdd to be? What is a useful format for you?

– Yaron
Nov 25 '18 at 10:32













I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

– J Doe
Nov 25 '18 at 11:14







I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

– J Doe
Nov 25 '18 at 11:14















It's called save as test file, not for nothing

– thebluephantom
Nov 25 '18 at 11:37





It's called save as test file, not for nothing

– thebluephantom
Nov 25 '18 at 11:37













So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

– J Doe
Nov 25 '18 at 12:04





So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

– J Doe
Nov 25 '18 at 12:04












0






active

oldest

votes

















0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes

Popular posts from this blog

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

Calculate evaluation metrics using cross_val_predict sklearn

Insert data from modal to MySQL (multiple modal on website)