Spark - python textFile creates weird rdd [duplicate]












0
















This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?










share|improve this question















marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04
















0
















This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?










share|improve this question















marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.



















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04














0












0








0









This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?










share|improve this question

















This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers




I saved an RDD with
rdd.saveAsTextFile("file_dir")



When I type
rdd = sc.textFile("path/to/file_dir") an RDD is created.



The only problem is that the RDD created isn't usable.



tail file
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1]
"('a', (('a1', '1'), ('a2', '2')))"



rdd.collect()[1][0]
"("



How can I change the output format to something I can use?





This question already has an answer here:




  • How to save a spark dataframe as a text file without Rows in pyspark?

    1 answer



  • How to restore RDD of (key,value) pairs after it has been stored/read from a text file

    2 answers








python apache-spark pyspark






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 25 '18 at 10:31









TrebuchetMS

2,4501722




2,4501722










asked Nov 25 '18 at 9:38









J DoeJ Doe

13




13




marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by user6910411 apache-spark
Users with the  apache-spark badge can single-handedly close apache-spark questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 12:57


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04



















  • In which form would you like the rdd to be? What is a useful format for you?

    – Yaron
    Nov 25 '18 at 10:32











  • I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

    – J Doe
    Nov 25 '18 at 11:14













  • It's called save as test file, not for nothing

    – thebluephantom
    Nov 25 '18 at 11:37











  • So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

    – J Doe
    Nov 25 '18 at 12:04

















In which form would you like the rdd to be? What is a useful format for you?

– Yaron
Nov 25 '18 at 10:32





In which form would you like the rdd to be? What is a useful format for you?

– Yaron
Nov 25 '18 at 10:32













I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

– J Doe
Nov 25 '18 at 11:14







I'd like to be able to use different values to perform transformations. ie rdd.collect()[1][0] = ('a1', '1')

– J Doe
Nov 25 '18 at 11:14















It's called save as test file, not for nothing

– thebluephantom
Nov 25 '18 at 11:37





It's called save as test file, not for nothing

– thebluephantom
Nov 25 '18 at 11:37













So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

– J Doe
Nov 25 '18 at 12:04





So I have to manually convert back to a rdd? Or is there a method I can use within pyspark?

– J Doe
Nov 25 '18 at 12:04












0






active

oldest

votes

















0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes

Popular posts from this blog

Contact image not getting when fetch all contact list from iPhone by CNContact

count number of partitions of a set with n elements into k subsets

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks