Problem writting vectors of strings into binary file












-2















I'm serializing data into binary file using ofstream/ifstream. Data is divided in 2 vectors of strings, one for data names and other for data values, std::vector<std::string> dataNames, std::vector<std::string> dataValues.



I'm writting the data using this function:



void Data::SaveData(std::string path)
{
std::ofstream outfile(path, std::ofstream::binary);
outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));
outfile.write(reinterpret_cast<const char *>(&dataValues[0]), dataValues.size() * sizeof(std::string));
outfile.close();
}


And reading it using:



bool Data::LoadData(std::string path)
{
bool ret = false;

std::ifstream file(path, std::ifstream::in | std::ifstream::binary);
if (file.is_open())
{
// get length of file:
file.seekg(0, file.end);
int length = file.tellg();
file.seekg(0, file.beg);

char * buffer = new char[length];
file.read(buffer, length);

if (file)
{
char* cursor = buffer;
uint32_t bytes = length / 2;
dataNames.resize(bytes / sizeof(std::string));
memcpy(dataNames.data(), cursor, bytes);

cursor += bytes;
dataValues.resize(bytes / sizeof(std::string));
memcpy(dataValues.data(), cursor, bytes);

delete buffer;
buffer = nullptr;
}

file.close();
ret = true;
}

return ret;
}


It works. I can write and read it correctly. Except if any of the strings in dataNames or dataValues has 16 chars or more.



Example of data using strings with less than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectCount"

dataValues[0] = "Scene"
dataValues[1] = "5"


data 15 chars



Example of data using strings with more than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectsCount" //Added a s. Now have 16 chars

dataValues[0] = "Scene"
dataValues[1] = "5"


data 16 chars



Here you can see that word "GameObjectsCount" doesn't appear and extrange characters are shown.
When reading this file the string is not valid. Sometimes it's empty, sometimes says "Error reading characters of string", sometimes is a radom letter...



Any idea?










share|improve this question


















  • 1





    sizeof(std::string) needs to be replaced by sizeof(char).

    – unxnut
    Nov 26 '18 at 1:27











  • A vector is not a POD type. A std::string is not a POD type. Thus none of the code that looks like this: outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string)); will work. To prove this, make one of your strings 1,000 characters. How could dataNames.size() * sizeof(std::string) ever be anything close to 1,000?

    – PaulMcKenzie
    Nov 26 '18 at 1:30













  • It looks like you are taking the address of a std::string and casting it to a const char*. That's not going to work. A std::string is a bit like a std::vector, you need to access its internal array.

    – Galik
    Nov 26 '18 at 1:35











  • Also, the data you do see is probably an artifact from Short String Optimization (SSO), where the std::string stores its characters in a regular array. Once the string becomes longer than 16 bytes, memory is allocated from the heap to store the string, thus you no longer have the array representing the string, but a pointer to the heap.

    – PaulMcKenzie
    Nov 26 '18 at 1:36













  • Does this binary output need to be machine portable?

    – Galik
    Nov 26 '18 at 1:44
















-2















I'm serializing data into binary file using ofstream/ifstream. Data is divided in 2 vectors of strings, one for data names and other for data values, std::vector<std::string> dataNames, std::vector<std::string> dataValues.



I'm writting the data using this function:



void Data::SaveData(std::string path)
{
std::ofstream outfile(path, std::ofstream::binary);
outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));
outfile.write(reinterpret_cast<const char *>(&dataValues[0]), dataValues.size() * sizeof(std::string));
outfile.close();
}


And reading it using:



bool Data::LoadData(std::string path)
{
bool ret = false;

std::ifstream file(path, std::ifstream::in | std::ifstream::binary);
if (file.is_open())
{
// get length of file:
file.seekg(0, file.end);
int length = file.tellg();
file.seekg(0, file.beg);

char * buffer = new char[length];
file.read(buffer, length);

if (file)
{
char* cursor = buffer;
uint32_t bytes = length / 2;
dataNames.resize(bytes / sizeof(std::string));
memcpy(dataNames.data(), cursor, bytes);

cursor += bytes;
dataValues.resize(bytes / sizeof(std::string));
memcpy(dataValues.data(), cursor, bytes);

delete buffer;
buffer = nullptr;
}

file.close();
ret = true;
}

return ret;
}


It works. I can write and read it correctly. Except if any of the strings in dataNames or dataValues has 16 chars or more.



Example of data using strings with less than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectCount"

dataValues[0] = "Scene"
dataValues[1] = "5"


data 15 chars



Example of data using strings with more than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectsCount" //Added a s. Now have 16 chars

dataValues[0] = "Scene"
dataValues[1] = "5"


data 16 chars



Here you can see that word "GameObjectsCount" doesn't appear and extrange characters are shown.
When reading this file the string is not valid. Sometimes it's empty, sometimes says "Error reading characters of string", sometimes is a radom letter...



Any idea?










share|improve this question


















  • 1





    sizeof(std::string) needs to be replaced by sizeof(char).

    – unxnut
    Nov 26 '18 at 1:27











  • A vector is not a POD type. A std::string is not a POD type. Thus none of the code that looks like this: outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string)); will work. To prove this, make one of your strings 1,000 characters. How could dataNames.size() * sizeof(std::string) ever be anything close to 1,000?

    – PaulMcKenzie
    Nov 26 '18 at 1:30













  • It looks like you are taking the address of a std::string and casting it to a const char*. That's not going to work. A std::string is a bit like a std::vector, you need to access its internal array.

    – Galik
    Nov 26 '18 at 1:35











  • Also, the data you do see is probably an artifact from Short String Optimization (SSO), where the std::string stores its characters in a regular array. Once the string becomes longer than 16 bytes, memory is allocated from the heap to store the string, thus you no longer have the array representing the string, but a pointer to the heap.

    – PaulMcKenzie
    Nov 26 '18 at 1:36













  • Does this binary output need to be machine portable?

    – Galik
    Nov 26 '18 at 1:44














-2












-2








-2








I'm serializing data into binary file using ofstream/ifstream. Data is divided in 2 vectors of strings, one for data names and other for data values, std::vector<std::string> dataNames, std::vector<std::string> dataValues.



I'm writting the data using this function:



void Data::SaveData(std::string path)
{
std::ofstream outfile(path, std::ofstream::binary);
outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));
outfile.write(reinterpret_cast<const char *>(&dataValues[0]), dataValues.size() * sizeof(std::string));
outfile.close();
}


And reading it using:



bool Data::LoadData(std::string path)
{
bool ret = false;

std::ifstream file(path, std::ifstream::in | std::ifstream::binary);
if (file.is_open())
{
// get length of file:
file.seekg(0, file.end);
int length = file.tellg();
file.seekg(0, file.beg);

char * buffer = new char[length];
file.read(buffer, length);

if (file)
{
char* cursor = buffer;
uint32_t bytes = length / 2;
dataNames.resize(bytes / sizeof(std::string));
memcpy(dataNames.data(), cursor, bytes);

cursor += bytes;
dataValues.resize(bytes / sizeof(std::string));
memcpy(dataValues.data(), cursor, bytes);

delete buffer;
buffer = nullptr;
}

file.close();
ret = true;
}

return ret;
}


It works. I can write and read it correctly. Except if any of the strings in dataNames or dataValues has 16 chars or more.



Example of data using strings with less than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectCount"

dataValues[0] = "Scene"
dataValues[1] = "5"


data 15 chars



Example of data using strings with more than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectsCount" //Added a s. Now have 16 chars

dataValues[0] = "Scene"
dataValues[1] = "5"


data 16 chars



Here you can see that word "GameObjectsCount" doesn't appear and extrange characters are shown.
When reading this file the string is not valid. Sometimes it's empty, sometimes says "Error reading characters of string", sometimes is a radom letter...



Any idea?










share|improve this question














I'm serializing data into binary file using ofstream/ifstream. Data is divided in 2 vectors of strings, one for data names and other for data values, std::vector<std::string> dataNames, std::vector<std::string> dataValues.



I'm writting the data using this function:



void Data::SaveData(std::string path)
{
std::ofstream outfile(path, std::ofstream::binary);
outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));
outfile.write(reinterpret_cast<const char *>(&dataValues[0]), dataValues.size() * sizeof(std::string));
outfile.close();
}


And reading it using:



bool Data::LoadData(std::string path)
{
bool ret = false;

std::ifstream file(path, std::ifstream::in | std::ifstream::binary);
if (file.is_open())
{
// get length of file:
file.seekg(0, file.end);
int length = file.tellg();
file.seekg(0, file.beg);

char * buffer = new char[length];
file.read(buffer, length);

if (file)
{
char* cursor = buffer;
uint32_t bytes = length / 2;
dataNames.resize(bytes / sizeof(std::string));
memcpy(dataNames.data(), cursor, bytes);

cursor += bytes;
dataValues.resize(bytes / sizeof(std::string));
memcpy(dataValues.data(), cursor, bytes);

delete buffer;
buffer = nullptr;
}

file.close();
ret = true;
}

return ret;
}


It works. I can write and read it correctly. Except if any of the strings in dataNames or dataValues has 16 chars or more.



Example of data using strings with less than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectCount"

dataValues[0] = "Scene"
dataValues[1] = "5"


data 15 chars



Example of data using strings with more than 16 chars:



dataNames[0] = "Type"
dataNames[1] = "GameObjectsCount" //Added a s. Now have 16 chars

dataValues[0] = "Scene"
dataValues[1] = "5"


data 16 chars



Here you can see that word "GameObjectsCount" doesn't appear and extrange characters are shown.
When reading this file the string is not valid. Sometimes it's empty, sometimes says "Error reading characters of string", sometimes is a radom letter...



Any idea?







c++ serialization ifstream ofstream






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 26 '18 at 1:24









Tino MartinTino Martin

15




15








  • 1





    sizeof(std::string) needs to be replaced by sizeof(char).

    – unxnut
    Nov 26 '18 at 1:27











  • A vector is not a POD type. A std::string is not a POD type. Thus none of the code that looks like this: outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string)); will work. To prove this, make one of your strings 1,000 characters. How could dataNames.size() * sizeof(std::string) ever be anything close to 1,000?

    – PaulMcKenzie
    Nov 26 '18 at 1:30













  • It looks like you are taking the address of a std::string and casting it to a const char*. That's not going to work. A std::string is a bit like a std::vector, you need to access its internal array.

    – Galik
    Nov 26 '18 at 1:35











  • Also, the data you do see is probably an artifact from Short String Optimization (SSO), where the std::string stores its characters in a regular array. Once the string becomes longer than 16 bytes, memory is allocated from the heap to store the string, thus you no longer have the array representing the string, but a pointer to the heap.

    – PaulMcKenzie
    Nov 26 '18 at 1:36













  • Does this binary output need to be machine portable?

    – Galik
    Nov 26 '18 at 1:44














  • 1





    sizeof(std::string) needs to be replaced by sizeof(char).

    – unxnut
    Nov 26 '18 at 1:27











  • A vector is not a POD type. A std::string is not a POD type. Thus none of the code that looks like this: outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string)); will work. To prove this, make one of your strings 1,000 characters. How could dataNames.size() * sizeof(std::string) ever be anything close to 1,000?

    – PaulMcKenzie
    Nov 26 '18 at 1:30













  • It looks like you are taking the address of a std::string and casting it to a const char*. That's not going to work. A std::string is a bit like a std::vector, you need to access its internal array.

    – Galik
    Nov 26 '18 at 1:35











  • Also, the data you do see is probably an artifact from Short String Optimization (SSO), where the std::string stores its characters in a regular array. Once the string becomes longer than 16 bytes, memory is allocated from the heap to store the string, thus you no longer have the array representing the string, but a pointer to the heap.

    – PaulMcKenzie
    Nov 26 '18 at 1:36













  • Does this binary output need to be machine portable?

    – Galik
    Nov 26 '18 at 1:44








1




1





sizeof(std::string) needs to be replaced by sizeof(char).

– unxnut
Nov 26 '18 at 1:27





sizeof(std::string) needs to be replaced by sizeof(char).

– unxnut
Nov 26 '18 at 1:27













A vector is not a POD type. A std::string is not a POD type. Thus none of the code that looks like this: outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string)); will work. To prove this, make one of your strings 1,000 characters. How could dataNames.size() * sizeof(std::string) ever be anything close to 1,000?

– PaulMcKenzie
Nov 26 '18 at 1:30







A vector is not a POD type. A std::string is not a POD type. Thus none of the code that looks like this: outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string)); will work. To prove this, make one of your strings 1,000 characters. How could dataNames.size() * sizeof(std::string) ever be anything close to 1,000?

– PaulMcKenzie
Nov 26 '18 at 1:30















It looks like you are taking the address of a std::string and casting it to a const char*. That's not going to work. A std::string is a bit like a std::vector, you need to access its internal array.

– Galik
Nov 26 '18 at 1:35





It looks like you are taking the address of a std::string and casting it to a const char*. That's not going to work. A std::string is a bit like a std::vector, you need to access its internal array.

– Galik
Nov 26 '18 at 1:35













Also, the data you do see is probably an artifact from Short String Optimization (SSO), where the std::string stores its characters in a regular array. Once the string becomes longer than 16 bytes, memory is allocated from the heap to store the string, thus you no longer have the array representing the string, but a pointer to the heap.

– PaulMcKenzie
Nov 26 '18 at 1:36







Also, the data you do see is probably an artifact from Short String Optimization (SSO), where the std::string stores its characters in a regular array. Once the string becomes longer than 16 bytes, memory is allocated from the heap to store the string, thus you no longer have the array representing the string, but a pointer to the heap.

– PaulMcKenzie
Nov 26 '18 at 1:36















Does this binary output need to be machine portable?

– Galik
Nov 26 '18 at 1:44





Does this binary output need to be machine portable?

– Galik
Nov 26 '18 at 1:44












1 Answer
1






active

oldest

votes


















0














Reinterpreting a vector in the manner you have above isn't correct.



 outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));


You don't know how the vector stores data (on the heap, etc..), and you can't assume that you can blindly cast the pointer and write whatever you see out to a file as a method to serialize the data. Furthermore, a std::string isn't necessarily an in-place character array of the size of the input. It's more likely a pointer to an object on the heap.



So, if you want to serialize the data in a vector or another stdlib type, you'll need to write a function to do that manually by iterating over the items and writing them in a properly delimited way.






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53473671%2fproblem-writting-vectors-of-strings-into-binary-file%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Reinterpreting a vector in the manner you have above isn't correct.



     outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));


    You don't know how the vector stores data (on the heap, etc..), and you can't assume that you can blindly cast the pointer and write whatever you see out to a file as a method to serialize the data. Furthermore, a std::string isn't necessarily an in-place character array of the size of the input. It's more likely a pointer to an object on the heap.



    So, if you want to serialize the data in a vector or another stdlib type, you'll need to write a function to do that manually by iterating over the items and writing them in a properly delimited way.






    share|improve this answer






























      0














      Reinterpreting a vector in the manner you have above isn't correct.



       outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));


      You don't know how the vector stores data (on the heap, etc..), and you can't assume that you can blindly cast the pointer and write whatever you see out to a file as a method to serialize the data. Furthermore, a std::string isn't necessarily an in-place character array of the size of the input. It's more likely a pointer to an object on the heap.



      So, if you want to serialize the data in a vector or another stdlib type, you'll need to write a function to do that manually by iterating over the items and writing them in a properly delimited way.






      share|improve this answer




























        0












        0








        0







        Reinterpreting a vector in the manner you have above isn't correct.



         outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));


        You don't know how the vector stores data (on the heap, etc..), and you can't assume that you can blindly cast the pointer and write whatever you see out to a file as a method to serialize the data. Furthermore, a std::string isn't necessarily an in-place character array of the size of the input. It's more likely a pointer to an object on the heap.



        So, if you want to serialize the data in a vector or another stdlib type, you'll need to write a function to do that manually by iterating over the items and writing them in a properly delimited way.






        share|improve this answer















        Reinterpreting a vector in the manner you have above isn't correct.



         outfile.write(reinterpret_cast<const char *>(&dataNames[0]), dataNames.size() * sizeof(std::string));


        You don't know how the vector stores data (on the heap, etc..), and you can't assume that you can blindly cast the pointer and write whatever you see out to a file as a method to serialize the data. Furthermore, a std::string isn't necessarily an in-place character array of the size of the input. It's more likely a pointer to an object on the heap.



        So, if you want to serialize the data in a vector or another stdlib type, you'll need to write a function to do that manually by iterating over the items and writing them in a properly delimited way.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 26 '18 at 1:41

























        answered Nov 26 '18 at 1:29









        PaulPaul

        3606




        3606






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53473671%2fproblem-writting-vectors-of-strings-into-binary-file%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

            Calculate evaluation metrics using cross_val_predict sklearn

            Insert data from modal to MySQL (multiple modal on website)