Writing “latin1” encoded byte array with escape characters

I'm working on a database import/export process in VB.NET which writes data from a MySQL (5.5) database to a plain text file. The application reads the data to a DataTable, then goes through the rows/columns to actually write the data to the OutputFile (System.IO.StreamWriter object). The encoding on the tables in this database is Latin1. There is a MediumBlob field in one of the tables I've been using for testing which contains image files stored as a byte array.

In my attempts to validate the output from my application, I've exported the data directly from the database using the MySQL Workbench, then compared that with the results I get when I write the same data from my application. In the direct export from MySQL Workbench, I see some of these bytes are exported with the backslash. When I read the data through my application, however, this escape character does not appear. Viewed through Notepad++, it clearly shows some distinct differences between the two output results (see screenshot).

Notepad++ comparison of output results

Obviously, while apparently very similar, the two are not completely identical. My application is not including the backslashes for escaped characters, and some characters such as NULL are coming out differently altogether. My code for writing this field to the file is:

OutputFile.Write("'" & System.Text.Encoding.GetEncoding(28591).GetString(CType(COPYRow(ColumnIndex), Byte())) & "'")

There doesn't appear to be an overload for the GetString method that allows me to specify an escape character, so I'm wondering if there's another way that, using this method, I can ensure the characters are correctly encoded, including escape characters.

I'm "assuming" that this method should also work in general when I start working with my PostgreSQL database, but with possibly a different encoding. I'm trying to build things as "generic" as possible, but I'll have to worry about specifying encodings at run-time instead of hard-coding them later.

EDIT

I just ran across another SO question, which might point me in the right direction: Convert a Unicode string to an escaped ASCII string. Obviously, it might take a bit more work to get it right, but this looks like the closest thing to what I'm trying to accomplish.

edited Nov 27 '18 at 17:01

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

It's not clear, to me, what you're trying to do. Are you trying to conver an Image bytes to a US-ASCII representation? Or iso-8859-1 encoded strings to US-ASCII, preserving the non-convertible CodePoints to the escaped Unicode representation as ASCII text? What's the use of these?

– Jimi
Nov 27 '18 at 17:21

MySQL Workbench is showing you a "debugging" output (probably of its own invention). Why do you want your text file to be anything but a straightforward text file with a specific character encoding? (Or, why not a clean JSON file [which would use UTF-8]?).

– Tom Blodget
Nov 27 '18 at 17:22

For binary data, ideally, it wouldn't come out of your data access layer as a .NET String. A String is a text data type. If it must be a String can you convert the data on the server side to a Base64 string?

– Tom Blodget
Nov 27 '18 at 17:26

Perhaps I'm over-thinking things (or thinking about it incorrectly), but the intent is to be able to export the data in a format that is usable for importing back into the database at potentially some point in the future. To that end, I am comparing the output produced by my application to the output produced by MySQL Workbench and trying to reproduce the latter as closely as possible. Additionally, I may need to be able to transfer this data from one RDBMS to another (MySQL -> PostgreSQL or vice-versa), so I want to ensure the data is as "clean" as possible.

– G_Hosa_Phat
Nov 27 '18 at 17:31

1

You are over-thinking it. For Text fields, read the Collation value and reproduce the Encoding when saving the Text to a file. Or, convert everything to UTF-8, if possible. Byte Arrays representing Bitmaps must remain as they are. You might, if really required, convert the array to a Base64 representation.

– Jimi
Nov 27 '18 at 18:14

|
show 1 more comment

Notepad++ comparison of output results

OutputFile.Write("'" & System.Text.Encoding.GetEncoding(28591).GetString(CType(COPYRow(ColumnIndex), Byte())) & "'")

EDIT

edited Nov 27 '18 at 17:01

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

It's not clear, to me, what you're trying to do. Are you trying to conver an Image bytes to a US-ASCII representation? Or iso-8859-1 encoded strings to US-ASCII, preserving the non-convertible CodePoints to the escaped Unicode representation as ASCII text? What's the use of these?

– Jimi
Nov 27 '18 at 17:21

MySQL Workbench is showing you a "debugging" output (probably of its own invention). Why do you want your text file to be anything but a straightforward text file with a specific character encoding? (Or, why not a clean JSON file [which would use UTF-8]?).

– Tom Blodget
Nov 27 '18 at 17:22

For binary data, ideally, it wouldn't come out of your data access layer as a .NET String. A String is a text data type. If it must be a String can you convert the data on the server side to a Base64 string?

– Tom Blodget
Nov 27 '18 at 17:26

Perhaps I'm over-thinking things (or thinking about it incorrectly), but the intent is to be able to export the data in a format that is usable for importing back into the database at potentially some point in the future. To that end, I am comparing the output produced by my application to the output produced by MySQL Workbench and trying to reproduce the latter as closely as possible. Additionally, I may need to be able to transfer this data from one RDBMS to another (MySQL -> PostgreSQL or vice-versa), so I want to ensure the data is as "clean" as possible.

– G_Hosa_Phat
Nov 27 '18 at 17:31

1

You are over-thinking it. For Text fields, read the Collation value and reproduce the Encoding when saving the Text to a file. Or, convert everything to UTF-8, if possible. Byte Arrays representing Bitmaps must remain as they are. You might, if really required, convert the array to a Base64 representation.

– Jimi
Nov 27 '18 at 18:14

|
show 1 more comment

Notepad++ comparison of output results

OutputFile.Write("'" & System.Text.Encoding.GetEncoding(28591).GetString(CType(COPYRow(ColumnIndex), Byte())) & "'")

EDIT

edited Nov 27 '18 at 17:01

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

Notepad++ comparison of output results

OutputFile.Write("'" & System.Text.Encoding.GetEncoding(28591).GetString(CType(COPYRow(ColumnIndex), Byte())) & "'")

EDIT

mysql vb.net character-encoding

edited Nov 27 '18 at 17:01

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

edited Nov 27 '18 at 17:01

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

edited Nov 27 '18 at 17:01

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

asked Nov 27 '18 at 16:31

G_Hosa_Phat

287418

It's not clear, to me, what you're trying to do. Are you trying to conver an Image bytes to a US-ASCII representation? Or iso-8859-1 encoded strings to US-ASCII, preserving the non-convertible CodePoints to the escaped Unicode representation as ASCII text? What's the use of these?

– Jimi
Nov 27 '18 at 17:21

MySQL Workbench is showing you a "debugging" output (probably of its own invention). Why do you want your text file to be anything but a straightforward text file with a specific character encoding? (Or, why not a clean JSON file [which would use UTF-8]?).

– Tom Blodget
Nov 27 '18 at 17:22

For binary data, ideally, it wouldn't come out of your data access layer as a .NET String. A String is a text data type. If it must be a String can you convert the data on the server side to a Base64 string?

– Tom Blodget
Nov 27 '18 at 17:26

Perhaps I'm over-thinking things (or thinking about it incorrectly), but the intent is to be able to export the data in a format that is usable for importing back into the database at potentially some point in the future. To that end, I am comparing the output produced by my application to the output produced by MySQL Workbench and trying to reproduce the latter as closely as possible. Additionally, I may need to be able to transfer this data from one RDBMS to another (MySQL -> PostgreSQL or vice-versa), so I want to ensure the data is as "clean" as possible.

– G_Hosa_Phat
Nov 27 '18 at 17:31

1

You are over-thinking it. For Text fields, read the Collation value and reproduce the Encoding when saving the Text to a file. Or, convert everything to UTF-8, if possible. Byte Arrays representing Bitmaps must remain as they are. You might, if really required, convert the array to a Base64 representation.

– Jimi
Nov 27 '18 at 18:14

|
show 1 more comment

It's not clear, to me, what you're trying to do. Are you trying to conver an Image bytes to a US-ASCII representation? Or iso-8859-1 encoded strings to US-ASCII, preserving the non-convertible CodePoints to the escaped Unicode representation as ASCII text? What's the use of these?

– Jimi
Nov 27 '18 at 17:21

MySQL Workbench is showing you a "debugging" output (probably of its own invention). Why do you want your text file to be anything but a straightforward text file with a specific character encoding? (Or, why not a clean JSON file [which would use UTF-8]?).

– Tom Blodget
Nov 27 '18 at 17:22

For binary data, ideally, it wouldn't come out of your data access layer as a .NET String. A String is a text data type. If it must be a String can you convert the data on the server side to a Base64 string?

– Tom Blodget
Nov 27 '18 at 17:26

Perhaps I'm over-thinking things (or thinking about it incorrectly), but the intent is to be able to export the data in a format that is usable for importing back into the database at potentially some point in the future. To that end, I am comparing the output produced by my application to the output produced by MySQL Workbench and trying to reproduce the latter as closely as possible. Additionally, I may need to be able to transfer this data from one RDBMS to another (MySQL -> PostgreSQL or vice-versa), so I want to ensure the data is as "clean" as possible.

– G_Hosa_Phat
Nov 27 '18 at 17:31

1

You are over-thinking it. For Text fields, read the Collation value and reproduce the Encoding when saving the Text to a file. Or, convert everything to UTF-8, if possible. Byte Arrays representing Bitmaps must remain as they are. You might, if really required, convert the array to a Base64 representation.

– Jimi
Nov 27 '18 at 18:14

It's not clear, to me, what you're trying to do. Are you trying to conver an Image bytes to a US-ASCII representation? Or iso-8859-1 encoded strings to US-ASCII, preserving the non-convertible CodePoints to the escaped Unicode representation as ASCII text? What's the use of these?

– Jimi
Nov 27 '18 at 17:21

MySQL Workbench is showing you a "debugging" output (probably of its own invention). Why do you want your text file to be anything but a straightforward text file with a specific character encoding? (Or, why not a clean JSON file [which would use UTF-8]?).

– Tom Blodget
Nov 27 '18 at 17:22

For binary data, ideally, it wouldn't come out of your data access layer as a .NET String. A String is a text data type. If it must be a String can you convert the data on the server side to a Base64 string?

– Tom Blodget
Nov 27 '18 at 17:26

Perhaps I'm over-thinking things (or thinking about it incorrectly), but the intent is to be able to export the data in a format that is usable for importing back into the database at potentially some point in the future. To that end, I am comparing the output produced by my application to the output produced by MySQL Workbench and trying to reproduce the latter as closely as possible. Additionally, I may need to be able to transfer this data from one RDBMS to another (MySQL -> PostgreSQL or vice-versa), so I want to ensure the data is as "clean" as possible.

– G_Hosa_Phat
Nov 27 '18 at 17:31

You are over-thinking it. For Text fields, read the Collation value and reproduce the Encoding when saving the Text to a file. Or, convert everything to UTF-8, if possible. Byte Arrays representing Bitmaps must remain as they are. You might, if really required, convert the array to a Base64 representation.

– Jimi
Nov 27 '18 at 18:14

|
show 1 more comment

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53504121%2fwriting-latin1-encoded-byte-array-with-escape-characters%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl