c_str() vs. data() when it comes to return type
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
|
show 3 more comments
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
4
my bet is that is has to do withc_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)
– user463035818
Nov 27 '18 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 '18 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 '18 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 '18 at 15:37
My understanding was you were asking for the context of the decision why a non-constantdata
was added. I believe that is covered in detail in the other question?
– Jonathan Mee
Nov 27 '18 at 15:41
|
show 3 more comments
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
c++ string c++17 c-str
c++ string c++17 c-str
edited Nov 27 '18 at 19:10
rrauenza
3,51921835
3,51921835
asked Nov 27 '18 at 13:03
gsamarasgsamaras
51.6k24104189
51.6k24104189
4
my bet is that is has to do withc_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)
– user463035818
Nov 27 '18 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 '18 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 '18 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 '18 at 15:37
My understanding was you were asking for the context of the decision why a non-constantdata
was added. I believe that is covered in detail in the other question?
– Jonathan Mee
Nov 27 '18 at 15:41
|
show 3 more comments
4
my bet is that is has to do withc_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)
– user463035818
Nov 27 '18 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 '18 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 '18 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 '18 at 15:37
My understanding was you were asking for the context of the decision why a non-constantdata
was added. I believe that is covered in detail in the other question?
– Jonathan Mee
Nov 27 '18 at 15:41
4
4
my bet is that is has to do with
c_str
being null terminated, while a std::string
may contain a null in the middle and I'd expect also data()
to return just the raw buffer (whether it contains null in the middle or not)– user463035818
Nov 27 '18 at 13:11
my bet is that is has to do with
c_str
being null terminated, while a std::string
may contain a null in the middle and I'd expect also data()
to return just the raw buffer (whether it contains null in the middle or not)– user463035818
Nov 27 '18 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 '18 at 13:20
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 '18 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 '18 at 15:34
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 '18 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 '18 at 15:37
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 '18 at 15:37
My understanding was you were asking for the context of the decision why a non-constant
data
was added. I believe that is covered in detail in the other question?– Jonathan Mee
Nov 27 '18 at 15:41
My understanding was you were asking for the context of the decision why a non-constant
data
was added. I believe that is covered in detail in the other question?– Jonathan Mee
Nov 27 '18 at 15:41
|
show 3 more comments
4 Answers
4
active
oldest
votes
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 '18 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 '18 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 '18 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 '18 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 '18 at 8:07
|
show 2 more comments
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
add a comment |
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 '18 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 '18 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 '18 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 '18 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 '18 at 13:21
|
show 3 more comments
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53500369%2fc-str-vs-data-when-it-comes-to-return-type%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 '18 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 '18 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 '18 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 '18 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 '18 at 8:07
|
show 2 more comments
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 '18 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 '18 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 '18 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 '18 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 '18 at 8:07
|
show 2 more comments
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
edited Nov 29 '18 at 9:17
gsamaras
51.6k24104189
51.6k24104189
answered Nov 27 '18 at 13:14
Kerrek SBKerrek SB
368k61692925
368k61692925
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 '18 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 '18 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 '18 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 '18 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 '18 at 8:07
|
show 2 more comments
Yes I couldn't find any relevant information on that document on whyc_str()
didn't get an overload too... Thank you for the answer!
– gsamaras
Nov 27 '18 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 '18 at 13:20
Also, I can easily imagine a non-constc_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.
– rustyx
Nov 27 '18 at 13:21
@rustyx: The newdata
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.
– Kerrek SB
Nov 27 '18 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 '18 at 8:07
Yes I couldn't find any relevant information on that document on why
c_str()
didn't get an overload too... Thank you for the answer!– gsamaras
Nov 27 '18 at 13:18
Yes I couldn't find any relevant information on that document on why
c_str()
didn't get an overload too... Thank you for the answer!– gsamaras
Nov 27 '18 at 13:18
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 '18 at 13:20
@gsamaras: No problem -- I added a note about writing to the null terminator.
– Kerrek SB
Nov 27 '18 at 13:20
Also, I can easily imagine a non-const
c_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.– rustyx
Nov 27 '18 at 13:21
Also, I can easily imagine a non-const
c_str()
overload breaking legacy code. Think about calling it on a non-const string, with an auto return type.– rustyx
Nov 27 '18 at 13:21
@rustyx: The new
data
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.– Kerrek SB
Nov 27 '18 at 13:24
@rustyx: The new
data
overload absolutely did break code. We coped, but it's not something you want to do gratuitously.– Kerrek SB
Nov 27 '18 at 13:24
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 '18 at 8:07
@KerrekSB yesterday in my sleep I was thinking about your first bullet. Why the non-const overload would break things? I mean wouldn't it be that where the const is needed, the relevant const overload of the method would be called?
– gsamaras
Nov 28 '18 at 8:07
|
show 2 more comments
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
add a comment |
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
add a comment |
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.
Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.
std::string programName;
// ...
if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.
std::string programName;
// ...
if( !programName.empty() ) {
char emptyString = {''};
if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) {
// etc.
} else {
// handle error
}
}
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
edited Nov 28 '18 at 8:04
gsamaras
51.6k24104189
51.6k24104189
answered Nov 27 '18 at 13:12
P.WP.W
15.4k31453
15.4k31453
add a comment |
add a comment |
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 '18 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 '18 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 '18 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 '18 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 '18 at 13:21
|
show 3 more comments
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 '18 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 '18 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 '18 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 '18 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 '18 at 13:21
|
show 3 more comments
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
edited Nov 29 '18 at 8:18
gsamaras
51.6k24104189
51.6k24104189
answered Nov 27 '18 at 13:12
The Quantum PhysicistThe Quantum Physicist
12.1k748102
12.1k748102
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 '18 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 '18 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 '18 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 '18 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 '18 at 13:21
|
show 3 more comments
3
I think the null-termination is a red herring here. Bothc_str
anddata
are absolutely equivalent regarding null termination.
– Kerrek SB
Nov 27 '18 at 13:15
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 '18 at 13:16
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 '18 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't usedata
to request null-termination (which you may or may not want to imply). It's perfectly fine to usedata
for the express purpose of getting a null-terminated string; I would not ask anyone to usec_str
instead.
– Kerrek SB
Nov 27 '18 at 13:18
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to usedata()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)
– The Quantum Physicist
Nov 27 '18 at 13:21
3
3
I think the null-termination is a red herring here. Both
c_str
and data
are absolutely equivalent regarding null termination.– Kerrek SB
Nov 27 '18 at 13:15
I think the null-termination is a red herring here. Both
c_str
and data
are absolutely equivalent regarding null termination.– Kerrek SB
Nov 27 '18 at 13:15
1
1
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 '18 at 13:16
@KerrekSB is right, after C++11 both methods return a null terminated string.
– gsamaras
Nov 27 '18 at 13:16
2
2
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 '18 at 13:16
@KerrekSB It's not about the null-termination in the sense of whether it exists or not. It's in the sense whether you want "null-terminated string" or "buffer vector", where you don't care about null termination.
– The Quantum Physicist
Nov 27 '18 at 13:16
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't use
data
to request null-termination (which you may or may not want to imply). It's perfectly fine to use data
for the express purpose of getting a null-terminated string; I would not ask anyone to use c_str
instead.– Kerrek SB
Nov 27 '18 at 13:18
@TheQuantumPhysicist: Yes, I see your point, but I would somewhat like to dispel the idea that you shouldn't use
data
to request null-termination (which you may or may not want to imply). It's perfectly fine to use data
for the express purpose of getting a null-terminated string; I would not ask anyone to use c_str
instead.– Kerrek SB
Nov 27 '18 at 13:18
2
2
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to use
data()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)– The Quantum Physicist
Nov 27 '18 at 13:21
@KerrekSB You're right, but keep in mind that C++ is an expressive language, and the text of the code you write should ideally have meaning. Personally I'd consider it bad practice to use
data()
if all you want is a null-terminated string. You wouldn't be helping the guy who reads your code next. It's my opinion, anyway :-)– The Quantum Physicist
Nov 27 '18 at 13:21
|
show 3 more comments
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
add a comment |
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
add a comment |
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
edited Nov 27 '18 at 20:46
gsamaras
51.6k24104189
51.6k24104189
answered Nov 27 '18 at 20:25
CAFCAF
19813
19813
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53500369%2fc-str-vs-data-when-it-comes-to-return-type%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
my bet is that is has to do with
c_str
being null terminated, while astd::string
may contain a null in the middle and I'd expect alsodata()
to return just the raw buffer (whether it contains null in the middle or not)– user463035818
Nov 27 '18 at 13:11
@user463035818 they both return the same in this bad example I made...
– gsamaras
Nov 27 '18 at 13:20
Possible duplicate of Why Doesn't string::data() Provide a Mutable char*?
– Jonathan Mee
Nov 27 '18 at 15:34
@JonathanMee thanks for sharing, but where does this answer my question? From what I can understand from the answers here, "we can only speculate". I don't see how this is a duplicate, but if I am wrong, please let me know. :)
– gsamaras
Nov 27 '18 at 15:37
My understanding was you were asking for the context of the decision why a non-constant
data
was added. I believe that is covered in detail in the other question?– Jonathan Mee
Nov 27 '18 at 15:41