#pragma pack effect












186














I was wondering if someone could explain to me what the #pragma pack preprocessor statement does, and more importantly, why one would want to use it.



I checked out the MSDN page, which offered some insight, but I was hoping to hear more from people with experience. I've seen it in code before, though I can't seem to find where anymore.










share|improve this question




















  • 1




    It forces a particular alignment/packing of a struct, but like all #pragma directives they are implementation defined.
    – dreamlax
    Jul 23 '10 at 13:20












  • A mod s = 0 where A is the address and s is the size of the datatype; this checks if a data is not misaligned.
    – legends2k
    Nov 22 '13 at 11:34
















186














I was wondering if someone could explain to me what the #pragma pack preprocessor statement does, and more importantly, why one would want to use it.



I checked out the MSDN page, which offered some insight, but I was hoping to hear more from people with experience. I've seen it in code before, though I can't seem to find where anymore.










share|improve this question




















  • 1




    It forces a particular alignment/packing of a struct, but like all #pragma directives they are implementation defined.
    – dreamlax
    Jul 23 '10 at 13:20












  • A mod s = 0 where A is the address and s is the size of the datatype; this checks if a data is not misaligned.
    – legends2k
    Nov 22 '13 at 11:34














186












186








186


98





I was wondering if someone could explain to me what the #pragma pack preprocessor statement does, and more importantly, why one would want to use it.



I checked out the MSDN page, which offered some insight, but I was hoping to hear more from people with experience. I've seen it in code before, though I can't seem to find where anymore.










share|improve this question















I was wondering if someone could explain to me what the #pragma pack preprocessor statement does, and more importantly, why one would want to use it.



I checked out the MSDN page, which offered some insight, but I was hoping to hear more from people with experience. I've seen it in code before, though I can't seem to find where anymore.







c c-preprocessor pragma-pack






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 22 '18 at 10:05









Mohammadreza Panahi

2,55521432




2,55521432










asked Jul 23 '10 at 13:12









Cenoc

4,037134377




4,037134377








  • 1




    It forces a particular alignment/packing of a struct, but like all #pragma directives they are implementation defined.
    – dreamlax
    Jul 23 '10 at 13:20












  • A mod s = 0 where A is the address and s is the size of the datatype; this checks if a data is not misaligned.
    – legends2k
    Nov 22 '13 at 11:34














  • 1




    It forces a particular alignment/packing of a struct, but like all #pragma directives they are implementation defined.
    – dreamlax
    Jul 23 '10 at 13:20












  • A mod s = 0 where A is the address and s is the size of the datatype; this checks if a data is not misaligned.
    – legends2k
    Nov 22 '13 at 11:34








1




1




It forces a particular alignment/packing of a struct, but like all #pragma directives they are implementation defined.
– dreamlax
Jul 23 '10 at 13:20






It forces a particular alignment/packing of a struct, but like all #pragma directives they are implementation defined.
– dreamlax
Jul 23 '10 at 13:20














A mod s = 0 where A is the address and s is the size of the datatype; this checks if a data is not misaligned.
– legends2k
Nov 22 '13 at 11:34




A mod s = 0 where A is the address and s is the size of the datatype; this checks if a data is not misaligned.
– legends2k
Nov 22 '13 at 11:34












10 Answers
10






active

oldest

votes


















343














#pragma pack instructs the compiler to pack structure members with particular alignment. Most compilers, when you declare a struct, will insert padding between members to ensure that they are aligned to appropriate addresses in memory (usually a multiple of the type's size). This avoids the performance penalty (or outright error) on some architectures associated with accessing variables that are not aligned properly. For example, given 4-byte integers and the following struct:



struct Test
{
char AA;
int BB;
char CC;
};


The compiler could choose to lay the struct out in memory like this:



|   1   |   2   |   3   |   4   |  

| AA(1) | pad.................. |
| BB(1) | BB(2) | BB(3) | BB(4) |
| CC(1) | pad.................. |


and sizeof(Test) would be 4 × 3 = 12, even though it only contains 6 bytes of data. The most common use case for the #pragma (to my knowledge) is when working with hardware devices where you need to ensure that the compiler does not insert padding into the data and each member follows the previous one. With #pragma pack(1), the struct above would be laid out like this:



|   1   |

| AA(1) |
| BB(1) |
| BB(2) |
| BB(3) |
| BB(4) |
| CC(1) |


And sizeof(Test) would be 1 × 6 = 6.



With #pragma pack(2), the struct above would be laid out like this:



|   1   |   2   | 

| AA(1) | pad.. |
| BB(1) | BB(2) |
| BB(3) | BB(4) |
| CC(1) | pad.. |


And sizeof(Test) would be 2 × 4 = 8.






share|improve this answer



















  • 61




    It might be worth adding the downsides of packing. (unaligned object accesses are slow in the best case, but will cause errors on some platforms.)
    – jalf
    Jul 23 '10 at 14:55






  • 10




    Seems the alignments "performance penalty" mentioned could actually be a benefit on some systems danluu.com/3c-conflict .
    – user152949
    Jan 3 '14 at 15:22






  • 3




    @Pacerier Not really. That post talks about some fairly extreme alignment (aligning on 4KB boundaries). The CPU expects certain minimum alignments for various data types, but those require, in the worst case, 8-byte alignment (not counting vector types which may require 16 or 32 byte alignment). Not aligning on those boundaries generally gives you a noticeable performance hit (because a load may have to be done as two operations instead of one), but the type is either well-aligned or it isn't. Stricter alignment than that buys you nothing (and ruins cache utilization
    – jalf
    May 14 '15 at 18:30






  • 4




    In other words, a double expects to be on an 8 byte boundary. Putting it on a 7 byte boundary will hurt performance. But putting it on a 16, 32, 64 or 4096 byte boundary buys you nothing above what the 8 byte boundary already gave you. You'll get the same performance from the CPU, while getting much worse cache utilization for the reasons outlined in that post.
    – jalf
    May 14 '15 at 18:30








  • 2




    So the lesson is not "packing is beneficial" (packing violates the types' natural alignment, so that hurts performance), but simply "don't over-align beyond what is required"
    – jalf
    May 14 '15 at 18:32



















21














#pragma is used to send non-portable (as in this compiler only) messages to the compiler. Things like disabling certain warnings and packing structs are common reasons. Disabling specific warnings is particularly useful if you compile with the warnings as errors flag turned on.



#pragma pack specifically is used to indicate that the struct being packed should not have its members aligned. It's useful when you have a memory mapped interface to a piece of hardware and need to be able to control exactly where the different struct members point. It is notably not a good speed optimization, since most machines are much faster at dealing with aligned data.






share|improve this answer

















  • 11




    To undo afterwards do this: #pragma pack(push,1) and #pragma pack(pop)
    – malhal
    Jan 12 '14 at 20:14





















15














It tells the compiler the boundary to align objects in a structure to. For example, if I have something like:



struct foo { 
char a;
int b;
};


With a typical 32-bit machine, you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed (and that's what will typically happen by default).



If, however, you have to match an externally defined structure you want to ensure the compiler lays out your structure exactly according to that external definition. In this case, you can give the compiler a #pragma pack(1) to tell it not to insert any padding between members -- if the definition of the structure includes padding between members, you insert it explicitly (e.g., typically with members named unusedN or ignoreN, or something on that order).






share|improve this answer





















  • "you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed" - how would having 3 byte of padding maximize access speed?
    – Ashwin
    Mar 31 '14 at 13:04






  • 7




    @Ashwin: Placing b at a 4-byte boundary means that the processor can load it by issuing a single 4-byte load. Although it depends somewhat on the processor, if it's at an odd boundary there's a good chance that loading it will require the processor to issue two separate load instructions, then use a shifter to put those pieces together. Typical penalty is on the order of 3x slower load of that item.
    – Jerry Coffin
    Mar 31 '14 at 14:07










  • ...if you look at the assembly code for reading aligned and unaligned int, aligned read is usually a single mnemonic. Unaligned read can be 10 lines of assembly easily as it pieces the int together, picking it byte by byte and placing at correct locations of the register.
    – SF.
    Jan 12 '16 at 14:09






  • 1




    @SF.: It can be--but even when it's not, don't be misled--on an x86 CPU (for one obvious example) the operations are carried out in hardware, but you still get roughly the same set of operations and slowdown.
    – Jerry Coffin
    Jan 12 '16 at 16:12



















7














Data elements (e.g. members of classes and structs) are typically aligned on WORD or DWORD boundaries for current generation processors in order to improve access times. Retrieving a DWORD at an address which isn't divisible by 4 requires at least one extra CPU cycle on a 32 bit processor. So, if you have e.g. three char members char a, b, c;, they actually tend to take 6 or 12 bytes of storage.



#pragma allows you to override this to achieve more efficient space usage, at the expense of access speed, or for consistency of stored data between different compiler targets. I had a lot of fun with this transitioning from 16 bit to 32 bit code; I expect porting to 64 bit code will cause the same kinds of headaches for some code.






share|improve this answer





















  • Actually, char a,b,c; will usually take either 3 or 4 bytes of storage (on x86 at least) -- that's because their alignment requirement is 1 byte. If it weren't, then how would you deal with char str = "foo";? Access to a char is always a simple fetch-shift-mask, while access to an int can be fetch-fetch-merge or just fetch, depending on whether it's aligned or not. int has (on x86) a 32-bit (4 byte) alignment because otherwise you'd get (say) half an int in one DWORD and half in the other, and that would take two lookups.
    – Tim Čas
    Jul 18 '12 at 14:17



















2














A compiler may place structure members on particular byte boundaries for reasons of performance on a particular architecture. This may leave unused padding between members. Structure packing forces members to be contiguous.



This may be important for example if you require a structure to conform to a particular file or communications format where the data you need the data to be at specific positions within a sequence. However such usage does not deal with endian-ness issues, so although used, it may not be portable.



It may also to exactly overlay the internal register structure of some I/O device such as a UART or USB controller for example, in order that register access be through a structure rather than direct addresses.






share|improve this answer





























    2














    Compiler could align members in structures to achieve maximum performance on the certain platform. #pragma pack directive allows you to control that alignment. Usually you should leave it by default for optimum performance. If you need to pass a structure to the remote machine you generally will use #pragma pack 1 to exclude any unwanted alignment.






    share|improve this answer





























      1














      You'd likely only want to use this if you were coding to some hardware (e.g. a memory mapped device) which had strict requirements for register ordering and alignment.



      However, this looks like a pretty blunt tool to achieve that end. A better approach would be to code a mini-driver in assembler and give it a C calling interface rather than fumbling around with this pragma.






      share|improve this answer





















      • I actually use it quite a lot to save space in large tables which are not accessed frequently. There, it's only to save space and not for any strict alignment. (Just voted you up, btw. Someone had given you a negative vote.)
        – Todd Lehman
        Apr 15 '15 at 21:27





















      1














      I've used it in code before, though only to interface with legacy code. This was a Mac OS X Cocoa application that needed to load preference files from an earlier, Carbon version (which was itself backwards-compatible with the original M68k System 6.5 version...you get the idea). The preference files in the original version were a binary dump of a configuration structure, that used the #pragma pack(1) to avoid taking up extra space and saving junk (i.e. the padding bytes that would otherwise be in the structure).



      The original authors of the code had also used #pragma pack(1) to store structures that were used as messages in inter-process communication. I think the reason here was to avoid the possibility of unknown or changed padding sizes, as the code sometimes looked at a specific portion of the message struct by counting a number of bytes in from the start (ewww).






      share|improve this answer





























        1














        I have seen people use it to make sure that a structure takes a whole cache line to prevent false sharing in a multithreaded context. If you are going to have a large number of objects that are going to be loosely packed by default it could save memory and improve cache performance to pack them tighter, though unaligned memory access will usually slow things down so there might be a downside.






        share|improve this answer





























          0














          Note that there are other ways of achieving data consistency that #pragma pack offers (for instance some people use #pragma pack(1) for structures that should be sent across the network). For instance, see the following code and its subsequent output:



          #include <stdio.h>

          struct a {
          char one;
          char two[2];
          char eight[8];
          char four[4];
          };

          struct b {
          char one;
          short two;
          long int eight;
          int four;
          };

          int main(int argc, char** argv) {
          struct a twoa[2] = {};
          struct b twob[2] = {};
          printf("sizeof(struct a): %i, sizeof(struct b): %in", sizeof(struct a), sizeof(struct b));
          printf("sizeof(twoa): %i, sizeof(twob): %in", sizeof(twoa), sizeof(twob));
          }


          The output is as follows:
          sizeof(struct a): 15, sizeof(struct b): 24
          sizeof(twoa): 30, sizeof(twob): 48



          Notice how the size of struct a is exactly what the byte count is, but struct b has padding added (see this for details on the padding). By doing this as opposed to the #pragma pack you can have control of converting the "wire format" into the appropriate types. For instance, "char two[2]" into a "short int" et cetera.






          share|improve this answer





















          • No it's wrong. If you look at the position in memory of b.two, it's not one byte after b.one (the compiler can (and will often) align b.two so it's aligned to word access). For a.two, it's exactly one byte after a.one. If you need to access a.two as a short int, you should have 2 alternative, either use a union (but this usually fails if you have endianness issue), or unpack/convert by code (using the appropriate ntohX function)
            – xryl669
            Sep 13 '16 at 15:39











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3318410%2fpragma-pack-effect%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          10 Answers
          10






          active

          oldest

          votes








          10 Answers
          10






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          343














          #pragma pack instructs the compiler to pack structure members with particular alignment. Most compilers, when you declare a struct, will insert padding between members to ensure that they are aligned to appropriate addresses in memory (usually a multiple of the type's size). This avoids the performance penalty (or outright error) on some architectures associated with accessing variables that are not aligned properly. For example, given 4-byte integers and the following struct:



          struct Test
          {
          char AA;
          int BB;
          char CC;
          };


          The compiler could choose to lay the struct out in memory like this:



          |   1   |   2   |   3   |   4   |  

          | AA(1) | pad.................. |
          | BB(1) | BB(2) | BB(3) | BB(4) |
          | CC(1) | pad.................. |


          and sizeof(Test) would be 4 × 3 = 12, even though it only contains 6 bytes of data. The most common use case for the #pragma (to my knowledge) is when working with hardware devices where you need to ensure that the compiler does not insert padding into the data and each member follows the previous one. With #pragma pack(1), the struct above would be laid out like this:



          |   1   |

          | AA(1) |
          | BB(1) |
          | BB(2) |
          | BB(3) |
          | BB(4) |
          | CC(1) |


          And sizeof(Test) would be 1 × 6 = 6.



          With #pragma pack(2), the struct above would be laid out like this:



          |   1   |   2   | 

          | AA(1) | pad.. |
          | BB(1) | BB(2) |
          | BB(3) | BB(4) |
          | CC(1) | pad.. |


          And sizeof(Test) would be 2 × 4 = 8.






          share|improve this answer



















          • 61




            It might be worth adding the downsides of packing. (unaligned object accesses are slow in the best case, but will cause errors on some platforms.)
            – jalf
            Jul 23 '10 at 14:55






          • 10




            Seems the alignments "performance penalty" mentioned could actually be a benefit on some systems danluu.com/3c-conflict .
            – user152949
            Jan 3 '14 at 15:22






          • 3




            @Pacerier Not really. That post talks about some fairly extreme alignment (aligning on 4KB boundaries). The CPU expects certain minimum alignments for various data types, but those require, in the worst case, 8-byte alignment (not counting vector types which may require 16 or 32 byte alignment). Not aligning on those boundaries generally gives you a noticeable performance hit (because a load may have to be done as two operations instead of one), but the type is either well-aligned or it isn't. Stricter alignment than that buys you nothing (and ruins cache utilization
            – jalf
            May 14 '15 at 18:30






          • 4




            In other words, a double expects to be on an 8 byte boundary. Putting it on a 7 byte boundary will hurt performance. But putting it on a 16, 32, 64 or 4096 byte boundary buys you nothing above what the 8 byte boundary already gave you. You'll get the same performance from the CPU, while getting much worse cache utilization for the reasons outlined in that post.
            – jalf
            May 14 '15 at 18:30








          • 2




            So the lesson is not "packing is beneficial" (packing violates the types' natural alignment, so that hurts performance), but simply "don't over-align beyond what is required"
            – jalf
            May 14 '15 at 18:32
















          343














          #pragma pack instructs the compiler to pack structure members with particular alignment. Most compilers, when you declare a struct, will insert padding between members to ensure that they are aligned to appropriate addresses in memory (usually a multiple of the type's size). This avoids the performance penalty (or outright error) on some architectures associated with accessing variables that are not aligned properly. For example, given 4-byte integers and the following struct:



          struct Test
          {
          char AA;
          int BB;
          char CC;
          };


          The compiler could choose to lay the struct out in memory like this:



          |   1   |   2   |   3   |   4   |  

          | AA(1) | pad.................. |
          | BB(1) | BB(2) | BB(3) | BB(4) |
          | CC(1) | pad.................. |


          and sizeof(Test) would be 4 × 3 = 12, even though it only contains 6 bytes of data. The most common use case for the #pragma (to my knowledge) is when working with hardware devices where you need to ensure that the compiler does not insert padding into the data and each member follows the previous one. With #pragma pack(1), the struct above would be laid out like this:



          |   1   |

          | AA(1) |
          | BB(1) |
          | BB(2) |
          | BB(3) |
          | BB(4) |
          | CC(1) |


          And sizeof(Test) would be 1 × 6 = 6.



          With #pragma pack(2), the struct above would be laid out like this:



          |   1   |   2   | 

          | AA(1) | pad.. |
          | BB(1) | BB(2) |
          | BB(3) | BB(4) |
          | CC(1) | pad.. |


          And sizeof(Test) would be 2 × 4 = 8.






          share|improve this answer



















          • 61




            It might be worth adding the downsides of packing. (unaligned object accesses are slow in the best case, but will cause errors on some platforms.)
            – jalf
            Jul 23 '10 at 14:55






          • 10




            Seems the alignments "performance penalty" mentioned could actually be a benefit on some systems danluu.com/3c-conflict .
            – user152949
            Jan 3 '14 at 15:22






          • 3




            @Pacerier Not really. That post talks about some fairly extreme alignment (aligning on 4KB boundaries). The CPU expects certain minimum alignments for various data types, but those require, in the worst case, 8-byte alignment (not counting vector types which may require 16 or 32 byte alignment). Not aligning on those boundaries generally gives you a noticeable performance hit (because a load may have to be done as two operations instead of one), but the type is either well-aligned or it isn't. Stricter alignment than that buys you nothing (and ruins cache utilization
            – jalf
            May 14 '15 at 18:30






          • 4




            In other words, a double expects to be on an 8 byte boundary. Putting it on a 7 byte boundary will hurt performance. But putting it on a 16, 32, 64 or 4096 byte boundary buys you nothing above what the 8 byte boundary already gave you. You'll get the same performance from the CPU, while getting much worse cache utilization for the reasons outlined in that post.
            – jalf
            May 14 '15 at 18:30








          • 2




            So the lesson is not "packing is beneficial" (packing violates the types' natural alignment, so that hurts performance), but simply "don't over-align beyond what is required"
            – jalf
            May 14 '15 at 18:32














          343












          343








          343






          #pragma pack instructs the compiler to pack structure members with particular alignment. Most compilers, when you declare a struct, will insert padding between members to ensure that they are aligned to appropriate addresses in memory (usually a multiple of the type's size). This avoids the performance penalty (or outright error) on some architectures associated with accessing variables that are not aligned properly. For example, given 4-byte integers and the following struct:



          struct Test
          {
          char AA;
          int BB;
          char CC;
          };


          The compiler could choose to lay the struct out in memory like this:



          |   1   |   2   |   3   |   4   |  

          | AA(1) | pad.................. |
          | BB(1) | BB(2) | BB(3) | BB(4) |
          | CC(1) | pad.................. |


          and sizeof(Test) would be 4 × 3 = 12, even though it only contains 6 bytes of data. The most common use case for the #pragma (to my knowledge) is when working with hardware devices where you need to ensure that the compiler does not insert padding into the data and each member follows the previous one. With #pragma pack(1), the struct above would be laid out like this:



          |   1   |

          | AA(1) |
          | BB(1) |
          | BB(2) |
          | BB(3) |
          | BB(4) |
          | CC(1) |


          And sizeof(Test) would be 1 × 6 = 6.



          With #pragma pack(2), the struct above would be laid out like this:



          |   1   |   2   | 

          | AA(1) | pad.. |
          | BB(1) | BB(2) |
          | BB(3) | BB(4) |
          | CC(1) | pad.. |


          And sizeof(Test) would be 2 × 4 = 8.






          share|improve this answer














          #pragma pack instructs the compiler to pack structure members with particular alignment. Most compilers, when you declare a struct, will insert padding between members to ensure that they are aligned to appropriate addresses in memory (usually a multiple of the type's size). This avoids the performance penalty (or outright error) on some architectures associated with accessing variables that are not aligned properly. For example, given 4-byte integers and the following struct:



          struct Test
          {
          char AA;
          int BB;
          char CC;
          };


          The compiler could choose to lay the struct out in memory like this:



          |   1   |   2   |   3   |   4   |  

          | AA(1) | pad.................. |
          | BB(1) | BB(2) | BB(3) | BB(4) |
          | CC(1) | pad.................. |


          and sizeof(Test) would be 4 × 3 = 12, even though it only contains 6 bytes of data. The most common use case for the #pragma (to my knowledge) is when working with hardware devices where you need to ensure that the compiler does not insert padding into the data and each member follows the previous one. With #pragma pack(1), the struct above would be laid out like this:



          |   1   |

          | AA(1) |
          | BB(1) |
          | BB(2) |
          | BB(3) |
          | BB(4) |
          | CC(1) |


          And sizeof(Test) would be 1 × 6 = 6.



          With #pragma pack(2), the struct above would be laid out like this:



          |   1   |   2   | 

          | AA(1) | pad.. |
          | BB(1) | BB(2) |
          | BB(3) | BB(4) |
          | CC(1) | pad.. |


          And sizeof(Test) would be 2 × 4 = 8.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited May 14 '15 at 16:10









          Pacerier

          43.7k50211514




          43.7k50211514










          answered Jul 23 '10 at 13:21









          Nick Meyer

          25.6k135571




          25.6k135571








          • 61




            It might be worth adding the downsides of packing. (unaligned object accesses are slow in the best case, but will cause errors on some platforms.)
            – jalf
            Jul 23 '10 at 14:55






          • 10




            Seems the alignments "performance penalty" mentioned could actually be a benefit on some systems danluu.com/3c-conflict .
            – user152949
            Jan 3 '14 at 15:22






          • 3




            @Pacerier Not really. That post talks about some fairly extreme alignment (aligning on 4KB boundaries). The CPU expects certain minimum alignments for various data types, but those require, in the worst case, 8-byte alignment (not counting vector types which may require 16 or 32 byte alignment). Not aligning on those boundaries generally gives you a noticeable performance hit (because a load may have to be done as two operations instead of one), but the type is either well-aligned or it isn't. Stricter alignment than that buys you nothing (and ruins cache utilization
            – jalf
            May 14 '15 at 18:30






          • 4




            In other words, a double expects to be on an 8 byte boundary. Putting it on a 7 byte boundary will hurt performance. But putting it on a 16, 32, 64 or 4096 byte boundary buys you nothing above what the 8 byte boundary already gave you. You'll get the same performance from the CPU, while getting much worse cache utilization for the reasons outlined in that post.
            – jalf
            May 14 '15 at 18:30








          • 2




            So the lesson is not "packing is beneficial" (packing violates the types' natural alignment, so that hurts performance), but simply "don't over-align beyond what is required"
            – jalf
            May 14 '15 at 18:32














          • 61




            It might be worth adding the downsides of packing. (unaligned object accesses are slow in the best case, but will cause errors on some platforms.)
            – jalf
            Jul 23 '10 at 14:55






          • 10




            Seems the alignments "performance penalty" mentioned could actually be a benefit on some systems danluu.com/3c-conflict .
            – user152949
            Jan 3 '14 at 15:22






          • 3




            @Pacerier Not really. That post talks about some fairly extreme alignment (aligning on 4KB boundaries). The CPU expects certain minimum alignments for various data types, but those require, in the worst case, 8-byte alignment (not counting vector types which may require 16 or 32 byte alignment). Not aligning on those boundaries generally gives you a noticeable performance hit (because a load may have to be done as two operations instead of one), but the type is either well-aligned or it isn't. Stricter alignment than that buys you nothing (and ruins cache utilization
            – jalf
            May 14 '15 at 18:30






          • 4




            In other words, a double expects to be on an 8 byte boundary. Putting it on a 7 byte boundary will hurt performance. But putting it on a 16, 32, 64 or 4096 byte boundary buys you nothing above what the 8 byte boundary already gave you. You'll get the same performance from the CPU, while getting much worse cache utilization for the reasons outlined in that post.
            – jalf
            May 14 '15 at 18:30








          • 2




            So the lesson is not "packing is beneficial" (packing violates the types' natural alignment, so that hurts performance), but simply "don't over-align beyond what is required"
            – jalf
            May 14 '15 at 18:32








          61




          61




          It might be worth adding the downsides of packing. (unaligned object accesses are slow in the best case, but will cause errors on some platforms.)
          – jalf
          Jul 23 '10 at 14:55




          It might be worth adding the downsides of packing. (unaligned object accesses are slow in the best case, but will cause errors on some platforms.)
          – jalf
          Jul 23 '10 at 14:55




          10




          10




          Seems the alignments "performance penalty" mentioned could actually be a benefit on some systems danluu.com/3c-conflict .
          – user152949
          Jan 3 '14 at 15:22




          Seems the alignments "performance penalty" mentioned could actually be a benefit on some systems danluu.com/3c-conflict .
          – user152949
          Jan 3 '14 at 15:22




          3




          3




          @Pacerier Not really. That post talks about some fairly extreme alignment (aligning on 4KB boundaries). The CPU expects certain minimum alignments for various data types, but those require, in the worst case, 8-byte alignment (not counting vector types which may require 16 or 32 byte alignment). Not aligning on those boundaries generally gives you a noticeable performance hit (because a load may have to be done as two operations instead of one), but the type is either well-aligned or it isn't. Stricter alignment than that buys you nothing (and ruins cache utilization
          – jalf
          May 14 '15 at 18:30




          @Pacerier Not really. That post talks about some fairly extreme alignment (aligning on 4KB boundaries). The CPU expects certain minimum alignments for various data types, but those require, in the worst case, 8-byte alignment (not counting vector types which may require 16 or 32 byte alignment). Not aligning on those boundaries generally gives you a noticeable performance hit (because a load may have to be done as two operations instead of one), but the type is either well-aligned or it isn't. Stricter alignment than that buys you nothing (and ruins cache utilization
          – jalf
          May 14 '15 at 18:30




          4




          4




          In other words, a double expects to be on an 8 byte boundary. Putting it on a 7 byte boundary will hurt performance. But putting it on a 16, 32, 64 or 4096 byte boundary buys you nothing above what the 8 byte boundary already gave you. You'll get the same performance from the CPU, while getting much worse cache utilization for the reasons outlined in that post.
          – jalf
          May 14 '15 at 18:30






          In other words, a double expects to be on an 8 byte boundary. Putting it on a 7 byte boundary will hurt performance. But putting it on a 16, 32, 64 or 4096 byte boundary buys you nothing above what the 8 byte boundary already gave you. You'll get the same performance from the CPU, while getting much worse cache utilization for the reasons outlined in that post.
          – jalf
          May 14 '15 at 18:30






          2




          2




          So the lesson is not "packing is beneficial" (packing violates the types' natural alignment, so that hurts performance), but simply "don't over-align beyond what is required"
          – jalf
          May 14 '15 at 18:32




          So the lesson is not "packing is beneficial" (packing violates the types' natural alignment, so that hurts performance), but simply "don't over-align beyond what is required"
          – jalf
          May 14 '15 at 18:32













          21














          #pragma is used to send non-portable (as in this compiler only) messages to the compiler. Things like disabling certain warnings and packing structs are common reasons. Disabling specific warnings is particularly useful if you compile with the warnings as errors flag turned on.



          #pragma pack specifically is used to indicate that the struct being packed should not have its members aligned. It's useful when you have a memory mapped interface to a piece of hardware and need to be able to control exactly where the different struct members point. It is notably not a good speed optimization, since most machines are much faster at dealing with aligned data.






          share|improve this answer

















          • 11




            To undo afterwards do this: #pragma pack(push,1) and #pragma pack(pop)
            – malhal
            Jan 12 '14 at 20:14


















          21














          #pragma is used to send non-portable (as in this compiler only) messages to the compiler. Things like disabling certain warnings and packing structs are common reasons. Disabling specific warnings is particularly useful if you compile with the warnings as errors flag turned on.



          #pragma pack specifically is used to indicate that the struct being packed should not have its members aligned. It's useful when you have a memory mapped interface to a piece of hardware and need to be able to control exactly where the different struct members point. It is notably not a good speed optimization, since most machines are much faster at dealing with aligned data.






          share|improve this answer

















          • 11




            To undo afterwards do this: #pragma pack(push,1) and #pragma pack(pop)
            – malhal
            Jan 12 '14 at 20:14
















          21












          21








          21






          #pragma is used to send non-portable (as in this compiler only) messages to the compiler. Things like disabling certain warnings and packing structs are common reasons. Disabling specific warnings is particularly useful if you compile with the warnings as errors flag turned on.



          #pragma pack specifically is used to indicate that the struct being packed should not have its members aligned. It's useful when you have a memory mapped interface to a piece of hardware and need to be able to control exactly where the different struct members point. It is notably not a good speed optimization, since most machines are much faster at dealing with aligned data.






          share|improve this answer












          #pragma is used to send non-portable (as in this compiler only) messages to the compiler. Things like disabling certain warnings and packing structs are common reasons. Disabling specific warnings is particularly useful if you compile with the warnings as errors flag turned on.



          #pragma pack specifically is used to indicate that the struct being packed should not have its members aligned. It's useful when you have a memory mapped interface to a piece of hardware and need to be able to control exactly where the different struct members point. It is notably not a good speed optimization, since most machines are much faster at dealing with aligned data.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jul 23 '10 at 13:24









          nmichaels

          36.2k1081114




          36.2k1081114








          • 11




            To undo afterwards do this: #pragma pack(push,1) and #pragma pack(pop)
            – malhal
            Jan 12 '14 at 20:14
















          • 11




            To undo afterwards do this: #pragma pack(push,1) and #pragma pack(pop)
            – malhal
            Jan 12 '14 at 20:14










          11




          11




          To undo afterwards do this: #pragma pack(push,1) and #pragma pack(pop)
          – malhal
          Jan 12 '14 at 20:14






          To undo afterwards do this: #pragma pack(push,1) and #pragma pack(pop)
          – malhal
          Jan 12 '14 at 20:14













          15














          It tells the compiler the boundary to align objects in a structure to. For example, if I have something like:



          struct foo { 
          char a;
          int b;
          };


          With a typical 32-bit machine, you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed (and that's what will typically happen by default).



          If, however, you have to match an externally defined structure you want to ensure the compiler lays out your structure exactly according to that external definition. In this case, you can give the compiler a #pragma pack(1) to tell it not to insert any padding between members -- if the definition of the structure includes padding between members, you insert it explicitly (e.g., typically with members named unusedN or ignoreN, or something on that order).






          share|improve this answer





















          • "you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed" - how would having 3 byte of padding maximize access speed?
            – Ashwin
            Mar 31 '14 at 13:04






          • 7




            @Ashwin: Placing b at a 4-byte boundary means that the processor can load it by issuing a single 4-byte load. Although it depends somewhat on the processor, if it's at an odd boundary there's a good chance that loading it will require the processor to issue two separate load instructions, then use a shifter to put those pieces together. Typical penalty is on the order of 3x slower load of that item.
            – Jerry Coffin
            Mar 31 '14 at 14:07










          • ...if you look at the assembly code for reading aligned and unaligned int, aligned read is usually a single mnemonic. Unaligned read can be 10 lines of assembly easily as it pieces the int together, picking it byte by byte and placing at correct locations of the register.
            – SF.
            Jan 12 '16 at 14:09






          • 1




            @SF.: It can be--but even when it's not, don't be misled--on an x86 CPU (for one obvious example) the operations are carried out in hardware, but you still get roughly the same set of operations and slowdown.
            – Jerry Coffin
            Jan 12 '16 at 16:12
















          15














          It tells the compiler the boundary to align objects in a structure to. For example, if I have something like:



          struct foo { 
          char a;
          int b;
          };


          With a typical 32-bit machine, you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed (and that's what will typically happen by default).



          If, however, you have to match an externally defined structure you want to ensure the compiler lays out your structure exactly according to that external definition. In this case, you can give the compiler a #pragma pack(1) to tell it not to insert any padding between members -- if the definition of the structure includes padding between members, you insert it explicitly (e.g., typically with members named unusedN or ignoreN, or something on that order).






          share|improve this answer





















          • "you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed" - how would having 3 byte of padding maximize access speed?
            – Ashwin
            Mar 31 '14 at 13:04






          • 7




            @Ashwin: Placing b at a 4-byte boundary means that the processor can load it by issuing a single 4-byte load. Although it depends somewhat on the processor, if it's at an odd boundary there's a good chance that loading it will require the processor to issue two separate load instructions, then use a shifter to put those pieces together. Typical penalty is on the order of 3x slower load of that item.
            – Jerry Coffin
            Mar 31 '14 at 14:07










          • ...if you look at the assembly code for reading aligned and unaligned int, aligned read is usually a single mnemonic. Unaligned read can be 10 lines of assembly easily as it pieces the int together, picking it byte by byte and placing at correct locations of the register.
            – SF.
            Jan 12 '16 at 14:09






          • 1




            @SF.: It can be--but even when it's not, don't be misled--on an x86 CPU (for one obvious example) the operations are carried out in hardware, but you still get roughly the same set of operations and slowdown.
            – Jerry Coffin
            Jan 12 '16 at 16:12














          15












          15








          15






          It tells the compiler the boundary to align objects in a structure to. For example, if I have something like:



          struct foo { 
          char a;
          int b;
          };


          With a typical 32-bit machine, you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed (and that's what will typically happen by default).



          If, however, you have to match an externally defined structure you want to ensure the compiler lays out your structure exactly according to that external definition. In this case, you can give the compiler a #pragma pack(1) to tell it not to insert any padding between members -- if the definition of the structure includes padding between members, you insert it explicitly (e.g., typically with members named unusedN or ignoreN, or something on that order).






          share|improve this answer












          It tells the compiler the boundary to align objects in a structure to. For example, if I have something like:



          struct foo { 
          char a;
          int b;
          };


          With a typical 32-bit machine, you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed (and that's what will typically happen by default).



          If, however, you have to match an externally defined structure you want to ensure the compiler lays out your structure exactly according to that external definition. In this case, you can give the compiler a #pragma pack(1) to tell it not to insert any padding between members -- if the definition of the structure includes padding between members, you insert it explicitly (e.g., typically with members named unusedN or ignoreN, or something on that order).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jul 23 '10 at 13:21









          Jerry Coffin

          383k48464908




          383k48464908












          • "you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed" - how would having 3 byte of padding maximize access speed?
            – Ashwin
            Mar 31 '14 at 13:04






          • 7




            @Ashwin: Placing b at a 4-byte boundary means that the processor can load it by issuing a single 4-byte load. Although it depends somewhat on the processor, if it's at an odd boundary there's a good chance that loading it will require the processor to issue two separate load instructions, then use a shifter to put those pieces together. Typical penalty is on the order of 3x slower load of that item.
            – Jerry Coffin
            Mar 31 '14 at 14:07










          • ...if you look at the assembly code for reading aligned and unaligned int, aligned read is usually a single mnemonic. Unaligned read can be 10 lines of assembly easily as it pieces the int together, picking it byte by byte and placing at correct locations of the register.
            – SF.
            Jan 12 '16 at 14:09






          • 1




            @SF.: It can be--but even when it's not, don't be misled--on an x86 CPU (for one obvious example) the operations are carried out in hardware, but you still get roughly the same set of operations and slowdown.
            – Jerry Coffin
            Jan 12 '16 at 16:12


















          • "you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed" - how would having 3 byte of padding maximize access speed?
            – Ashwin
            Mar 31 '14 at 13:04






          • 7




            @Ashwin: Placing b at a 4-byte boundary means that the processor can load it by issuing a single 4-byte load. Although it depends somewhat on the processor, if it's at an odd boundary there's a good chance that loading it will require the processor to issue two separate load instructions, then use a shifter to put those pieces together. Typical penalty is on the order of 3x slower load of that item.
            – Jerry Coffin
            Mar 31 '14 at 14:07










          • ...if you look at the assembly code for reading aligned and unaligned int, aligned read is usually a single mnemonic. Unaligned read can be 10 lines of assembly easily as it pieces the int together, picking it byte by byte and placing at correct locations of the register.
            – SF.
            Jan 12 '16 at 14:09






          • 1




            @SF.: It can be--but even when it's not, don't be misled--on an x86 CPU (for one obvious example) the operations are carried out in hardware, but you still get roughly the same set of operations and slowdown.
            – Jerry Coffin
            Jan 12 '16 at 16:12
















          "you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed" - how would having 3 byte of padding maximize access speed?
          – Ashwin
          Mar 31 '14 at 13:04




          "you'd normally "want" to have 3 bytes of padding between a and b so that b will land at a 4-byte boundary to maximize its access speed" - how would having 3 byte of padding maximize access speed?
          – Ashwin
          Mar 31 '14 at 13:04




          7




          7




          @Ashwin: Placing b at a 4-byte boundary means that the processor can load it by issuing a single 4-byte load. Although it depends somewhat on the processor, if it's at an odd boundary there's a good chance that loading it will require the processor to issue two separate load instructions, then use a shifter to put those pieces together. Typical penalty is on the order of 3x slower load of that item.
          – Jerry Coffin
          Mar 31 '14 at 14:07




          @Ashwin: Placing b at a 4-byte boundary means that the processor can load it by issuing a single 4-byte load. Although it depends somewhat on the processor, if it's at an odd boundary there's a good chance that loading it will require the processor to issue two separate load instructions, then use a shifter to put those pieces together. Typical penalty is on the order of 3x slower load of that item.
          – Jerry Coffin
          Mar 31 '14 at 14:07












          ...if you look at the assembly code for reading aligned and unaligned int, aligned read is usually a single mnemonic. Unaligned read can be 10 lines of assembly easily as it pieces the int together, picking it byte by byte and placing at correct locations of the register.
          – SF.
          Jan 12 '16 at 14:09




          ...if you look at the assembly code for reading aligned and unaligned int, aligned read is usually a single mnemonic. Unaligned read can be 10 lines of assembly easily as it pieces the int together, picking it byte by byte and placing at correct locations of the register.
          – SF.
          Jan 12 '16 at 14:09




          1




          1




          @SF.: It can be--but even when it's not, don't be misled--on an x86 CPU (for one obvious example) the operations are carried out in hardware, but you still get roughly the same set of operations and slowdown.
          – Jerry Coffin
          Jan 12 '16 at 16:12




          @SF.: It can be--but even when it's not, don't be misled--on an x86 CPU (for one obvious example) the operations are carried out in hardware, but you still get roughly the same set of operations and slowdown.
          – Jerry Coffin
          Jan 12 '16 at 16:12











          7














          Data elements (e.g. members of classes and structs) are typically aligned on WORD or DWORD boundaries for current generation processors in order to improve access times. Retrieving a DWORD at an address which isn't divisible by 4 requires at least one extra CPU cycle on a 32 bit processor. So, if you have e.g. three char members char a, b, c;, they actually tend to take 6 or 12 bytes of storage.



          #pragma allows you to override this to achieve more efficient space usage, at the expense of access speed, or for consistency of stored data between different compiler targets. I had a lot of fun with this transitioning from 16 bit to 32 bit code; I expect porting to 64 bit code will cause the same kinds of headaches for some code.






          share|improve this answer





















          • Actually, char a,b,c; will usually take either 3 or 4 bytes of storage (on x86 at least) -- that's because their alignment requirement is 1 byte. If it weren't, then how would you deal with char str = "foo";? Access to a char is always a simple fetch-shift-mask, while access to an int can be fetch-fetch-merge or just fetch, depending on whether it's aligned or not. int has (on x86) a 32-bit (4 byte) alignment because otherwise you'd get (say) half an int in one DWORD and half in the other, and that would take two lookups.
            – Tim Čas
            Jul 18 '12 at 14:17
















          7














          Data elements (e.g. members of classes and structs) are typically aligned on WORD or DWORD boundaries for current generation processors in order to improve access times. Retrieving a DWORD at an address which isn't divisible by 4 requires at least one extra CPU cycle on a 32 bit processor. So, if you have e.g. three char members char a, b, c;, they actually tend to take 6 or 12 bytes of storage.



          #pragma allows you to override this to achieve more efficient space usage, at the expense of access speed, or for consistency of stored data between different compiler targets. I had a lot of fun with this transitioning from 16 bit to 32 bit code; I expect porting to 64 bit code will cause the same kinds of headaches for some code.






          share|improve this answer





















          • Actually, char a,b,c; will usually take either 3 or 4 bytes of storage (on x86 at least) -- that's because their alignment requirement is 1 byte. If it weren't, then how would you deal with char str = "foo";? Access to a char is always a simple fetch-shift-mask, while access to an int can be fetch-fetch-merge or just fetch, depending on whether it's aligned or not. int has (on x86) a 32-bit (4 byte) alignment because otherwise you'd get (say) half an int in one DWORD and half in the other, and that would take two lookups.
            – Tim Čas
            Jul 18 '12 at 14:17














          7












          7








          7






          Data elements (e.g. members of classes and structs) are typically aligned on WORD or DWORD boundaries for current generation processors in order to improve access times. Retrieving a DWORD at an address which isn't divisible by 4 requires at least one extra CPU cycle on a 32 bit processor. So, if you have e.g. three char members char a, b, c;, they actually tend to take 6 or 12 bytes of storage.



          #pragma allows you to override this to achieve more efficient space usage, at the expense of access speed, or for consistency of stored data between different compiler targets. I had a lot of fun with this transitioning from 16 bit to 32 bit code; I expect porting to 64 bit code will cause the same kinds of headaches for some code.






          share|improve this answer












          Data elements (e.g. members of classes and structs) are typically aligned on WORD or DWORD boundaries for current generation processors in order to improve access times. Retrieving a DWORD at an address which isn't divisible by 4 requires at least one extra CPU cycle on a 32 bit processor. So, if you have e.g. three char members char a, b, c;, they actually tend to take 6 or 12 bytes of storage.



          #pragma allows you to override this to achieve more efficient space usage, at the expense of access speed, or for consistency of stored data between different compiler targets. I had a lot of fun with this transitioning from 16 bit to 32 bit code; I expect porting to 64 bit code will cause the same kinds of headaches for some code.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Jul 23 '10 at 13:22









          Pontus Gagge

          15.7k13348




          15.7k13348












          • Actually, char a,b,c; will usually take either 3 or 4 bytes of storage (on x86 at least) -- that's because their alignment requirement is 1 byte. If it weren't, then how would you deal with char str = "foo";? Access to a char is always a simple fetch-shift-mask, while access to an int can be fetch-fetch-merge or just fetch, depending on whether it's aligned or not. int has (on x86) a 32-bit (4 byte) alignment because otherwise you'd get (say) half an int in one DWORD and half in the other, and that would take two lookups.
            – Tim Čas
            Jul 18 '12 at 14:17


















          • Actually, char a,b,c; will usually take either 3 or 4 bytes of storage (on x86 at least) -- that's because their alignment requirement is 1 byte. If it weren't, then how would you deal with char str = "foo";? Access to a char is always a simple fetch-shift-mask, while access to an int can be fetch-fetch-merge or just fetch, depending on whether it's aligned or not. int has (on x86) a 32-bit (4 byte) alignment because otherwise you'd get (say) half an int in one DWORD and half in the other, and that would take two lookups.
            – Tim Čas
            Jul 18 '12 at 14:17
















          Actually, char a,b,c; will usually take either 3 or 4 bytes of storage (on x86 at least) -- that's because their alignment requirement is 1 byte. If it weren't, then how would you deal with char str = "foo";? Access to a char is always a simple fetch-shift-mask, while access to an int can be fetch-fetch-merge or just fetch, depending on whether it's aligned or not. int has (on x86) a 32-bit (4 byte) alignment because otherwise you'd get (say) half an int in one DWORD and half in the other, and that would take two lookups.
          – Tim Čas
          Jul 18 '12 at 14:17




          Actually, char a,b,c; will usually take either 3 or 4 bytes of storage (on x86 at least) -- that's because their alignment requirement is 1 byte. If it weren't, then how would you deal with char str = "foo";? Access to a char is always a simple fetch-shift-mask, while access to an int can be fetch-fetch-merge or just fetch, depending on whether it's aligned or not. int has (on x86) a 32-bit (4 byte) alignment because otherwise you'd get (say) half an int in one DWORD and half in the other, and that would take two lookups.
          – Tim Čas
          Jul 18 '12 at 14:17











          2














          A compiler may place structure members on particular byte boundaries for reasons of performance on a particular architecture. This may leave unused padding between members. Structure packing forces members to be contiguous.



          This may be important for example if you require a structure to conform to a particular file or communications format where the data you need the data to be at specific positions within a sequence. However such usage does not deal with endian-ness issues, so although used, it may not be portable.



          It may also to exactly overlay the internal register structure of some I/O device such as a UART or USB controller for example, in order that register access be through a structure rather than direct addresses.






          share|improve this answer


























            2














            A compiler may place structure members on particular byte boundaries for reasons of performance on a particular architecture. This may leave unused padding between members. Structure packing forces members to be contiguous.



            This may be important for example if you require a structure to conform to a particular file or communications format where the data you need the data to be at specific positions within a sequence. However such usage does not deal with endian-ness issues, so although used, it may not be portable.



            It may also to exactly overlay the internal register structure of some I/O device such as a UART or USB controller for example, in order that register access be through a structure rather than direct addresses.






            share|improve this answer
























              2












              2








              2






              A compiler may place structure members on particular byte boundaries for reasons of performance on a particular architecture. This may leave unused padding between members. Structure packing forces members to be contiguous.



              This may be important for example if you require a structure to conform to a particular file or communications format where the data you need the data to be at specific positions within a sequence. However such usage does not deal with endian-ness issues, so although used, it may not be portable.



              It may also to exactly overlay the internal register structure of some I/O device such as a UART or USB controller for example, in order that register access be through a structure rather than direct addresses.






              share|improve this answer












              A compiler may place structure members on particular byte boundaries for reasons of performance on a particular architecture. This may leave unused padding between members. Structure packing forces members to be contiguous.



              This may be important for example if you require a structure to conform to a particular file or communications format where the data you need the data to be at specific positions within a sequence. However such usage does not deal with endian-ness issues, so although used, it may not be portable.



              It may also to exactly overlay the internal register structure of some I/O device such as a UART or USB controller for example, in order that register access be through a structure rather than direct addresses.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Jul 23 '10 at 13:22









              Clifford

              58.3k858125




              58.3k858125























                  2














                  Compiler could align members in structures to achieve maximum performance on the certain platform. #pragma pack directive allows you to control that alignment. Usually you should leave it by default for optimum performance. If you need to pass a structure to the remote machine you generally will use #pragma pack 1 to exclude any unwanted alignment.






                  share|improve this answer


























                    2














                    Compiler could align members in structures to achieve maximum performance on the certain platform. #pragma pack directive allows you to control that alignment. Usually you should leave it by default for optimum performance. If you need to pass a structure to the remote machine you generally will use #pragma pack 1 to exclude any unwanted alignment.






                    share|improve this answer
























                      2












                      2








                      2






                      Compiler could align members in structures to achieve maximum performance on the certain platform. #pragma pack directive allows you to control that alignment. Usually you should leave it by default for optimum performance. If you need to pass a structure to the remote machine you generally will use #pragma pack 1 to exclude any unwanted alignment.






                      share|improve this answer












                      Compiler could align members in structures to achieve maximum performance on the certain platform. #pragma pack directive allows you to control that alignment. Usually you should leave it by default for optimum performance. If you need to pass a structure to the remote machine you generally will use #pragma pack 1 to exclude any unwanted alignment.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Jul 23 '10 at 13:24









                      Kirill V. Lyadvinsky

                      77.3k19117203




                      77.3k19117203























                          1














                          You'd likely only want to use this if you were coding to some hardware (e.g. a memory mapped device) which had strict requirements for register ordering and alignment.



                          However, this looks like a pretty blunt tool to achieve that end. A better approach would be to code a mini-driver in assembler and give it a C calling interface rather than fumbling around with this pragma.






                          share|improve this answer





















                          • I actually use it quite a lot to save space in large tables which are not accessed frequently. There, it's only to save space and not for any strict alignment. (Just voted you up, btw. Someone had given you a negative vote.)
                            – Todd Lehman
                            Apr 15 '15 at 21:27


















                          1














                          You'd likely only want to use this if you were coding to some hardware (e.g. a memory mapped device) which had strict requirements for register ordering and alignment.



                          However, this looks like a pretty blunt tool to achieve that end. A better approach would be to code a mini-driver in assembler and give it a C calling interface rather than fumbling around with this pragma.






                          share|improve this answer





















                          • I actually use it quite a lot to save space in large tables which are not accessed frequently. There, it's only to save space and not for any strict alignment. (Just voted you up, btw. Someone had given you a negative vote.)
                            – Todd Lehman
                            Apr 15 '15 at 21:27
















                          1












                          1








                          1






                          You'd likely only want to use this if you were coding to some hardware (e.g. a memory mapped device) which had strict requirements for register ordering and alignment.



                          However, this looks like a pretty blunt tool to achieve that end. A better approach would be to code a mini-driver in assembler and give it a C calling interface rather than fumbling around with this pragma.






                          share|improve this answer












                          You'd likely only want to use this if you were coding to some hardware (e.g. a memory mapped device) which had strict requirements for register ordering and alignment.



                          However, this looks like a pretty blunt tool to achieve that end. A better approach would be to code a mini-driver in assembler and give it a C calling interface rather than fumbling around with this pragma.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Jul 23 '10 at 13:20









                          msw

                          36.9k562100




                          36.9k562100












                          • I actually use it quite a lot to save space in large tables which are not accessed frequently. There, it's only to save space and not for any strict alignment. (Just voted you up, btw. Someone had given you a negative vote.)
                            – Todd Lehman
                            Apr 15 '15 at 21:27




















                          • I actually use it quite a lot to save space in large tables which are not accessed frequently. There, it's only to save space and not for any strict alignment. (Just voted you up, btw. Someone had given you a negative vote.)
                            – Todd Lehman
                            Apr 15 '15 at 21:27


















                          I actually use it quite a lot to save space in large tables which are not accessed frequently. There, it's only to save space and not for any strict alignment. (Just voted you up, btw. Someone had given you a negative vote.)
                          – Todd Lehman
                          Apr 15 '15 at 21:27






                          I actually use it quite a lot to save space in large tables which are not accessed frequently. There, it's only to save space and not for any strict alignment. (Just voted you up, btw. Someone had given you a negative vote.)
                          – Todd Lehman
                          Apr 15 '15 at 21:27













                          1














                          I've used it in code before, though only to interface with legacy code. This was a Mac OS X Cocoa application that needed to load preference files from an earlier, Carbon version (which was itself backwards-compatible with the original M68k System 6.5 version...you get the idea). The preference files in the original version were a binary dump of a configuration structure, that used the #pragma pack(1) to avoid taking up extra space and saving junk (i.e. the padding bytes that would otherwise be in the structure).



                          The original authors of the code had also used #pragma pack(1) to store structures that were used as messages in inter-process communication. I think the reason here was to avoid the possibility of unknown or changed padding sizes, as the code sometimes looked at a specific portion of the message struct by counting a number of bytes in from the start (ewww).






                          share|improve this answer


























                            1














                            I've used it in code before, though only to interface with legacy code. This was a Mac OS X Cocoa application that needed to load preference files from an earlier, Carbon version (which was itself backwards-compatible with the original M68k System 6.5 version...you get the idea). The preference files in the original version were a binary dump of a configuration structure, that used the #pragma pack(1) to avoid taking up extra space and saving junk (i.e. the padding bytes that would otherwise be in the structure).



                            The original authors of the code had also used #pragma pack(1) to store structures that were used as messages in inter-process communication. I think the reason here was to avoid the possibility of unknown or changed padding sizes, as the code sometimes looked at a specific portion of the message struct by counting a number of bytes in from the start (ewww).






                            share|improve this answer
























                              1












                              1








                              1






                              I've used it in code before, though only to interface with legacy code. This was a Mac OS X Cocoa application that needed to load preference files from an earlier, Carbon version (which was itself backwards-compatible with the original M68k System 6.5 version...you get the idea). The preference files in the original version were a binary dump of a configuration structure, that used the #pragma pack(1) to avoid taking up extra space and saving junk (i.e. the padding bytes that would otherwise be in the structure).



                              The original authors of the code had also used #pragma pack(1) to store structures that were used as messages in inter-process communication. I think the reason here was to avoid the possibility of unknown or changed padding sizes, as the code sometimes looked at a specific portion of the message struct by counting a number of bytes in from the start (ewww).






                              share|improve this answer












                              I've used it in code before, though only to interface with legacy code. This was a Mac OS X Cocoa application that needed to load preference files from an earlier, Carbon version (which was itself backwards-compatible with the original M68k System 6.5 version...you get the idea). The preference files in the original version were a binary dump of a configuration structure, that used the #pragma pack(1) to avoid taking up extra space and saving junk (i.e. the padding bytes that would otherwise be in the structure).



                              The original authors of the code had also used #pragma pack(1) to store structures that were used as messages in inter-process communication. I think the reason here was to avoid the possibility of unknown or changed padding sizes, as the code sometimes looked at a specific portion of the message struct by counting a number of bytes in from the start (ewww).







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Jul 23 '10 at 13:25







                              user23743






























                                  1














                                  I have seen people use it to make sure that a structure takes a whole cache line to prevent false sharing in a multithreaded context. If you are going to have a large number of objects that are going to be loosely packed by default it could save memory and improve cache performance to pack them tighter, though unaligned memory access will usually slow things down so there might be a downside.






                                  share|improve this answer


























                                    1














                                    I have seen people use it to make sure that a structure takes a whole cache line to prevent false sharing in a multithreaded context. If you are going to have a large number of objects that are going to be loosely packed by default it could save memory and improve cache performance to pack them tighter, though unaligned memory access will usually slow things down so there might be a downside.






                                    share|improve this answer
























                                      1












                                      1








                                      1






                                      I have seen people use it to make sure that a structure takes a whole cache line to prevent false sharing in a multithreaded context. If you are going to have a large number of objects that are going to be loosely packed by default it could save memory and improve cache performance to pack them tighter, though unaligned memory access will usually slow things down so there might be a downside.






                                      share|improve this answer












                                      I have seen people use it to make sure that a structure takes a whole cache line to prevent false sharing in a multithreaded context. If you are going to have a large number of objects that are going to be loosely packed by default it could save memory and improve cache performance to pack them tighter, though unaligned memory access will usually slow things down so there might be a downside.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Jul 23 '10 at 13:26









                                      stonemetal

                                      5,7731825




                                      5,7731825























                                          0














                                          Note that there are other ways of achieving data consistency that #pragma pack offers (for instance some people use #pragma pack(1) for structures that should be sent across the network). For instance, see the following code and its subsequent output:



                                          #include <stdio.h>

                                          struct a {
                                          char one;
                                          char two[2];
                                          char eight[8];
                                          char four[4];
                                          };

                                          struct b {
                                          char one;
                                          short two;
                                          long int eight;
                                          int four;
                                          };

                                          int main(int argc, char** argv) {
                                          struct a twoa[2] = {};
                                          struct b twob[2] = {};
                                          printf("sizeof(struct a): %i, sizeof(struct b): %in", sizeof(struct a), sizeof(struct b));
                                          printf("sizeof(twoa): %i, sizeof(twob): %in", sizeof(twoa), sizeof(twob));
                                          }


                                          The output is as follows:
                                          sizeof(struct a): 15, sizeof(struct b): 24
                                          sizeof(twoa): 30, sizeof(twob): 48



                                          Notice how the size of struct a is exactly what the byte count is, but struct b has padding added (see this for details on the padding). By doing this as opposed to the #pragma pack you can have control of converting the "wire format" into the appropriate types. For instance, "char two[2]" into a "short int" et cetera.






                                          share|improve this answer





















                                          • No it's wrong. If you look at the position in memory of b.two, it's not one byte after b.one (the compiler can (and will often) align b.two so it's aligned to word access). For a.two, it's exactly one byte after a.one. If you need to access a.two as a short int, you should have 2 alternative, either use a union (but this usually fails if you have endianness issue), or unpack/convert by code (using the appropriate ntohX function)
                                            – xryl669
                                            Sep 13 '16 at 15:39
















                                          0














                                          Note that there are other ways of achieving data consistency that #pragma pack offers (for instance some people use #pragma pack(1) for structures that should be sent across the network). For instance, see the following code and its subsequent output:



                                          #include <stdio.h>

                                          struct a {
                                          char one;
                                          char two[2];
                                          char eight[8];
                                          char four[4];
                                          };

                                          struct b {
                                          char one;
                                          short two;
                                          long int eight;
                                          int four;
                                          };

                                          int main(int argc, char** argv) {
                                          struct a twoa[2] = {};
                                          struct b twob[2] = {};
                                          printf("sizeof(struct a): %i, sizeof(struct b): %in", sizeof(struct a), sizeof(struct b));
                                          printf("sizeof(twoa): %i, sizeof(twob): %in", sizeof(twoa), sizeof(twob));
                                          }


                                          The output is as follows:
                                          sizeof(struct a): 15, sizeof(struct b): 24
                                          sizeof(twoa): 30, sizeof(twob): 48



                                          Notice how the size of struct a is exactly what the byte count is, but struct b has padding added (see this for details on the padding). By doing this as opposed to the #pragma pack you can have control of converting the "wire format" into the appropriate types. For instance, "char two[2]" into a "short int" et cetera.






                                          share|improve this answer





















                                          • No it's wrong. If you look at the position in memory of b.two, it's not one byte after b.one (the compiler can (and will often) align b.two so it's aligned to word access). For a.two, it's exactly one byte after a.one. If you need to access a.two as a short int, you should have 2 alternative, either use a union (but this usually fails if you have endianness issue), or unpack/convert by code (using the appropriate ntohX function)
                                            – xryl669
                                            Sep 13 '16 at 15:39














                                          0












                                          0








                                          0






                                          Note that there are other ways of achieving data consistency that #pragma pack offers (for instance some people use #pragma pack(1) for structures that should be sent across the network). For instance, see the following code and its subsequent output:



                                          #include <stdio.h>

                                          struct a {
                                          char one;
                                          char two[2];
                                          char eight[8];
                                          char four[4];
                                          };

                                          struct b {
                                          char one;
                                          short two;
                                          long int eight;
                                          int four;
                                          };

                                          int main(int argc, char** argv) {
                                          struct a twoa[2] = {};
                                          struct b twob[2] = {};
                                          printf("sizeof(struct a): %i, sizeof(struct b): %in", sizeof(struct a), sizeof(struct b));
                                          printf("sizeof(twoa): %i, sizeof(twob): %in", sizeof(twoa), sizeof(twob));
                                          }


                                          The output is as follows:
                                          sizeof(struct a): 15, sizeof(struct b): 24
                                          sizeof(twoa): 30, sizeof(twob): 48



                                          Notice how the size of struct a is exactly what the byte count is, but struct b has padding added (see this for details on the padding). By doing this as opposed to the #pragma pack you can have control of converting the "wire format" into the appropriate types. For instance, "char two[2]" into a "short int" et cetera.






                                          share|improve this answer












                                          Note that there are other ways of achieving data consistency that #pragma pack offers (for instance some people use #pragma pack(1) for structures that should be sent across the network). For instance, see the following code and its subsequent output:



                                          #include <stdio.h>

                                          struct a {
                                          char one;
                                          char two[2];
                                          char eight[8];
                                          char four[4];
                                          };

                                          struct b {
                                          char one;
                                          short two;
                                          long int eight;
                                          int four;
                                          };

                                          int main(int argc, char** argv) {
                                          struct a twoa[2] = {};
                                          struct b twob[2] = {};
                                          printf("sizeof(struct a): %i, sizeof(struct b): %in", sizeof(struct a), sizeof(struct b));
                                          printf("sizeof(twoa): %i, sizeof(twob): %in", sizeof(twoa), sizeof(twob));
                                          }


                                          The output is as follows:
                                          sizeof(struct a): 15, sizeof(struct b): 24
                                          sizeof(twoa): 30, sizeof(twob): 48



                                          Notice how the size of struct a is exactly what the byte count is, but struct b has padding added (see this for details on the padding). By doing this as opposed to the #pragma pack you can have control of converting the "wire format" into the appropriate types. For instance, "char two[2]" into a "short int" et cetera.







                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered Feb 9 '16 at 16:19









                                          wangchow

                                          1




                                          1












                                          • No it's wrong. If you look at the position in memory of b.two, it's not one byte after b.one (the compiler can (and will often) align b.two so it's aligned to word access). For a.two, it's exactly one byte after a.one. If you need to access a.two as a short int, you should have 2 alternative, either use a union (but this usually fails if you have endianness issue), or unpack/convert by code (using the appropriate ntohX function)
                                            – xryl669
                                            Sep 13 '16 at 15:39


















                                          • No it's wrong. If you look at the position in memory of b.two, it's not one byte after b.one (the compiler can (and will often) align b.two so it's aligned to word access). For a.two, it's exactly one byte after a.one. If you need to access a.two as a short int, you should have 2 alternative, either use a union (but this usually fails if you have endianness issue), or unpack/convert by code (using the appropriate ntohX function)
                                            – xryl669
                                            Sep 13 '16 at 15:39
















                                          No it's wrong. If you look at the position in memory of b.two, it's not one byte after b.one (the compiler can (and will often) align b.two so it's aligned to word access). For a.two, it's exactly one byte after a.one. If you need to access a.two as a short int, you should have 2 alternative, either use a union (but this usually fails if you have endianness issue), or unpack/convert by code (using the appropriate ntohX function)
                                          – xryl669
                                          Sep 13 '16 at 15:39




                                          No it's wrong. If you look at the position in memory of b.two, it's not one byte after b.one (the compiler can (and will often) align b.two so it's aligned to word access). For a.two, it's exactly one byte after a.one. If you need to access a.two as a short int, you should have 2 alternative, either use a union (but this usually fails if you have endianness issue), or unpack/convert by code (using the appropriate ntohX function)
                                          – xryl669
                                          Sep 13 '16 at 15:39


















                                          draft saved

                                          draft discarded




















































                                          Thanks for contributing an answer to Stack Overflow!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          To learn more, see our tips on writing great answers.





                                          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                          Please pay close attention to the following guidance:


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3318410%2fpragma-pack-effect%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          Lallio

                                          Futebolista

                                          Jornalista