[ACCEPTED]-Types of endianness-endianness
There are two approaches to endian mapping: address invariance and 42 data invariance.
Address Invariance
In this type of mapping, the address of 41 bytes is always preserved between big and 40 little. This has the side effect of reversing 39 the order of significance (most significant 38 to least significant) of a particular datum 37 (e.g. 2 or 4 byte word) and therefore the 36 interpretation of data. Specifically, in 35 little-endian, the interpretation of data 34 is least-significant to most-significant 33 bytes whilst in big-endian, the interpretation 32 is most-significant to least-significant. In 31 both cases, the set of bytes accessed remains 30 the same.
Example
Address invariance (also known 29 as byte invariance): the byte address is constant but byte 28 significance is reversed.
Addr Memory
7 0
| | (LE) (BE)
|----|
+0 | aa | lsb msb
|----|
+1 | bb | : :
|----|
+2 | cc | : :
|----|
+3 | dd | msb lsb
|----|
| |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0xaa (preserved)
Read 2 bytes: 0xbbaa 0xaabb
Read 4 bytes: 0xddccbbaa 0xaabbccdd
Data Invariance
In this type of 27 mapping, the relative byte significance 26 is preserved for datum of a particular size. There 25 are therefore different types of data invariant 24 endian mappings for different datum sizes. For 23 example, a 32-bit word invariant endian 22 mapping would be used for a datum size of 21 32. The effect of preserving the value of 20 particular sized datum, is that the byte 19 addresses of bytes within the datum are 18 reversed between big and little endian mappings.
Example
32-bit 17 data invariance (also known as word invariance): The datum 16 is a 32-bit word which always has the value 15 0xddccbbaa
, independent of endianness. However, for 14 accesses smaller than a word, the address 13 of the bytes are reversed between big and 12 little endian mappings.
Addr Memory
| +3 +2 +1 +0 | <- LE
|-------------------|
+0 msb | dd | cc | bb | aa | lsb
|-------------------|
+4 msb | 99 | 88 | 77 | 66 | lsb
|-------------------|
BE -> | +0 +1 +2 +3 |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0xdd
Read 2 bytes: 0xbbaa 0xddcc
Read 4 bytes: 0xddccbbaa 0xddccbbaa (preserved)
Read 8 bytes: 0x99887766ddccbbaa 0x99887766ddccbbaa (preserved)
Example
16-bit data invariance 11 (also known as half-word invariance): The datum is a 16-bit
which 10 always has the value 0xbbaa
, independent of endianness. However, for 9 accesses smaller than a half-word, the address 8 of the bytes are reversed between big and 7 little endian mappings.
Addr Memory
| +1 +0 | <- LE
|---------|
+0 msb | bb | aa | lsb
|---------|
+2 msb | dd | cc | lsb
|---------|
+4 msb | 77 | 66 | lsb
|---------|
+6 msb | 99 | 88 | lsb
|---------|
BE -> | +0 +1 |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0xbb
Read 2 bytes: 0xbbaa 0xbbaa (preserved)
Read 4 bytes: 0xddccbbaa 0xddccbbaa (preserved)
Read 8 bytes: 0x99887766ddccbbaa 0x99887766ddccbbaa (preserved)
Example
64-bit data invariance 6 (also known as double-word invariance): The datum is a 64-bit
word 5 which always has the value 0x99887766ddccbbaa
, independent 4 of endianness. However, for accesses smaller 3 than a double-word, the address of the bytes 2 are reversed between big and little endian 1 mappings.
Addr Memory
| +7 +6 +5 +4 +3 +2 +1 +0 | <- LE
|---------------------------------------|
+0 msb | 99 | 88 | 77 | 66 | dd | cc | bb | aa | lsb
|---------------------------------------|
BE -> | +0 +1 +2 +3 +4 +5 +6 +7 |
At Addr=0: Little-endian Big-endian
Read 1 byte: 0xaa 0x99
Read 2 bytes: 0xbbaa 0x9988
Read 4 bytes: 0xddccbbaa 0x99887766
Read 8 bytes: 0x99887766ddccbbaa 0x99887766ddccbbaa (preserved)
There's also middle or mixed endian. See 7 wikipedia for details.
The only time I had to worry 6 about this was when writing some networking 5 code in C. Networking typically uses big-endian 4 IIRC. Most languages either abstract the 3 whole thing or offer libraries to guarantee 2 that you're using the right endian-ness 1 though.
Philibert said,
bits were actually inverted
I doubt 31 any architecture would break byte value 30 invariance. The order of bit-fields may 29 need inversion when mapping structs containing 28 them against data. Such direct mapping 27 relies on compiler specifics which are outside 26 the C99 standard but which may still be 25 common. Direct mapping is faster but does 24 not comply with the C99 standard that does 23 not stipulate packing, alignment and byte 22 order. C99-compliant code should use slow 21 mapping based on values rather than addresses. That 20 is, instead of doing this,
#if LITTLE_ENDIAN
struct breakdown_t {
int least_significant_bit: 1;
int middle_bits: 10;
int most_significant_bits: 21;
};
#elif BIG_ENDIAN
struct breakdown_t {
int most_significant_bits: 21;
int middle_bits: 10;
int least_significant_bit: 1;
};
#else
#error Huh
#endif
uint32_t data = ...;
struct breakdown_t *b = (struct breakdown_t *)&data;
one should write 19 this (and this is how the compiler would 18 generate code anyways even for the above 17 "direct mapping"),
uint32_t data = ...;
uint32_t least_significant_bit = data & 0x00000001;
uint32_t middle_bits = (data >> 1) & 0x000003FF;
uint32_t most_significant_bits = (data >> 11) & 0x001fffff;
The reason behind 16 the need to invert the order of bit-fields 15 in each endian-neutral, application-specific 14 data storage unit is that compilers pack 13 bit-fields into bytes of growing addresses.
The 12 "order of bits" in each byte does 11 not matter as the only way to extract them 10 is by applying masks of values and by shifting 9 to the the least-significant-bit or most-significant-bit 8 direction. The "order of bits" issue 7 would only become important in imaginary 6 architectures with the notion of bit addresses. I 5 believe all existing architectures hide 4 this notion in hardware and provide only 3 least vs. most significant bit extraction 2 which is the notion based on the endian-neutral 1 byte values.
Actually, I'd describe the endianness of 11 a machine as the order of bytes inside of a word, and 10 not the order of bits.
By "bytes" up there I 9 mean the "smallest unit of memory the architecture 8 can manage individually". So, if the smallest 7 unit is 16 bits long (what in x86 would 6 be called a word) then a 32 bit "word" representing 5 the value 0xFFFF0000 could be stored like 4 this:
FFFF 0000
or this:
0000 FFFF
in memory, depending on endianness.
So, if 3 you have 8-bit endianness, it means that 2 every word consisting of 16 bits, will be 1 stored as:
FF 00
or:
00 FF
and so on.
Practically speaking, endianess refers to 18 the way the processor will interpret the 17 content of a given memory location. For 16 example, if we have memory location 0x100 15 with the following content (hex bytes)
0x100: 12 34 56 78 90 ab cd ef
Reads Little Endian Big Endian
8-bit: 12 12
16-bit: 34 12 12 34
32-bit: 78 56 34 12 12 34 56 78
64-bit: ef cd ab 90 78 56 34 12 12 34 56 78 90 ab cd ef
The 14 two situations where you need to mind endianess 13 are with networking code and if you do down 12 casting with pointers.
TCP/IP specifies that 11 data on the wire should be big endian. If 10 you transmit types other than byte arrays 9 (like pointers to structures), you should 8 make sure to use the ntoh/hton macros to 7 ensure the data is sent big endian. If 6 you send from a little-endian processor 5 to a big-endian processor (or vice versa), the 4 data will be garbled...
Casting issues:
uint32_t* lptr = 0x100;
uint16_t data;
*lptr = 0x0000FFFF
data = *((uint16_t*)lptr);
What 3 will be the value of data? On a big-endian 2 system, it would be 0 On a little-endian 1 system, it would be FFFF
13 years ago I worked on a tool portable 6 to both a DEC ALPHA system and a PC. On 5 this DEC ALPHA the bits were actually inverted. That is:
1010 0011
actually translated 4 to
1100 0101
It was almost transparent and seamless 3 in the C code except that I had a bitfield 2 declared like
typedef struct {
int firstbit:1;
int middlebits:10;
int lastbits:21;
};
that needed to be translated 1 to (using #ifdef conditional compiling)
typedef struct {
int lastbits:21;
int middlebits:10;
int firstbit:1;
};
As @erik-van-brakel answered on this post, be 8 careful when communicating with certain 7 PLC : Mixed-endian still alive !
Indeed, I need to communicate with 6 a PLC (from a well known manufacturer) with 5 (Modbus-TCP) OPC protocol and it seems that 4 it returns me a mixed-endian on every half 3 word. So it is still used by some of the 2 larger manufacturers.
Here is an example 1 with the "pieces" string :
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.