Endian Buffer Types |
Endian Home Conversion Functions Arithmetic Types Buffer Types Choosing Approach |
Contents |
Introduction Example Limitations Feature set Enums and typedefs Class template endian_buffer Synopsis Members Non-Members FAQ Design C++11 Compilation |
The internal byte order of arithmetic types is traditionally called endianness. See the Wikipedia for a full exploration of endianness, including definitions of big endian and little endian.
Header boost/endian/buffers.hpp
provides endian_buffer
, a portable endian integer binary buffer
class template with control over
byte order, value type, size, and alignment independent of the platform's native
endianness. Typedefs provide easy-to-use names
for common configurations.
Use cases primarily involve data portability, either via files or network connections, but these byte-holders may also be used to reduce memory use, file size, or network activity since they provide binary numeric sizes not otherwise available.
Class endian_buffer
is aimed at users who wish
explicit control over when endianness conversions occur. It also serves as the
base class for the endian_arithmetic
class template, which is aimed at users who wish fully automatic endianness
conversion and direct support for all normal arithmetic operations.
The example/endian_example.cpp
program writes a
binary file containing four-byte, big-endian and little-endian integers:
#include <iostream> #include <cstdio> #include <boost/endian/buffers.hpp> // see Synopsis below #include <boost/static_assert.hpp> using namespace boost::endian; namespace { // This is an extract from a very widely used GIS file format. // Why the designer decided to mix big and little endians in // the same file is not known. But this is a real-world format // and users wishing to write low level code manipulating these // files have to deal with the mixed endianness. struct header { big_int32_buf_
t file_code; big_int32_buf_
t file_length; little_int32_buf_
t version; little_int32_buf_
t shape_type; }; const char* filename = "test.dat"; } int main(int, char* []) { header h; BOOST_STATIC_ASSERT(sizeof(h) == 16U); // reality check h.file_code = 0x01020304; h.file_length = sizeof(header); h.version = 1; h.shape_type = 0x01020304; // Low-level I/O such as POSIX read/write or <cstdio> // fread/fwrite is sometimes used for binary file operations // when ultimate efficiency is important. Such I/O is often // performed in some C++ wrapper class, but to drive home the // point that endian integers are often used in fairly // low-level code that does bulk I/O operations, <cstdio> // fopen/fwrite is used for I/O in this example. std::FILE* fi = std::fopen(filename, "wb"); // MUST BE BINARY if (!fi) { std::cout << "could not open " << filename << '\n'; return 1; } if (std::fwrite(&h, sizeof(header), 1, fi)!= 1) { std::cout << "write failure for " << filename << '\n'; return 1; } std::fclose(fi); std::cout << "created file " << filename << '\n'; return 0; }
After compiling and executing example/endian_example.cpp
,
a hex dump of test.dat
shows:
01020304 00000010 01000000 04030201
Notice that the first two 32-bit integers are big endian while the second two are little endian, even though the machine this was compiled and run on was little endian.
Requires <climits>
CHAR_BIT == 8
. If CHAR_BIT
is some other value, compilation will result in an #error
. This
restriction is in place because the design, implementation, testing, and
documentation has only considered issues related to 8-bit bytes, and there have
been no real-world use cases presented for other sizes.
In C++03, endian_buffer
does not meet the requirements for POD types
because it has constructors, private data members, and a base class. This means
that common use cases are relying on unspecified behavior in that the C++
Standard does not guarantee memory layout for non-POD types. This has not been a
problem in practice since all known C++ compilers lay out memory as if
endian
were a POD type. In C++11, it is possible to specify the
default constructor as trivial, and private data members and base classes no longer disqualify a type from being a POD
type. Thus under C++11, endian_buffer
will no longer be relying on unspecified behavior.
Two scoped enums are provided:
enum class order {big, little, native}; enum class align {no, yes};
One class template is provided:
template <order Order, typename T, std::size_t Nbits, align Align = align::no> class endian_buffer;
Typedefs, such as big_int32_buf_t
, provide convenient naming
conventions for common use cases:
Name Alignment Endianness Sign Sizes in bits (n) big_int
n_buf_t
no
big
signed 8,16,24,32,40,48,56,64 big_uint
n_buf_t
no
big
unsigned 8,16,24,32,40,48,56,64 little_int
n_buf_t
no
little
signed 8,16,24,32,40,48,56,64 little_uint
n_buf_t
no
little
unsigned 8,16,24,32,40,48,56,64 native_int
n_buf_t
no
native
signed 8,16,24,32,40,48,56,64 native_uint
n_buf_t
no
native
unsigned 8,16,24,32,40,48,56,64 big_int
n_buf_at
yes
big
signed 8,16,32,64 big_uint
n_
buf_at
yes
big
unsigned 8,16,32,64 little_int
n_
buf_at
yes
little
signed 8,16,32,64 little_uint
n_
buf_at
yes
little
unsigned 8,16,32,64
The unaligned types do not cause compilers to insert padding bytes in classes and structs. This is an important characteristic that can be exploited to minimize wasted space in memory, files, and network transmissions.
Warning: Code that uses aligned types is possibly non-portable because alignment requirements vary between hardware architectures and because alignment may be affected by compiler switches or pragmas. For example, alignment of an 64-bit integer may be to a 32-bit boundary on a 32-bit machine and to a 64-bit boundary on a 64-bit machine. Furthermore, aligned types are only available on architectures with 8, 16, 32, and 64-bit integer types.
Recommendation: Prefer unaligned buffer types.
Recommendation: Protect yourself against alignment ills. For example:
static_assert(sizeof(containing_struct) == 12, "sizeof(containing_struct) is wrong");
Note: One-byte big and little buffer types have identical layout on all platforms, so they never actually reverse endianness. They are provided to enable generic code, and to improve code readability and searchability.
endian
_buffer
An endian_buffer
is an integer byte-holder with user-specified
endianness, value type, size, and alignment. The
usual operations on integers are supplied.
namespace boost { namespace endian { // C++11 features emulated if not available enum class order { big, // big-endian little, // little-endian native = implementation-defined // same as order::big or order::little }; enum class align {no, yes}; template <order Order, class T, std::size_t Nbits, align Align = align::no> class endian_buffer { public: typedef T value_type; endian_buffer() noexcept = default; explicit endian_buffer(T v) noexcept; endian_buffer& operator=(T v) noexcept; value_type value() const noexcept; const char* data() const noexcept; protected: implementaton-defined endian_value; // for exposition only }; // stream inserter template <class charT, class traits, order Order, class T, std::size_t n_bits, align Align> std::basic_ostream<charT, traits>& operator<<(std::basic_ostream<charT, traits>& os, const endian_buffer<Order, T, n_bits, Align>& x); // stream extractor template <class charT, class traits, order Order, class T, std::size_t n_bits, align A> std::basic_istream<charT, traits>& operator>>(std::basic_istream<charT, traits>& is, endian_buffer<Order, T, n_bits, Align>& x); // typedefs // unaligned big endian signed integer buffers typedef endian_buffer<order::big, int_least8_t, 8> big_int8_buf_t; typedef endian_buffer<order::big, int_least16_t, 16> big_int16_buf_t; typedef endian_buffer<order::big, int_least32_t, 24> big_int24_buf_t; typedef endian_buffer<order::big, int_least32_t, 32> big_int32_buf_t; typedef endian_buffer<order::big, int_least64_t, 40> big_int40_buf_t; typedef endian_buffer<order::big, int_least64_t, 48> big_int48_buf_t; typedef endian_buffer<order::big, int_least64_t, 56> big_int56_buf_t; typedef endian_buffer<order::big, int_least64_t, 64> big_int64_buf_t; // unaligned big endian unsigned integer buffers typedef endian_buffer<order::big, uint_least8_t, 8> big_uint8_buf_t; typedef endian_buffer<order::big, uint_least16_t, 16> big_uint16_buf_t; typedef endian_buffer<order::big, uint_least32_t, 24> big_uint24_buf_t; typedef endian_buffer<order::big, uint_least32_t, 32> big_uint32_buf_t; typedef endian_buffer<order::big, uint_least64_t, 40> big_uint40_buf_t; typedef endian_buffer<order::big, uint_least64_t, 48> big_uint48_buf_t; typedef endian_buffer<order::big, uint_least64_t, 56> big_uint56_buf_t; typedef endian_buffer<order::big, uint_least64_t, 64> big_uint64_buf_t; // unaligned little endian signed integer buffers typedef endian_buffer<order::little, int_least8_t, 8> little_int8_buf_t; typedef endian_buffer<order::little, int_least16_t, 16> little_int16_buf_t; typedef endian_buffer<order::little, int_least32_t, 24> little_int24_buf_t; typedef endian_buffer<order::little, int_least32_t, 32> little_int32_buf_t; typedef endian_buffer<order::little, int_least64_t, 40> little_int40_buf_t; typedef endian_buffer<order::little, int_least64_t, 48> little_int48_buf_t; typedef endian_buffer<order::little, int_least64_t, 56> little_int56_buf_t; typedef endian_buffer<order::little, int_least64_t, 64> little_int64_buf_t; // unaligned little endian unsigned integer buffers typedef endian_buffer<order::little, uint_least8_t, 8> little_uint8_buf_t; typedef endian_buffer<order::little, uint_least16_t, 16> little_uint16_buf_t; typedef endian_buffer<order::little, uint_least32_t, 24> little_uint24_buf_t; typedef endian_buffer<order::little, uint_least32_t, 32> little_uint32_buf_t; typedef endian_buffer<order::little, uint_least64_t, 40> little_uint40_buf_t; typedef endian_buffer<order::little, uint_least64_t, 48> little_uint48_buf_t; typedef endian_buffer<order::little, uint_least64_t, 56> little_uint56_buf_t; typedef endian_buffer<order::little, uint_least64_t, 64> little_uint64_buf_t; // unaligned native endian signed integer types typedef implementation-defined_int8_buf_t native_int8_buf_t; typedef implementation-defined_int16_buf_t native_int16_buf_t; typedef implementation-defined_int24_buf_t native_int24_buf_t; typedef implementation-defined_int32_buf_t native_int32_buf_t; typedef implementation-defined_int40_buf_t native_int40_buf_t; typedef implementation-defined_int48_buf_t native_int48_buf_t; typedef implementation-defined_int56_buf_t native_int56_buf_t; typedef implementation-defined_int64_buf_t native_int64_buf_t; // unaligned native endian unsigned integer types typedef implementation-defined_uint8_buf_t native_uint8_buf_t; typedef implementation-defined_uint16_buf_t native_uint16_buf_t; typedef implementation-defined_uint24_buf_t native_uint24_buf_t; typedef implementation-defined_uint32_buf_t native_uint32_buf_t; typedef implementation-defined_uint40_buf_t native_uint40_buf_t; typedef implementation-defined_uint48_buf_t native_uint48_buf_t; typedef implementation-defined_uint56_buf_t native_uint56_buf_t; typedef implementation-defined_uint64_buf_t native_uint64_buf_t; // aligned big endian signed integer buffers typedef endian_buffer<order::big, int8_t, 8, align::yes> big_int8_buf_at; typedef endian_buffer<order::big, int16_t, 16, align::yes> big_int16_buf_at; typedef endian_buffer<order::big, int32_t, 32, align::yes> big_int32_buf_at; typedef endian_buffer<order::big, int64_t, 64, align::yes> big_int64_buf_at; // aligned big endian unsigned integer buffers typedef endian_buffer<order::big, uint8_t, 8, align::yes> big_uint8_buf_at; typedef endian_buffer<order::big, uint16_t, 16, align::yes> big_uint16_buf_at; typedef endian_buffer<order::big, uint32_t, 32, align::yes> big_uint32_buf_at; typedef endian_buffer<order::big, uint64_t, 64, align::yes> big_uint64_buf_at; // aligned little endian signed integer buffers typedef endian_buffer<order::little, int8_t, 8, align::yes> little_int8_buf_at; typedef endian_buffer<order::little, int16_t, 16, align::yes> little_int16_buf_at; typedef endian_buffer<order::little, int32_t, 32, align::yes> little_int32_buf_at; typedef endian_buffer<order::little, int64_t, 64, align::yes> little_int64_buf_at; // aligned little endian unsigned integer buffers typedef endian_buffer<order::little, uint8_t, 8, align::yes> little_uint8_buf_at; typedef endian_buffer<order::little, uint16_t, 16, align::yes> little_uint16_buf_at; typedef endian_buffer<order::little, uint32_t, 32, align::yes> little_uint32_buf_at; typedef endian_buffer<order::little, uint64_t, 64, align::yes> little_uint64_buf_at; // aligned native endian typedefs are not provided because // <cstdint> types are superior for this use case } // namespace endian } // namespace boost
The implementation-defined
text in typedefs above is either
big
or little
according to the native endianness of the
platform.
The expository data member endian_value
stores the current value
of an endian_value
object as a sequence of bytes ordered as
specified by the Order
template parameter. The
implementation-defined
type of endian_value
is a
type such as char[Nbits/CHAR_BIT]
or T
that meets the
requirements imposed by the Nbits
and Align
template
parameters. The CHAR_BIT
macro is defined in <climits>
.
The only value of CHAR_BIT
that
is required to be supported is 8.
Template parameter T
is
required to be a standard integer type (C++std, 3.9.1) and
sizeof(T)*CHAR_BIT
is required to be
greater or equal to Nbits
.
endian_buffer() noexcept = default;
Effects: Constructs an uninitialized object of type
endian_buffer<Order, T, Nbits, Align>
.
explicit endian_buffer(T v) noexcept;
Effects: Constructs an object of type
endian_buffer<Order, T, Nbits, Align>
.Postcondition:
value() == v & mask
, wheremask
is a constant of typevalue_type
withNbits
low-order bits set to one.Remarks: If
Align
isalign::yes
then endianness conversion, if required, is performed byboost::endian::endian_reverse
.
endian_buffer& operator=(T v) noexcept;
Postcondition:
value() == v & mask
, wheremask
is a constant of typevalue_type
withNbits
low-order bits set to one.Returns:
*this
.Remarks: If
Align
isalign::yes
then endianness conversion, if required, is performed byboost::endian::endian_reverse
.
value_type value() const noexcept;
Returns:
endian_value
, converted tovalue_type
, if required, and having the endianness of the native platform.Remarks: If
Align
isalign::yes
then endianness conversion, if required, is performed byboost::endian::endian_reverse
.
const char* data() const noexcept;
Returns: A pointer to the first byte of
endian_value
.
template <class charT, class traits, order Order, class T, std::size_t n_bits, align Align> std::basic_ostream<charT, traits>& operator<<(std::basic_ostream<charT, traits>& os, const endian_buffer<Order, T, n_bits, Align>& x);
Returns:
os << x.value()
.
template <class charT, class traits, order Order, class T, std::size_t n_bits, align A> std::basic_istream<charT, traits>& operator>>(std::basic_istream<charT, traits>& is, endian_buffer<Order, T, n_bits, Align>& x);
Effects: As if:
T i; if (is >> i) x = i;Returns:
is
.
See the Endian home page FAQ for a library-wide FAQ.
Why not just use Boost.Serialization? Serialization involves a conversion for every object involved in I/O. Endian integers require no conversion or copying. They are already in the desired format for binary I/O. Thus they can be read or written in bulk.
Are endian types PODs? Yes for C++11. No for C++03, although several macros are available to force PODness in all cases.
What are the implications of endian integer types not being PODs with C++03 compilers? They can't be used in unions. Also, compilers aren't required to align or lay out storage in portable ways, although this potential problem hasn't prevented use of Boost.Endian with real compilers.
What good is native endianness? It provides alignment and size guarantees not available from the built-in types. It eases generic programming.
Why bother with the aligned endian types? Aligned integer operations may be faster (as much as 10 to 20 times faster) if the endianness and alignment of the type matches the endianness and alignment requirements of the machine. The code, however, is likely to be somewhat less portable than with the unaligned types.
Why provide the arithmetic operations? Providing a full set of operations reduces program clutter and makes code both easier to write and to read. Consider incrementing a variable in a record. It is very convenient to write:
++record.foo;
Rather than:
int temp(record.foo); ++temp; record.foo = temp;
The availability of the C++11
Defaulted Functions feature is detected automatically, and will be used if
present to ensure that objects of class endian_buffer
are trivial, and
thus PODs.
Boost.Endian is implemented entirely within headers, with no need to link to any Boost object libraries.
Several macros allow user control over features:
class endian_buffer
to have no
constructors. The intended use is for compiling user code that must be
portable between compilers regardless of C++11
Defaulted Functions support. Use of constructors will always fail, class endian_buffer
are PODs, and so can be used in C++03 unions.
In C++11, class endian_buffer
objects are PODs, even though they have
constructors, so can always be used in unions.Last revised: 25 March, 2015
© Copyright Beman Dawes, 2006-2009, 2013
Distributed under the Boost Software License, Version 1.0. See www.boost.org/ LICENSE_1_0.txt