|
sbepp
|
In this section I'll try to gradually describe the structure of generated code. It's just a brief description, for detailed documentation see corresponding reference pages.
For each schema compiled by sbeppc, there will be two main sources of information:
Sometimes, traits and representation types have functionality with common names, but it's very different in nature. For example, sbepp::message_traits<msg_tag>::size_bytes() returns precomputed value based on message structure from the XML, and hence guaranteed to be valid only for the current schema version. On the other hand, sbepp::size_bytes(msg) calculates message size based on the message buffer and the values it holds, it returns valid value even for newer schema versions because message representation type correctly handles schema extension.
It's possible to get a representation type from a tag using value_type member of the traits (e.g. sbepp::message_traits<msg_tag>::value_type), and vice versa, to get a tag from a representation type using sbepp::traits_tag_t. These helpers can be used to avoid explicit mentioning of both tag and representation types at the same time by deducing one from another.
Here's the structure of generated code after compilation:
Here, detail, schema, messages and types are hardcoded and don't depend on schema names.
sbepp preserves names of all schema entities without any modification. It means that messages, types, fields, etc. will have the same class/function names in the public part of the generated code. Of course standard C++ naming rules are still applied and usually you'll get an error from sbeppc if schema uses wrong name.
Although original names are preserved in public interface, sometimes underlying implementation is located in detail and a public alias is provided for it. Names from detail should never be used explicitly but in case of error, compiler usually uses them in error message so it's useful to know how they are formed.
sbepp tries to preserve schema names for everything but when it's not possible, class name is mangled like <original_name>_<N> where N is a number. Group entries have class names like <group_name>_entry where group_name is a potentially mangled group name. Tag types can be mangled as well and their names always match those of the corresponding implementation types.
For example, public encoding User is always accessible as schema_name::types::User but can actually be an alias to schema_name::detail::types::User_0. Similar, its tag is schema_name::schema::types::User but it can be an alias to schema_name::detail::schema::types::User_0. When schema doesn't have a lot of repetitive names, looking at the trailing class name is enough to understand which schema entity it represents.
Most types generated by sbepp have reference semantics. It means they are just pointers to the actual data which they don't own and don't manage in any way.
They usually contain a single pointer (with one additional pointer in Debug mode) and are cheap to pass by value. Creation of such an object usually invloves no actions/parsing except the pointer intialization. They are templates with Byte template parameter which is a byte type (can be cv-qualified). Another consequence of reference semantics is that making object const doesn't make underlying data const, i.e., you can modify a message via const msg<char> object. You need to use const-qualified Byte type to make a thing read-only.
There are helpers to access raw underlying data which are available for all reference semantics types: sbepp::size_bytes, sbepp::size_bytes_checked, sbepp::addressof.
Non-array types, enums and sets are the only schema entities represented with value semantics types (including constants). They are small, 64 bits at most, types which behave like int.
As was said before, client is responsible to ensure that provided buffer is enough to hold corresponding SBE data. To provide some sort of safety, sbepp inserts assertions in many places. They check only the accessed data, not the whole SBE message or other entity. For example, if a message has 10 fields and provided buffer can only hold 5 of them, you'll get an assertion only when any of last 5 fields will be accessed. By default, these checks are controlled by NDEBUG just as standard assert().
sbepp doesn't make a distinction between encoding and decoding. It only provides SBE view of the provided buffer. Functions have no hidden side-effects beyond their main functionality. The only thing which is done implicitly is handling of SBE schema extension mechanism by respecting blockLength. This has several consequences. First, be careful not to change things which affect the offset of the following already-written fields. For example, if you have dynamic-length fields data1 and data2, don't change data1's size after you filled data2 because its offset the data will become a garbage. Simple advice is to fill groups/data fields in-order. Second, when you encode a new message, you need to explicitly fill message/group header via sbepp::fill_message_header or sbepp::fill_group_header and it's better to do this as early as possible.
Messages and composites are two root things from which any work with SBE data starts. Like any reference semantics type, they are created from pointer and size:
There is a couple of helpers which deduce byte type for you, sbepp::make_view and sbepp::make_const_view. They are applicable to any reference semantics type except group entry.
The main purpose of messages, composites and group entries is of course to contain fields, their interface will be discussed later in the accessors section.
Since sbepp provides only a message's view, message header should be filled explicitly via sbepp::fill_message_header when a new message is created:
blockLength is required to correctly interpret underlying data.You can also fill it by hand using sbepp::get_header:
Typically, you need first to create a message header to check the type of the incoming message. Composite has the same form and construction approach as message:
In general, group has a container-like interface with iterators and other members you'd expect from standard container. There are two kinds of groups (message levels in general sense) which provide different interfaces:
std::vector. See sbepp::detail::flat_group_base for complete reference.sbepp::detail::nested_group_base for complete reference.Because sbepp provides only views, there are no functions like push_back, there's nothing to push. To encode a group, one needs first to set its size, then access corresponding group entries. Similar to message header, group header has to filled explicitly either by sbepp::fill_group_header, resize() method or manually by sbepp::get_header:
numInGroup and blockLength are used to correctly interpret the underlying data.Group entries have no special properties and normally are never created explicitly. See sbepp::detail::entry_base for details if you are interested.
Variable-length arrays are represented using sbepp::detail::dynamic_array_ref. This type works like a reference to a vector-like type.
According to SBE standard, strings stored inside data members never have terminating null character. Conversion from sbepp::detail::dynamic_array_ref to a more string-specific type can be done using data() and size() methods:
There are 2 options for string assignments:
sbepp::detail::dynamic_array_ref::assign_string() to assign a raw string pointer.sbepp::detail::dynamic_array_ref::assign_range(), a generic method to assign from ranges. Since range requires just begin()/end() methods, most string-specific types satisfy this requirement.Example:
sbepp treats all <type>s with length != 1 (including 0) as fixed-size arrays. They are implemented in terms of sbepp::detail::static_array_ref which has std::span-like interface with assignment helpers.
Assignment from a string can be done using sbepp::detail::static_array_ref::assign_string(). It can handle both, raw string pointers and string ranges like std::string or std::string_view. As a second parameter it takes sbepp::eos_null eos_mode that controls how to set trailing null bytes (if any). If a stored string is shorter than the array, SBE standard requires all the remaining bytes to be set to null and sbepp::eos_null::all is the default argument for eos_mode parameter. In practice, however, it's not always required because:
sbepp::eos_null::none is enough to correctly encode a string.Example:
To convert sbepp::detail::static_array_ref to a more string-specific type, string length has to be calculated explicitly because the stored string might occupy the entire array without having the terminating null character. There are two ways to do this:
sbepp::detail::static_array_ref::strlen(), calculates string length by looking for the first null character from left to right.sbepp::detail::static_array_ref::strlen_r(), calculates string length by looking for the first non-null character from right to left. This reversed approach might be useful when user expects that string end is closer to the end of the array than to its start. For it to work, it requires all padding bytes (if any) to be set to null.Example:
See sbepp::char_t and sbepp::char_opt_t for the example of required and optional type correspondingly.
sbepp::detail::required_base::in_range() and sbepp::detail::optional_base::has_value(), they don't enforce any checks on the underlying value.Enums are represented using scoped enumerations. For example:
is represented like:
In set representation, each choice has a corresponding getter and setter, for example:
is represented like:
sbepp::visit_setAlthough SBE doesn't allow optional enums/sets, some schema authors don't respect this rule and try to achieve it by specifying an optional type as encodingType for enum/set:
They consider such enum/set to be null if its value matches the null value of that encodingType. To facilitate getting "null" values for such cases, sbepp provides encoding_type_tag in sbepp::enum_traits and sbepp::set_traits. It represents the tag of the original type specified in schema and can be used to get the null value:
<validValue>. Similarly, a special "null" <choice> is required for set or simply treat 0 as its null value.Constant accessors are represented via static functions. Non-array constants return directly underlying value without any wrapper. Only <field> can return it as enum type. Array-like constants (strings) are represented using sbepp::detail::static_array_ref like a fixed-size array.
There are multiple entities which can hold fields: messages, group entries and composites. They all provide the same interface for accessors:
That is, for value semantics fields there are pair of getter and setter, for reference semantics fields there is only a getter which returns a view with the same byte type as its enclosing object. When byte type is const-qualified, setters are not available.
Unlike cursor-based accessors, these "normal" accessors can be used in any order.
While normal accessors can be used in any order, this is not always efficient. Consider this message:
Here, access to field1/2/3 is still fast but access to the data1 is not. To get its offset we need:
blockLength from nested_msg header to get group's offsetnumInGroup from group headerblockLength from group headergroup entryblockLength to get offset to data2data2 lengthdata1 offsetMoreover, to access next data2 all that work has to be repeated! Now imagine if there were more groups and data in that message. It's a lot of work and current compilers can't optimize normal accessors well even when everything is accessed in order.
The way to solve it is to access things in a forward-only manner to avoid recalculation of the next field's offset each time from the message start. In this way, after we've read the group, offset for data1 is ready for free. This is the core idea behind cursor-based API.
A cursor (sbepp::cursor<Byte>) is just a pointer wrapper which is passed to field accessors as an additional parameter:
It's parameterized with Byte type which has the same meaning as for other reference semantics types. Note that it can be more const-qualified than the byte type of an enclosing view, setters are not available for such cursors.
By default, each field assumes that cursor points to the end of the previous field (or to the end of group/message header) so the offset to current field can be calculated efficiently (usually a no-op), then cursor is advanced using these rules:
field moves cursor up by field's sizefield moves cursor to the end of the block (calculated using blockLength)group or data) of the message/entry unconditionally initializes cursor to the end of the block before using it. It means that it's possible to use uninitialized cursor with them but I don't recommend itgroup moves cursor to the end of group's headerdata moves cursor to the end of the data (data() + size())Cursor has to be initialized before the first usage, it can be done via sbepp::init_cursor/sbepp::init_const_cursor or by using sbepp::cursor_ops::init or even by hand from provided pointer. Only messages and group entries provide cursor-based accessors (composites cannot contain variable-sized fields). Note that cursor-based and normal accessors return the same objects. Those object don't care how they were created.
Here's an example of how to read the above message:
Note that you need to use cursor_range() to iterate over group entries. That's because now each entry is created from the cursor.
Here's how compiler will see it (simplified of course):
This approach is very efficient but the downside is that to access a field, you need to access all previous fields in their schema order.
To provide some sort of flexibility, there are various sbepp::cursor_ops helpers which can control cursor's position. Check out their documentation for examples. Here, I only want to duplicate one tricky case from sbepp::cursor_ops::dont_move, using cursor to write a data member:
Recall that data accessors by-default move the cursor to data() + size() so when done in a naive way:
at the time we access data, its length can have any value (your best hope is message buffer initialized by 0), thus, cursor will be moved to the unknown position and its furher use is unpredictable or even UB.
As you can see, using cursor-based API might be tricky and requires additional care. Most schemas I saw have only a single flat group and no data, for them normal accessors work great. I recommend to use cursors only for messages with complex structure or when you did a benchmark and know for sure that you'll benefit from it.
It's possible to access fields and set choices using tags via sbepp::get_by_tag and sbepp::set_by_tag. Under the hood, they call the corresponding normal or cursor-based accessors and thus take and return the same types:
In combination with various traits like sbepp::message_traits::field_tags they provide a powerful API for automation tasks. See the examples page for demos.