<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2026-02-24T13:52:06+00:00</updated><id>/feed.xml</id><title type="html">Oleksandr Koval’s blog</title><subtitle>Thoughts, experiments, software, C++</subtitle><author><name>Oleksandr Koval</name></author><entry><title type="html">Making generic wrapper for a niche binary format using meta tags</title><link href="/2026/02/24/meta-tags-api.html" rel="alternate" type="text/html" title="Making generic wrapper for a niche binary format using meta tags" /><published>2026-02-24T12:50:00+00:00</published><updated>2026-02-24T12:50:00+00:00</updated><id>/2026/02/24/meta-tags-api</id><content type="html" xml:base="/2026/02/24/meta-tags-api.html"><![CDATA[<p>In this article I’ll show how meta tags turned out to be a surprisingly powerful
feature and how I used them to implement core parts of a polymorphic wrapper
around a binary message format.</p>

<hr />

<h2 id="problem-overview">Problem overview</h2>

<p>I work for a company that makes various software for financial markets. Some of
those markets use
<a href="https://github.com/FIXTradingCommunity/fix-simple-binary-encoding">FIX SBE format</a>
(or simply SBE) for message encoding. Details
of this format don’t really matter here but small example will make overall
picture clearer. In SBE message layout is specified using XML schema:</p>

<pre><code class="language-xml">&lt;!-- user-defined types... --&gt;
&lt;type name="uint32_t" primitiveType="uint32"/&gt;

&lt;!-- user-defined messages... --&gt;
&lt;sbe:message name="msg_a" id="1"&gt;
    &lt;field name="field_a" type="uint32_t" id="1"/&gt;
    &lt;field name="field_b" type="uint32_t" id="2"/&gt;
&lt;/sbe:message&gt;
</code></pre>

<p>In memory this message looks roughly like:</p>

<pre><code class="language-cpp">struct msg_a_layout{
    // common for all messages from the same schema, provides message ID
    message_header header;
    uint32_t field_a;
    uint32_t field_b;
};
</code></pre>

<p>Thanks to its binary nature, the format is very fast to work with, much faster
than its ancestor, text-based
<a href="https://en.wikipedia.org/wiki/Financial_Information_eXchange">FIX</a>.</p>

<p>Typical way of working with SBE is to generate language-specific code from XML
schema (so the generated code is schema-specific) and include it into the
final app. This workflow is very similar to <code>protobuf</code> or <code>flatbuffers</code>.</p>

<p>At some point, being unhappy with our proprietary implementation, I decided to
write my own one - <a href="https://github.com/OleksandrKvl/sbepp">sbepp</a>. The main
goals I had in mind for that project were:</p>
<ul>
  <li>zero overhead, in financial apps we care about performance</li>
  <li>provide only the basic functionality, building blocks that allow users to
  create their own efficient solutions to their specific problems</li>
  <li>provide all information from SBE schema, high-level things like field
  description and low-level details like memory offsets and sizes</li>
</ul>

<p>As a result, generated code never allocates or does more work than one would do
by hand.</p>

<p><code>sbepp</code> compiles XML schema into C++ header-only library, for most SBE entities
it generates a class with field-named accessors:</p>

<pre><code class="language-cpp">// SBE-specific wrapper around `std::uint32_t`
class uint32_t{
    std::uint32_t value() const;
    // ...
};

class msg_a{
    uint32_t field_a();      // getter
    void field_a(uint32_t);  // setter

    uint32_t field_b();
    void field_b(uint32_t);
};

// other messages, types, etc.
</code></pre>

<p>Since SBE is a format from financial world, its primary use-case is encoding and
decoding market messages. It’s quite natural to generate and include
schema-specific headers in market-specific component. We began to use <code>sbepp</code>
for this purpose and everything was quite good.</p>

<p>However, we also have a project that can work with multiple markets and SBE is
used internally as their common message format. The main market logic is still
located in market-specific components but there’s also a core component that has
to be able to work with <em>any</em> SBE message in a generic way. In particular, it
should be able to:</p>
<ul>
  <li>convert to/from JSON</li>
  <li>create and fill message given its schema/message/field name/value</li>
</ul>

<p>And here’s the problem, how to make a generic, runtime polymorphic wrapper
around a bunch of unrelated classes, each with its own set of field names and
types? There’s no way to convert <code>msg-&gt;get_by_name("field_a")</code> into
<code>msg.field_a()</code> in a generic fashion. Even the return type for such a function
is unclear.</p>

<p>In theory, <code>sbepp</code> knows everything during code generation but again, it’s not
possible to generate interface/implementation that will satisfy all users. Not
mentioning that most users don’t even need this polymorphic wrapper, it’s a very
ad-hoc problem.</p>

<p>So the goal of this task became to provide a minimal mechanism that allows users
to do literally anything they want without overhead. No ad-hoc solutions, only
building blocks for them.</p>

<p>However, there was a glimpse of light. If you remember, one of the initial
<code>sbepp</code> goals was providing all information from SBE schema. That means field
names, descriptions, offsets, basically every attribute that SBE schema defines.
The way it’s been done is a standalone set of meta tags that represent SBE
schema structure and various traits to access properties via those tags:</p>

<pre><code class="language-cpp">// tags
struct schema_name::schema{
    struct types{
        struct uint32_t{};
        // other types...
    };

    struct messages{
        struct msg_a{
            struct field_a{};
            // other fields...
        };
        // other messages...
    };
};

// traits
struct type_traits&lt;schema_name::schema::types::uint32_t&gt;{/*...*/};
struct message_traits&lt;schema_name::schema::messages::msg_a&gt;{/*...*/};
struct field_traits&lt;schema_name::schema::messages::msg_a::field_a&gt;{/*...*/};

// representation classes
class schema_name::messages::msg_a{
    uint32_t field_a();
    void field_a(uint32_t);
    // other fields...
};

// a mapping between them is available too
// representation type to meta tag
static_assert(std::same_as&lt;
    sbepp::traits_tag_t&lt;msg_a&gt;,
    schema_name::schema::messages::msg_a&gt;);
// meta tag to representation type
static_assert(std::same_as&lt;
    message_traits&lt;schema_name::schema::messages::msg_a&gt;::value_type,
    schema_name::messages::msg_a&gt;);
</code></pre>

<hr />

<h2 id="solution">Solution</h2>

<p>Turns out that both, a generic de/serialization and by-name accessors, need
similar functionality - ability to navigate over the fields at compile time to
access their meta information (e.g. name) and, at the same time, to access field
value at run-time if needed.</p>

<p>From the previous section we know that <code>sbepp</code> provides a way to access both
kinds of information via two separate API which, however, is impossible to use
generically:</p>

<pre><code class="language-cpp">// access runtime value
auto field_value = msg.field_a();
// access meta-information
constexpr auto field_name = field_traits&lt;message_a::field_a&gt;::name();
// but no way to combine them
</code></pre>

<p>To solve the first part of the problem, compile-time navigation, we can use a
very simple tool - a type list.</p>

<pre><code class="language-cpp">template&lt;typename... Types&gt;
struct type_list{};
</code></pre>

<p>Existing meta-tags and their traits already provide all possible information, we
only need to put them inside a list and provide via trait:</p>

<pre><code class="language-cpp">// concrete message traits
template&lt;&gt;
struct message_traits&lt;msg_a&gt;{
    using field_tags = type_list&lt;
    schema_name::schema::messages::msg_a::field_a,
    schema_name::schema::messages::msg_a::field_b&gt;;
};
</code></pre>

<p>Same approach is used to represent all children tags within their parents
traits, not just message fields. For more convenience, let’s add <code>is_ABC_tag</code>
trait so user can figure out tag kind and use proper trait to access its
properties, e.g. <code>is_enum_tag&lt;T&gt;</code> followed by <code>enum_traits&lt;T&gt;</code>.</p>

<p>The next step is a runtime access by compile time tag. For that we add a couple
of functions: <code>get/set_by_tag&lt;Tag&gt;</code>. Every generated class that has children to
access by tag implements it by simply forwarding to a proper name-based
accessor:</p>

<pre><code class="language-cpp">class msg_a{
    // normal accessors
    uint32_t field_a();
    void field_a(uint32_t);

    // for exposition only
    auto get_by_tag(message_a::field_a){
        return field_a();
    }

    void set_by_tag(message_a::field_a, uint32_t value){
        return field_a(value);
    }
};

// public methods forward to internal ones above
sbepp::get_by_tag&lt;message_a::field_a&gt;(msg);
sbepp::set_by_tag&lt;message_a::field_a&gt;(msg, 123);
</code></pre>

<p>Now let’s see how these two seemingly minor additions allowed a very rich set of
functionality. Actually, they allow nearly everything one can imagine about
working with SBE data.</p>

<hr />

<h2 id="examples">Examples</h2>

<p><em>Disclaimer. Examples are intentionally simplified and are not intended for
direct use as-is. The main goal is to demonstrate overall approach, interested
reader can look near the bottom of the
<a href="https://oleksandrkvl.github.io/sbepp/1.7.0/examples.html">Examples</a> page for
examples that do compile.</em></p>

<p>Common technique for most examples is iteration over a type list that contains
children tags of various kind. Its implementation is not important here, just
assume it has this signature:</p>

<pre><code class="language-cpp">// For each tag `T` in `TypeList`, calls `cb(T{})` until it returns `true`.
// Returns `true` if `cb` returned `true`, `false` otherwise.
template&lt;typename TypeList&gt;
auto for_each_tag_until(auto cb);
</code></pre>

<p>Also, I won’t specify <code>sbepp</code> namespace to make the code more compact.</p>

<hr />

<h3 id="enum-to-string">Enum to string</h3>

<p><code>sbepp</code> represents SBE enums as scoped C++ enums for which there’s a classic
problem: you have an enum and want to log its name. To solve it we can iterate
over all known enumerator values to find a match with the current one:</p>

<pre><code class="language-cpp">// returns `nullptr` if value is unknown
template&lt;typename Enum&gt;
const char* enum_to_string(Enum e){
    const char* res{};

    for_each_tag_until&lt;enum_traits&lt;traits_tag_t&lt;Enum&gt;&gt;::value_tags&gt;(
        [e, &amp;res]&lt;typename Tag&gt;(Tag){
            if(enum_value_traits&lt;Tag&gt;::value() == e){
                res = enum_value_traits&lt;Tag&gt;::name();
                return true;
            }
            return false;
    });

    return res;
}
</code></pre>

<p>Btw, here’s how to do the opposite conversion:</p>

<pre><code class="language-cpp">template&lt;typename Enum&gt;
Enum string_to_enum(std::string_view name){
    Enum res{};

    for_each_tag_until&lt;enum_traits&lt;traits_tag_t&lt;Enum&gt;&gt;::value_tags&gt;(
        [name, &amp;res]&lt;typename Tag&gt;(Tag){
        if(enum_value_traits&lt;Tag&gt;::name() == name){
            res = enum_value_traits&lt;Tag&gt;::value();
            return true;
        }
        return false;
    });

    return res;
}
</code></pre>

<p>Note that it’s not the only possible implementation. While for enum-to-string
conversion and relatively small enums we can expect compiler to optimize into a
lookup table, for the opposite conversion a compile time hash table can be used
to improve the performance in complex cases (again, can be decided at compile
time based on the number of enumerators).</p>

<hr />

<h3 id="handle-all-schema-messages">Handle all schema messages</h3>

<p>Another common use-case is when you receive SBE message from network and need to
handle it. To do that, you first need to figure out message type by looking at
message header and then create corresponding message wrapper. Instead of writing
a bunch of <code>if</code>-s or <code>case</code>-s for each message type manually, we can make a
generic helper that will do that for us:</p>

<pre><code class="language-cpp">// calls `cb(Message)` when a buffer given by `data`/`size` pair represents
// any `Message` from schema represented by `SchemaTag`.
// returns `false` if message is unknown
template&lt;typename SchemaTag&gt;
bool handle_schema_message(const char* data, const size_t size, auto cb){
    // implementation of this function is not important here
    const auto msg_id = get_message_id_from_header&lt;SchemaTag&gt;(data, size);

    return for_each_tag_until&lt;schema_traits&lt;SchemaTag&gt;::message_tags&gt;(
        [data, size, msg_id, &amp;cb]&lt;typename Tag&gt;(Tag){
            if(message_traits&lt;Tag&gt;::id() == msg_id){
                cb(message_traits&lt;Tag&gt;::value_type{data, size});
                return true;
            }
            return false;
    });
}

// usage:
handle_schema_message&lt;MySchema::schema&gt;(data, size, overloaded{
    [](MySchema::messages::message_a msg){},
    [](MySchema::messages::message_b msg){},
    // other messages...
});
</code></pre>

<p>Note that this helper allows to either ensure that all messages are handled or
to handle only some target messages by adding <code>[](auto){}</code> at the end of the
overload set. Optimizer will see it and eliminate branches for ignored messages.</p>

<hr />

<h3 id="access-by-name">Access by name</h3>

<p>So far we’ve seen only the usage of tags iteration, let’s add <code>get/set_by_tag</code>
into the mix. Here’s simplified by-name getter:</p>

<pre><code class="language-cpp">template&lt;typename Message&gt;
uint64_t get_by_name(Message m, std::string_view field_name){
    uint64_t res{};

    for_each_tag_until&lt;message_traits&lt;traits_tag&lt;Message&gt;&gt;::field_tags&gt;(
        [field_name, m, &amp;res]&lt;typename Tag&gt;(Tag){
        if(field_traits&lt;Tag&gt;::name() == field_name){
            res = get_by_tag&lt;Tag&gt;(m).value();
            return true;
        }
        return false;
    });

    return res;
}
</code></pre>

<p>It assumes all message fields type can be represented by <code>uint64_t</code>. The full
implementation of this method can return a variant of basic types or even some
extra info, for example enumerator name along with its value. It also should
handle complex field paths instead of just a single field name, e.g.
<code>field_a.sub_field_b</code>. That’s all doable using the same technique. Note that
this approach relies on string comparison, it compares given field name to all
known field names at runtime. This might be suboptimal for some applications.
Further optimization is quite trivial, we can compare numbers instead of strings
by relying on some compile time hash function:</p>

<pre><code class="language-cpp">// constexpr-friendly hash implementation
constexpr uint64_t get_hash(const std::string_view str);

const auto field_name_hash = get_hash(field_name);

for_each_tag_until&lt;message_traits&lt;traits_tag&lt;Message&gt;&gt;::field_tags&gt;(
    [field_name, field_name_hash, m, &amp;res]&lt;typename Tag&gt;(Tag){
        if((get_hash(field_traits&lt;Tag&gt;::name()) == field_name_hash)
            &amp;&amp; (field_traits&lt;Tag&gt;::name() == field_name)){
                res = get_by_tag&lt;Tag&gt;(m).value();
                return true;
        }
        return false;
});
</code></pre>

<p>Because field names are available at compile time, their hash can be computed at
compile time too so at runtime this will mostly compare numbers. One string
comparison is unavoidable if we want to protect against collisions. To give you
an idea how to extend it to support, for example, SBE enums, we can leverage
<code>if constexpr</code> to distinguish field kind:</p>

<pre><code class="language-cpp">for_each_tag_until&lt;...&gt;(
    [...]&lt;typename Tag&gt;(Tag){
    // as before
    if(names_match){
        // field tag != its value type tag
        using value_type_tag = field_traits&lt;Tag&gt;::value_type_tag;
        if constexpr(is_enum_tag_v&lt;field_type_tag&gt;){
            res = to_underlying(get_by_tag&lt;Tag&gt;(m));
        }
        else{
            res = get_by_tag&lt;Tag&gt;(m).value();
        }
        return true;
    }
    return false;
});
</code></pre>

<p>Since SBE enums are scoped C++ <code>enum</code>-s, we use <code>to_underlying()</code> to get their
underlying value instead of <code>value()</code> for numeric fields.</p>

<p>OK, let’s see the setter:</p>

<pre><code class="language-cpp">template&lt;typename Message&gt;
bool set_by_name(Message m, std::string_view field_name, uint64_t value){
    return for_each_tag_until&lt;
        message_traits&lt;traits_tag_t&lt;Message&gt;&gt;::field_tags&gt;(
        [m, value]&lt;typename Tag&gt;(Tag){
            if(field_traits&lt;Tag&gt;::name() == field_name){
                set_by_tag&lt;Tag&gt;(m, value);
                return true;
            }
            return false;
    });
}
</code></pre>

<p>Again, the real implementation will need to handle various field types and check
that given <code>value</code> can be properly converted to the underlying field type.</p>

<hr />

<h3 id="deserialization">De/serialization</h3>

<p>Conversion to/from any other format is not a big deal now as we can freely
iterate over the message structure and get any representation of its fields,
let’s imagine we want JSON:</p>

<pre><code class="language-cpp">template&lt;typename Message&gt;
std::string to_json(Message m){
    json j;

    for_each_tag_until&lt;message_traits&lt;traits_tag_t&lt;Message&gt;&gt;::field_tags&gt;(
        [m, &amp;j]&lt;typename Tag&gt;(Tag){
            j[field_traits&lt;Tag&gt;::name()] = get_by_tag&lt;Tag&gt;(m).value();
            return false;
        });

    return j.to_string();
}

template&lt;typename Message&gt;
Message from_json(const json&amp; j){
    Message m;

    for_each_tag_until&lt;message_traits&lt;traits_tag_t&lt;Message&gt;&gt;::field_tags&gt;(
        [&amp;m, &amp;j]&lt;typename Tag&gt;(Tag){
            set_by_tag&lt;Tag&gt;(m, j[field_traits&lt;Tag&gt;::name()]);
            return false;
        });

    return m;
}
</code></pre>

<p>Of course, the real implementation is a bit harder than that because message can
have nested structure and many other field types. But what’s important is that
it’s possible to tailor the code to specific needs, for example, in our internal
implementation enums are represented as JSON strings or numbers for unknown
values, optional numeric fields are numbers or JSON <code>null</code>, arrays are JSON
strings or JSON arrays of numbers and so on.</p>

<hr />

<h2 id="final-thoughts">Final thoughts</h2>

<p>Being able to navigate over the message structure at compile time and have
access to both its compile and run-time properties has become a surprisingly
powerful mechanism. I already used it in <code>sbepp</code> to replace some generated code
with a single generic implementation that works for all. When I initially wrote
this project, I was sure that “normal” accessors are a key part and meta tags
just provide an extra info that is useful in some relatively rare cases. But
with this new mechanism it’s kinda vice versa. Meta tags and their traits
actually contain <em>all</em> information about the schema. Instead of forwarding
<code>get/set_by_tag</code> to normal accessors that contain hardcoded values and types,
it’s possible to reverse it and let <code>get/set_by_tag</code> to actually calculate them
at compile time. In fact, it’s possible to generate SBE implementation for any
other language now without touching <code>sbepp</code>’s own schema compiler. It
effectively becomes a project-specific reflection mechanism and it works even in
C++11.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[In this article I’ll show how meta tags turned out to be a surprisingly powerful feature and how I used them to implement core parts of a polymorphic wrapper around a binary message format.]]></summary></entry><entry><title type="html">Multi-version Doxygen documentation with GitHub Pages</title><link href="/2024/06/05/multi-version-doxygen.html" rel="alternate" type="text/html" title="Multi-version Doxygen documentation with GitHub Pages" /><published>2024-06-05T08:05:00+00:00</published><updated>2024-06-05T08:05:00+00:00</updated><id>/2024/06/05/multi-version-doxygen</id><content type="html" xml:base="/2024/06/05/multi-version-doxygen.html"><![CDATA[<h2 id="table-of-contents">Table of contents</h2>

<ul>
  <li><a href="#introduction">Introduction</a></li>
  <li><a href="#mono-docs-problems">Problems with mono-version documentation</a></li>
  <li><a href="#multi-version-docs">Welcome multi-version documentation</a></li>
  <li><a href="#prerequisites">Prerequisites</a></li>
  <li><a href="#overall-design">Overall design</a></li>
  <li><a href="#mechanics">Version switch mechanics</a>
    <ul>
      <li><a href="#folder-names">Adjusting folder names</a></li>
      <li><a href="#main-page">Main page</a></li>
      <li><a href="#version-selector">Version selector</a></li>
    </ul>
  </li>
  <li><a href="#automation">Automation</a>
    <ul>
      <li><a href="#actions">Actions</a>
        <ul>
          <li><a href="#build-docs-action">Building documentation</a></li>
          <li><a href="#version-selector-action">Updating <code>version_selector_html</code></a></li>
          <li><a href="#redirect-page-action">Updating redirect page</a></li>
        </ul>
      </li>
      <li><a href="#workflows">Workflows</a>
        <ul>
          <li><a href="#git-main-docs-workflow">Generating <code>git-main</code> docs</a></li>
          <li><a href="#release-docs-workflow">Generating release docs</a></li>
          <li><a href="#pr-docs-create-workflow">Generating PR docs</a></li>
          <li><a href="#pr-docs-remove-workflow">Removing PR docs</a></li>
        </ul>
      </li>
    </ul>
  </li>
  <li><a href="#mono-to-multi-docs">Upgrading from a mono to multi-version documentation</a></li>
  <li><a href="#wrap-up">Wrap-up</a></li>
</ul>

<hr />

<h2 id="introduction">Introduction</h2>

<p>In this article I’ll show how to host a multi-version Doxygen documentation on
GitHub Pages and automate its generation using GitHub Actions.</p>

<hr />

<h2 id="mono-docs-problems">Problems with mono-version documentation</h2>

<p>I have a small <a href="https://github.com/OleksandrKvl/sbepp">hobby project</a> that
generates documentation using Doxygen and hosts it on GitHub Pages. Nothing
special but as the project begins to evolve, I realized that having a single
documentation is not nice for multiple reasons:</p>
<ul>
  <li>it requires special notes like “available/deprecated since version N” around
API.</li>
  <li>it requires similar notes for non-API things, like describing overall design,
recommended practices, examples.</li>
  <li>the above notes are not generally useful because changes are usually driven by
users who are supposed to migrate to the latest version.</li>
</ul>

<p>Basically, over the time, documentation gets polluted with those notes to
cover all the existing versions. Of course, another approach is just to remove
older documentation altogether but it’s not friendly to users that for some
reasons can’t migrate to the latest version immediately.</p>

<hr />

<h2 id="multi-version-docs">Welcome multi-version documentation</h2>

<p>All of the above problems can be solved when a project has dedicated
documentation for each release. Pretty every large project like
<a href="https://www.boost.org/doc/libs/1_85_0/">Boost</a> or
<a href="https://docs.python.org/3.10/">Python</a> uses this approach. Unfortunately,
Doxygen doesn’t support this out of the box. It generates a completely
standalone set of HTML pages for a given version but has no functionality to
combine multiple such sets and allow the user to switch between them.</p>

<p>I’ve found no existing tutorial to achieve this but ChatGPT suggested an
approach and guided me through most of its steps. Note that I’m not a front-end
nor a DevOps expert so there’s a chance that there’s a space for further
improvement but, overall, it should be a solid start for anyone with a similar
goal.</p>

<hr />

<h2 id="prerequisites">Prerequisites</h2>

<p>This tutorial uses CMake 3.29.3 and Doxygen 1.10. It also assumes
that project version is specified using CMake <code>project()</code> command and available
via <code>${PROJECT_VERSION}</code> variable.</p>

<p>The very basic CMake + Doxygen setup looks like this:</p>

<pre><code class="language-cmake">find_package(Doxygen REQUIRED)

# Doxygen options in the form of `DOXYGEN_&lt;OPTION_NAME&gt;`...

doxygen_add_docs(
    doc                      # target name
    "${PROJECT_SOURCE_DIR}"  # sources to scan for docs
)
</code></pre>

<p>All the following changes are made on top of it. It’s not a problem if you’re
using another approach, just apply similar settings in a way you prefer.</p>

<p>The final demo repository is
<a href="https://github.com/OleksandrKvl/multi_version_doxygen_test">here</a>, it contains
all the parts described below. Its docs are
<a href="https://oleksandrkvl.github.io/multi_version_doxygen_test/">here</a>.</p>

<hr />

<h2 id="overall-design">Overall design</h2>

<p>Here’s the directory structure we need:</p>

<pre><code class="language-bash">/           # docs root
    1.0.0/  # version-specific docs
    2.0.0/
</code></pre>

<p>These dirs will be located in a separate branch (e.g. <code>gh-pages</code>) that contains
nothing but the docs themselves. On each release (or another event), a new
documentation will be generated using Doxygen and pushed to the root of that
branch into a version-specific directory. As was said above,
generated docs know nothing about each other so our two main problems are:</p>
<ul>
  <li>version switch mechanics</li>
  <li>automated docs generation and population of the docs branch</li>
</ul>

<hr />

<h2 id="mechanics">Version switch mechanics</h2>

<h3 id="folder-names">Adjusting folder names</h3>

<p>By default, Doxygen generates docs into <code>html</code> directory, to have them in a
version-specific directory we can use <code>HTML_OUTPUT</code> option:</p>

<pre><code class="language-cmake">set(DOXYGEN_HTML_OUTPUT "${PROJECT_VERSION}")
</code></pre>

<p>It will produce the docs into a path like <code>build_dir/doc/1.0.0</code>, where <code>doc</code> is
the Doxygen target name. But there’s a minor problem here. When we build the
<code>doc</code> target, we don’t know the actual version and hence the path to generated
docs. Without this knowledge it’s pretty hard to automate the process. To solve
it, let’s add another directory into that path using Doxygen
<code>OUTPUT_DIRECTORY</code> setting:</p>

<pre><code class="language-cmake">set(DOXYGEN_OUTPUT_DIRECTORY "docs")
</code></pre>

<p>Now, docs are located in <code>build_dir/doc/docs/1.0.0</code> and the new <code>docs</code>
directory holds nothing else but our version-specific docs. To determine the
generated version, we can simply enumerate directories in <code>docs</code>, actually,
there will always be a single one. This trick will be used later in the
<a href="#build-docs-action">building documentation</a> section.</p>

<hr />

<h3 id="main-page">Main page</h3>

<p>OK, now we can have docs in separate version-specific directories but what
should be the main page? When docs are hosted on a standalone server over which
you have full control, you can try to update the path to index page on each new
release. GitHub Pages is not so flexible, it always looks for <code>index.html</code> in
the root directory. To make it work, we have to implement HTML redirect:</p>

<pre><code class="language-html">&lt;!DOCTYPE html&gt;
&lt;html lang="en"&gt;
&lt;head&gt;
    &lt;meta charset="UTF-8"&gt;
    &lt;meta http-equiv="refresh" content="0; url=2.0.0/index.html"&gt;
    &lt;title&gt;Redirecting...&lt;/title&gt;
&lt;/head&gt;
&lt;body&gt;
    &lt;p&gt;If you are not redirected automatically, &lt;a href="2.0.0/index.html"&gt;click here&lt;/a&gt;.&lt;/p&gt;
&lt;/body&gt;
&lt;/html&gt;
</code></pre>

<p>The target URL changes with each release so the generation of this
file will be automated later. Here’s how the root directory will look like at
this stage:</p>

<pre><code class="language-bash">/           # root of our documentation
    1.0.0/  # version-specific directories
        index.html # version-specific main page generated by Doxygen
        ...
    2.0.0/
        index.html
        ...

    index.html # redirects to the latest release docs, e.g. `2.0.0/index.html`
</code></pre>

<hr />

<h3 id="version-selector">Version selector</h3>

<p>Now comes the interesting part, we need that dropdown selector element that
does the actual switch between versions. Doxygen has <code>PROJECT_NUMBER</code> option,
when set, it nicely displays the project version next to the project name.
<code>doxygen_add_docs</code> automatically sets it to <code>${PROJECT_VERSION}</code>. If you’re
using standalone Doxygen config, be sure to set it by hand.
That static number has to be replaced with a version selector. Doxygen allows
customization of the header HTML part that’s common for all pages. First, we
need to generate the default one:</p>

<pre><code class="language-bash">doxygen -w html header.html footerFile styleSheetFile Doxyfile.doc
</code></pre>

<p>Here, <code>Doxyfile.doc</code> is Doxygen configuration file generated by
<code>doxygen_add_docs</code>, it can be found in the binary directory of the <code>doc</code> target
(e.g.
<code>build_dir/doc/Doxyfile.doc</code>). If you’re using standalone configuration
file, use it here instead of <code>Doxyfile.doc</code>.
From 3 files generated by this command, we need only <code>header.html</code> that we are
about to customize. Opening it, we can see where that version text is located:</p>

<pre><code class="language-html">&lt;span id="projectnumber"&gt;&amp;#160;$projectnumber&lt;/span&gt;
</code></pre>

<p>During docs generation, Doxygen replaces that <code>$projecnumber</code> with
<code>PROJECT_NUMBER</code> value but we don’t need that. Instead, we cut it down to just:</p>

<pre><code class="language-html">&lt;span id="projectnumber"/&gt;
</code></pre>

<p>Then, we add a <code>version_selector_handler.js</code> script that does 3 things:</p>
<ul>
  <li>injects <code>&lt;select&gt;</code> element from <code>version_selector.html</code> inside the
<code>projectnumber</code> span</li>
  <li>implements redirect when different version is selected</li>
  <li>maintains proper active value of the <code>&lt;select&gt;</code> element</li>
</ul>

<p>Here’s the script:</p>

<pre><code class="language-js">// version_selector_handler.js
$(function () {
    var repoName = window.location.pathname.split('/')[1];
    $.get('/' + repoName + '/version_selector.html', function (data) {
        // Inject version selector HTML into the page
        $('#projectnumber').html(data);

        // Event listener to handle version selection
        document.getElementById('versionSelector').addEventListener('change', function () {
            var selectedVersion = this.value;
            window.location.href = '/' + repoName + '/' + selectedVersion + '/index.html';
        });

        // Set the selected option based on the current version
        var currentVersion = window.location.pathname.split('/')[2];
        $('#versionSelector').val(currentVersion);
    });
});
</code></pre>

<p>Note that its URL is in the form of <code>/&lt;repo_name&gt;/version_selector.html</code>.
That’s because when project is published on GitHub Pages, all its URLs will have
<code>&lt;repo_name&gt;</code> prefix. It’s not needed if docs are hosted on a fully standalone
domain.</p>

<p>This script has to be injected at the bottom of the <code>&lt;head&gt;</code> section of the
<code>header.html</code>:</p>

<pre><code class="language-html">&lt;!-- header.html --&gt;
&lt;head&gt;
  &lt;!-- other content... --&gt;
  &lt;script type="text/javascript" src="$relpath^version_selector_handler.js"&gt;&lt;/script&gt;
&lt;/head&gt;
</code></pre>

<p>The actual <code>&lt;select&gt;</code> element is located in <code>/version_selector.html</code> file and
will be populated automatically. Here’s how it can look like:</p>

<pre><code class="language-html">&lt;!-- version_selector.html --&gt;
&lt;select id="versionSelector"&gt;
    &lt;option value="2.0.0"&gt;2.0.0&lt;/option&gt;
    &lt;option value="1.0.0"&gt;1.0.0&lt;/option&gt;
&lt;/select&gt;
</code></pre>

<p>Here’s the full structure of our documentation directory:</p>

<pre><code class="language-bash">/           # root of our documentation
    1.0.0/  # version-specific directories
        version_selector_handler.js # loads `/version_selector.html`
        index.html # version-specific main page generated by Doxygen
        ...
    2.0.0/
        version_selector_handler.js
        index.html
        ...

    index.html # redirects to the latest release, e.g. `2.0.0/index.html`
    version_selector.html # holds the actual version list
</code></pre>

<p>Note that while every generated documentation has its own
<code>version_selector_handler.js</code> copy, there’s only one instance of
<code>version_selector.html</code>. There’s no need to regenerate existing docs to
introduce a new version, only <code>version_selector.html</code> has to be updated.</p>

<p>We need to adjust Doxygen settings to bring the above changes together:</p>

<pre><code class="language-cmake"># specify custom header path
set(DOXYGEN_HTML_HEADER "header.html")

# extra files that will be copied alongside generated HTML files
set(DOXYGEN_HTML_EXTRA_FILES "version_selector_handler.js")
</code></pre>

<p>And that’s it, now each HTML page has that <code>&lt;select&gt;</code> element with version list
which is updated automatically and user can switch between versions as they
want.</p>

<hr />

<h2 id="automation">Automation</h2>

<p>The next major step is automation of the above steps using GitHub Actions. The
following assumes you know at least basics of how Actions and Workflows work.</p>

<h3 id="actions">Actions</h3>

<p>We need 3 basic building blocks to achieve our goals:</p>
<ul>
  <li>build the docs</li>
  <li>update <code>version_selector.html</code> that contains version list</li>
  <li>update <code>/index.html</code> HTML redirect</li>
</ul>

<p>Using them we can implement different strategies using GitHub Workflows.
I’ve extracted them to corresponding custom actions. Here, I will show only
their core parts to keep things short, you can find full sources in the <a href="https://github.com/OleksandrKvl/multi_version_doxygen_test/tree/main/.github/actions">demo
repository</a>. Note that the actions below are made configurable and can be
reused as is, check out their <code>inputs</code> section for a list of parameters.</p>

<hr />

<h4 id="build-docs-action">Building documentation</h4>

<pre><code class="language-yaml"># .github/actions/build-docs/action.yaml

steps:
  - name: Install deps
    shell: bash
    run:   |
            sudo apt install -y cmake
            sudo apt install -y wget
            wget -nv https://www.doxygen.nl/files/doxygen-1.10.0.linux.bin.tar.gz
            tar -xzf doxygen-1.10.0.linux.bin.tar.gz
            echo "$(pwd)/doxygen-1.10.0/bin" &gt;&gt; $GITHUB_PATH

  - name: CMake configuration
    shell: bash
    run:  cmake $ -B $

  - name: CMake build
    shell: bash
    run:  cmake --build $ --target $

  - name: Get docs version
    id: get-docs-version
    shell: bash
    run: |
      subdir=$(basename $(find $/$ -mindepth 1 -maxdepth 1 -type d | head -n 1))
      echo "version=$subdir" &gt;&gt; $GITHUB_OUTPUT

  # deploy to either version-specific or to explicitly provided directory
</code></pre>

<p>It just installs CMake and Doxygen and builds the target that is responsible for
invoking Doxygen. The only interesting part here is the last step, it retrieves
the version of the generated documentation by simply enumerating directories in
the Doxygen <code>OUTPUT_DIRECTORY</code>.</p>

<hr />

<h4 id="version-selector-action">Updating <code>version_selector.html</code></h4>

<p><code>version_selector.html</code> should contain all the versions we have published on our
docs branch (e.g. <code>gh-pages</code>). This is achieved by enumerating
directories and then generating HTML file from that list:</p>

<pre><code class="language-yaml"># .github/actions/update-version-selector/action.yaml

steps:
  - name: Discover versions
    id: discover-versions
    shell: bash
    run: |
      git fetch origin $
      dirs=$(git ls-tree --name-only -d origin/$ | sort -rV)
      echo "counter=$(echo "$dirs" | wc -l | xargs)" &gt;&gt; $GITHUB_OUTPUT

      mkdir $
      # Create HTML
      echo '&lt;select id="$"&gt;' &gt; $/$
      for dir in $dirs; do
          if [[ "$(basename "$dir")" != .* ]]; then
              version=$(basename "$dir")
              echo "    &lt;option value=\"$version\"&gt;$version&lt;/option&gt;" &gt;&gt; $/$
          fi
      done
      echo '&lt;/select&gt;' &gt;&gt; $/$

  # deploy step...
</code></pre>

<hr />

<h4 id="redirect-page-action">Updating redirect page</h4>

<p>This one is trivial, just take the target url from the parameter and generate
the standard HTML redirect:</p>

<pre><code class="language-yaml"># .github/actions/update-redirect-page/action.yaml

steps:
  - name: Generate redirect HTML
    shell: bash
    run: |
      mkdir $
      cat &lt;&lt; EOF &gt; $/$
      &lt;!DOCTYPE html&gt;
      &lt;html lang="en"&gt;
      &lt;head&gt;
          &lt;meta charset="UTF-8"&gt;
          &lt;meta http-equiv="refresh" content="0; url=$"&gt;
          &lt;title&gt;Redirecting...&lt;/title&gt;
      &lt;/head&gt;
      &lt;body&gt;
          &lt;p&gt;If you are not redirected automatically, &lt;a href="$"&gt;click here&lt;/a&gt;.&lt;/p&gt;
      &lt;/body&gt;
      &lt;/html&gt;
      EOF

  # deploy step...
</code></pre>

<hr />

<h3 id="workflows">Workflows</h3>

<p>The above three actions are enough to implement almost any strategy for docs
generation. Let’s see a couple of common ones.</p>

<hr />

<h4 id="git-main-docs-workflow">Generating <code>git-main</code> docs</h4>

<p>It’s useful to have not only release-specific docs but the ones corresponding
to the latest, not yet released version of a project. This can be achieved by
generating docs from the <code>main</code> branch whenever new commits are pushed into it.
We’ll give such docs a <code>git-main</code> “version”:</p>

<pre><code class="language-yaml"># .github/workflows/create-git-main-docs.yml

on:
  push:
    branches:
      - main

jobs:
  create-git-main-docs:
    runs-on: ubuntu-22.04

    steps:
      - uses: actions/checkout@v4

      - name: Build docs
        id: build-docs
        uses: ./.github/actions/build-docs
        with:
          cmake_target: 'doc'
          docs_dir: 'doc/docs'
          destination_dir: git-main
          github_token: $

      - name: Update version selector
        id: update-version-selector
        uses: ./.github/actions/update-version-selector
        with:
          github_token: $

      - name: Create redirect page if there are no releases
        if: $
        uses: ./.github/actions/update-redirect-page
        with:
          github_token: $
          target_url: git-main/index.html
</code></pre>

<p>The first two steps are self-explanatory, the only thing to notice is
<code>destination_dir: git-main</code> argument to <code>build-docs</code> action. It forces
<code>build-docs</code> to deploy documentation to the <code>git-main</code> directory, not to a
version-specific one because those are reserved for releases. The last step is
required to create redirect page when your documentation branch has no
release-specific docs yet. It’s useful when you begin to play with this stuff to
test how everything works without making new releases.</p>

<hr />

<h4 id="release-docs-workflow">Generating release docs</h4>

<p>Generating release-specific docs is the main goal of this tutorial and its
workflow is even simpler than the above. This time <code>destination_dir</code> is not set
so the docs are published into a version-specific directory, e.g. <code>1.0.0</code>:</p>

<pre><code class="language-yaml">on:
  release:
    types: [released]

jobs:
  create-release-docs:
    runs-on: ubuntu-22.04

    steps:
      - uses: actions/checkout@v4

      - name: Build docs
        id: build-docs
        uses: ./.github/actions/build-docs
        with:
          cmake_target: 'doc'
          docs_dir: 'doc/docs'
          github_token: $

      - name: Update redirect HTML
        uses: ./.github/actions/update-redirect-page
        with:
          github_token: $
          target_url: $/index.html

      - name: Update version selector
        uses: ./.github/actions/update-version-selector
        with:
          github_token: $
</code></pre>

<hr />

<h4 id="pr-docs-create-workflow">Generating PR docs</h4>

<p>The above two workflows are a good basis to generate docs for a project. As an
example of something “extra”, let’s generate docs from a pull request. It can be
useful for projects with many contributors to check that docs for a new feature
are correct. Generating them from every PR makes no sense so we need some
condition here. GitHub has different ways to control when to run such a
workflow, I’ve chosen the simplest one, to run it when PR is labeled as
<code>documentation</code>. It’s very similar to <code>git-main</code> workflow but now
<code>destination_dir</code> is set to
<code>PR-$</code> and redirect page is never touched:</p>

<pre><code class="language-yaml"># .github/workflows/create-pr-docs.yml

on:
  pull_request:
    types: [labeled, synchronize]
    branches:
      - main

jobs:
  create-pr-docs:
    if: $
    runs-on: ubuntu-22.04

    steps:
      - uses: actions/checkout@v4
        with:
          ref: $

      - name: Build docs
        id: build-docs
        uses: ./.github/actions/build-docs
        with:
          cmake_target: 'doc'
          docs_dir: 'doc/docs'
          destination_dir: PR-$
          github_token: $

      - name: Update version selector
        uses: ./.github/actions/update-version-selector
        with:
          github_token: $
</code></pre>

<hr />

<h4 id="pr-docs-remove-workflow">Removing PR docs</h4>

<p>Unlike <code>git-main</code> and release docs which stay there forever, PR docs should be
removed when PR is closed:</p>

<pre><code class="language-yaml"># .github/workflows/remove-pr-docs.yml

on:
  pull_request:
    types: [closed]
    branches:
      - main

jobs:
  remove-pr-docs:
    if: $
    runs-on: ubuntu-22.04

    steps:
      - name: Remove PR docs
        uses: peaceiris/actions-gh-pages@v4
        with:
          github_token: $
          publish_dir: $
          destination_dir: PR-$

      - uses: actions/checkout@v4
        with:
          sparse-checkout: .github

      - name: Update version selector
        uses: ./.github/actions/update-version-selector
        with:
          github_token: $
</code></pre>

<p>Here, <code>peaceiris/actions-gh-pages</code> removes <code>destination_dir</code> before pushing new
files to it and since <code>$</code> is empty, this effectively
results in removing PR docs.</p>

<hr />

<h2 id="mono-to-multi-docs">Upgrading from a mono to multi-version documentation</h2>

<p>Switch from a mono to multi-version documentation requires a bit of manual
intervention. <code>git-main</code> docs will be generated automatically once new
functionality is merged into <code>main</code>, the same applies to new releases but not
to the previous ones. For my project, I’ve decided to integrate only the latest
available release at the time to the new multi-version docs branch. But that
release generates docs without the new version selector functionality and
wouldn’t work as is. Here’s how I’ve done it:</p>
<ul>
  <li>pulled release tag locally</li>
  <li>applied Doxygen-specific changes on top of it</li>
  <li>generated its documentation, now it has version selector functionality</li>
  <li>manually pushed it to my <code>gh-pages</code> branch into the corresponding
version-specific directory</li>
  <li>manually added that version to <code>version_selector.html</code></li>
  <li>updated <code>index.html</code> redirect page</li>
</ul>

<p>That’s it, now old release docs are fully integrated. Not a lot of work for a
single release but if one needs this for more releases, it makes sense to
automate the process.</p>

<hr />

<h2 id="wrap-up">Wrap-up</h2>

<p>We’re done, at this point we have a fully automated system that generates and
publishes documentation for a project allowing user to switch between the
versions. The presented approach is not the only one, many other customizations
are possible but it should be a good place to start.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Table of contents]]></summary></entry><entry><title type="html">From range projections to projected ranges</title><link href="/2021/10/11/projected-ranges.html" rel="alternate" type="text/html" title="From range projections to projected ranges" /><published>2021-10-11T16:22:00+00:00</published><updated>2021-10-11T16:22:00+00:00</updated><id>/2021/10/11/projected-ranges</id><content type="html" xml:base="/2021/10/11/projected-ranges.html"><![CDATA[<h2 id="table-of-contents">Table of contents</h2>

<ul>
  <li><a href="#intro">Introduction</a></li>
  <li><a href="#what-proj-is">What a projection is</a></li>
  <li><a href="#problems">Problems with existing design</a></li>
  <li><a href="#projected-ranges">Projected ranges to the rescue</a></li>
  <li><a href="#implementation">Implementation story</a>
    <ul>
      <li><a href="#cpp20-iterators">C++20 iterators overview</a></li>
      <li><a href="#need-for-better-design">Need for a better design</a></li>
      <li><a href="#next-iter-iter">The next iteration of iterators</a></li>
      <li><a href="#views-projection">views::projection</a></li>
      <li><a href="#views-narrow-projection">views::narrow_projection</a></li>
      <li><a href="#impact-on-algos">Impact on algorithms</a></li>
      <li><a href="#reducing-derefs">Reducing number of dereferences</a></li>
      <li><a href="#root-method">root() method</a></li>
      <li><a href="#the-flaw">Major flaw</a></li>
    </ul>
  </li>
  <li><a href="#other-use-cases">Other use-cases</a></li>
  <li><a href="#transform">The role of std::views::transform</a></li>
  <li><a href="#demo">Demo</a></li>
  <li><a href="#wrap-up">Wrap-up</a></li>
</ul>

<hr />

<h2 id="intro">Introduction</h2>

<p>When I first watched range-related talks, I liked the idea of projections. I
played with them a bit and still liked them. However, after trying to write
range-based algorithms I found them not good enough and not pleasant to work with.
In this post I’ll explain why I don’t like range projections in their current
form and how I propose to fix them (demo implementation is provided).</p>

<p><em>Update. After this article was published, I received some feedback and realized
that the proposed design has one problem. I described it in the section
<a href="#the-flaw">Major flaw</a> and left the rest of the article untouched. Please keep that
in mind while reading it. Thanks to all the people who shared their feedback and
thoughts.</em></p>

<hr />

<h2 id="what-proj-is">What a projection is</h2>

<p>If you are not familiar with projections, here’s a brief explanation. 
Projection is an <em>invocable</em> entity which is applied to a range element before the
algorithm’s logic will use it. It can be a lambda, pointer-to-member (either data
or function) or just a function pointer.
Along this article I will use these two structures for examples:</p>

<pre><code class="language-cpp">struct Y
{
    int a;
    int b;
    auto operator&lt;=&gt;(const Y&amp;) const = default;
};

struct X
{
    int x;
    Y y;
    auto operator&lt;=&gt;(const X&amp;) const = default;
};
</code></pre>

<p>If you don’t know what <code>operator&lt;=&gt;</code> is, don’t worry, in the context of this
article you only need to know that it provides all the comparison operations (
<code>==, !=, &lt;, &lt;=, &gt;, &gt;=</code>) for both <code>X</code> and <code>Y</code>, they operate in a member-wise 
fashion. Ok, back to the subject, imagine that we 
want to sort
a vector of <code>X</code> based on <code>X::x</code>. Here’s how this can be done
with the pre-ranges STL:</p>

<pre><code class="language-cpp">std::vector&lt;X&gt; v;

std::sort(std::begin(v), std::end(v), [](const auto&amp; lhs, const auto&amp; rhs){
    return lhs.x &lt; rhs.x;
});
</code></pre>

<p>And here’s how it can be done using projection and range-based algorithm:</p>

<pre><code class="language-cpp">std::ranges::sort(v, std::less{}, &amp;X::x);
</code></pre>

<p>Now, <code>std::less</code> operates on <code>X::x</code> values but it’s important to understand that
the algorithm itself sorts original <code>X</code> elements, not just their <code>X::x</code> parts. 
Roughly:</p>

<pre><code class="language-cpp">auto sort(auto range, auto compare, auto projection){
    // `it1` and `it2` are iterators from `range`
    // comparator is invoked on projected values
    if(compare(std::invoke(projection, *it1), std::invoke(projection, *it2))){
        // but moving/swapping is done on non-projected values
        std::ranges::iter_swap(it1, it2);
    }
    // ...
}
</code></pre>

<p>Projection provides clear separation of comparison logic from the element
manipulation. These things are really orthogonal, it’s nice that now we can 
keep them separate. And while the idea behind projection is great, its 
implementation has unpleasant side-effects which sometimes make developer
lives harder.</p>

<hr />

<h2 id="problems">Problems with existing design</h2>

<p>Several months ago I became involved in the
<a href="https://wg21.link/p1708r5">P1708 Simple Statistical Functions</a> proposal. I needed
those functions for my hobby project and started implementing them. This was my
first experience in writing range-based API and that’s how I got most of my
unpleasant experience working with current projections design.</p>

<h3 id="projections-uglify-function-signatures">Projections uglify function signatures</h3>

<p>In range-based API you usually have at least one range for which you need to
support and hence provide a projection. For example, simplified signature of 
<code>copy_if</code> with removed return type and <code>O</code> type requirements:</p>

<pre><code class="language-cpp">template&lt;ranges::input_range R, typename O,
    class Proj = std::identity,
    std::indirect_unary_predicate&lt;
        std::projected&lt;ranges::iterator_t&lt;R&gt;, Proj&gt;&gt; Pred&gt;
constexpr auto copy_if(R&amp;&amp; r, O result, Pred pred, Proj proj = {});
</code></pre>

<p>All range-based algorithms must have this additional function and template
parameter that defaults to a
no-op <code>std::identity</code> projection. Looks innocent? In P1708 we have
weighted statistics so we use two ranges: one for values and one for weights,
thus, we need one projection per range:</p>

<pre><code class="language-cpp">template&lt;
    typename Values,
    typename ValuesProj = std::identity,
    typename Weights,
    typename WeightsProj = std::identity&gt;
constexpr auto mean(
    Values&amp;&amp; v, Weights&amp;&amp; w, ValuesProj proj1 = {}, WeightsProj proj2 = {});
</code></pre>

<p>Add to it more algorithm specific parameters like comparators and you’ll get
something like <code>std::ranges::merge()</code>:</p>

<pre><code class="language-cpp">template&lt;
    ranges::input_range R1,
    ranges::input_range R2,
    std::weakly_incrementable O,
    class Comp = ranges::less,
    class Proj1 = std::identity,
    class Proj2 = std::identity&gt;
constexpr auto merge(R1&amp;&amp; r1, R2&amp;&amp; r2, O result,
        Comp comp = {}, Proj1 proj1 = {}, Proj2 proj2 = {});
</code></pre>

<p>I believe that in a good API default function arguments should be rare and
their number should be small. Here, we have 6 parameters and 3 of them have
default arguments. This signature is not good at all, we also will discuss
usability of such API in following sections.</p>

<p>Another issue, though not so critical, is access to projected value type in 
order, for example, to constrain it. Recall the constraint from the <code>copy_if()</code>:
<code>std::indirect_unary_predicate&lt;std::projected&lt;ranges::iterator_t&lt;R&gt;, Proj&gt;&gt; Pred</code>.
It ensures that predicate <code>Pred</code> can be called with the result of applying
projection <code>Proj</code> to the value of the iterator of a range <code>R</code>. It’s understandable
but still quite complex. In P1708R5 functions are supposed to work only on
standard arithmetic types, a way to achieve it:</p>

<pre><code class="language-cpp">template&lt;typename Range, typename Proj = std::identity&gt;
requires std::is_arithmetic_v&lt;
    std::remove_cvref_t&lt;
        std::indirect_result_t&lt;Proj&amp;, std::ranges::iter_value_t&lt;Range&gt;&gt;&gt;&gt;
double mean(Range&amp;&amp; r, Proj = {});
</code></pre>

<p>I mean, OK, it works and with some effort you can do it properly. But I don’t
like its complexity. Writing your own algorithms in the classic STL style was
simple, writing them for ranges is not if you want to support projections
properly.</p>

<hr />

<h3 id="projections-are-not-easily-composable">Projections are not easily composable</h3>

<p>Imagine that you’re implementing a range-based algorithm and you need to call
another algorithm but with one more additional projection. For example, <a href="https://en.wikipedia.org/wiki/Geometric_mean#Relationship_with_logarithms">geometric
mean</a>
is usually implemented in terms of arithmetic mean of logarithms and final
<code>std::exp()</code> of it. This requires combination of two projections, original one
and <code>std::log()</code>:</p>

<pre><code class="language-cpp">template&lt;typename R, typename P = std::identity&gt;
constexpr double geometric_mean(R&amp;&amp; r, P proj = {})
{
    const auto logs_mean = mean(
        std::forward&lt;R&gt;(r),
        [&amp;](const auto&amp; value)
        {
            return std::log(std::invoke(proj, value));
        });

    return std::exp(logs_mean);
}
</code></pre>

<p>It would be nice to move <code>std::log()</code> part to a separate independent projection
but such a projection wouldn’t be really independent because it needs to know
about the preceding one:</p>

<pre><code class="language-cpp">// to make it a reusable function-like object we again need this additional
// parameter everywhere
template&lt;typename P = std::identity&gt;
class log_proj{
public:
    explicit log_proj(P proj = {}): p{std::move(proj)}{}

    auto operator()(const auto&amp; value){
        return std::log(std::invoke(p, value));
    }
private:
    P p;
};
// with the above it's possible to write:
// const auto logs_mean = mean(r, log_proj{proj});

// and this is what I want as a client:
struct nice_log_proj{
    auto operator()(const auto&amp; value){
        return std::log(value);
    }
};
</code></pre>

<p>Of course, it’s possible to create another utility to chain projections together
like <code>mean(r, chain(std::move(proj), nice_log_proj{}))</code> but at the moment
there’s no standard tool for that.
This problem also occurs when you want to sort <code>std::vector&lt;X&gt;</code> by <code>Y::a</code> member of
<code>X::y</code>. In C++ it’s not possible to get a pointer to a member of a member, <code>&amp;X::y::a</code>
doesn’t work, something like <code>chain(&amp;X::y, &amp;Y::a)</code> is needed.</p>

<hr />

<h3 id="projections-complicate-callers-code">Projections complicate caller’s code</h3>

<p>Imagine a function with several default arguments:</p>

<pre><code class="language-cpp">template&lt;typename R, typename P = std::identity&gt;
void f(R&amp;&amp; range, int x = 1, int y = 2, P p = {});
</code></pre>

<p>Because the projection is usually placed at the end of signature, if you need to
use it, you have to specify all the default arguments by hand:</p>

<pre><code class="language-cpp">f(v);               // without projection
f(v, 1, 2, &amp;X::x);  // with projection
</code></pre>

<p>What if the author of <code>f()</code> decides to change default arguments? Clients will be
forced to rewrite the code to preserve “default” behavior. It’s less painful
with something like <code>std::ranges::sort()</code>:</p>

<pre><code class="language-cpp">std::ranges::sort(v);   // without projection
std::ranges::sort(v, std::less{}, &amp;X::x);   // with projection
std::ranges::sort(v, {}, &amp;X::x);    // less verbose but less readable too
</code></pre>

<p>But now it’s either too verbose with explicit <code>std::less{}</code> or less readable
with <code>{}</code>. So the client is either forced to
explicitly write arguments by hand or use less readable constructions if that’s
possible at all. Going back to weighted stats:</p>

<pre><code class="language-cpp">template&lt;
    typename Values,
    typename Weights,
    typename ValuesProj = std::identity,
    typename WeightsProj = std::identity&gt;
constexpr auto mean(
    Values&amp;&amp; v, Weights&amp;&amp; w, ValuesProj proj1 = {}, WeightsProj proj2 = {});

mean(values, weights);    // no projections, great
mean(values, weights, &amp;X::x); // only value projection, OK
mean(values, weights, {}, &amp;X::x); // only weight projection, ugly :(
</code></pre>

<hr />

<h3 id="root-cause-of-all-the-problems">Root cause of all the problems</h3>

<p>From the interface point of view it’s pretty simple, the problem is that range
and projection represent a logically single entity but are passed to functions 
separately via distinct
parameters. It’s the same as for
error-prone <code>f(const char* str, std::size_t len);</code> and we all know it’s a bad
way of doing things. Clients are forced to separate things, developers are 
forced to combine them back together, I want something better.</p>

<hr />

<h2 id="projected-ranges">Projected ranges to the rescue</h2>

<p>I had quite a simple idea: range and projection should be combined into a single
thing using some kind of view, e.g., <code>views::projection</code>. This would make all
those projection-related parameters redundant, algorithms wouldn’t care about
them at all, they would only operate on a range itself, just like in classic
STL. Here’s what I wanted:</p>

<pre><code class="language-cpp">// no projection-related parameters
auto sort(auto&amp;&amp; range, auto cmp = std::ranges::less{});

// sorts elements by `X::x` member, analog of current sort(v, {}, &amp;X::x);
sort(v | projection(&amp;X::x));

const auto log_proj = [](const auto value)
{
    return std::log(value);
};

// nested projections, actually, it's a nested range now
constexpr double geometric_mean(auto&amp;&amp; r)
{
    const auto logs_mean = mean(r | projection(log_proj{}));
    return std::exp(logs_mean);
}
</code></pre>

<p>Isn’t it great? No more projection-related parameters, signatures are clean,
everything is perfectly composable. It
simplifies projections in the same way as ranges simplified usage and
composition of iterator-based algorithms.</p>

<p>I call it a <em>projected range</em> because it combines range and projection.
Such a range
has very important property: its <code>operator*()</code> returns projected
value, while copy/move/swap/assign operations should be performed on the whole
underlying object. Immediately, another type of projection
came to my mind, the so-called <code>narrow_projection</code>. All of its operations are
performed on the projected part only. It’s <em>narrow</em> in a sense that it
represents only a narrow part of the object while <em>wide</em> <code>projection</code>
represents a wider object behind it:</p>

<pre><code class="language-cpp">std::vector&lt;X&gt; v{{3, {30, 300}}, {2, {20, 200}}, {1, {10, 100}}};

// sorts the whole X objects using &amp;X::x member
std::sort(v | projection(&amp;X::x));
// {{1, {10, 100}}, {2, {20, 200}}, {3, {30, 300}}}

// sorts only X::x
std::sort(v | narrow_projection(&amp;X::x), std::ranges::greater{});
// {{3, {10, 100}}, {2, {20, 200}}, {1, {30, 300}}}

// sorts X::y by Y::a
std::sort(
    v | narrow_projection(&amp;X::y) | projection(&amp;Y::a), std::ranges::greater{});
// {{3, {30, 300}}, {2, {20, 200}}, {1, {10, 100}}}
</code></pre>

<p>Delighted, I started to think how to implement it and it turned out to be a
bit harder than I expected.</p>

<hr />

<h2 id="implementation">Implementation story</h2>

<p>If you don’t know how range views work, here’s the basic idea: all work is
done inside custom “smart” iterators. For example, iterator for the most relevant
to projections <code>views::transform</code> has <code>operator*()</code> which looks like this:</p>

<pre><code class="language-cpp">class transform_view_iterator{
private:
    Iterator it;    // underlying iterator
    F f;            // transform function

public:
    decltype(auto) operator*(){
        return std::invoke(f, *it);
    }
    // ...
};
</code></pre>

<p>Other operations mostly take care of proper <code>it</code> moving. To implement
<code>views::projection</code> we will need to implement a custom iterator, thus, we need
first to understand how  iterators work in C++20.</p>

<hr />

<h3 id="cpp20-iterators">C++20 iterators overview</h3>

<p>Here’s the brief overview of iterator-related types and operations. It’s
heavily based on articles/papers by Eric Niebler
(<a href="http://ericniebler.com/2015/01/28/to-be-or-not-to-be-an-iterator/">0</a>,
<a href="https://ericniebler.com/2015/02/03/iterators-plus-plus-part-1/">1</a>,
<a href="https://ericniebler.com/2015/02/13/iterators-plus-plus-part-2/">2</a>,
<a href="https://ericniebler.com/2015/03/03/iterators-plus-plus-part-3/">3</a>,
<a href="https://ericniebler.github.io/std/wg21/D0022.html">4</a>).
Read them if you want more details and reasoning behind current design.</p>

<p><code>iter_value_t/value_type</code> - the type of a value which the iterator represents. The
value of this type can be copied/moved from the iterator.</p>

<p><code>iter_reference_t operator*()</code> - dereference operator, usually returns lvalue
reference to <code>value_type</code> (but not required), must be convertible to <code>iter_value_t</code>.</p>

<p><code>iter_rvalue_reference_t iter_move(it)</code> - customization
point for moving value out of iterator, usually returns rvalue reference to
<code>value_type</code> (but not required). Also must be convertible to <code>iter_value_t</code>. If
not defined by iterator, <code>std::move(*it)</code> is used.</p>

<p><code>void iter_swap(it1, it2)</code> - customization point for swapping values between
two iterators. If not defined, performs <code>std::ranges::swap(*it1, *it2)</code> if 
possible, otherwise uses <code>iter_move()</code> to swap elements “by-hand”.</p>

<p><code>common_reference</code> requirements for readable iterators. Now comes the tricky
part. As you might have
noticed, <code>iter_value_t</code>, <code>iter_reference_t</code>, <code>iter_rvalue_reference_t</code> are not
required to be as simple as <code>int</code>, <code>int&amp;</code> and <code>int&amp;&amp;</code> correspondingly. But there
must be pairwise <code>common_reference</code>s to represent relationships between them.
Basically, <code>common_reference&lt;T,U&gt;</code> is a type to which both <code>T</code> and <code>U</code> can be
converted or bound, it’s not required to be a true reference type.</p>

<pre><code class="language-cpp">static_assert(std::same_as&lt;std::common_reference_t&lt;int&amp;, const int&amp;&gt;, const int&amp;&gt;);
static_assert(std::same_as&lt;std::common_reference_t&lt;int&amp;&amp;, int&amp;&amp;&gt;, int&amp;&amp;&gt;);
static_assert(std::same_as&lt;std::common_reference_t&lt;int&amp;&amp;, int&amp;&gt;, const int&amp;&gt;);
static_assert(std::same_as&lt;std::common_reference_t&lt;int&amp;, int&gt;, int&gt;);
</code></pre>

<p>You can find
these requirements in the <a href="https://en.cppreference.com/w/cpp/iterator/indirectly_readable"><code>std::indirectly_readable</code></a> concept:</p>

<pre><code class="language-cpp">template&lt;class In&gt;
concept __IndirectlyReadableImpl = // exposition only
requires(const In in) {
    typename std::iter_value_t&lt;In&gt;;
    typename std::iter_reference_t&lt;In&gt;;
    typename std::iter_rvalue_reference_t&lt;In&gt;;
    { *in } -&gt; std::same_as&lt;std::iter_reference_t&lt;In&gt;&gt;;
    { ranges::iter_move(in) } -&gt; std::same_as&lt;std::iter_rvalue_reference_t&lt;In&gt;&gt;;
} &amp;&amp;
std::common_reference_with&lt;
    std::iter_reference_t&lt;In&gt;&amp;&amp;, std::iter_value_t&lt;In&gt;&amp;&gt; &amp;&amp;
std::common_reference_with&lt;
    std::iter_reference_t&lt;In&gt;&amp;&amp;, std::iter_rvalue_reference_t&lt;In&gt;&amp;&amp;&gt; &amp;&amp;
std::common_reference_with&lt;
    std::iter_rvalue_reference_t&lt;In&gt;&amp;&amp;, const std::iter_value_t&lt;In&gt;&amp;&gt;;
</code></pre>

<p>Interestingly, there’s no requirement that all these <code>common_reference</code>s must
be the same type. In fact, they are not even required to be used and hence
defined but they must be declared. Eric shows one example when <code>common_reference</code>
might be useful, <code>unique_copy()</code> comparator parameter types. <code>unique_copy()</code>
needs to copy <code>value_type</code> and then call comparator with this copy and the
result of <code>operator*()</code> which is <code>iter_reference_t</code>. But the order of arguments
is not specified. If for whatever reason your comparator cannot have templated
parameters, you need to use <code>common_reference</code> for parameter types:</p>

<pre><code class="language-cpp">auto unique_copy(Iterator first, Iterator last, auto d_first, auto comparator){
    // somewhere inside `unique_copy()`
    Iterator it = first;
    std::iter_value_t&lt;It&gt; copy = *it;   // copy current element
    ++it;
    comparator(copy, *it);  // compare it to the next one, it can be one way
    comparator(*it, copy);  // or the other
}

// client's code
auto generic_comparator = [](auto&amp; lhs, auto&amp; rhs){};   // no problems

// but if you need specific types, use common_reference
template&lt;std::indirectly_readable T&gt;
using iter_common_reference_t = std::common_reference_t&lt;
    std::iter_reference_t&lt;T&gt;,std::iter_value_t&lt;T&gt;&amp;&gt;;

auto non_generic_comparator = [](
    iter_common_reference_t&lt;T&gt; lhs, iter_common_reference_t&lt;T&gt; rhs){};
</code></pre>

<p>Note that since C++20, <code>iter_reference_t</code> is not required to be a true reference
for any kind of iterator which effectively allows random-access proxy iterators.</p>

<p>Range-based versions of existing algorithms must be changed like this (current
<code>libstdc++</code> still doesn’t use <code>iter_move()</code>/<code>iter_swap()</code> in its range algorithms):</p>

<pre><code class="language-cpp">Iterator it1, it2;

// pre C++20 algorithms:
using value_type = std::iterator_traits&lt;Iterator&gt;;
value_type copied = *it1;             // copy
value_type moved = std::move(*it1);   // move
std::iter_swap(it1, it2);             // swap

// C++20 algorithms:
using value_type = std::iter_value_t&lt;Iterator&gt;;
value_type copied = *it1;                       // copy
value_type moved = std::ranges::iter_move(it);  // move
std::ranges::iter_swap(it1, it2);               // swap
</code></pre>

<p>The main purpose of this design (as I understand it) is to allow proxy iterators
of any kind, which, in theory, allows more “indirect” iterators and their usage
with standard algorithms. Imaginary proxy-iterator must implement:</p>
<ul>
  <li>corresponding to its category functions (<code>operator++()</code>, <code>operator[]</code>, etc.)</li>
  <li>custom proxy-reference type which must have read/write/conversions to/from 
<code>value_type</code>, itself and <code>iter_rvalue_reference_t</code></li>
  <li>custom <code>iter_move()</code> and <code>iter_swap()</code></li>
  <li>specialize necessary <code>basic_common_reference</code> (a helper for <code>common_reference</code>
described above) between its <code>value_type</code>, 
<code>iter_reference_t</code>, <code>iter_rvalue_reference_t</code> to a type which is at least
declared</li>
</ul>

<hr />

<h3 id="need-for-better-design">Need for a better design</h3>

<p>Now, when you have a basic idea of what iterators can do in C++20, we can start to
think about how <code>views::projection</code> should work. Recall usage example:</p>

<pre><code class="language-cpp">sort(v | projection(&amp;X::x));    // sorts `v` by `X::x` member
</code></pre>

<p>For this to work, <code>operator*()</code> must return a reference-like thing which points to
the projected value (<code>X::x</code> member in this
case) so that the comparator will use it instead of the whole object. On the other
hand, copy/move/swap/assign operations must operate on the non-projected object (<code>X</code>).
In other words, we have two distinct types:</p>
<ul>
  <li><code>value_type</code> - the type exposed through <code>operator*()</code>, projected type</li>
  <li><code>iter_root_t</code> - the type of underlying object, root type</li>
</ul>

<p>and there’s no logical relationship between them, i.e., there’s no connection
between <code>int x;</code> and <code>struct X;</code> types. One can argue that in fact we have <code>X</code>
and <code>&amp;X::x</code> types and there is a <code>member-of</code> relation but in reality projection
can also be a pure transformation, e.g.,
from <code>std::string</code> to <code>int</code> so any kind of relationship doesn’t make sense here.<br />
In contrast, the existing design doesn’t leave space for a second type (<code>iter_root_t</code>).
It allows proxy-reference as <code>iter_reference_t</code> but it enforces strict
relationships between it and <code>value_type</code> in terms of <code>common_reference</code>
requirements. At most, it allows representing logically single <code>value_type</code>
with two different types, like an advanced form of pointer. That’s why related
concepts are named like <code>indirectly_readable/writable/etc</code>, it’s all about
indirection mechanics, not true abstraction from one type to another.<br />
And even this indirection mechanism is over-complicated, I’d say it’s
expert- <del>friendly</del> only utility. I mean, when Eric
Niebler <a href="https://twitter.com/ericniebler/status/1363892564831711235">says it’s hard</a>(
you can check his implementation <a href="https://github.com/ericniebler/range-v3/blob/master/include/range/v3/iterator/basic_iterator.hpp">here</a>), how can you expect people
to write their own iterators using it? It’s hard because if you need to use
proxy-reference, you need first to check and understand algorithm requirements
on operations/conversions proxy-reference (<code>iter_reference_t</code>), <code>value_type</code> and
<code>iter_rvalue_reference_t</code> should support and only then <em>try</em> to implement it.</p>

<p>To summarize, there are two main problems: over-complicated design and its
inability to support true abstraction between two unrelated types. Now, let’s
fix it.</p>

<hr />

<h3 id="next-iter-iter">The next iteration of iterators</h3>

<p>In <a href="#projected-ranges">Projected ranges to the rescue</a> section I said that
algorithms must operate on the <code>iter_root_t</code> values only, <code>value_type</code> should be 
used only when it’s passed to customizable logic like comparators. Thus, we need
to separate <code>value_type</code> API from <code>iter_root_t</code> API. Let’s summarize what we
have so far:</p>
<ul>
  <li><code>operator*()</code> to get an lvalue reference or copy of <code>value_type</code></li>
  <li><code>operator*()</code> to write <code>value_type</code></li>
  <li><code>iter_move(it)</code> to get rvalue reference to <code>value_type</code></li>
  <li><code>iter_swap(it1, it2)</code> to swap whatever we want, in our case it’s <code>iter_root_t</code></li>
</ul>

<p>Now we need similar functions for <code>iter_root_t</code>:</p>
<ul>
  <li><code>iter_copy_root(it)</code> to get an lvalue reference to <code>iter_root_t</code>, 
<code>iter_root_reference_t</code></li>
  <li><code>iter_move_root(it)</code> to get an rvalue reference to <code>iter_root_t</code>, 
<code>iter_root_rvalue_reference_t</code></li>
</ul>

<p>And to simplify assignment:</p>
<ul>
  <li><code>iter_assign_from(it, value)</code> to assign whatever is needed</li>
</ul>

<p>All these new functions are customization point objects (CPO) which means
they are not required to be implemented if the iterator is happy with the default
behavior. One of my goals was to preserve backward compatibility with all
existing iterators so default implementations mostly forward to the old
API. If you are not familiar with typical CPO implementation, the idea is quite
simple: you call customized for a specific type function or the default 
implementation. The
presence of a customized function is detected via ADL check (<code>has_adl_[cpo_name]</code>
below). Implementation is located inside <code>struct</code> that’s why in the code below
<code>operator()(...)</code> is used instead of a plain function. <code>stdf</code> is a namespace
name where I put all the new stuff, not a typo.</p>

<hr />

<h4 id="iter_copy_root">iter_copy_root()</h4>

<p>Returns lvalue reference to <code>iter_root_t</code>. Default behavior is to return
the result of <code>operator*()</code>. I deliberately omit return type, <code>noexcept</code>-ness
and constraint specifications since they are trivial, interested readers can
find them in demo implementation.</p>

<pre><code class="language-cpp">template&lt;typename From&gt;
constexpr decltype(auto) operator()(From&amp;&amp; from) const
{
    if constexpr(has_adl_iter_copy_root&lt;From&gt;)
    {
        return iter_copy_root(static_cast&lt;From&amp;&amp;&gt;(from));
    }
    else
    {
        return *from;
    }
}

// helper aliases
template&lt;typename T&gt;
using iter_root_t =
    std::remove_cvref_t&lt;decltype(stdf::iter_copy_root(std::declval&lt;T&gt;()))&gt;;

template&lt;typename T&gt;
using iter_root_reference_t = decltype(stdf::iter_copy_root(std::declval&lt;T&gt;()));

// usage example:
auto&amp; ref = stdf::iter_copy_root(it);
auto copy = stdf::iter_copy_root(it);
</code></pre>

<hr />

<h4 id="iter_move_root">iter_move_root()</h4>

<p>Returns rvalue reference to underlying object. When not customized, can forward
to <code>iter_move()</code> or to <code>iter_copy_root()</code>. Reason for this is simple: 
<code>iter_move_root()</code> is supposed to return rvalue reference to root type, if
<code>iter_copy_root()</code> is not customized, it operates in terms of value type and 
<code>iter_move()</code> is responsible for moving it. This also preserves backward
compatibility, for existing iterators <code>iter_copy_root()</code> is forwarded
to <code>operator*()</code> and <code>iter_move_root()</code> to <code>iter_move()</code>.</p>

<pre><code class="language-cpp">constexpr decltype(auto) operator()(From&amp;&amp; from) const
{
    if constexpr(has_adl_iter_move_root&lt;From&gt;)
    {
        return iter_move_root(static_cast&lt;From&amp;&amp;&gt;(from));
    }
    else if constexpr(
        iter_move_cpo::has_adl_iter_move&lt;From&gt; &amp;&amp;
        !iter_copy_root_cpo::has_adl_iter_copy_root&lt;From&gt;)
    {
        return stdf::iter_move(static_cast&lt;From&amp;&amp;&gt;(from));
    }
    else if constexpr(std::is_lvalue_reference_v&lt;
                            iter_root_reference_t&lt;From&gt;&gt;)
    {
        return std::move(
            stdf::iter_copy_root(static_cast&lt;From&amp;&amp;&gt;(from)));
    }
    else
    {
        return stdf::iter_copy_root(static_cast&lt;From&amp;&amp;&gt;(from));
    }
}

template&lt;typename T&gt;
using iter_root_rvalue_reference_t =
    decltype(stdf::iter_move_root(std::declval&lt;T&gt;()));

// usage example:
auto moved = stdf::iter_move_root(it);
</code></pre>

<hr />

<h4 id="iter_assign_from">iter_assign_from()</h4>

<p>It is responsible for assignment. Developer has full control over supported
types. It’s possible to introduce <code>iter_assign_value</code> and <code>iter_assign_root</code> but
I don’t know any use-case where it might be useful. Default behavior assigns to
root:</p>

<pre><code class="language-cpp">template&lt;typename To, typename From&gt;
constexpr void operator()(To&amp;&amp; to, From&amp;&amp; from) const
{
    if constexpr(has_adl_iter_assign_from&lt;To, From&gt;)
    {
        iter_assign_from(static_cast&lt;To&amp;&amp;&gt;(to), static_cast&lt;From&amp;&amp;&gt;(from));
    }
    else
    {
        stdf::iter_copy_root(static_cast&lt;To&amp;&amp;&gt;(to)) = static_cast&lt;From&amp;&amp;&gt;(from);
    }
}

// helper concept
template&lt;typename To, typename From&gt;
concept iter_assignable_from = requires(To&amp;&amp; to, From&amp;&amp; from)
{
    stdf::iter_assign_from(static_cast&lt;To&amp;&amp;&gt;(to), static_cast&lt;From&amp;&amp;&gt;(from));
};

// usage example:
stdf::iter_assign_from(it, T{});
</code></pre>

<hr />

<h4 id="iter_swap">iter_swap()</h4>

<p><code>iter_swap()</code> behaves almost like <code>std::ranges::iter_swap()</code> but it operates on
root values, i.e., it uses <code>iter_copy_root()</code> instead of <code>operator*()</code> and
<code>iter_move_root()</code>/<code>iter_assign_from()</code> instead of <code>iter_move()</code>/<code>operator=()</code>. I
don’t show implementation here because it’s not so short, you can find it in the
demo.</p>

<hr />

<h3 id="views-projection">views::projection</h3>

<p>Now, when we have full control over the iterator’s behavior, we can finally
implement <code>views::projection</code> and <code>views::narrow_projection</code> and see how the new
API simplifies custom iterator implementation.
I will show only core parts of the iterator. We need to store current underlying
iterator and a pointer to a parent view where projection function is stored:</p>

<pre><code class="language-cpp">class Iterator
{
private:
    BaseIter current{};
    ParentView* parent{};
};
</code></pre>

<p>Core parts:</p>

<pre><code class="language-cpp">constexpr decltype(auto) operator*() const
{
    return std::invoke(parent-&gt;fun, *current);
}

friend constexpr decltype(auto) iter_copy_root(const Iterator&amp; it)
{
    return stdf::iter_copy_root(it.current);
}

// enabled only if `BaseIter` has custom `iter_move_root`
friend constexpr decltype(auto) iter_move_root(const Iterator&amp; it)
{
    return stdf::iter_move_root(it.current);
}

// enabled only if `BaseIter` has custom `iter_swap`
friend constexpr void iter_swap(const Iterator&amp; x, const Iterator&amp; y)
{
    return stdf::iter_swap(x.current, y.current);
}

// enabled only if `BaseIter` has custom `iter_assign_from`
template&lt;typename T&gt;
friend constexpr void iter_assign_from(const Iterator&amp; it, T&amp;&amp; val)
{
    stdf::iter_assign_from(it.current, std::forward&lt;T&gt;(val));
}
</code></pre>

<p>As you can see, it’s trivial, all of them are one-liners.
<code>operator*()</code> returns the result of applying projection to the iterator’s value. We
don’t need custom <code>iter_move()</code> because default implementation operates on the
result of <code>operator*()</code>. Since we want copy/move/assign/swap operations to 
operate on 
the root value, we simply forward these calls to it. Note that the last three
functions are enabled (using <code>requires</code>-clause) only in case when the underlying 
iterator 
customizes them. Otherwise, their default versions will operate on
the basis of <code>iter_copy_root()</code> which is exactly what’s needed. There’s another
reason why it’s better to avoid customized versions of CPO-s when possible, 
it’s described later in section <a href="#reducing-derefs">Reducing number of dereferences</a>.</p>

<hr />

<h3 id="views-narrow-projection">views::narrow_projection</h3>

<p>It’s even simpler, all we need is:</p>

<pre><code class="language-cpp">constexpr decltype(auto) operator*() const
{
    return std::invoke(parent-&gt;fun, *current);
}
</code></pre>

<p>Because we don’t want to expose the underlying root value to copy/move/swap/assign 
operations, everything else works by default.</p>

<hr />

<h3 id="impact-on-algos">Impact on algorithms</h3>

<p>Just like it was with C++20 iterator API, this one also requires algorithm authors
to update implementations. Their requirements have to be updated to
reflect usage of the new API. Changes to algorithms code are trivial, <code>operator*()</code>
is still used for customizable logic like comparators but copy/move/swap/assign
must be replaced with new functions. In the demo I implemented a couple of simple
algorithms to test how the new design fits in and found no major problems.</p>

<pre><code class="language-cpp">// read projected value
auto v1 = std::invoke(proj, *it);   // before
auto v2 = *it;                      // after

// copy underlying value, now copies `iter_root_t`
iter_value_t&lt;It&gt; copy1 = *it;               // before
iter_root_t&lt;It&gt; copy2 = iter_copy_root(it); // after

// move underlying value, now moves `iter_root_t`
iter_value_t&lt;It&gt; moved1 = iter_move(it);    // before
iter_root_t&lt;It&gt; moved = iter_move_root(it); // after

// assign to iterator
*it = val;                  // before
iter_assign_from(it, val);  // after
</code></pre>

<h4 id="iterator-based-versions-of-algorithms">Iterator-based versions of algorithms</h4>

<p>For some reason, all range-based algorithms also have iterator-based
counterparts, e.g., <code>copy_if(Range r, Out o, Pred pred, Proj proj)</code> and <code>copy_if(I begin, S end, Out o, Pred pred, Proj proj)</code>.
I don’t know why they are needed at all when
a pair of iterators can be converted into a range using <code>std::ranges::subrange</code>
but they are here.
Described <code>projection</code>/<code>narrow_projection</code> combine projection and a <em>range</em>. To
remove projections from iterator-based signatures we need something like
<code>projection_iterator</code>. It should work just like <code>projection_view::iterator</code>
with addition of comparison functions with its root iterator to support cases
like <code>std::ranges::sort(make_projection_iterator(std::begin(r), some_projection), std::end(r))</code>. Or this issue can be ignored at all, 
projections can be removed without introducing <code>projection_iterator</code>. It will 
force usage of <code>std::ranges::subrange</code> and projection on the resulting
range.</p>

<hr />

<h3 id="reducing-derefs">Reducing number of dereferences</h3>

<p>While implementing algorithms, I found one interesting issue.
Consider <code>copy_if()</code> algorithm. Here’s <code>libstdc++</code> implementation:</p>

<pre><code class="language-cpp">void copy_if(auto first, auto last, auto result, auto pred, auto proj)
{
    for (; first != last; ++first)
    {
        if (std::invoke(pred, std::invoke(proj, *first)))   // #1
        {
            *result = *first;   // #2
            ++result;
        }
    }
}
</code></pre>

<p>The subtle issue here is that <code>first</code> is dereferenced twice per iteration, first,
to call
the predicate, second, to copy its value to the output <code>result</code> iterator. As I told
you before, <code>libstdc++</code> still uses old implementations for range-based
algorithms and probably this version is OK for old-school iterators. But in a
<code>ranges</code> world <code>operator*()</code> might do non-trivial things. For example, it might
be a range which uses <code>views::transform</code> with <code>int -&gt; string</code> transformation.
<code>range-v3</code> handles it better, whenever
possible it stores and reuses the result of dereference:</p>

<pre><code class="language-cpp">void copy_if(auto first, auto last, auto result, auto pred, auto proj)
{
    for (; first != last; ++first)
    {
        auto&amp;&amp; x = *first;     // dereference is done only once now
        if (std::invoke(pred, std::invoke(proj, x)))
        {
            *result = (decltype(x) &amp;&amp;)x;    // analog of std::forward&lt;...&gt;(x)
            ++result;
        }
    }
}
</code></pre>

<p>With that in mind, I wrote initial implementation using the new API:</p>

<pre><code class="language-cpp">constexpr void copy_if(auto&amp;&amp; in, auto out, auto pred)
{
    auto first = std::ranges::begin(in);
    auto last = std::ranges::end(in);
    for(; first != last; ++first)
    {
        if(std::invoke(pred, *first))
        {
            iter_assign_from(out, iter_copy_root(first));
            ++out;
        }
    }
}
</code></pre>

<p>Explicit dereference is done only once here but recall that when
<code>iter_copy_root()</code> is not customized by client, it falls back to <code>operator*()</code> so
the above code transforms to:</p>

<pre><code class="language-cpp">if(std::invoke(pred, *first))       // first dereference
{
    iter_assign_from(out, *first);  // second dereference
    ++out;
}
</code></pre>

<p>Taking into account that now <code>operator*()</code> might contain projection, I want to
avoid calling it whenever possible. Also, standard algorithms guarantee a
specific number of projection calls, any approach which cannot fulfill
them would be useless.
The fixed version would be:</p>

<pre><code class="language-cpp">constexpr void copy_if(auto&amp;&amp; in, auto out, auto pred)
{
    auto first = std::ranges::begin(in);
    auto last = std::ranges::end(in);
    for(; first != last; ++first)
    {
        auto&amp;&amp; x = *first;
        if(std::invoke(pred, x))
        {
            if constexpr(has_adl_iter_copy_root&lt;decltype(first)&gt;)
            {
                // call customization point
                iter_assign_from(out, iter_copy_root(first));
            }
            else
            {
                // reuse `x`
                iter_assign_from(out, std::forward&lt;decltype(x)&gt;(x));
            }
            ++out;
        }
    }
}
</code></pre>

<p>Now when there’s no customized <code>iter_copy_root()</code>, the dereferenced value can
safely be reused.
Obviously, having such an <code>if</code> statement in all algorithms for each call of
<code>iter_copy_root</code>, <code>iter_move_root</code> and <code>iter_assign_from</code> would be too verbose.
To
simplify it, I added a second version for each CPO with additional <code>dereferenced</code>
parameter at the end. Now <code>copy_if()</code> is shorter and dereferences only once:</p>

<pre><code class="language-cpp">// second version of iter_copy_root()
template&lt;typename From&gt;
constexpr decltype(auto) operator()(
    From&amp;&amp; from, std::iter_reference_t&lt;From&gt;&amp; dereferenced) const
{
    if constexpr(has_adl_iter_copy_root&lt;From&gt;)
    {
        return iter_copy_root(static_cast&lt;From&amp;&amp;&gt;(from));
    }
    else if constexpr(std::is_lvalue_reference_v&lt;std::iter_reference_t&lt;From&gt;&gt;)
    {
        return dereferenced;
    }
    else
    {
        return std::move(dereferenced);
    }
}

constexpr void copy_if(auto&amp;&amp; in, auto out, auto pred)
{
    auto first = std::ranges::begin(in);
    auto last = std::ranges::end(in);
    for(; first != last; ++first)
    {
        auto&amp;&amp; x = *first;
        if(std::invoke(pred, x))
        {
            stdf::iter_assign_from(out, stdf::iter_copy_root(first, x));
            ++out;
        }
    }
}
</code></pre>

<p>As a side-effect, it reduces the number of dereferences for all currently
existing iterators because they don’t customize new CPOs. Usually, an optimizer
is able to eliminate them but I like that now it’s guaranteed by design with or
without optimizations. The same problem exists for <code>iter_move()</code> and
<code>iter_swap()</code> because when not customized, they dereference. At the end, I 
added a second version for them too.
That’s why in <code>views::projection</code> it’s important to enable custom
<code>iter_move_root()</code>, <code>iter_swap()</code>, <code>iter_assign_from()</code> only if they are
customized by the underlying iterator. Customizing them unconditionally prevents
reuse of dereferenced value.</p>

<hr />

<h3 id="root-method">root() method</h3>

<p>Sometimes we need to use the result of a generic algorithm with member function
of a container. One such example is <code>remove_if()</code>. It
returns a range of removed elements which are then <code>erase()</code>d using member
function. The signature in <code>std::vector</code> is <code>constexpr iterator erase(const_iterator first, const_iterator last);</code>.
The problem is that it takes <code>std::vector::const_iterator</code> and when we do:</p>

<pre><code class="language-cpp">auto pv = v | projection(&amp;X::x);
auto removed = stdf::remove_if(pv, less_than&lt;int{3}&gt;{});
</code></pre>

<p><code>removed</code> contains a range of <code>projection_view::iterator</code> so we need a way to get
the underlying iterator from it. It’s possible to provide an implicit conversion
for it but implicit conversions are always dangerous so for now I made it a
normal member function. While the existing <code>base()</code> method of view iterators returns 
the last wrapped iterator, the new <code>root()</code> method returns the very first iterator
in the projection chain. To make it generic, I added a <code>stdf::root(it)</code> free function
which falls back to <code>it.root()</code> or just returns <code>it</code>. Now we can do:</p>

<pre><code class="language-cpp">auto pv = v | projection(&amp;X::x);
auto removed = stdf::remove_if(pv, less_than&lt;int{3}&gt;{});
v.erase(stdf::root(removed.begin()), stdf::root(removed.end()));
</code></pre>

<hr />

<h3 id="the-flaw">Major flaw</h3>

<p>Unfortunately, after this article was published I received some feedback and
realized that this design cannot replace projections when the algorithm operates on
<em>input range/iterator</em>. The value represented by the input iterator is valid until
the iterator is not incremented, all the copies of the iterator may be invalidated
afterward. This restriction allows only single-pass algorithms. Consider one of
the simplest algorithm, <code>max</code>:</p>

<pre><code class="language-cpp">template&lt;typename R, typename C = std::less&lt;&gt;&gt;
auto max(R&amp;&amp; r, C pred = {}, auto proj)
{
    auto first = std::ranges::begin(r);
    auto last = std::ranges::end(r);
    
    std::ranges::range_value_t&lt;R&gt; result = *first;
    while(++first != last)
    {
        auto&amp;&amp; tmp = *first;
        if(invoke(pred, invoke(proj, result), invoke(proj, tmp))){
            result = (decltype(tmp) &amp;&amp;)tmp;
        }
    }

    return result;
}
</code></pre>

<p>Here, we need to store a copy of the current max element in the <code>result</code>
variable. The projection <code>proj</code> is later applied to that copied element and
that’s the problem. In the proposed design, I assumed that projected value is
always accessed through iterator, not through the root value. Here’s the
implementation using new design:</p>

<pre><code class="language-cpp">template&lt;typename Rng, typename Cmp = std::less&lt;&gt;&gt;
stdf::iter_root_t&lt;std::ranges::iterator_t&lt;Rng&gt;&gt; max(Rng&amp;&amp; rng, Cmp pred = {})
{
    auto first = std::ranges::begin(rng);
    auto last = std::ranges::end(rng);
    using iterator_t = std::ranges::iterator_t&lt;Rng&gt;;

    std::iter_value_t&lt;iterator_t&gt; maxValue = *first;
    stdf::iter_root_t&lt;iterator_t&gt; root = stdf::iter_copy_root(first);
    while(++first != last)
    {
        auto&amp;&amp; tmp = *first;
        if(std::invoke(pred, result, tmp)){
            maxValue = (decltype(tmp) &amp;&amp;)tmp;
            root = stdf::iter_copy_root(first);
        }
    }
    return root;
}
</code></pre>

<p>As you can see, the only option is to copy both root and projected value which
is not acceptable from the performance point of view. This problem exists only
for input ranges and vanishes with forward ranges because for them it’s safe to
copy the iterator and call its <code>operator*()</code> to get the projected value.
However, there are still plenty of algorithms which require only <code>input_range</code>
so the proposed design cannot be used in its current form. Any potential
projection replacement should be able to retrieve projected value from the root
value, not from the iterator.</p>

<hr />

<h2 id="other-use-cases">Other use-cases</h2>

<p>Introduced design significantly simplifies creation of non-trivial iterators.
Because each aspect is handled separately, there’s no need for tricky proxy
reference objects. <code>common_reference</code> requirements are still there, now for
both <code>value_type</code> and <code>iter_root_t</code>, but it’s almost impossible to break them so
clients shouldn’t care or even know about their existence. For example, here’s
how infamous <code>std::vector&lt;bool&gt;::iterator</code> can be implemented:</p>

<pre><code class="language-cpp">class Iterator
{
public:
    bool operator*();   // no need for proxy reference type
    // swaps bits
    friend void iter_swap(const Iterator&amp; lhs, const Iterator&amp; rhs);
    // assigns bit from bool value
    friend void iter_assign_from(const Iterator&amp; lhs, bool val);
};
</code></pre>

<p>Because there’s no sense in true copy/move of a single bit, <code>iter_move()</code>,
<code>iter_copy_root()</code>, <code>iter_move_root()</code> work in terms of <code>bool</code> value returned by
<code>operator*()</code>. But <code>iter_swap()</code> needs to actually swap bit values and
<code>iter_assign_from()</code> should assign <code>bool</code> to a specific bit, thus, they are 
customized. It’s possible to achieve it with the existing design but it
requires a custom proxy reference type and <code>basic_common_reference</code>
specializations.</p>

<p>Another use-case might be various wrappers. Once I wanted to
write a wrapper for <a href="https://rapidjson.org/">rapidjson</a> library. The main part
was a wrapper class around
<code>rapidjson::Value</code> and <code>rapidjson::Document::AllocatorType</code>
which provided a more convenient interface similar to
<a href="https://github.com/nlohmann/json">nlohmann/json</a>. Writing the wrapper itself
was easy but I failed at
the point when I needed to provide a random-access iterator which returns my 
wrapper by-value. In C++17 it was 
impossible to achieve simply because <code>operator*()</code> returns value instead of
true reference and such iterator couldn’t be a random-access one. In C++20 it
should be possible, but again, requires a good understanding of proxy reference
and <code>common_reference</code> requirements to implement it. With 
the proposed 
design it’s straightforward: root type is <code>rapidjson::Value</code>, value type is
a wrapper:</p>

<pre><code class="language-cpp">class MyWrapper{
public:
    // interface methods...
private:
    rapidjson::Value* value;
    rapidjson::Document::AllocatorType* allocator;  // required for write ops
};

class MyIterator{
public:
    auto operator*(){
        return MyWrapper{*origIterator, allocator};
    }

    decltype(auto) iter_copy_root(){
        return *origIterator;
    }

    // other iterator methods...

private:
    rapidjson::Value::ValueIterator origIterator;
    rapidjson::Document::AllocatorType* allocator;
};
</code></pre>

<hr />

<h2 id="transform">The role of std::views::transform</h2>

<p>Someone might think that <code>std::views::transform</code> can be used to combine
projection and a range but currently it’s mostly useless for that purpose.
Its <code>iter_move()</code> operates on transformed value while <code>iter_swap</code> operates on
the underlying non-transformed value so you can’t use it with any algorithm that
might use
them both (like <code>sort()</code>, see the
<a href="https://cplusplus.github.io/LWG/issue3520">issue 3520</a>). The
proposed fix is to remove customized <code>iter_swap()</code> so that the default version
will operate on transformed value. With that fix,
<code>views::transform</code> will become almost the same as
<code>narrow_projection</code> (the only difference is that, for unknown reason,
<code>views::transform</code> has <a href="https://eel.is/c++draft/range.transform.iterator">customized</a> 
<code>iter_move()</code> which behaves exactly like <a href="https://eel.is/c++draft/iterator.cust.move#1.2">the
default one</a>). But should it
be used instead of <code>narrow_projection</code>? It’s more like a naming question, I
think that the name <code>transform</code> corresponds to a case when that’s the real
purpose of the code, just like classic <code>std::transform()</code>. The name <code>projection</code>
better fits cases when you don’t want to transform a range but only change its
representation for an algorithm. Of course such a thing can still be called a
<em>transformation</em>, it’s hard to get a clear answer here.</p>

<hr />

<h2 id="demo">Demo</h2>

<p>You can find the implementation of new CPO-s, <code>projection</code>, <code>narrow_projection</code>
and a few test algorithms <a href="https://github.com/OleksandrKvl/projected_ranges/blob/master/src/main.cpp">here</a>.
It’s just a single file which you can <a href="https://godbolt.org/z/685Pcavb3">copy-paste to godbolt</a>,
currently it works only with GCC-11 because Clang hasn’t implemented ranges yet.</p>

<hr />

<h2 id="wrap-up">Wrap-up</h2>

<p>The main benefit of introduced design is the support of true abstraction behind
the iterator. Projection is only one of its use cases and
I believe that it can fully replace and enhance them, at the same time simplify 
creation of new iterators. Important point
is that it’s backward compatible, no need to change existing iterators, only
the algorithms.
Let me know what you think about it. Do you like this design? Would you like to
see it in the standard? Can it solve some of your problems or will create new
ones instead? Have I missed something else? Any meaningful feedback is welcome.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Table of contents]]></summary></entry><entry><title type="html">All C++20 core language features with examples</title><link href="/2021/04/02/cpp-20-overview.html" rel="alternate" type="text/html" title="All C++20 core language features with examples" /><published>2021-04-02T11:12:00+00:00</published><updated>2021-04-02T11:12:00+00:00</updated><id>/2021/04/02/cpp-20-overview</id><content type="html" xml:base="/2021/04/02/cpp-20-overview.html"><![CDATA[<h2 id="introduction">Introduction</h2>

<p>The story behind this article is very simple, I wanted to learn about new C++20
language features and to have a brief summary for all of them on a single page.
So, I decided to read all proposals and create this “cheat sheet” that
explains and demonstrates each feature.
This is not a “best
practices” kind of article, it serves only demonstrational purpose.
Most examples were inspired or directly taken from corresponding proposals,
all credit goes to their authors and to members of ISO C++ committee for
their work. Enjoy!</p>

<h2 id="table-of-contents">Table of contents</h2>

<ul>
  <li><a href="#concepts">Concepts</a></li>
  <li><a href="#modules">Modules</a></li>
  <li><a href="#coroutines">Coroutines</a></li>
  <li><a href="#three-way-comparison">Three-way comparison</a></li>
  <li><a href="#lambda-features">Lambda expressions</a>
    <ul>
      <li><a href="#lambda-this">Allow lambda-capture <code>[=, this]</code></a></li>
      <li><a href="#lambda-templ-params">Template parameter list for generic lambdas</a></li>
      <li><a href="#lambda-uneval-ctx">Lambdas in unevaluated contexts</a></li>
      <li><a href="#lambda-def-ctor">Default constructible and assignable stateless lambdas</a></li>
      <li><a href="#lambda-pack-exp">Pack expansion in lambda init-capture</a></li>
    </ul>
  </li>
  <li><a href="#constexpr-features">Constant expressions</a>
    <ul>
      <li><a href="#consteval">Immediate functions(<code>consteval</code>)</a></li>
      <li><a href="#constexpr-virtual"><code>constexpr</code> virtual function</a></li>
      <li><a href="#constexpr-try-catch"><code>constexpr</code> try-catch blocks</a></li>
      <li><a href="#constexpr-dyn-cast"><code>constexpr</code> <code>dynamic_cast</code> and polymorphic <code>typeid</code></a></li>
      <li><a href="#constexpr-union">Changing the active member of a <code>union</code> inside <code>constexpr</code></a></li>
      <li><a href="#constexpr-alloc"><code>constexpr</code> allocations</a></li>
      <li><a href="#constexpr-trivial-def-init">Trivial default initialization in <code>constexpr</code> functions</a></li>
      <li><a href="#constexpr-asm">Unevaluated <code>asm</code>-declaration in <code>constexpr</code> functions</a></li>
      <li><a href="#is-const-eval"><code>std::is_constant_evaluated()</code></a></li>
    </ul>
  </li>
  <li><a href="#aggregates">Aggregates</a>
    <ul>
      <li><a href="#aggr-no-ctor">Prohibit aggregates with user-declared constructors</a></li>
      <li><a href="#ctad-aggr">Class template argument deduction for aggregates</a></li>
      <li><a href="#aggr-paren-init">Parenthesized initialization of aggregates</a></li>
    </ul>
  </li>
  <li><a href="#nttp">Non-type template parameters</a>
    <ul>
      <li><a href="#class-types-nttp">Class types in non-type template parameters</a></li>
      <li><a href="#nttp-gen">Generalized non-type template parameters</a></li>
    </ul>
  </li>
  <li><a href="#struct-bindings">Structured bindings</a>
    <ul>
      <li><a href="#structbind-specs">Lambda capture and storage class specifiers for structured bindings</a></li>
      <li><a href="#fix-structbind-cp">Relaxing the structured bindings customization point finding rules</a></li>
      <li><a href="#fix-structbind-access">Allow structured bindings to accessible members</a></li>
    </ul>
  </li>
  <li><a href="#range-based-for">Range-based <code>for</code> loop</a>
    <ul>
      <li><a href="#init-range-for">init-statements for range-based <code>for</code> loop</a></li>
      <li><a href="#fix-range-for-cp">Relaxing the range-based <code>for</code> loop customization point finding rules</a></li>
    </ul>
  </li>
  <li><a href="#attributes">Attributes</a>
    <ul>
      <li><a href="#attr-likely"><code>[[likely]]</code> and <code>[[unlikely]]</code></a></li>
      <li><a href="#attr-no-uniq-addr"><code>[[no_unique_address]]</code></a></li>
      <li><a href="#discard-msg"><code>[[nodiscard]]</code> with message</a></li>
      <li><a href="#fix-nodiscard-ctor"><code>[[nodiscard]]</code> for constructors</a></li>
    </ul>
  </li>
  <li><a href="#encoding">Character encoding</a>
    <ul>
      <li><a href="#char8t"><code>char8_t</code></a></li>
      <li><a href="#stronger-unicode">Stronger Unicode requirements</a></li>
    </ul>
  </li>
  <li><a href="#sugar">Sugar</a>
    <ul>
      <li><a href="#designated-init">Designated initializers</a></li>
      <li><a href="#bitfield-def-init">Default member initializers for bit-fields</a></li>
      <li><a href="#less-typename">More optional <code>typename</code></a></li>
      <li><a href="#nested-inline-ns">Nested <code>inline</code> namespaces</a></li>
      <li><a href="#using-enum"><code>using enum</code></a></li>
      <li><a href="#fix-arr-size">Array size deduction in new-expressions</a></li>
      <li><a href="#ctad-alias">Class template argument deduction for alias templates</a></li>
    </ul>
  </li>
  <li><a href="#constinit"><code>constinit</code></a></li>
  <li><a href="#int-twos-compl">Signed integers are two’s complement</a></li>
  <li><a href="#va-opt"><code>__VA_OPT__</code> for variadic macros</a></li>
  <li><a href="#diff-except-spec">Explicitly defaulted functions with different exception specifications</a></li>
  <li><a href="#destr-delete">Destroying <code>operator delete</code></a></li>
  <li><a href="#explicit-conditional">Conditionally <code>explicit</code> constructors</a></li>
  <li><a href="#feature-test-macros">Feature-test macros</a></li>
  <li><a href="#array-conv">Known-to-unknown bound array conversions</a></li>
  <li><a href="#more-impl-moves">Implicit move for more local objects and rvalue references</a></li>
  <li><a href="#narrowing-ptr-bool-conv">Conversion from <code>T*</code> to <code>bool</code> is narrowing</a></li>
  <li><a href="#depr-volatile">Deprecate some uses of <code>volatile</code></a></li>
  <li><a href="#depr-comma-subs">Deprecate comma operator in subscripts</a></li>
  <li><a href="#fixes">Fixes</a>
    <ul>
      <li><a href="#fix-init-list-ctad">Initializer list constructors in class template argument deduction</a></li>
      <li><a href="#fix-const-qual"><code>const&amp;</code>-qualified pointers to members</a></li>
      <li><a href="#fix-impl-capture">Simplifying implicit lambda capture</a></li>
      <li><a href="#fix-const-mismatch"><code>const</code> mismatch with defaulted copy constructor</a></li>
      <li><a href="#fix-spec-access-check">Access checking on specializations</a></li>
      <li><a href="#fix-adl">ADL and function templates that are not visible</a></li>
      <li><a href="#fix-constexpr-inst">Specify when <code>constexpr</code> function definitions are needed for constant evaluation</a></li>
      <li><a href="#fix-impl-creation">Implicit creation of objects for low-level object manipulation</a></li>
    </ul>
  </li>
</ul>

<hr />

<h2 id="concepts">Concepts</h2>

<p>The basic idea behind concepts is to specify what’s needed from a template
argument so the compiler can check it before instantiation. As a result, the
error message, if any, is much cleaner, something like <code>constraint X was not satisfied</code>.
Before C++20 it was possible to use tricky <code>enable_if</code> constructions or 
just fail during template instantiation with cryptic error messages. With concepts 
failure happens early and the error message is much cleaner.</p>

<h3 id="requires-expression">Requires expression</h3>

<p>Let’s start with <code>requires-expression</code>. It’s an expression that contains 
actual 
requirements for template arguments, it evaluates to <code>true</code> if they are satisfied 
and <code>false</code> otherwise.</p>

<pre><code class="language-cpp">template&lt;typename T&gt; /*...*/
requires (T x) // optional set of fictional parameter(s)
{
    // simple requirement: expression must be valid
    x++;    // expression must be valid
    
    // type requirement: `typename T`, T type must be a valid type
    typename T::value_type;
    typename S&lt;T&gt;;

    // compound requirement: {expression}[noexcept][-&gt; Concept];
    // {expression} -&gt; Concept&lt;A1, A2, ...&gt; is equivalent to
    // requires Concept&lt;decltype((expression)), A1, A2, ...&gt;
    {*x};  // dereference must be valid
    {*x} noexcept;  // dereference must be noexcept
    // dereference must  return T::value_type
    {*x} noexcept -&gt; std::same_as&lt;typename T::value_type&gt;;
    
    // nested requirement: requires ConceptName&lt;...&gt;;
    requires Addable&lt;T&gt;; // constraint Addable&lt;T&gt; must be satisfied
};
</code></pre>

<h3 id="concept">Concept</h3>

<p>Concept is simply a named set of such constraints or their logical combination.
Both concept and requires-expression render to a compile-time bool value and 
can be used as a normal value, for example in <code>if constexpr</code>.</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
concept Addable = requires(T a, T b)
{
    a + b;
};

template&lt;typename T&gt;
concept Dividable = requires(T a, T b)
{
    a/b;
};

template&lt;typename T&gt;
concept DivAddable = Addable&lt;T&gt; &amp;&amp; Dividable&lt;T&gt;;

template&lt;typename T&gt;
void f(T x)
{
    if constexpr(Addable&lt;T&gt;){ /*...*/ }
    else if constexpr(requires(T a, T b) { a + b; }){ /*...*/ }
}
</code></pre>

<h3 id="requires-clause">Requires clause</h3>

<p>To actually constrain something we need <code>requires-clause</code>. It may appear right
after <code>template&lt;&gt;</code> block or as the last element of a function declaration, or
even at both places at once, lambdas included:</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
requires Addable&lt;T&gt;
auto f1(T a, T b) requires Subtractable&lt;T&gt;; // Addable&lt;T&gt; &amp;&amp; Subtractable&lt;T&gt;

auto l = []&lt;typename T&gt; requires Addable&lt;T&gt;
    (T a, T b) requires Subtractable&lt;T&gt;{};

template&lt;typename T&gt;
requires Addable&lt;T&gt;
class C;

// infamous `requires requires`. First `requires` is requires-clause,
// second one is requires-expression. Useful if you don't want to introduce new
// concept.
template&lt;typename T&gt;
requires requires(T a, T b) {a + b;}
auto f4(T x);
</code></pre>

<p>Much cleaner way is to use concept name instead of <code>class/typename</code> keyword in
template parameter list:</p>

<pre><code class="language-cpp">template&lt;Addable T&gt;
void f();
</code></pre>

<p>Template template parameters can also be constrained. In this case argument must 
be less or equally constrained than parameter. Unconstrained template template
parameters still can accept constrained templates as arguments:</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
concept Integral = std::integral&lt;T&gt;;

template&lt;typename T&gt;
concept Integral4 = std::integral&lt;T&gt; &amp;&amp; sizeof(T) == 4;

// requires-clause also works here
template&lt;template&lt;typename T1&gt; requires Integral&lt;T1&gt; typename T&gt;
void f2(){}

// f() and f2() forms are equal
template&lt;template&lt;Integral T1&gt; typename T&gt;
void f(){
    f2&lt;T&gt;();
}

// unconstrained template template parameter can accept constrained arguments
template&lt;template&lt;typename T1&gt; typename T&gt;
void f3(){}

template&lt;typename T&gt;
struct S1{};

template&lt;Integral T&gt;
struct S2{};

template&lt;Integral4 T&gt;
struct S3{};

void test(){
    f&lt;S1&gt;();    // OK
    f&lt;S2&gt;();    // OK
    // error, S3 is constrained by Integral4 which is more constrained than
    // f()'s Integral
    f&lt;S3&gt;();

    // all are OK
    f3&lt;S1&gt;();
    f3&lt;S2&gt;();
    f3&lt;S3&gt;();
}
</code></pre>

<p>Functions with unsatisfied constraints become “invisible”:</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
struct X{
    void f() requires std::integral&lt;T&gt;
    {}
};

void f(){
    X&lt;double&gt; x;
    x.f();  // error
    auto pf = &amp;X&lt;double&gt;::f;    // error
}
</code></pre>

<h3 id="constrained-auto">Constrained <code>auto</code></h3>

<p><code>auto</code> parameters now allowed for normal functions to make them generic just like
generic lambdas. Concepts can be used to constrain placeholder 
types(<code>auto</code>/<code>decltype(auto)</code>) in various contexts.
For parameter packs, <code>MyConcept... Ts</code> requires
<code>MyConcept</code> to be true for each element of the pack, not for the whole pack at 
once, e.g. <code>requires&lt;T1&gt; &amp;&amp; requires&lt;T2&gt; &amp;&amp; ... &amp;&amp; requires&lt;TLast&gt;</code>.</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
concept is_sortable = true;

auto l = [](auto x){};
void f1(auto x){}               // unconstrained template
void f2(is_sortable auto x){}   // constrained template

template&lt;is_sortable auto NonTypeParameter, is_sortable TypeParameter&gt;
is_sortable auto f3(is_sortable auto x, auto y)
{
    // notice that nothing is allowed between constraint name and `auto`
    is_sortable auto z = 0;
    return 0;
}

template&lt;is_sortable auto... NonTypePack, is_sortable... TypePack&gt;
void f4(TypePack... args){}

int f();

// takes two parameters
template&lt;typename T1, typename T2&gt;
concept C = true;
// binds second parameter
C&lt;double&gt; auto v = f(); // means C&lt;int, double&gt;

struct X{
    operator is_sortable auto() {
        return 0;
    }
};

auto f5() -&gt; is_sortable decltype(auto){
    f4&lt;1,2,3&gt;(1,2,3);
    return new is_sortable auto(1);
}
</code></pre>

<h3 id="partial-ordering-by-constraints">Partial ordering by constraints</h3>

<p><em>This section was inspired by the article
<a href="https://akrzemi1.wordpress.com/2020/05/07/ordering-by-constraints/">Ordering by constraints</a>
by Andrzej Krzemieński. Check it out for a more thorough explanation.</em></p>

<p>Aside from specifying requirements for a single declaration, constraints can be
used to select the best alternative for a normal function, template function or
a class template. To do so, constraints have a notion of partial ordering, that is,
one constraint can be <em>at least</em> or <em>more</em> constrained than the other or they can
be <em>unordered</em>(unrelated). Compiler decomposes(the Standard uses term
<em>normalization</em> but for me <em>decomposition</em> sounds better) constraint into a conjunction/
disjunction of <em>atomic</em> constraints. Intuitively, <code>C1 &amp;&amp; C2</code> is more constrained than
<code>C1</code>, <code>C1</code> is more constrained than <code>C1 || C2</code> and any constraint is more
constrained than the unconstrained declaration. When more than one candidate
with satisfied constraints are present, the most constrained one is chosen. If
constraints are unordered, the usage is ambiguous.</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
concept integral_or_floating = std::integral&lt;T&gt; || std::floating_point&lt;T&gt;;

template&lt;typename T&gt;
concept integral_and_char = std::integral&lt;T&gt; &amp;&amp; std::same_as&lt;T, char&gt;;

void f(std::integral auto){}        // #1
void f(integral_or_floating auto){} // #2
void f(std::same_as&lt;char&gt; auto){}   // #3

// calls #1 because std::integral is more constrained
// than integral_or_floating(#2)
f(int{});
// calls #2 because it's the only one whose constraint is satisfied
f(double{});
// error, #1, #2 and #3's constraints are satisfied but unordered
// because std::same_as&lt;char&gt; appears only in #3
f(char{});

void f(integral_and_char auto){}    // #4

// calls #4 because integral_and_char is more
// constrained than std::same_as&lt;char&gt;(#3) and std::integral(#1)
f(char{});
</code></pre>

<p>It’s important to understand how the compiler decomposes constraints and when it
can see that they have common atomic constraint and deduce order between
them.
During decomposition, the concept name is replaced with its definition but
<code>requires-expression</code> is <em>not</em> further decomposed. Two atomic constraints are
identical only if they are represented by the same expression at the same
location.
For example, <code>concept C = C1 &amp;&amp; C2</code> is decomposed to conjunction of <code>C1</code> and
<code>C2</code> but <code>concept C = requires{...}</code> becomes <code>concept C = Expression-Location-Pair</code>
and its body is not further decomposed.
If two concepts
have common or even the same requirements in their <code>requires-expression</code>,
they will always be unordered because either their <code>requires-expression</code>s are
not equal or they are equal but at different source locations. The same happens
with duplicated usage of a naked type traits - they always represent different
atomic constraints because of different locations, thus, cannot be used for
ordering.</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
requires std::is_integral_v&lt;T&gt;  // uses type traits instead of concepts
void f1(){}  // #1

template&lt;typename T&gt;
requires std::is_integral_v&lt;T&gt; || std::is_floating_point_v&lt;T&gt;
void f1(){}  // #2

// error, #1 and #2 have common `std::is_integral_v&lt;T&gt;` expression
// but at different locations(line 2 vs. line 6), thus, #1 and #2 constraints
// are unordered and the call is ambiguous
f1(int{});

template&lt;typename T&gt;
concept C1 = requires{      // requires-expression is not decomposed
    requires std::integral&lt;T&gt;;
};

template&lt;typename T&gt;
concept C2 = requires{      // requires-expression is not decomposed
    requires (std::integral&lt;T&gt; || std::floating_point&lt;T&gt;);
};

void f2(C1 auto){}  // #3
void f2(C2 auto){}  // #4

// error, since requires-expressions are not decomposed, #3 and #4 have
// completely unrelated and hence unordered constraints and the call is
// ambiguous
f2(int{});
</code></pre>

<h3 id="conditionally-trivial-special-member-functions">Conditionally trivial special member functions</h3>

<p>For wrapper types like <code>std::optional</code> or <code>std::variant</code> it’s useful to
propagate <em>triviality</em> from the types they wrap. For example, <code>std::optional&lt;int&gt;</code>
should be trivial but <code>std::optional&lt;std::string&gt;</code> shouldn’t. In C++17 this can be
achieved using <a href="https://wg21.link/P0848R3#introduction">pretty cumbersome machinery</a>.
Concepts provide a natural solution for this: we can create multiple versions of
the same special member function with different constraints, the compiler will
choose the best one and ignore the others. In this particular case, we need
a trivial set of functions when the wrapped type is a trivial and a non-trivial 
set of
functions when it’s not. For this to work, some updates have been made to the
definition of trivial type. In C++17, a trivially copyable class is required to 
have <em>all</em> of its copy and move operations either deleted or trivial.
To take concepts into account, the notion of an <em>eligible special member function</em>
was introduced. It is a function that’s not deleted, whose constraints(if any) are
satisfied and no other special member function of the same kind, with the same
first parameter type(if any), is more constrained. Simply put, it’s a
function(s) with the most constrained satisfied constraints(if any). All existing
destructors(yes, now you can have more than one) are now called <em>prospective</em>
destructors. Only one “active” destructor is allowed, it’s selected
using normal overload resolution.<br />
A <em>trivially copyable</em> class is now a class that has a <em>trivial</em> 
non-deleted destructor, <em>at least one</em> eligible
copy/move operation and whose all such eligible operations are trivial. A 
<em>trivial</em> class is a trivially copyable class that has one or more eligible 
default constructors, all of which are trivial.<br />
Here’s the skeleton of this technique:</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
class optional{
public:
    optional() = default;

    // trivial copy-constructor
    optional(const optional&amp;) = default;

    // non-trivial copy-constructor
    optional(const optional&amp; rhs)
        requires(!std::is_trivially_copy_constructible_v&lt;T&gt;){
        // ...
    }

    // trivial destructor
    ~optional() = default;

    // non-trivial destructor
    ~optional() requires(!std::is_trivial_v&lt;T&gt;){
        // ...
    }
    // ...
private:
    T value;
};

static_assert(std::is_trivial_v&lt;optional&lt;int&gt;&gt;);
static_assert(!std::is_trivial_v&lt;optional&lt;std::string&gt;&gt;);
</code></pre>

<hr />

<h2 id="modules">Modules</h2>

<p>Modules is a new way to organize C++ code into logical components. Historically,
C++ used C model
which is based on the preprocessor and repetitive textual inclusion. It has
a lot of problems such as macros leakage in and out from headers, 
inclusion-order-dependent
headers, repetitive compilation of the same code, cyclic dependencies, poor 
encapsulation of implementation details and so on. Modules are about to solve them
but not so fast. We won’t be able to use their full power until compilers <em>and</em>
build tools, such as CMake, will support it too. Full description of Modules is
well beyond the scope of this article, I will only show the basic ideas and use 
cases. For more details you can read
<a href="https://vector-of-bool.github.io/2019/03/10/modules-1.html">a series of articles by <code>vector-of-bool</code></a>
or just google for other blog posts or talks.</p>

<p>The main idea behind modules is to restrict what’s accessible(<code>export</code>ed) when 
a module is used(<code>import</code>ed) by its clients. This allows true hiding of 
implementation details.</p>

<pre><code class="language-cpp">// module.cpp
// dots in module name are for readability purpose, they have no special meaning
export module my.tool;  // module declaration

export void f(){}       // export f()
void g(){}              // but not g()

// client.cpp
import my.tool;

f();    // OK
g();    // error, not exported
</code></pre>

<p>Modules are macro-unfriendly, you
can’t pass manually <code>#define</code>d macros to module(compiler’s built-in and
command-line macros are still visible) and only in one special case you can import
macros from
module. Modules can’t have cyclic dependencies. Module is a self-contained
entity, compiler can precompile
each module exactly once so overall compilation time is greatly improved. Import
order doesn’t matter for modules.</p>

<h3 id="module-units">Module units</h3>

<p>A module can be either <em>interface</em> or <em>implementation</em> module unit. Only interface
units can contribute to the module’s interface, that’s why they have <code>export</code> in
their
declaration. A module can be a single file or scattered across <em>partitions</em>. Each
partition is named in the form <code>module_name:partition_name</code>. Partitions are 
<code>import</code>able only within the same module and client can <code>import</code> only a module as 
a whole. This provides much better encapsulation than header files.</p>

<pre><code class="language-cpp">// tool.cpp
export module tool; // primary module interface unit
export import :helpers; // re-export(see below) helpers partition

export void f();
export void g();

// tool.internals.cpp
module tool:internals;  // implementation partition
void utility();

// tool.impl.cpp
module tool;    // implementation unit, implicitly imports primary module unit
import :internals;

void utility(){}

void f(){
    utility();
}

// tool.impl2.cpp
module tool;    // another implementation unit
void g(){}

// tool.helpers.cpp
export module tool:helpers; // module interface partition
import :internals;

export void h(){
    utility();
}

// client.cpp
import tool;

f();
g();
h();
</code></pre>

<p>Note that partitions are imported without specifying module name. This prohibits
importing other module’s partitions. Multiple implementation units(
<code>module tool;</code>) are allowed, all other units and partitions of any kind must
be unique. All interface partitions must be re-exported by the module via
<code>export import</code>.</p>

<h3 id="export">Export</h3>

<p>Here are various forms of <code>export</code>, the general rule is that you can’t <code>export</code>
names with internal linkage:</p>

<pre><code class="language-cpp">// tool.cpp
module tool;
export import :helpers; // import and re-export helpers interface partition

export int x{}; // export single declaration

export{         // export multiple declarations
    int y{};
    void f(){};
}

export namespace A{ // export the whole namespace
    void f();
    void g();
}

namespace B{
    export void f();// export a single declaration within a namespace
    void g();
}

namespace{
    export int x;   // error, x has internal linkage
    export void f();// error, f() has internal linkage
}

export class C; // export as incomplete type
class C{};
export C get_c();

// client.cpp
import tool;

C c1;    // error, C is incomplete
auto c2 = get_c();  // OK
</code></pre>

<h3 id="import">Import</h3>

<p>Import declarations should precede any other “non-module” declarations, it
allows quick dependency analysis. Otherwise, it’s pretty intuitive:</p>

<pre><code class="language-cpp">// tool.cpp
export module tool;
import :helpers;  // import helpers partition

export void f(){}

// tool.helpers.cpp
export module tool:helpers;

export void g(){}

// client.cpp
import tool;

f();
g();
</code></pre>

<h3 id="header-units">Header units</h3>

<p>There’s one special <code>import</code> form that allows import of <em>importable</em> headers:
<code>import &lt;header.h&gt;</code> or <code>import "header.h"</code>. Compiler creates a synthesized
<em>header unit</em> and makes all declarations implicitly exported.
What headers are actually importable
is implementation-defined but all C++ library headers are so. Perhaps, there will 
be a way to tell the compiler which user-provided headers are importable, such 
headers should not contain non-inline function definitions or variables with
external linkage. It’s the only
<code>import</code> form that allows import of macros from headers(but you still can’t 
re-export them via <code>export import "header.h"</code>). Don’t use it to import random
legacy header if you’re not sure about its content.</p>

<h3 id="global-module-fragment">Global module fragment</h3>

<p>If you need to use old-school headers within a module, there’s a special place
to put <code>#include</code>s safely: <em>global module fragment</em>:</p>

<pre><code class="language-cpp">// header.h
#pragma once
class A{};
void g(){}

// tool.cpp
module;             // global module fragment
#include "header.h"
export module tool; // ends here

export void f(){    // uses declarations from header.h
    g();
    A a;
}
</code></pre>

<p>It must appear before the named module declaration and it can contain only 
preprocessor
directives. All declarations from all global module fragments and non-modular
translation units are attached to a single global module. Thus, all rules for 
normal headers apply here.</p>

<h3 id="private-module-fragment">Private module fragment</h3>

<p>The final strange beast is a <em>private module fragment</em>. Its intent is to hide
implementation details in a single-file module(it’s not allowed elsewhere). In 
theory, clients might
not recompile when things in a private module fragment changes:</p>

<pre><code class="language-cpp">export module tool; // interface

export void f();    // declared here

module :private;    // implementation details

void f(){}          // defined here
</code></pre>

<h3 id="no-more-implicit-inline">No more implicit <code>inline</code></h3>

<p>There’s also an interesting change regarding <code>inline</code>. Member functions defined
within the class definition are <em>not</em> implicitly <code>inline</code> if that class is attached
to a named module. <code>inline</code> functions in a named module can use only names
that are visible to a client.</p>

<pre><code class="language-cpp">// header.h
struct C{
    void f(){}  // still inline because attached to a global module
};

// tool.cpp
module;
#include "header.h"

export module tool;

class A{};  // not exported

export struct B{// B is attached to module "tool"
    void f(){   // not implicitly inline anymore
        A a;    // can safely use non-exported name
    }

    inline void g(){
        A a;    // oops, uses non-exported name
    }

    inline void h(){
        f();    // fine, f() is not inline
    }
};

// client.cpp
import tool;

B b;
b.f();  // OK
b.g();  // error, A is undefined
b.h();  // OK
</code></pre>

<hr />

<h2 id="coroutines">Coroutines</h2>

<p>Finally, we have stackless(their state is stored in heap, not on stack)
<a href="https://en.wikipedia.org/wiki/Coroutine">coroutines</a> in C++. C++20 provides
nearly the lowest possible API and leaves rest up to the user.
We’ve got <code>co_await</code>,
<code>co_yield</code>, <code>co_return</code> keywords and rules for interaction between the caller
and callee. Those rules are so low-level that I see no point in explaining them
here.
You can find more details on <a href="https://lewissbaker.github.io/">Lewis Baker’s blog</a>.
Hopefully, C++23 will fill this gap with some library utilities. Until then,
we can use third-party libraries, here’s an example 
that uses <a href="https://github.com/lewissbaker/cppcoro">cppcoro</a>:</p>

<pre><code class="language-cpp">cppcoro::task&lt;int&gt; someAsyncTask()
{
    int result;
    // get the result somehow
    co_return result;
}

// task&lt;&gt; is analog of void for normal function
cppcoro::task&lt;&gt; usageExample()
{
    // creates a new task but doesn't start executing the coroutine yet
    cppcoro::task&lt;int&gt; myTask = someAsyncTask();
    // ...
    // Coroutine is only started when we later co_await the task.
    auto result = co_await myTask;
}

// will lazily generate numbers from 0 to 9
cppcoro::generator&lt;std::size_t&gt; getTenNumbers()
{
    std::size_t n{0};
    while (n != 10)
    {
        co_yield n++;
    }
}

void printNumbers()
{
    for(const auto n : getTenNumbers())
    {
        std::cout &lt;&lt; n;    
    }
}
</code></pre>

<hr />

<h2 id="three-way-comparison">Three-way comparison</h2>

<p>Before C++20, to provide comparison operations for a class,
implementations of 6 operators are needed: <code>==, !=, &lt;, &lt;=, &gt;, &gt;=</code>.
Usually, four of them contain boiler-plate code that works in terms of <code>==</code> and
<code>&lt;</code> which contain the real comparison logic.
Common practice 
is to implement them as free functions taking <code>const T&amp;</code> to allow comparison of
convertible types.
If you want to support non-convertible types, you need to add two sets of 6 
functions, <code>op(const T1&amp;, const T2&amp;)</code> and <code>op(const T2&amp;, const T1&amp;)</code> and now you
have 18 comparison operators(check out <a href="https://en.cppreference.com/w/cpp/utility/optional/operator_cmp"><code>std::optional</code></a>).
C++20 gives us a better way to handle and think about comparisons. Now you need
to focus on <code>operator&lt;=&gt;()</code> and sometimes on <code>operator==()</code>.
New <code>operator&lt;=&gt;</code>(spaceship
operator) implements three-way comparison, it tells whether <code>a</code> is less,
equal or greater than <code>b</code> in a single call, just like <code>strcmp()</code>. It returns a
comparison category(see below) that could be compared to zero. Having this,
compiler can replace calls to <code>&lt;, &lt;=, &gt;, &gt;=</code> with call to <code>operator&lt;=&gt;()</code> and
check its result(<code>a &lt; b</code> becomes <code>a &lt;=&gt; b &lt; 0</code>), and calls to <code>==, !=</code> to <code>operator==()</code>(<code>a != b</code> becomes <code>!(a == b)</code>).
Due to new lookup rules they can handle asymmetric comparisons, e.g. when you 
provide a single 
<code>T1::operator==(const T2&amp;)</code>, you get both <code>T1 == T2</code> and <code>T2 == T1</code>, the same 
applies to <code>operator&lt;=&gt;()</code>.
Now you need to write at most 2 functions to get all 6 comparisons between 
convertible types, and 2 functions to get all 12 comparisons between 
non-convertible types.</p>

<h4 id="comparison-categories">Comparison categories</h4>

<p>The Standard provides three comparison categories(which doesn’t prevent you from 
having your own one).
<code>strong_ordering</code> implies that exactly one of <code>a &lt; b</code>, <code>a &gt; b</code>, <code>a == b</code> must be 
true and if <code>a == b</code> then <code>f(a) == f(b)</code>.
<code>weak_ordering</code> implies that exactly one of <code>a &lt; b</code>, <code>a &gt; b</code>, <code>a == b</code> must  be 
true and if <code>a == b</code> then <code>f(a)</code> can be <em>not</em> equal to <code>f(b)</code>. Such elements are
equivalent but not equal.
<code>partial_ordering</code> means that none of <code>a &lt; b</code>, <code>a &gt; b</code>, <code>a == b</code> might<br />
be true and if <code>a == b</code> then <code>f(a)</code> can be not equal to <code>f(b)</code>. That is, some 
elements may be incomparable.
Important note here is that <code>f()</code> denotes a function that accesses only <em>salient</em>
attributes. For example, <code>std::vector&lt;int&gt;</code> is strongly ordered despite that
two vectors with the same values can have different capacity. Here, capacity is 
not a salient attribute. Example of a weakly ordered type is <code>CaseInsensitiveString</code>,
it can store original string as-is but compare in a case-insensitive way.
Example of a partially ordered type is <code>float/double</code> because
<code>NaN</code> is not comparable to any other value. These categories form hierarchy, i.e., <code>strong_ordering</code>
can be converted to <code>weak_ordering</code> and <code>partial_ordering</code>, and <code>weak_ordering</code>
can be converted to <code>partial_ordering</code>.</p>

<h4 id="defaulted-comparisons">Defaulted comparisons</h4>

<p>Comparisons could be defaulted just like special member functions. In such case 
they operate in a member-wise fashion by comparing all underlying non-static data 
members with their corresponding operators. Defaulted <code>operator&lt;=&gt;()</code> also 
declares defaulted <code>operator==()</code>(if there was none),
so you can write  <code>auto operator&lt;=&gt;(const T&amp;) const = default;</code> and get all six 
comparison operations with member-wise semantics.</p>

<pre><code class="language-cpp">template&lt;typename T1, typename T2&gt;
void TestComparisons(T1 a, T2 b)
{
    (a &lt; b), (a &lt;= b), (a &gt; b), (a &gt;= b), (a == b), (a != b);
}

struct S2
{
    int a;
    int b;
};

struct S1
{
    int x;
    int y;
    // support homogeneous comparisons
    auto operator&lt;=&gt;(const S1&amp;) const = default;
    // this is required because there's operator==(const S2&amp;) which prevents
    // implicit declaration of defaulted operator==()
    bool operator==(const S1&amp;) const = default;

    // support heterogeneous comparisons
    std::strong_ordering operator&lt;=&gt;(const S2&amp; other) const
    {
        if (auto cmp = x &lt;=&gt; other.a; cmp != 0)
            return cmp;
        return y &lt;=&gt; other.b;
    }

    bool operator==(const S2&amp; other) const
    {
        return (*this &lt;=&gt; other) == 0;
    }
};

TestComparisons(S1{}, S1{});
TestComparisons(S1{}, S2{});
TestComparisons(S2{}, S1{});
</code></pre>

<p>Implicitly declared <code>operator==()</code> has the same signature as <code>operator&lt;=&gt;()</code>
except that return type is <code>bool</code>.</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
struct X
{
    friend constexpr std::partial_ordering operator&lt;=&gt;(X, X) requires(sizeof(T) != 1) = default;
    // implicitly declares:
    // friend constexpr bool operator==(X, X) requires(sizeof(T) != 1) = default;

    [[nodiscard]] virtual std::strong_ordering operator&lt;=&gt;(const X&amp;) const = default;
    // implicitly declares:
    //[[nodiscard]] virtual bool operator==(const X&amp;) const = default; 
};
</code></pre>

<p>Deduced comparison category is the weakest one of type’s members.</p>

<pre><code class="language-cpp">struct S3{
    int x;      // int-s are strongly ordered
    double d;   // but double-s are partially ordered
    // thus, the resulting category is std::partial_ordering
    auto operator&lt;=&gt;(const S3&amp;) const = default;
};
static_assert(std::is_same_v&lt;decltype(S3{} &lt;=&gt; S3{}), std::partial_ordering&gt;);
</code></pre>

<p>They must be members or friends and only friends can take by-value.</p>

<pre><code class="language-cpp">struct S4
{
    int x;
    int y;
    // member version must have op(const T&amp;) const; form
    auto operator&lt;=&gt;(const S3&amp;) const = default;

    // friend version can take arguments by const-reference or by-value
    // friend auto operator&lt;=&gt;(const S3&amp;, const S3&amp;) = default;
    // friend auto operator&lt;=&gt;(S3, S3) = default;
};
</code></pre>

<p>Can be out-of-class defaulted, just like special member functions.</p>

<pre><code class="language-cpp">struct S5
{
    int x;
    std::strong_ordering operator&lt;=&gt;(const S5&amp;) const;
    bool operator==(const S5&amp;) const;
};

std::strong_ordering S5::operator&lt;=&gt;(const S5&amp;) const = default;
bool S5::operator==(const S5&amp;) const = default;
</code></pre>

<p>Defaulted <code>operator&lt;=&gt;()</code> uses <code>operator&lt;=&gt;()</code> of class members or
their ordering can be synthesized using existing <code>Member::operator==()</code> and <code>Member::operator&lt;()</code>.
Note that it works only for members and not for the class itself,
existing <code>T::operator&lt;()</code> is never used in defaulted <code>T::operator&lt;=&gt;()</code>.</p>

<pre><code class="language-cpp">// not in our immediate control
struct Legacy
{
    bool operator==(Legacy const&amp;) const;
    bool operator&lt;(Legacy const&amp;) const;
};

struct S6
{
    int x;
    Legacy l;
    // deleted because Legacy doesn't have operator&lt;=&gt;(), comparison category
    // can't be deduced
    auto operator&lt;=&gt;(const S6&amp;) const = default;
};

struct S7
{
    int x;
    Legacy l;

    std::strong_ordering operator&lt;=&gt;(const S7&amp; rhs) const = default;
    /*
    Since comparison category is provided explicitly, ordering can be
    synthesized using operator&lt;() and operator==(). They must return exactly
    `bool` for this to work. It will work for weak and partial ordering as well.
    
    Here's an example of synthesized operator&lt;=&gt;():
    std::strong_ordering operator&lt;=&gt;(const S7&amp; rhs) const
    {
        // use operator&lt;=&gt;() for int
        if(auto cmp = x &lt;=&gt; rhs.x; cmp != 0) return cmp;

        // synthesize ordering for Legacy using operator&lt;() and operator==()
        if(l == rhs.l) return std::strong_ordering::equal;
        if(l &lt; rhs.l) return std::strong_ordering::less;
        return std::strong_ordering::greater;
    }
    */
};

struct NoEqual
{
    bool operator&lt;(const NoEqual&amp;) const = default;
};

struct S8
{
    NoEqual n;
    // deleted, NoEqual doesn't have operator&lt;=&gt;()
    // auto operator&lt;=&gt;(const S8&amp;) const = default;

    // deleted as well because NoEqual doesn't have operator==()
    std::strong_ordering operator&lt;=&gt;(const S8&amp;) const = default;
};

struct W
{
    std::weak_ordering operator&lt;=&gt;(const W&amp;) const = default;
};

struct S9
{
    W w;
    // ask for strong_ordering but W can provide only weak_ordering, this will
    // yield an error during instantiation
    std::strong_ordering operator&lt;=&gt;(const S9&amp;) const = default;
    void f()
    {
        (S9{} &lt;=&gt; S9{});    // error
    }
};
</code></pre>

<p><code>union</code> and reference members are not supported.</p>

<pre><code class="language-cpp">struct S4
{
    int&amp; r;
    // deleted because of reference member
    auto operator&lt;=&gt;(const S4&amp;) const = default;
};
</code></pre>

<hr />

<h2 id="lambda-features">Lambda expressions</h2>

<h3 id="lambda-this">Allow lambda-capture <code>[=, this]</code></h3>

<p>When captured implicitly, <code>this</code> is always captured by-reference, even with <code>[=]</code>.
To remove this confusion, C++20 deprecates such behavior and allows more explicit
<code>[=, this]</code>:</p>

<pre><code class="language-cpp">struct S{
    void f(){
        [=]{};          // captures this by reference, deprecated since C++20
        [=, *this]{};   // OK since C++17, captures this by value
        [=, this]{};    // OK since C++20, captures this by reference
    }
};
</code></pre>

<hr />

<h3 id="lambda-templ-params">Template parameter list for generic lambdas</h3>

<p>Sometimes generic lambdas are too generic. C++20 allows to use familiar
template function syntax to introduce type names directly.</p>

<pre><code class="language-cpp">// lambda that expect std::vector&lt;T&gt;
// until C++20:
[](auto vector){
    using T =typename decltype(vector)::value_type;
    // use T
};
// since C++20:
[]&lt;typename T&gt;(std::vector&lt;T&gt; vector){
    // use T
};

// access argument type
// until C++20
[](const auto&amp; x){
    using T = std::decay_t&lt;decltype(x)&gt;;
    // using T = decltype(x); // without decay_t&lt;&gt; it would be const T&amp;, so
    T copy = x;               // copy would be a reference type
    T::static_function();     // and these wouldn't work at all
    using Iterator = typename T::iterator;
};
// since C++20
[]&lt;typename T&gt;(const T&amp; x){
    T copy = x;
    T::static_function();
    using Iterator = typename T::iterator;
};

// perfect forwarding
// until C++20:
[](auto&amp;&amp;... args){
    return f(std::forward&lt;decltype(args)&gt;(args)...);
};
// since C++20:
[]&lt;typename... Ts&gt;(Ts&amp;&amp;... args){
    return f(std::forward&lt;Ts&gt;(args)...);
};

// and of course you can mix them with auto-parameters
[]&lt;typename T&gt;(const T&amp; a, auto b){};
</code></pre>

<hr />

<h3 id="lambda-uneval-ctx">Lambdas in unevaluated contexts</h3>

<p>Lambda expressions can be used in unevaluated contexts, such as <code>sizeof()</code>,
<code>typeid()</code>, <code>decltype()</code>, etc. Here are some key points for this feature, for a
more real-world example see <a href="#lambda-def-ctor">Default constructible and assignable stateless 
lambdas</a>.</p>

<p>The main principle is that lambdas have a unique unknown type, two lambdas and 
their types are never equal.</p>

<pre><code class="language-cpp">using L = decltype([]{});   // lambdas have no linkage
L PublicApi();              // L can't be used for external linkage

// in template , two different declarations
template&lt;class T&gt; void f(decltype([]{}) (*s)[sizeof(T)]);
template&lt;class T&gt; void f(decltype([]{}) (*s)[sizeof(T)]);

// again, lambda types are never equivalent
static decltype([]{}) f();
static decltype([]{}) f(); // error, return type mismatch

static decltype([]{}) g();
static decltype(g()) g(); // okay, redeclaration

// each specialization has its own lambda with unique type
template&lt;typename T&gt;
using R = decltype([]{});

static_assert(!std::is_same_v&lt;R&lt;int&gt;, R&lt;char&gt;&gt;);

// Lambda-based SFINAE and constraints are not supported, it just fails
template &lt;class T&gt;
auto f(T) -&gt; decltype([]() { T::invalid; } ());
void f(...);

template&lt;typename T&gt;
void g(T) requires requires{
    [](){typename T::invalid x;}; }
{}
void g(...){}

f(0);  // error
g(0);  // error
</code></pre>

<p>In the following example, <code>f()</code> increments the same counter in both
translation units because <code>inline</code> function behaves as if there’s 
only one definition of it. However, <code>g_s</code> violates ODR because despite that 
there’s only one definition of it, there are still multiple declarations which
are different because there are two different lambdas in <code>a.cpp</code> and <code>b.cpp</code>,
thus, <code>S</code> has different non-type template argument:</p>

<pre><code class="language-cpp">// a.h
template&lt;typename T&gt;
int counter(){
    static int value{};
    return value++;
}

inline int f(){
    return counter&lt;decltype([]{})&gt;();
}

template&lt;auto&gt; struct S{ void call(){} };
// cast lambda to pointer
inline S&lt;+[]{}&gt; g_s;

// a.cpp
#include "a.h"
auto v = f();
g_s.call();

// b.cpp
#include "a.h"
auto v = f();
g_s.call();
</code></pre>

<hr />

<h3 id="lambda-def-ctor">Default constructible and assignable stateless lambdas</h3>

<p>In C++20 stateless lambdas are default constructible and assignable which 
allows to use a type of a lambda to construct/assign it later. With
<a href="#lambda-uneval-ctx">Lambdas in unevaluated contexts</a> we can get a type of a
lambda with <code>decltype()</code>
and create a variable of that type later:</p>

<pre><code class="language-cpp">auto greater = [](auto x,auto y)
{
    return x &gt; y;
};
// requires default constructible type
std::map&lt;std::string, int, decltype(greater)&gt; map;
auto map2 = map;    // requires default assignable type
</code></pre>

<p>Here, <code>std::map</code> takes a comparator type to instantiate it later. While we could
get a lambda type in C++17, it was not possible to instantiate it because 
lambdas were not default constructible.</p>

<hr />

<h3 id="lambda-pack-exp">Pack expansion in lambda init-capture</h3>

<p>C++20 simplifies capturing parameter packs in lambdas. Until C++20 they can be
captured by-value, by-reference or do some tricks with <code>std::tuple</code> if
we want to move the pack. Now it’s much easier, we can create <em>init-capture pack</em>
and initialize it with the pack we want to capture. It’s not limited to 
<code>std::move</code> or <code>std::forward</code>, any function can be
applied to pack elements.</p>

<pre><code class="language-cpp">void g(int, int){}

// C++17
template&lt;class F, class... Args&gt;
auto delay_apply(F&amp;&amp; f, Args&amp;&amp;... args) {
    return [f=std::forward&lt;F&gt;(f), tup=std::make_tuple(std::forward&lt;Args&gt;(args)...)]()
            -&gt; decltype(auto) {
        return std::apply(f, tup);
    };
}

// C++20
template&lt;typename F, typename... Args&gt;
auto delay_call(F&amp;&amp; f, Args&amp;&amp;... args) {
    return [f = std::forward&lt;F&gt;(f), ...f_args=std::forward&lt;Args&gt;(args)]()
            -&gt; decltype(auto) {
        return f(f_args...);
    };
}

void f(){
    delay_call(g, 1, 2)();
}
</code></pre>

<hr />

<h2 id="constexpr-features">Constant expressions</h2>

<h3 id="consteval">Immediate functions(<code>consteval</code>)</h3>

<p>While <code>constexpr</code> implies that function <em>can</em> be evaluated at compile-time,
<code>consteval</code> specifies that function <em>must</em> be evaluated at compile-time(only).
<code>virtual</code> functions are allowed to be <code>consteval</code> but they can override and be
overridden by another <code>consteval</code> function only, i.e., mix of <code>consteval</code> and
non-<code>consteval</code> is not allowed.
Destructors and allocation/deallocation functions can’t be <code>consteval</code>.</p>

<pre><code class="language-cpp">consteval int GetInt(int x){
    return x;
}

constexpr void f(){
    auto x1 = GetInt(1);
    constexpr auto x2 = GetInt(x1); // error x1 is not a constant-expression
}
</code></pre>

<hr />

<h3 id="constexpr-virtual"><code>constexpr</code> virtual function</h3>

<p>Virtual functions can now be <code>constexpr</code>. <code>constexpr</code> function can override
non-<code>constexpr</code> one and vice-versa.</p>

<pre><code class="language-cpp">struct Base{
    constexpr virtual ~Base() = default;
    virtual int Get() const = 0;    // non-constexpr
};

struct Derived1 : Base{
    constexpr int Get() const override {
        return 1;
    }
};

struct Derived2 : Base{
    constexpr int Get() const override {
        return 2;
    }
};

constexpr auto GetSum(){
    const Derived1 d1;
    const Derived2 d2;
    const Base* pb1 = &amp;d1;
    const Base* pb2 = &amp;d2;

    return pb1-&gt;Get() + pb2-&gt;Get();
}

static_assert(GetSum() == 1 + 2);   // evaluated at compile-time
</code></pre>

<hr />

<h3 id="constexpr-try-catch"><code>constexpr</code> try-catch blocks</h3>

<p><em>try-catch</em> blocks are now allowed inside <code>constexpr</code> functions but <code>throw</code> is not,
so, the <code>catch</code> block is simply ignored. This can be useful, for example, in
combination with <code>constexpr new</code>, we can have single function that works at
run/compile time:</p>

<pre><code class="language-cpp">constexpr void f(){
    try{
        auto p = new int;
        // ...
        delete p;
    }
    catch(...){     // ignored at compile-time
        // ...
    }
}
</code></pre>

<hr />

<h3 id="constexpr-dyn-cast"><code>constexpr</code> <code>dynamic_cast</code> and polymorphic <code>typeid</code></h3>

<p>Since virtual functions can now be <code>constexpr</code>, there’s no reason not to allow
<code>dynamic_cast</code> and polymorphic <code>typeid</code> in <code>constexpr</code>. Unfortunately,
<code>std::type_info</code> has
no <code>constexpr</code> members yet so there’s a little use of it now(thanks to Peter
Dimov for clarifying this for me).</p>

<pre><code class="language-cpp">struct Base1{
    virtual ~Base1() = default;
    constexpr virtual int get() const = 0;
};

struct Derived1 : Base1{
    constexpr int get() const override {
        return 1;
    }
};

struct Base2{
    virtual ~Base2() = default;
    constexpr virtual int get() const = 0;
};

struct Derived2 : Base2{
    constexpr int get() const override {
        return 2;
    }
};

template&lt;typename Base, typename Derived&gt;
constexpr auto downcasted_get(){
    const Derived d;
    const Base&amp; upcasted = d;
    const auto&amp; downcasted = dynamic_cast&lt;const Derived&amp;&gt;(upcasted);

    return downcasted.get();
}

static_assert(downcasted_get&lt;Base1, Derived1&gt;() == 1);
static_assert(downcasted_get&lt;Base2, Derived2&gt;() == 2);

// compile-time error, cannot cast Derived1 to Base2
static_assert(downcasted_get&lt;Base2, Derived1&gt;() == 1);
</code></pre>

<hr />

<h3 id="constexpr-union">Changing the active member of a <code>union</code> inside <code>constexpr</code></h3>

<p>Another relaxation for constant expressions. One can change an active member of
a <code>union</code> but can’t read an inactive member since it’s UB and UB is not allowed
in <code>constexpr</code> context.</p>

<pre><code class="language-cpp">union Foo {
  int i;
  float f;
};

constexpr int f() {
  Foo foo{};
  foo.i = 3;    // i is an active member
  foo.f = 1.2f; // valid since C++20, f becomes an active member

//   return foo.i;  // error, reading inactive union member
  return foo.f;
}
</code></pre>

<hr />

<h3 id="constexpr-alloc"><code>constexpr</code> allocations</h3>

<p>C++20 lays foundation for <code>constexpr</code> containers. First, it allows <code>constexpr</code> 
and 
even <code>virtual constexpr</code> destructors for <em>literal</em> types(types that can be used as 
a <code>constexpr</code> variable). Second, it allows calls to 
<code>std::allocator&lt;T&gt;::allocate()</code> and <code>new-expression</code> which 
results in a call to one of the global <code>operator new</code> if allocated storage is 
deallocated at compile time. That is, memory can be allocated at compile-time
but it must be freed at compile-time also. This creates a bit of friction
if final data has to be used at run-time. There’s no choice but to store it in
some non-allocating container like <code>std::array</code> and get compile-time value twice:
first, to get its size, and second, to actually copy it(thanks to 
<strong>arthur-odwyer</strong>, 
<strong>beached</strong> and <strong>luke</strong> from <a href="cpplang.slack.com">cpplang slack</a> for explaining 
this to me):</p>

<pre><code class="language-cpp">constexpr auto get_str()
{
    std::string s1{"hello "};
    std::string s2{"world"};
    std::string s3 = s1 + s2;
    return s3;
}

constexpr auto get_array()
{
    constexpr auto N = get_str().size();
    std::array&lt;char, N&gt; arr{};
    std::copy_n(get_str().data(), N, std::begin(arr));
    return arr;
}

static_assert(!get_str().empty());

// error because it holds data allocated at compile-time
constexpr auto str = get_str();

// OK, string is stored in std::array&lt;char&gt;
constexpr auto result = get_array();
</code></pre>

<hr />

<h3 id="constexpr-trivial-def-init">Trivial default initialization in <code>constexpr</code> functions</h3>

<p>In C++17 <code>constexpr</code> constructor, among other requirements, must initialize all
non-static data members. This rule has been removed in C++20. But, because UB is 
not allowed in <code>constexpr</code> context, you can’t read from such uninitialized members,
only write to them:</p>

<pre><code class="language-cpp">struct NonTrivial{
    bool b = false;
};

struct Trivial{
    bool b;
};

template &lt;typename T&gt;
constexpr T f1(const T&amp; other) {
    T t;        // default initialization
    t = other;
    return t;
}

template &lt;typename T&gt;
constexpr auto f2(const T&amp; other) {
    T t;
    return t.b;
}

void test(){
    constexpr auto a = f1(Trivial{});   // error in C++17, OK in C++20
    constexpr auto b = f1(NonTrivial{});// OK

    constexpr auto c = f2(Trivial{}); // error, uninitialized Trivial::b is used
    constexpr auto d = f2(NonTrivial{}); // OK
}
</code></pre>

<hr />

<h3 id="constexpr-asm">Unevaluated <code>asm</code>-declaration in <code>constexpr</code> functions</h3>

<p>asm-declaration now can appear inside <code>constexpr</code> function in case it’s not 
evaluated at compile-time. This allows to have both compile and run time(with 
asm now) code inside a single function:</p>

<pre><code class="language-cpp">constexpr int add(int a, int b){
    if (std::is_constant_evaluated()){
        return a + b;
    }
    else{
        asm("asm magic here");
        //...
    }
}
</code></pre>

<hr />

<h3 id="is-const-eval">std::is_constant_evaluated()</h3>

<p>With <code>std::is_constant_evaluated()</code> you can check whether current invocation 
occurs within a constant-evaluated context. I would like to say “during 
compile-time” but, as the authors said, “C++ doesn’t make a clear distinction 
between 
compile-time and run-time”. Instead, C++20 declares a <a href="https://en.cppreference.com/w/cpp/types/is_constant_evaluated">list</a>
of expressions that are <em>manifestly constant-evaluated</em> and this function returns
<code>true</code> during their evaluation and <code>false</code> otherwise.<br />
Be careful not to use this function directly in such <em>manifestly constant-evaluated</em>
expressions(e.g. <code>if constexpr</code>, array size, template arguments, etc.).
By definition, in such cases 
<code>std::is_constant_evaluated()</code> returns <code>true</code> even if the enclosing function
is not constant evaluated. Thanks to user <strong>destroyerrocket</strong> from <a href="https://www.reddit.com/r/cpp/">/r/cpp</a>
for bringing up this issue.</p>

<pre><code class="language-cpp">constexpr int GetNumber(){
    if(std::is_constant_evaluated()){   // should not be `if constexpr`
        return 1;
    }
    return 2;
}

constexpr int GetNumber(int x){
    if(std::is_constant_evaluated()){   // should not be `if constexpr`
        return x;
    }
    return x+1;
}

void f(){
    constexpr auto v1 = GetNumber();
    const auto v2 = GetNumber();

    // initialization of a non-const variable, not constant-evaluated
    auto v3 = GetNumber();

    assert(v1 == 1);
    assert(v2 == 1);
    assert(v3 == 2);

    constexpr auto v4 = GetNumber(1);
    int x = 1;

    // x is not a constant-expression, not constant-evaluated
    const auto v5 = GetNumber(x);

    assert(v4 == 1);
    assert(v5 == 2);    
}

// pathological examples
// always returns `true`
constexpr bool IsInConstexpr(int){
    if constexpr(std::is_constant_evaluated()){ // always `true`
        return true;
    }
    return false;
}

// always returns `sizeof(int)`
constexpr std::size_t GetArraySize(int){
    int arr[std::is_constant_evaluated()];  // always int arr[1];
    return sizeof(arr);
}

// always returns `1`
constexpr std::size_t GetStdArraySize(int){
    std::array&lt;int, std::is_constant_evaluated()&gt; arr;  // std::array&lt;int, 1&gt;
    return arr.size();
}
</code></pre>

<hr />

<h2 id="aggregates">Aggregates</h2>

<h3 id="aggr-no-ctor">Prohibit aggregates with user-declared constructors</h3>

<p>Now aggregate types
can’t have <em>user-declared</em> constructors. Previously, aggregates were allowed to
have only deleted or defaulted constructors. That 
resulted in a weird behavior for aggregates with defaulted/deleted constructors
(they’re <em>user-declared</em> but not <em>user-provided</em>).</p>

<pre><code class="language-cpp">// none of the types below are an aggregate in C++20
struct S{
    int x{2};
    S(int) = delete; // user-declared ctor
};

struct X{
    int x;
    X() = default;  // user-declared ctor
};

struct Y{
    int x;
    Y();            // user-provided ctor
};

Y::Y() = default;

void f(){
    S s(1);     // always an error
    S s2{1};    // OK in C++17, error in C++20, S is not an aggregate now
    X x{1};     // OK in C++17, error in C++20
    Y y{2};     // always an error
}
</code></pre>

<hr />

<h3 id="ctad-aggr">Class template argument deduction for aggregates</h3>

<p>In C++17 to use aggregates with CTAD we need explicit deduction guides,
that’s unnecessary now:</p>

<pre><code class="language-cpp">template&lt;typename T, typename U&gt;
struct S{
    T t;
    U u;
};
// deduction guide was needed in C++17
// template&lt;typename T, typename U&gt;
// S(T, U) -&gt; S&lt;T,U&gt;;

S s{1, 2.0};    // S&lt;int, double&gt;
</code></pre>

<p>CTAD isn’t involved when there are user-provided deduction guides:</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
struct MyData{
    T data;
};
MyData(const char*) -&gt; MyData&lt;std::string&gt;;

MyData s1{"abc"};   // OK, MyData&lt;std::string&gt; using deduction guide
MyData&lt;int&gt; s2{1};  // OK, explicit template argument
MyData s3{1};       // Error, CTAD isn't involved
</code></pre>

<p>Can deduce array types:</p>

<pre><code class="language-cpp">template&lt;typename T, std::size_t N&gt;
struct Array{
    T data[N];
};

Array a{{1, 2, 3}}; // Array&lt;int, 3&gt;, notice additional braces
Array str{"hello"}; // Array&lt;char, 6&gt;
</code></pre>

<p>Brace elision doesn’t work for dependent non-array types or array types of
dependent bound.</p>

<pre><code class="language-cpp">template&lt;typename T, typename U&gt;
struct Pair{
    T first;
    U second;
};

template&lt;typename T, std::size_t N&gt;
struct A1{
    T data[N];
    T oneMore;
    Pair&lt;T, T&gt; p;
};

template&lt;typename T&gt;
struct A2{
    T data[3];
    T oneMore;
    Pair&lt;int, int&gt; p;
};

// A1::data is an array of dependent bound and A1::p is a dependent type, thus,
// no brace elision for them
A1 a1{{1,2,3}, 4, {5, 6}};  // A1&lt;int, 3&gt;
// A2::data is an array of non-dependent bound and A1::p is a non-dependent type,
// thus, brace elision works
A2 a2{1, 2, 3, 4, 5, 6};    // A2&lt;int&gt;
</code></pre>

<p>Works with pack expansions.
Trailing aggregate element that is a pack expansion corresponds to all 
remaining elements:</p>

<pre><code class="language-cpp">template&lt;typename... Ts&gt;
struct Overload : Ts...{
    using Ts::operator()...;
};
// no need for deduction guide anymore

Overload p{[](int){
        std::cout &lt;&lt; "called with int";
    }, [](char){
        std::cout &lt;&lt; "called with char";
    }
};     // Overload&lt;lambda(int), lambda(char)&gt;
p(1);   // called with int
p('c'); // called with char
</code></pre>

<p>Non-trailing element that is a pack expansions corresponds to no elements:</p>

<pre><code class="language-cpp">template&lt;typename T, typename...Ts&gt;
struct Pack : Ts... {
    T x;
};

// can deduce only the first element
Pack p1{1};         // Pack&lt;int&gt;
Pack p2{[]{}};      // Pack&lt;lambda()&gt;
Pack p3{1, []{}};   // error
</code></pre>

<p>Number of elements in the pack is deduced only once but types should match 
exactly if repeated:</p>

<pre><code class="language-cpp">struct A{};
struct B{};
struct C{};
struct D{
    operator C(){return C{};}
};

template&lt;typename...Ts&gt;
struct P : std::tuple&lt;Ts...&gt;, Ts...{
};

P{std::tuple&lt;A, B, C&gt;{}, A{}, B{}, C{}}; // P&lt;A, B, C&gt;

// equivalent to the above, since pack elements were deduced for
// std::tuple&lt;A, B, C&gt; there's no need to repeat their types
P{std::tuple&lt;A, B, C&gt;{}, {}, {}, {}}; // P&lt;A, B, C&gt;

// since we know the whole P&lt;A, B, C&gt; type after std::tuple initializer, we can
// omit trailing initializers, elements will be value-initialized as usual
P{std::tuple&lt;A, B, C&gt;{}, {}, {}}; // P&lt;A, B, C&gt;

// error, pack deduced from first initializer is &lt;A, B, C&gt; but got &lt;A, B, D&gt; for
// the trailing pack, implicit conversions are not considered
P{std::tuple&lt;A, B, C&gt;{}, {}, {}, D{}};
</code></pre>

<hr />

<h3 id="aggr-paren-init">Parenthesized initialization of aggregates</h3>

<p>Parenthesized initialization of aggregates now works in the same way as braced 
initialization except that narrowing conversions are permitted, designated 
initializers are not allowed, no lifetime extension for temporaries and no brace 
elision. Elements without initializer are value-initialized. This allows
seamless usage of factory functions like <code>std::make_unique&lt;&gt;()/emplace()</code> with
aggregates.</p>

<pre><code class="language-cpp">struct S{
    int a;
    int b = 2;
    struct S2{
        int d;
    } c;
};

struct Ref{
    const int&amp; r;
};

int GetInt(){
    return 21;
}

S{0.1}; // error, narrowing
S(0.1); // OK

S{.a=1}; // OK
S(.a=1); // error, no designated initializers

Ref r1{GetInt()}; // OK, lifetime is extended
Ref r2(GetInt()); // dangling, lifetime is not extended

S{1, 2, 3}; // OK, brace elision, same as S{1,2,{3}}
S(1, 2, 3); // error, no brace elision

// values without initializers take default values or value-initialized(T{})
S{1}; // {1, 2, 0}
S(1); // {1, 2, 0}

// make_unique works now
auto ps = std::make_unique&lt;S&gt;(1, 2, S::S2{3});

// arrays are also supported
int arr1[](1, 2, 3);
int arr2[2](1); // {1, 0}
</code></pre>

<hr />

<h2 id="nttp">Non-type template parameters</h2>

<h3 id="class-types-nttp">Class types in non-type template parameters</h3>

<p>Non-type template parameters now can be of literal class types(
types that can be used as a <code>constexpr</code> variable)
with all bases and non-static members being <code>public</code> and non-<code>mutable</code>(literally, 
there should be no <code>mutable</code> specifier). Instances of such classes are stored as
<code>const</code> objects and you can even call their member functions. There’s a new
kind of non-type template parameter: <em>placeholder for a deduced
class type</em>. In the example below, <code>fixed_string</code> is a template name, not a type 
name, but we can use it to declare template parameter <code>template&lt;fixed_string S&gt;</code>.
In such a case, the compiler will deduce template arguments for <code>fixed_string</code> 
before instantiating <code>f&lt;&gt;()</code> using an invented declaration in the form of 
<code>T x = template-argument;</code>. Here’s how it can be used to create a simple
compile-time string class:</p>

<pre><code class="language-cpp">template&lt;std::size_t N&gt;
struct fixed_string{
    constexpr fixed_string(const char (&amp;s)[N+1]) {
        std::copy_n(s, N + 1, str);
    }
    constexpr const char* data() const {
        return str;
    }
    constexpr std::size_t size() const {
        return N;
    }

    char str[N+1];
};

template&lt;std::size_t N&gt;
fixed_string(const char (&amp;)[N])-&gt;fixed_string&lt;N-1&gt;;

// user-defined literals are also supported
template&lt;fixed_string S&gt;
constexpr auto operator""_cts(){
    return S;
}

// N for `S` will be deduced
template&lt;fixed_string S&gt;
void f(){
    std::cout &lt;&lt; S.data() &lt;&lt; ", " &lt;&lt; S.size() &lt;&lt; '\n';
}

f&lt;"abc"&gt;(); // abc, 3
constexpr auto s = "def"_cts;
f&lt;s&gt;();     // def, 3
</code></pre>

<hr />

<h3 id="nttp-gen">Generalized non-type template parameters</h3>

<p>Non-type template parameters are generalized to so-called <em>structural</em> types.
Structural type is one of:</p>
<ul>
  <li>scalar type(arithmetic, pointer, pointer-to-member, enumeration, <code>std::nullptr_t</code>)</li>
  <li>lvalue reference</li>
  <li>literal class type with the following properties: all base classes and
non-static data members are public and non-<code>mutable</code>, and their types are
structural or array types.</li>
</ul>

<p>This allows usage of floating-point and class types as a template parameters:</p>

<pre><code class="language-cpp">template&lt;auto T&gt;    // placeholder for any non-type template parameter
struct X{};

template&lt;typename T, std::size_t N&gt;
struct Arr{
    T data[N];
};

X&lt;5&gt; x1;
X&lt;'c'&gt; x2;
X&lt;1.2&gt; x3;
// with the help of CTAD for aggregates
X&lt;Arr{{1,2,3}}&gt; x4; // X&lt;Arr&lt;int, 3&gt;&gt;
X&lt;Arr{"hi"}&gt; x5;    // X&lt;Arr&lt;char, 3&gt;&gt;
</code></pre>

<p>Interesting moment here is that non-type template arguments are compared not 
with their <code>operator==()</code> but in a bitwise-<em>like</em> manner(the exact
rules are <a href="https://en.cppreference.com/w/cpp/language/template_parameters#Template_argument_equivalence">here</a>). That is, their
bit representation is used for comparison. <code>union</code>s are exceptions because 
the compiler can track their active members. Two unions are equal if they both
have no active member or have the same active member with equal value.</p>

<pre><code class="language-cpp">template&lt;auto T&gt;
struct S{};

union U{
    int a;
    int b;
};

enum class E{
    A = 0,
    B = 0
};

struct C{
    int x;
    bool operator==(const C&amp;) const{    // never equal
        return false;
    }
};

constexpr C c1{1};
constexpr C c2{1};
assert(c1 != c2);                           // not equal using operator==()
assert(memcmp(&amp;c1, &amp;c2, sizeof(C)) == 0);   // but equal bitwise
// thus, equal at compile-time, operator==() is not used
static_assert(std::is_same_v&lt;S&lt;c1&gt;, S&lt;c2&gt;&gt;);

constexpr E e1{E::A};
constexpr E e2{E::B};
// equal bitwise, enum's identity isn't taken into account
assert(memcmp(&amp;e1, &amp;e2, sizeof(E)) == 0);
static_assert(std::is_same_v&lt;S&lt;e1&gt;, S&lt;e2&gt;&gt;); // thus, equal at compile-time

constexpr U u1{.a=1};
constexpr U u2{.b=1};
// equal bitwise but have different active members(a vs. b)
assert(memcmp(&amp;u1, &amp;u2, sizeof(U)) == 0);
// thus, not equal at compile-time
static_assert(!std::is_same_v&lt;S&lt;u1&gt;, S&lt;u2&gt;&gt;);
</code></pre>

<hr />

<h2 id="struct-bindings">Structured bindings</h2>

<h3 id="structbind-specs">Lambda capture and storage class specifiers for structured bindings</h3>

<p>Structured bindings are allowed to have <code>[[maybe_unused]]</code> attribute, <code>static</code>
and <code>thread_local</code> specifiers. Also, it’s possible now to capture them by-value or
by-reference in lambdas. Note that bound bit-fields can be captured only
by-value.</p>

<pre><code class="language-cpp">struct S{
    int a: 1;
    int b: 1;
    int c;
};

static auto [A,B,C] = S{};

void f(){
    [[maybe_unused]] thread_local auto [a,b,c] = S{};
    auto l = [=](){
        return a + b + c;
    };

    auto m = [&amp;](){
        // error, can't capture bit-fields 'a' and 'b' by-reference
        // return a + b + c;
        return c;
    };
}

</code></pre>

<hr />

<h3 id="fix-structbind-cp">Relaxing the structured bindings customization point finding rules</h3>

<p>One of ways for a type to be decomposed for structured bindings is through
a tuple-like API. It consists of three “functions”: <code>std::tuple_element</code>, 
<code>std::tuple_size</code> and two options for <code>get</code>: <code>e.get&lt;I&gt;()</code> or <code>get&lt;I&gt;(e)</code> where the 
first has priority over the second. That is, the member <code>get()</code> is preferred over
non-member one. Imagine a
type that has <code>get()</code> but it’s not for a tuple-like API, for example 
<code>std::shared_ptr::get()</code>. Such a type can’t be decomposed because the compiler
will try to
use member <code>get()</code> and it won’t work. Now this rule has been fixed in a way that 
the member version is
preferred only if it’s a template and its first template parameter is a non-type
template parameter.</p>

<pre><code class="language-cpp">struct X : private std::shared_ptr&lt;int&gt;{
    std::string payload;
};

// due to new rules, this function is used instead of std::shared_ptr&lt;int&gt;::get
template&lt;int N&gt;
std::string&amp; get(X&amp; x) {
    if constexpr(N==0) return x.payload;
}

namespace std {
    template&lt;&gt;
    class tuple_size&lt;X&gt; 
        : public std::integral_constant&lt;int, 1&gt;
    {};
    
    template&lt;&gt;
    class tuple_element&lt;0, X&gt; {
    public:
        using type = std::string;
    };
}

void f(){
    X x;
    auto&amp; [payload] = x;
}
</code></pre>

<hr />

<h3 id="fix-structbind-access">Allow structured bindings to accessible members</h3>

<p>This fix allows structured bindings not only to <code>public</code> members but to
<em>accessible</em> members in the context of structured binding declaration.</p>

<pre><code class="language-cpp">struct A {
    friend void foo();
private:
    int i;
};

void foo() {
    A a;
    auto x = a.i;   // OK
    auto [y] = a;   // Ill-formed until C++20, now OK
}
</code></pre>

<hr />

<h2 id="range-based-for">Range-based for-loop</h2>

<h3 id="init-range-for">init-statements for range-based for-loop</h3>

<p>Similar to if-statement, range-based for-loop now can have init-statement. It
can be used to avoid dangling references:</p>

<pre><code class="language-cpp">class Obj{
    std::vector&lt;int&gt;&amp; GetItems();
};

Obj GetObj();

// dangling reference, lifetime of Obj return by GetObj() is not extended
for(auto x : GetObj().GetCollection()){
    // ...
}

// OK
for(auto obj = GetObj(); auto item : obj.GetCollection()){
    // ...
}

// also can be used to maintain index
for(std::size_t i = 0; auto&amp; v : collection){
    // use v...
    i++;
}
</code></pre>

<hr />

<h3 id="fix-range-for-cp">Relaxing the range-based for-loop customization point finding rules</h3>

<p>This one is similar to <a href="#fix-structbind-cp">structured bindings customization point fix</a>.
To iterate over a range, range-based for-loop needs either free or member
<code>begin</code>/<code>end</code> functions.
Old rules worked in a way that if <em>any</em> member(function or variable) named
<code>begin</code>/<code>end</code> was found then the compiler would try to use member functions.
This creates a problem for types that have a member <code>begin</code> but no <code>end</code> or vice
versa. Now member functions are used only if both names exist, otherwise free
functions are used.</p>

<pre><code class="language-cpp">struct X : std::stringstream {
  // ...
};

std::istream_iterator&lt;char&gt; begin(X&amp; x){
    return std::istream_iterator&lt;char&gt;(x);
}

std::istream_iterator&lt;char&gt; end(X&amp; x){
    return std::istream_iterator&lt;char&gt;();
}

void f(){
    X x;
    // X has member with name `end` inherited from std::stringstream
    // but due to new rules free begin()/end() are used
    for (auto&amp;&amp; i : x) {
        // ...
    }
}
</code></pre>

<hr />

<h2 id="attributes">Attributes</h2>

<h3 id="attr-likely"><code>[[likely]]</code> and <code>[[unlikely]]</code></h3>

<p><code>[[likely]]</code> and <code>[[unlikely]]</code> attributes give a hint to the
compiler about likeliness of execution path so it can better optimize the code.
They can be applied to statements(e.g. <code>if/else</code>-statements, loops) or
labels(<code>case/default</code>).</p>

<pre><code class="language-cpp">int f(bool b){
    if(b) [[likely]] {
        return 12;
    }
    else{
        return 10;
    }
}
</code></pre>

<hr />

<h3 id="attr-no-uniq-addr"><code>[[no_unique_address]]</code></h3>

<p><code>[[no_unique_address]]</code> can be applied to a non-static non-bitfield data member to 
indicate that it doesn’t need a unique address. In practice, it’s applied to
a potentially empty data member and the compiler can optimize it to occupy no
space(like empty base optimization for members). Such a member can share the
address of another member or base class.</p>

<pre><code class="language-cpp">struct Empty{};

template&lt;typename T&gt;
struct Cpp17Widget{
    int i;
    T t;
};

template&lt;typename T&gt;
struct Cpp20Widget{
    int i;
    [[no_unique_address]] T t;
};

static_assert(sizeof(Cpp17Widget&lt;Empty&gt;) &gt; sizeof(int));
static_assert(sizeof(Cpp20Widget&lt;Empty&gt;) == sizeof(int));
</code></pre>

<hr />

<h3 id="discard-msg"><code>[[nodiscard]]</code> with message</h3>

<p>Like <code>[[deprecated("reason")]]</code>, <code>nodiscard</code> now can have a reason too.</p>

<pre><code class="language-cpp">// test whether it's supported
static_assert(__has_cpp_attribute(nodiscard) == 201907L);

[[nodiscard("Don't leave me alone")]]
int get();

void f(){
    get(); // warning: ignoring return value of function declared with 
           // 'nodiscard' attribute: Don't leave me alone
}
</code></pre>

<hr />

<h3 id="fix-nodiscard-ctor"><code>[[nodiscard]]</code> for constructors</h3>

<p>This fix explicitly allows applying <code>[[nodiscard]]</code> to constructors(compilers
were not required to support it prior to C++20).</p>

<pre><code class="language-cpp">struct resource{
    // empty resource, no harm if discarded
    resource() = default;
    
    [[nodiscard("don't discard non-empty resource")]]
    resource(int fd);
};

void f(){
    resource{};     // OK
    resource{1};    // warning
}
</code></pre>

<hr />

<h2 id="encoding">Character encoding</h2>

<h3 id="char8t"><code>char8_t</code></h3>

<p>C++17 introduced the <code>u8</code> character literal for UTF-8 string but its type was plain 
<code>char</code>. The inability to distinguish encoding by a type resulted in a code that
had to use various tricks to handle different encodings. A new <code>char8_t</code> type was 
introduced to represent UTF-8 characters. It has the same size, signedness, 
alignment, etc, as <code>unsigned char</code> but it’s a distinct type, not an alias.</p>

<pre><code class="language-cpp">void HandleString(const char*){}
// distinct function name is required to handle UTF-8 in C++17
void HandleStringUTF8(const char*){}
// now it can be done using convenient overload
void HandleString(const char8_t*){}

void Cpp17(){
    HandleString("abc");        // char[4]
    HandleStringUTF8(u8"abc");  // C++17: char[4] but UTF-8, 
                                // C++20: error, type is char8_t[4]
}

void Cpp20(){
    HandleString("abc");    // char
    HandleString(u8"abc");  // char8_t
}
</code></pre>

<hr />

<h3 id="stronger-unicode">Stronger Unicode requirements</h3>

<p>Types <code>char16_t</code> and <code>char32_t</code> are now explicitly required to represent UTF-16
and 
UTF-32 string literals correspondingly. Universal character names(<code>\Unnnnnnnn</code> and 
<code>\uNNNN</code>) must correspond to ISO/IEC 10646 code points (0x0 - 0x10FFFF inclusive) 
and not to a surrogate code points (0xD800 - 0xDFFF inclusive), otherwise the 
program is ill-formed.</p>

<pre><code class="language-cpp">char32_t c{'\U00110000'};   // error: invalid universal character
</code></pre>

<hr />

<h2 id="sugar">Sugar</h2>

<h3 id="designated-init">Designated initializers</h3>

<p>Now it’s possible to initialize specific(designated) aggregate members and skip 
others. Unlike C, initialization order must be the same as in aggregate
declaration:</p>

<pre><code class="language-cpp">struct S{
    int x;
    int y{2};
    std::string s;
};
S s1{.y = 3};   // {0, 3, {}}
S s2 = {.x = 1, .s = "abc"};    // {1, 2, {"abc"}}
S s3{.y = 1, .x = 2};   // Error, x should be initialized before y
</code></pre>

<hr />

<h3 id="bitfield-def-init">Default member initializers for bit-fields</h3>

<p>Until C++20, to provide default value for a bit-field one had to create a default
constructor, now that can be achieved using convenient default member
initialization syntax:</p>

<pre><code class="language-cpp">// until C++20:
struct S{
    int a : 1;
    int b : 1;
    S() : a{0}, b{1}{}
};

// since C++20:
struct S{
    int a : 1 {0},
    int b : 1 = 1;
};
</code></pre>

<hr />

<h3 id="less-typename">More optional <code>typename</code></h3>

<p><code>typename</code> can be omitted in contexts where nothing but a type name can 
appear(type in casts, return type, type aliases, member type, argument type of a member function, etc.):</p>

<pre><code class="language-cpp">template &lt;class T&gt;
T::R f();  // OK, return type of a function declaration at global scope

template &lt;class T&gt;
void f(T::R);   // Ill-formed (no diagnostic required), attempt to declare a
                // void variable template

template&lt;typename T&gt;
struct PtrTraits{
    using Ptr = void*;
};

template &lt;class T&gt;
struct S {
  using Ptr = PtrTraits&lt;T&gt;::Ptr;  // OK, in a defining-type-id
  T::R f(T::P p) {                // OK, class scope
    return static_cast&lt;T::R&gt;(p);  // OK, type-id of a static_cast
  }
  auto g() -&gt; S&lt;T*&gt;::Ptr; // OK, trailing-return-type

  T::SubType t;
};

template &lt;typename T&gt;
void f() {
  void (*pf)(T::X); // Variable pf of type void* initialized with T::X
  void g(T::X);     // Error: T::X at block scope does not denote a type
                    // (attempt to declare a void variable)
}
</code></pre>

<hr />

<h3 id="nested-inline-ns">Nested <code>inline</code> namespaces</h3>

<p><code>inline</code> keyword is allowed to appear in nested namespace definitions:</p>

<pre><code class="language-cpp">// C++20
namespace A::B::inline C{
    void f(){}
}
// C++17
namespace A::B{
    inline namespace C{
        void f(){}
    }
}
</code></pre>

<hr />

<h3 id="using-enum"><code>using enum</code></h3>

<p>Scoped enumerations are great, the only problem with them is their verbose usage
(e.g. <code>my_enum::enum_value</code>). For example, in a switch-statement that checks
every possible enum value, <code>my_enum::</code> part should be repeated for each case-label.
<em>Using enum declaration</em> introduces all enumeration’s names into the
current scope so they are visible as unqualified names and <code>my_enum::</code> part can
be omitted. It can be applied to unscoped enumerations and even to a single
enumerator.</p>

<pre><code class="language-cpp">namespace my_lib {
enum class color { red, green, blue };
enum COLOR {RED, GREEN, BLUE};
enum class side {left, right};
}

void f(my_lib::color c1, my_lib::COLOR c2){
    using enum my_lib::color;   // introduce scoped enum
    using enum my_lib::COLOR;   // introduce unscoped enum
    using my_lib::side::left;   // introduce single enumerator id

    // C++17
    if(c1 == my_lib::color::red){/*...*/}
    
    // C++20
    if(c1 == green){/*...*/}
    if(c2 == BLUE){/*...*/}

    auto r = my_lib::side::right;   // qualified id is required for `right`
    auto l = left;                  // but not for `left`
}
</code></pre>

<hr />

<h3 id="fix-arr-size">Array size deduction in new-expressions</h3>

<p>This fix allows the compiler to deduce array size in new-expressions just like it
does for local variables.</p>

<pre><code class="language-cpp">// before C++20
int p0[]{1, 2, 3};
int* p1 = new int[3]{1, 2, 3};  // explicit size is required

// since C++20
int* p2 = new int[]{1, 2, 3};
int* p3 = new int[]{};  // empty
char* p4 = new char[]{"hi"};
// works with parenthesized initialization of aggregates
int p5[](1, 2, 3);
int* p6 = new int[](1, 2, 3);
</code></pre>

<hr />

<h3 id="ctad-alias">Class template argument deduction for alias templates</h3>

<p>CTAD works with type aliases now:</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
using IntPair = std::pair&lt;int, T&gt;;

double d{};
IntPair&lt;double&gt; p0{1, d};   // C++17
IntPair p1{1, d};   // std::pair&lt;int, double&gt;
IntPair p2{1, p1};  // std::pair&lt;int, std::pair&lt;int, double&gt;&gt;
</code></pre>

<hr />

<h2 id="constinit"><code>constinit</code></h2>

<p>C++ has infamous “static initialization order fiasco” when order of 
initialization of static storage variables from different translation units is 
undefined. Variables with zero/constant initialization avoid this problem because 
they are initialized at compile-time. <code>constinit</code> enforces that variable is
initialized at compile-time and unlike <code>constexpr</code> it allows non-trivial
destructors. Second use-case for <code>constinit</code> is with non-initializing 
<code>thread_local</code> declarations. In such a case, it tells the compiler that the 
variable is
already initialized, otherwise the compiler usually adds code to check and 
initialize it if required on each usage.</p>

<pre><code class="language-cpp">struct S {
    constexpr S(int) {}
    ~S(){}; // non-trivial
};

constinit S s1{42};  // OK
constexpr S s2{42};  // error because destructor is not trivial

// tls_definitions.cpp
thread_local constinit int tls1{1};
thread_local int tls2{2};

// main.cpp
extern thread_local constinit int tls1;
extern thread_local int tls2;

int get_tls1() {
    return tls1;  // pure TLS access
}

int get_tls2() {
    return tls2;  // has implicit TLS initialization code
}
</code></pre>

<hr />

<h2 id="int-twos-compl">Signed integers are two’s complement</h2>

<p>That is, signed integers are now guaranteed to be <a href="https://en.wikipedia.org/wiki/Two%27s_complement">two’s complement</a>.
This removes some undefined and implementation-defined behavior because
the binary representation is fixed. Overflow for signed integers is still UB but 
these are well-defined now:</p>

<pre><code class="language-cpp">int i1 = -1;
// left-shift for signed negative integers(previously undefined behavior)
i1 &lt;&lt;= 1;    // -2

int i2 = INT_MAX;
// "unrepresentable" left-shift for signed integers(previously undefined behavior)
i2 &lt;&lt;= 1;   // -2

int i3 = -1;
// right shift for signed negative integers, performs sign-extension(previously 
// implementation-defined)
i3 &gt;&gt;= 1;   // -1
int i4 = 1;
i4 &gt;&gt;= 1;   // 0

// "unrepresentable" conversions to signed integers(previously implementation-defined)
int i5 = UINT_MAX;  // -1
</code></pre>

<hr />

<h2 id="va-opt"><code>__VA_OPT__</code> for variadic macros</h2>

<p>Allows more simple handlining of variadic macros. Expands to nothing if 
<code>__VA_ARGS__</code> is empty and to its content otherwise. It’s especially useful
when macro calls a function with some predefined argument(s) followed be optional
<code>__VA_ARGS__</code>. In such a case, <code>__VA_OPT__</code> allows to omit the trailing comma when
<code>__VA_ARGS__</code> are empty(thanks to Jérôme Marsaguet for bringing up this issue).</p>

<pre><code class="language-cpp">#define LOG1(...)                   \
    __VA_OPT__(std::printf(__VA_ARGS);) \
    std::printf("\n");

LOG1();                      // std::printf("\n");
LOG1("number is %d", 12);    // std::printf("number is %d", 12); std::printf("\n");

#define LOG2(msg, ...) \
    std::printf("[" __FILE__ ":%d] " msg, __LINE__, __VA_ARGS__)
#define LOG3(msg, ...) \
    std::printf("[" __FILE__ ":%d] " msg, __LINE__ __VA_OPT__(,) __VA_ARGS__)

// OK, std::printf("[" "file.cpp" ":%d] " "%d errors.\n", 14, 0);
LOG2("%d errors\n", 0);

// Error, std::printf("[" "file.cpp" ":%d] " "No errors\n", 17, );
LOG2("No errors\n");

// OK, std::printf("[" "file.cpp" ":%d] " "No errors\n", 20);
LOG3("No errors\n");
</code></pre>

<hr />

<h2 id="diff-except-spec">Explicitly defaulted functions with different exception specifications</h2>

<p>This fix allows exception specification of an explicitly defaulted function to 
differ from such specification of implicitly declared function. Until C++20 such 
declarations made the program ill-formed. Now it’s allowed and, of course,
the provided exception specification is the actual one. This is useful when you
want to enforce <code>noexcept</code>-ness of some operations. For example, due to
strong exception guarantee, <code>std::vector</code> <em>moves</em> its elements into a new storage
only if their move constructors are <code>noexcept</code>, otherwise elements are <em>copied</em>.
Sometimes it’s desirable to allow this faster implementation even if elements
can actually throw during move. As usual, when a function marked <code>noexcept</code> throws,
<code>std::terminate()</code> is called.</p>

<pre><code class="language-cpp">struct S1{
    // ill-formed until C++20 because implicit constructor is noexcept(true)
    S1(S1&amp;&amp;)noexcept(false) = default; // can throw
};

struct S2{
    S2(S2&amp;&amp;) noexcept = default;
    // implicitly generated move constructor would be `noexcept(false)`
    // because of `s1`, now it's enforced to be `noexcept(true)`
    S1 s1;
};

static_assert(std::is_nothrow_move_constructible_v&lt;S1&gt; == false);
static_assert(std::is_nothrow_move_constructible_v&lt;S2&gt; == true);

struct X1{
    X1(X1&amp;&amp;) noexcept = default;
    std::map&lt;int, int&gt; m;   // `std::map(std::map&amp;&amp;)` can throw
};

struct X2{
    // same as implicitly generated, it's `noexcept(false)` because of `std::map`
    X2(X2&amp;&amp;) = default;
    std::map&lt;int, int&gt; m;   // `std::map(std::map&amp;&amp;)` can throw
};

std::vector&lt;X1&gt; v1;
std::vector&lt;X2&gt; v2;
// ... at some point, `push_back()` needs to reallocate storage

// efficiently uses `X1(X1&amp;&amp;)` to move the elements to a new storage,
// calls `std::terminate()` if it throws
v1.push_back(X1{});

// uses `X2(const X2&amp;)`, thus, copies, not moves elements to a new storage
v2.push_back(X2{});
</code></pre>

<hr />

<h2 id="destr-delete">Destroying <code>operator delete</code></h2>

<p>C++20 introduces a class-specific <code>operator delete()</code> that takes a special
<code>std::destroying_delete_t</code> tag. In such a case, the compiler will not call the 
object’s
destructor before calling <code>operator delete()</code>, it should be called manually. This 
might be useful if object members should be used to extract information needed
to free memory it occupies, for example to extract its valid size and call sized
version of <code>delete</code>.</p>

<pre><code class="language-cpp">struct TrickyObject{
    void operator delete(TrickyObject *ptr, std::destroying_delete_t){
        // without destroying_delete_t object would have been destroyed here
        const std::size_t realSize = ptr-&gt;GetRealSizeSomehow();
        // now we need to call the destructor by-hand
        ptr-&gt;~TrickyObject();
        // and free storage it occupies
        ::operator delete(ptr, realSize);
    }
    // ...
};
</code></pre>

<hr />

<h2 id="explicit-conditional">Conditionally <code>explicit</code> constructors</h2>

<p>Just like <code>noexcept(bool)</code> we now have <code>explicit(bool)</code> to make
constructor/conversion conditionally <code>explicit</code>.</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
struct S{
    explicit(!std::is_convertible_v&lt;T, int&gt;) S(T){}
};

void f(){
    S&lt;char&gt; sc = 'x';           // OK
    S&lt;std::string&gt; ss1 = "x";   // Error, constructor is explicit
    S&lt;std::string&gt; ss2{"x"};    // OK
}
</code></pre>

<hr />

<h2 id="feature-test-macros">Feature-test macros</h2>

<p>C++20 defines a set of preprocessor macros for testing various language and library
features, the full list is <a href="https://en.cppreference.com/w/cpp/feature_test">here</a>.</p>

<pre><code class="language-cpp">#ifdef __has_cpp_attribute  // check __has_cpp_attribute itself before using it
#   if __has_cpp_attribute(no_unique_address) &gt;= 201803L
#       define CXX20_NO_UNIQUE_ADDR [[no_unique_address]]
#   endif
#endif

#ifndef CXX20_NO_UNIQUE_ADDR
#   define CXX20_NO_UNIQUE_ADDR
#endif

template&lt;typename T&gt;
class Widget{
    int x;
    CXX20_NO_UNIQUE_ADDR T obj;
};
</code></pre>

<hr />

<h2 id="array-conv">Known-to-unknown bound array conversions</h2>

<p>Allows conversion from array of known bound to the reference to array of unknown
bound. Overload resolution rules have also been updated so that overload with
matching size is better than overload with unknown or non-matching size.</p>

<pre><code class="language-cpp">void f(int (&amp;&amp;)[]){};
void f(int (&amp;)[1]){};

void g() {
  int arr[1];

  f(arr);       // calls `f(int (&amp;)[1])`
  f({1, 2});    // calls `f(int (&amp;&amp;)[])`
  int(&amp;r)[] = arr;
}
</code></pre>

<hr />

<h2 id="more-impl-moves">Implicit move for more local objects and rvalue references</h2>

<p>In certain cases the compiler is allowed to replace copy with move. But it turned 
out that rules were too restrictive. C++17 didn’t allow to move rvalue references 
in <code>return</code> statements, function parameters in <code>throw</code> expressions, and various 
forms of conversions unreasonably prevented moving. C++20 fixed these issues but some problems
are still here, see
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2266r0.html">P2266R0 Simpler implicit move</a>.</p>

<pre><code class="language-cpp">std::unique_ptr&lt;T&gt; f0(std::unique_ptr&lt;T&gt; &amp;&amp; ptr) {
    return ptr; // copied in C++17(thus, error), moved in C++20, OK
}

std::string f1(std::string &amp;&amp; x) {
    return x;   // copied in C++17, moved in C++20
}

struct Widget{};

void f2(Widget w){
    throw w;    // copied in C++17, moved in C++20
}

struct From {
    From(Widget const &amp;);
    From(Widget&amp;&amp;);
};

struct To {
    operator Widget() const &amp;;
    operator Widget() &amp;&amp;;
};

From f3() {
    Widget w;
    return w;  // moved (no NRVO because of different types)
}

Widget f4() {
    To t;
    return t;// copied in C++17(conversions were not considered), moved in C++20
}

struct A{
    A(const Widget&amp;);
    A(Widget&amp;&amp;);
};

struct B{
    B(Widget);
};

A f5() {
    Widget w;
    return w;  // moved
}

B f6() {
    Widget w;
    return w; // copied in C++17(because there's no B(Widget&amp;&amp;)), moved in C++20
}

struct Derived : Widget{};

std::shared_ptr&lt;Widget&gt; f7() {
    std::shared_ptr&lt;Derived&gt; result;
    return result;  // moved
}

Widget f8() {
    Derived result;
    // copied in C++17(because there's no Base(Derived)), moved in C++20
    return result;
}
</code></pre>

<hr />

<h2 id="narrowing-ptr-bool-conv">Conversion from <code>T*</code> to <code>bool</code> is narrowing</h2>

<p>Conversions from pointer or pointer-to-member types to <code>bool</code> are narrowing now 
and can’t be used in places where such conversions are not allowed. <code>nullptr</code> is
OK when used with direct initialization.</p>

<pre><code class="language-cpp">struct S{
    int i;
    bool b;
};

void f(){
    void* p;
    S s{1, p};          // error
    bool b1{p};         // error
    bool b2 = p;        // OK
    bool b3{nullptr};   // OK
    bool b4 = nullptr;  // error
    bool b5 = {nullptr};// error
    if(p){/*...*/}      // OK
}
</code></pre>

<hr />

<h2 id="depr-volatile">Deprecate some uses of <code>volatile</code></h2>

<p>Deprecates <code>volatile</code> in various contexts:</p>
<ul>
  <li>built-in prefix/postfix increment/decrement operators on volatile-qualified variables</li>
  <li>usage of the result of an assignment to volatile-qualified object</li>
  <li>built-in compound assignments in form of <code>E1 op= E2</code>(e.g. <code>a += b</code>) when E1 is volatile-qualified</li>
  <li>volatile-qualified return/parameter type</li>
  <li>volatile-qualified structured binding declarations</li>
</ul>

<p>Note that <code>volatile-qualified</code> means top-level qualification, not just any
<code>volatile</code> in a type. Something like <code>volatile int* px</code> is actually
pointer-to-volatile-int, thus, not volatile-qualified.</p>

<pre><code class="language-cpp">volatile int x{};
x++;            // deprecated
int y = x = 1;  // deprecated
x = 1;          // OK
y = x;          // OK
x += 2;         // deprecated

volatile int            //deprecated
    f(volatile int);    //deprecated
</code></pre>

<hr />

<h2 id="depr-comma-subs">Deprecate comma operator in subscripts</h2>

<p>Comma operator inside subscripts is deprecated to allow a multidimensional
(variadic) subscript operator <a href="http://wg21.link/P2128R3">in the future</a>. Current 
approach for this is to have
a custom <code>path_type</code> with overloaded <code>path_type::operator,()</code> and <code>operator[](path_type)</code>.
Variadic <code>operator[]</code> will eliminate the need for such dirty tricks.</p>

<pre><code class="language-cpp">// current approach
struct SPath{
    SPath(int);
    SPath operator,(const SPath&amp;);  // store path somehow
};

struct S1{
    int operator[](SPath); // use path
};

S1 s1;
auto x1 = s1[1,2,3];    // deprecated
auto x2 = s1[(1,2,3)];  // OK

// future approach
struct S2{
    int operator[](int, int, int);
    // or, as a variadic template
    template&lt;typename... IndexType&gt;
    int operator[](IndexType...);
};

S2 s2;
auto x3 = s2[1,2,3];
</code></pre>

<hr />

<h2 id="fixes">Fixes</h2>

<p>Here I put minor fixes. Some of them have been implemented by compilers for a
while but were not reflected in the Standard. Perhaps, you won’t notice any
major changes in practice.</p>

<h3 id="fix-init-list-ctad">Initializer list constructors in class template argument deduction</h3>

<pre><code class="language-cpp">// C++17
std::tuple t{std::tuple{1, 2}};     // std::tuple&lt;int, int&gt;
std::vector v{std::vector{1,2,3}};  // std::vector&lt;std::vector&lt;int&gt;&gt;
</code></pre>

<p>In this example, two syntactically similar initializations result
in surprisingly different CTAD-deduced types. That’s because <code>std::vector</code> has
and prefers <code>std::initializer_list</code> constructor, <code>std::tuple</code> doesn’t have one
so it prefers copy constructor.<br />
With this fix, copy constructor is preferred to list constructor when
initializing from a single element whose type is a specialization or a 
child of specialization of the class template under construction.</p>

<pre><code class="language-cpp">// C++20
std::tuple t{std::tuple{1, 2}};     // std::tuple&lt;int, int&gt;
std::vector v{std::vector{1,2,3}};  // std::vector&lt;int&gt;

// this example is from "C++17" book by N. Josuttis, section 9.1.1
// now it has consistent behavior across compilers
template&lt;typename... Args&gt;
auto make_vector(const Args&amp;... elems)
{
    return std::vector{elems...};
}

auto v2 = make_vector(std::vector{1,2,3});  // std::vector&lt;int&gt;
</code></pre>

<hr />

<h3 id="fix-const-qual"><code>const&amp;</code>-qualified pointers to members</h3>

<p>The problem was that using <code>.*</code> with rvalue with reference qualified pointer to 
member function was not allowed. Now it’s fine.</p>

<pre><code class="language-cpp">struct S {
    void f() const&amp; {}
};

S{}.f();        // OK
(S{}.*&amp;S::f)(); // could be an error on some old compilers
</code></pre>

<hr />

<h3 id="fix-impl-capture">Simplifying implicit lambda capture</h3>

<p>This simplifies wording for lambda capture. Lambdas within default member 
initializers now officially can have capture list, their enclosing scope is the class scope:</p>

<pre><code class="language-cpp">struct S{
    int x{1};
    int y{[&amp;]{ return x + 1; }()};  // OK, captures 'this'
};
</code></pre>

<p>Entities are implicitly captured even within discarded statements and <code>typeid</code>:</p>

<pre><code class="language-cpp">template&lt;bool B&gt;
void f1() {
    std::unique_ptr&lt;int&gt; p;
    [=]() {
        if constexpr (B) {
            (void)p;        // always captures p
        }
    }();
}
f1&lt;false&gt;();    // error, can't capture unique_ptr by-value

void f2() {
    std::unique_ptr&lt;int&gt; p;
    [=]() {
        typeid(p);  // error, can't capture unique_ptr by-value
    }();
}

void f3() {
    std::unique_ptr&lt;int&gt; p;
    [=]() {
        sizeof(p);  // OK, unevaluated operand
    }();
}
</code></pre>

<hr />

<h3 id="fix-const-mismatch"><code>const</code> mismatch with defaulted copy constructor</h3>

<p>This fix allows type to have defaulted copy constructor that takes
its argument by <code>const</code> reference even if some of its members or base classes has
copy constructor that takes its argument by non-<code>const</code> reference until that
constructor is actually needed:</p>

<pre><code class="language-cpp">struct NonConstCopyable{
    NonConstCopyable() = default;
    NonConstCopyable(NonConstCopyable&amp;){}   // takes by non-const reference
    NonConstCopyable(NonConstCopyable&amp;&amp;){}
};

// std::tuple(const std::tuple&amp; other) = default;   // takes by const reference

void f(){
    std::tuple&lt;NonConstCopyable&gt; t; // error in C++17, OK in C++20
    auto t2 = t;                    // always an error
    auto t3 = std::move(t);         // OK, move-ctor is used
}
</code></pre>

<hr />

<h3 id="fix-spec-access-check">Access checking on specializations</h3>

<p>Allows usage of <code>protected/private</code> type to be used as template arguments for 
partial specialization, explicit specialization and explicit instantiation.</p>

<pre><code class="language-cpp">template&lt;typename T&gt;
void f(){}

template&lt;typename T&gt;
struct Trait{};

class C{
    class Impl; // private
};

template&lt;&gt;
struct Trait&lt;C::Impl&gt;{};    // OK

template struct Trait&lt;C::Impl&gt;; // OK

class C2{
    template&lt;typename T&gt;
    struct Impl;    // private
};

template&lt;typename T&gt;
struct Trait&lt;C2::Impl&lt;T&gt;&gt;;   // OK
</code></pre>

<hr />

<h3 id="fix-adl">ADL and function templates that are not visible</h3>

<p>Unqualified-id that is followed by a <code>&lt;</code> and for
which name lookup finds nothing or finds a function is treated as a
template-name in order to potentially cause argument dependent lookup to be 
performed.</p>

<pre><code class="language-cpp">int h;
void g();

namespace N {
	struct A {};
	template&lt;class T&gt; int f(T);
	template&lt;class T&gt; int g(T);
	template&lt;class T&gt; int h(T);
}

// OK: lookup of `f` finds nothing, `f` treated as a template name
auto a = f&lt;N::A&gt;(N::A{});
// OK: lookup of `g` finds a function, `g` treated as a template name
auto b = g&lt;N::A&gt;(N::A{});
// error: `h` is a variable, not a template function
auto c = h&lt;N::A&gt;(N::A{};
// OK, `N::h` is qualified-id
auto d = N::h&lt;N::A&gt;(N::A{});
</code></pre>

<p>In rare cases, this can break existing code if there’s <code>operator&lt;()</code> for
functions but it was considered as a pathological case by committee:</p>

<pre><code class="language-cpp">struct A {};
bool operator &lt;(void (*fp)(), A);
void f(){}
int main() {
    A a;
    f &lt; a;      // OK until C++20, now error
    (f) &lt; a;    // OK
}
</code></pre>

<hr />

<h3 id="fix-constexpr-inst">Specify when <code>constexpr</code> function definitions are needed for constant evaluation</h3>

<p>This fix specifies when <code>constexpr</code> functions are instantiated.
These rules are
pretty tricky but most of the time everything works as expected. Instead of 
copy-pasting them here I will only show a couple of examples to demonstrate the 
problem.</p>

<pre><code class="language-cpp">struct duration {
    constexpr duration() {}
    constexpr operator int() const { return 0; }
};

// duration d = duration(); // #1
int n = sizeof(short{duration(duration())});    // always OK since C++20
</code></pre>

<p>Remember that special member functions are defined only when they are <em>used</em>.
In C++17 terms move constructor is not used and not 
defined here so the program should be ill-formed. But, if line <code>#1</code> would be uncommented,
move constructor would become <em>used</em> and defined so the program would be OK. It makes no
sense and rules have been changed to reflect this.</p>

<p>Another example:</p>

<pre><code class="language-cpp">template&lt;typename T&gt; constexpr int f() { return T::value; }

template&lt;bool B, typename T&gt; void g(decltype(B ? f&lt;T&gt;() : 0));
template&lt;bool B, typename T&gt; void g(...);

template&lt;bool B, typename T&gt; void h(decltype(int{B ? f&lt;T&gt;() : 0}));
template&lt;bool B, typename T&gt; void h(...);

void x() {
    g&lt;false, int&gt;(0); // OK
    h&lt;false, int&gt;(0); // error
}
</code></pre>

<p>Here we have <code>constexpr</code> template function that will potentially be instantiated
with type <code>int</code> and should lead to an error because <code>int::value</code> is wrong. Then
there are two functions that use <code>B ? f&lt;int&gt;() : 0</code> where <code>B</code> is always <code>false</code>
so <code>f&lt;int&gt;()</code> is never needed.
The question is: should <code>f&lt;int&gt;</code> be instantiated here?<br />
New rules clarify
what’s <em>needed for constant evaluation</em>, template variables or functions in such
expressions are always instantiated
even if they are not required to evaluate an expression. One of such cases is braced
initializer list, thus, in expression <code>int{B ? f&lt;T&gt;() : 0}</code> <code>f&lt;T&gt;</code> is always
instantiated which leads to an error.</p>

<hr />

<h3 id="fix-impl-creation">Implicit creation of objects for low-level object manipulation</h3>

<p>In C++17 an object can be created by a definition, by a new-expression or by changing
the active member of a <code>union</code>. Now, consider this example:</p>

<pre><code class="language-cpp">struct X { int a, b; };
X *make_x() {
    X* p = (X*)malloc(sizeof(struct X));
    p-&gt;a = 1;   // UB in C++17, OK in C++20
    return p;
}
</code></pre>

<p>Although it looks natural, in C++17 this code has undefined behavior because <code>X</code> 
is not created according
to the language rules and write to a member of a nonexistent entity is UB.
Rules for such cases have been clarified by specifying what types can be created 
implicitly and what operations can create such objects implicitly.
Types that can be created implicitly(implicit-lifetime types):</p>
<ul>
  <li>scalar types</li>
  <li>aggregate types</li>
  <li>class types with any eligible trivial constructor and trivial destructor</li>
</ul>

<p>Operations that can create implicit-lifetime objects implicitly:</p>
<ul>
  <li>operations that begin the lifetime of an array of <code>char</code>, <code>unsigned char</code>, 
<code>std::byte</code></li>
  <li><code>operator new</code> and <code>operator new[]</code></li>
  <li><code>std::allocator&lt;T&gt;::allocate(std::size_t n)</code></li>
  <li>C library allocation functions: <code>aligned_alloc</code>, <code>calloc</code>, <code>malloc</code>, and 
<code>realloc</code></li>
  <li><code>memcpy</code> and <code>memmove</code></li>
  <li><code>std::bit_cast</code></li>
</ul>

<p>Also, the rule for pseudo-destructor(destructor for built-in types) has been 
changed. Until C++20 it has no effect, now it ends object’s lifetime:</p>

<pre><code class="language-cpp">int f(){
    using T = int;
    T n{1};
    n.~T();     // no effect in C++17, ends n's lifetime in C++20
    return n;   // OK in C++17, UB in C++20, n is dead now
}
</code></pre>

<p>You can find more detailed explanation in this post: <a href="https://blog.panicsoftware.com/objects-their-lifetimes-and-pointers/">Objects, their lifetimes and pointers</a>
by Dawid Pilarski.</p>

<hr />

<h2 id="references">References</h2>

<p><a href="https://en.cppreference.com/w/cpp/20">C++20 feature list</a><br />
<a href="https://gcc.gnu.org/projects/cxx-status.html#cxx20">Complete and grouped list of all papers for each feature</a><br />
<a href="https://www.youtube.com/channel/UCxHAlbZQNFU2LgEtiqd2Maw">C++ Weekly</a><br />
<a href="https://www.youtube.com/watch?v=8jNXy3K2Wpk&amp;t">CppCon 2019: Jonathan Müller “Using C++20’s Three-way Comparison ＜=＞”</a><br />
<a href="https://www.youtube.com/watch?v=Xb6u8BrfHjw">CppCon 2019: Timur Doumler “C++20: The small things”</a><br />
<a href="http://eel.is/c++draft/">C++ standard draft</a></p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Allow only pure data structs with clang-tidy</title><link href="/2020/11/04/pure-data-structs-clang-tidy.html" rel="alternate" type="text/html" title="Allow only pure data structs with clang-tidy" /><published>2020-11-04T16:33:00+00:00</published><updated>2020-11-04T16:33:00+00:00</updated><id>/2020/11/04/pure-data-structs-clang-tidy</id><content type="html" xml:base="/2020/11/04/pure-data-structs-clang-tidy.html"><![CDATA[<h2 id="introduction">Introduction</h2>

<p><code>struct</code> is for <em>data</em>, <code>class</code> is for <em>invariant</em>. This is what guides told us(<a href="https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#S-class">Core guidelines</a>, 
<a href="https://google.github.io/styleguide/cppguide.html#Structs_vs._Classes">Google C++ style guide</a>, 
<a href="https://www.fluentcpp.com/2017/06/13/the-real-difference-between-struct-class/">Fluent C++ blog</a>).
Sounds like a good candidate for another simple <code>clang-tidy</code> check. While 
implementation of such a check is more or less simple, defining the actual 
constraints
for it is not. In this article I’ll discuss what could <em>pure data</em> <code>struct</code> mean 
and flavors it could have. Then I’ll briefly show my implementation of a 
corresponding <code>clang-tidy</code> check.</p>

<h2 id="pure-data-struct">Pure data struct</h2>

<p>Naively, <code>pure data struct</code> means that there are only data members and nothing
more, just like C <code>struct</code>:</p>

<pre><code class="language-cpp">struct Point{
    int x;
    int y;
};
</code></pre>

<p>But in C++ world we can have much more in it and still consider that as a <em>data</em>.
It turned out that it can have several levels of strictness, that’s why I decided
to make this tool configurable instead of hardcoding one <em>the only</em> way.</p>

<h4 id="static-members">Static members</h4>

<p>Because <code>static</code> members are not part of an object’s state, I decided to completely
ignore them. Also, I don’t care whether they are <code>public</code> or not.</p>

<h4 id="stateless-structs">Stateless structs</h4>

<p><code>struct</code>s are often used for various TMP tricks and for simple callables:</p>

<pre><code class="language-cpp">struct less&lt;Widget&gt;{
    bool operator()(const Widget&amp;, const Widget&amp;){
        //...
    }
};
</code></pre>

<p>In such context they are simply a shorthand for a stateless <code>class</code> with all 
members being
<code>public</code>. However, in some codebases, especially old ones, you can find this 
trick(which I also consider as a non-data <code>struct</code>):</p>

<pre><code class="language-cpp">struct WidgetList : std::vector&lt;Widget&gt;{};
</code></pre>

<p>Thus, instead of always skipping stateless structs, I added an option 
<code>SkipStateless</code> to control it. By default it’s <code>true</code>.</p>

<h4 id="data-members">Data members</h4>

<p>Obviously, only <code>public</code> data members are allowed. What about default member
initialization, do you think it violates <em>pure data</em> notion? Consider this simple 
example:</p>

<pre><code class="language-cpp">struct Point{
    int x{};
    int y{};
};
</code></pre>

<p>It doesn’t have invariants but it has a kind of a contract, specifically 
postcondition:</p>

<pre><code class="language-cpp">Point p;
assert((p.x == 0) &amp;&amp; (p.y == 0));
</code></pre>

<p>It binds magic values(<code>0</code>s in this particular case) to its members. As opposed to
<code>class</code>, <code>struct</code> should model a <em>collection</em> of values without any control or
logic over them. Not everyone will share it, so I made 
<code>AllowDefaultMemberInit</code> option for it, it’s <code>false</code> by default.</p>

<h4 id="member-functions">Member functions</h4>

<p>At first sight no member functions should be allowed. Since all data is <code>public</code>
there’s no need for them. How about special members? Google guide says:</p>
<blockquote>
  <p>Constructors, destructors, and helper methods may be present; however, these 
methods must not require or enforce any invariants.</p>
</blockquote>

<p>I feel that <code>struct</code> should not add <em>behavior</em>, only collect related data. Thus,
 this tool doesn’t allow destructors and <em>helper methods</em>(whatever that means).
Hand-written copy/move constructors are also not allowed but you can still 
<code>=default</code> or <code>=delete</code> them. What’s left I called <em>primary</em> constructors, that 
is, non-copy non-move constructors.  <br />
Default constructor can be used instead of 
default member initialization(before C++20 default values for bit-fields could 
be set only in constructors) and they also can have a body. Having a body means
that constructor does something beyond trivial initialization and should be 
considered suspicious. Nevertheless, it’s not
uncommon, hence another option: <code>AllowNonEmptyCtorBody</code> which is <code>false</code> by
default.  <br />
First kind of a non-default constructor is what I called <em>memberwise</em> constructor,
it enforces the initialization of all members at once and looks like a good 
practice:</p>

<pre><code class="language-cpp">struct Point{
    int x;
    int y;
    Point(int x, int y) : x{x}, y{y} {}
};
Point p1;       // error
Point p2{1};    // error
Point p3{1, 2}; // OK
</code></pre>

<p>Although there’s no invariant nor contract, I consider this a poor interface.
It guarantees that initial <code>Point</code> contains valid and related coordinate values
but there’s no guarantee that this relation will always be respected. The better
way is to have two separate types, one for coordinate values and 
another one to manipulate them:</p>

<pre><code class="language-cpp">// pure data
struct Coordinate{
    int x;
    int y;
};
// manipulation interface
class Point{
public:
    Point(Coordinate c) : c{c}{}    // or Point(int x, int y);
    void Update(Coordinate c) { this-&gt;c = c; }
    //...
private:
    Coordinate c;
};
</code></pre>

<p>Second kind of a non-default constructor is purely custom one. It can have any
kind of parameters, related or not to <code>struct</code>’s data members. I
consider this a bad design because it either implies invariant or works as 
a conversion from another set of values, which should be implemented as a free 
function.  <br />
With that in mind, I added another option: <code>AllowedCtors</code> which can have three
values: <code>none</code> - no constructors are allowed, <code>default</code> - only default constructors
are allowed, <code>primary</code> - all non-copy non-move constructors are allowed.</p>

<h4 id="inheritance">Inheritance</h4>

<p>Things are simple here: only <code>public</code> inheritance is allowed and only another 
<code>struct</code> could be used as a base.</p>

<h4 id="conclusion">Conclusion</h4>

<p>So what’s a <em>pure data</em> <code>struct</code>? It’s a <code>struct</code> that implies no invariant nor
contract on its members beyond theirs own ones, and doesn’t add any kind of 
behavior to them. It should be used just to pack a set of logically related values. 
It should <em>not</em> be used to model a new type. In practice
it means good old C-<code>struct</code>s  with rare static members and inheritance.</p>

<h2 id="implementation">Implementation</h2>

<p>Let’s summarize options introduced above:</p>
<ul>
  <li><code>SkipStateless</code> - allows skip(<code>0</code>) or not(<code>1</code>) checking of <code>struct</code>s without
direct data members</li>
  <li><code>AllowDefaultMemberInit</code> - allows(<code>1</code>) or not(<code>0</code>) default member initialization</li>
  <li><code>AllowNonEmptyCtorBody</code> - allows(<code>1</code>) or not(<code>0</code>) non-empty body of allowed 
constructors</li>
  <li><code>AllowedCtors</code> - specifies what kind of constructors are allowed. Possible values: 
<code>none</code> - no constructors are allowed, <code>default</code> - only default constructors
are allowed, <code>primary</code> - all non-copy non-move constructors are allowed</li>
</ul>

<p>Our matcher should detect bad <code>struct</code>s finding some bad parts in them. The 
top-level structure of a matcher is:</p>

<pre><code class="language-cpp">cxxRecordDecl(
    isStruct(),
    // proceed only if SkipStateless == false or has any data member
    anyOf(boolean(!SkipStateless), has(fieldDecl())),
    anyOf(
        // bad member method
        // OR bad data member
        // OR bad base specifier
    ))
</code></pre>

<p>We want to check only user-provided methods(hand-written, non-defaulted,
non-deleted), and it’s easier to specify what’s allowed(static methods and
several kinds of constructors):</p>

<pre><code class="language-cpp">// helper to check whether given constructor matches AllowedCtors requirements
AST_MATCHER_P(CXXConstructorDecl, shouldAllowCtor,
              NonDataStructsCheck::AllowedCtorKind, AllowedCtors) {
  if (Node.isCopyOrMoveConstructor() ||
      (AllowedCtors == NonDataStructsCheck::AllowedCtorKind::None)) {
    return false;
  } else if (AllowedCtors == NonDataStructsCheck::AllowedCtorKind::Primary) {
    return true;
  } else {
    return Node.isDefaultConstructor();
  }
}

// it should have trivial(empty) body or be explicitly allowed through config
const auto ShouldAllowNonEmptyCtorBody =
    anyOf(boolean(AllowNonEmptyCtorBody), hasTrivialBody());

cxxMethodDecl(isUserProvided(),
    unless(anyOf(isStaticStorageClass(),
                    cxxConstructorDecl(
                        shouldAllowCtor(AllowedCtors),
                        ShouldAllowNonEmptyCtorBody))))
</code></pre>

<p>Bad data members are either non-public or the ones with default member 
initializers(in case they are not allowed):</p>

<pre><code class="language-cpp">// returns true when either AllowDefaultMemberInit is set or there's no default
// member initializer
const auto ShouldAllowDefaultMemberInit =
    anyOf(boolean(AllowDefaultMemberInit), unless(has(initListExpr())));

fieldDecl(
    unless(allOf(isPublic(), ShouldAllowDefaultMemberInit)))
</code></pre>

<p>And finally, bad base specifier is the one that’s non-public or non-<code>struct</code>:</p>

<pre><code class="language-cpp">anyOf(hasNonPublicBase(cxxRecordDecl()),
    hasDirectBase(
        cxxRecordDecl(unless(isStruct()))))
</code></pre>

<p>The full matcher:</p>

<pre><code class="language-cpp">cxxRecordDecl(
    isStruct(), anyOf(boolean(!SkipStateless), has(fieldDecl())),
    anyOf(
        has(cxxMethodDecl(isUserProvided(),
                        unless(anyOf(isStaticStorageClass(),
                                        cxxConstructorDecl(
                                            shouldAllowCtor(AllowedCtors),
                                            ShouldAllowNonEmptyCtorBody))))
                .bind("method")),
        has(fieldDecl(
                unless(allOf(isPublic(), ShouldAllowDefaultMemberInit)))
                .bind("field")),
        anyOf(hasNonPublicBase(cxxRecordDecl().bind("np_base")),
            hasDirectBase(
                cxxRecordDecl(unless(isStruct())).bind("ns_base")))
    .bind("record"),
</code></pre>

<p>That’s it, you can find the full source code <a href="https://github.com/OleksandrKvl/clang-tidy-playground/blob/master/misc/NonDataStructsCheck.cpp">here</a>. Maybe it doesn’t
cover all possible cases but upgrading it to meet specific guide requirements
should not be hard.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Enforce explicit/implicit ‘this’ with custom clang-tidy check</title><link href="/2020/10/16/enforce-this-style-check.html" rel="alternate" type="text/html" title="Enforce explicit/implicit ‘this’ with custom clang-tidy check" /><published>2020-10-16T18:34:00+00:00</published><updated>2020-10-16T18:34:00+00:00</updated><id>/2020/10/16/enforce-this-style-check</id><content type="html" xml:base="/2020/10/16/enforce-this-style-check.html"><![CDATA[<h3 id="introduction">Introduction</h3>

<p>Recently, I’ve discovered an interesting topic: <code>clang-tidy</code>-based tools. The idea
is that you get an AST representing all the details of your C++ code, what you
can do with it is limited mostly by your imagination: detect bugs, calculate
some code metrics, refactor, etc. You can take your old legacy codebase and convert
it into a modern one. This idea of making changes at scale really fascinates me.
To learn something you have to use it in a real-world. As I’ve recently made
several contributions to <code>CMake</code> codebase, the idea quickly popped up in my mind for
a refactoring tool. <code>CMake</code> has one strange part in its coding conventions which I’ve never seen in any other C++ codebase: explicit <code>this</code> usage. That is, 
like in JS or Python:</p>

<pre><code class="language-cpp">class Widget{
    void Increment(){
        this-&gt;x++;
    }
    int x;
};
</code></pre>

<p>Because there’s no way to check it automatically, there are places where it’s not 
respected. I don’t know how this
style was adopted, maybe it’s just some old artifact that is hard to get rid of
by hand. So, I decided to write a tool that can do the both:</p>
<ul>
  <li>add explicit <code>this</code> if it’s missed</li>
  <li>remove explicit <code>this</code> wherever possible</li>
</ul>

<h3 id="workflow-outline">Workflow outline</h3>

<p>Writing this <code>clang-tidy</code> tool involves several steps:</p>
<ul>
  <li>understand what C++ code you want to fix</li>
  <li>find out how to detect it using <code>Clang AST API</code></li>
  <li>fix it</li>
</ul>

<p>I will describe explicit/implicit parts separately after describing common things.
I used <code>clang-tidy-standalone</code> as a base for this tool, you can build it without
build LLVM itself, more information in my <a href="/2020/10/06/building-clang-tidy-without-llvm.html">previous article</a>.  <br />
Notice that this is not a complete <code>Clast AST</code> tutorial, you can find more
information in the <a href="https://clang.llvm.org/extra/clang-tidy/Contributing.html">official documentation</a>
, various youtube talks, and <code>clang-tidy</code> <a href="https://github.com/llvm/llvm-project/tree/master/clang-tools-extra/clang-tidy">sources</a>.</p>

<h4 id="templates-handling">Templates handling</h4>

<p>Since template is only an outline for generated code, they are represented 
differently from non-templated code. In non-templated code all the types, 
variables and functions are known and checked. In template it’s not always possible
before the actual instantiation. As a result, they are represented with different
AST nodes. You have a choice: deal with template definition which contains unknown 
or type-dependent entities, or deal with instantiations where everything is known.
You should know whether your check can produce different results for different 
template instantiations. Thankfully, that’s not the case for this tool, so I will
deal only with instantiations. This in turn requires that all of your templates
are used somewhere in a project or in a test suite so they are actually instantiated.</p>

<h4 id="macros-handling">Macros handling</h4>

<p>Since macros are just text replacements, they can have very different meaning in 
different contexts. Final AST represents code after preprocessing, so your
tool can detect things that were composed from macro expansions. In most cases
you can just skip such code, macro usage should be rare nowadays. But there’s at
least one macro which I want to handle - <code>assert()</code>. It naturally contains things
that I want to fix, for example:</p>

<pre><code class="language-cpp">void Widget::Reset(){
    assert(this-&gt;ptr);
    *ptr = 0;
}
</code></pre>

<p>For this reason, I’ve added simple regex-based macro filter:</p>

<pre><code class="language-cpp">  if (ThisLocation.isMacroID()) {
    const auto MacroName =
        Lexer::getImmediateMacroName(ThisLocation, SM, getLangOpts());
    if (!llvm::Regex(AllowedMacroRegexp).match(MacroName)) {
      return false; // skip
    }
  }
  // continue...
</code></pre>

<p>Another example of widely used macro is various loggers, my final macro-filter for 
CMake 
codebase looks like this: <code>^(assert|cm.*Log|cm.*Logger)$</code>. Keep in mind that
we only can handle things that are present after preprocessing, eliminated <code>#ifdef</code> 
blocks wouldn’t be there, so run your tool on various configurations.</p>

<h3 id="enforce-explicit-this">Enforce explicit <code>this</code></h3>

<h4 id="target-c-code">Target C++ code</h4>

<p>Let’s start with the easier case, enforcing explicit <code>this</code>. Here’s our test case:</p>

<pre><code class="language-cpp">class Widget{
    void Do(){
        DoConst();      // should become this-&gt;DoConst();
        x++;            // should become this-&gt;x++;
    }
    void DoConst() const{}
    int x{};
};
</code></pre>

<p>That is, every access to member should become explicit. Since we’re dealing with
already valid code and adding explicit <code>this</code> wouldn’t change its meaning, there’s 
nothing more to consider, we just need to find such
places, check whether they have explicit <code>this</code> or not, and add it if missed.</p>

<h4 id="detecting-it-with-clang-ast-api">Detecting it with Clang AST API</h4>

<p><code>clang-query</code> is a useful tool to examine generated AST, I left only important
parts:</p>

<pre><code>clang-check-10 --ast-dump example.cpp --

  |-CXXMethodDecl 0x1ef3a88 &lt;line:2:5, line:5:5&gt; line:2:10 Do 'void ()'
  | `-CompoundStmt 0x1ef3e18 &lt;col:14, line:5:5&gt;
  |   |-CXXMemberCallExpr 0x1ef3d88 &lt;line:3:9, col:17&gt; 'void'
  |   | `-MemberExpr 0x1ef3d58 &lt;col:9&gt; '&lt;bound member function type&gt;' -&gt;DoConst 0x1ef3ba8
  |   |   `-ImplicitCastExpr 0x1ef3da8 &lt;col:9&gt; 'const Widget *' &lt;NoOp&gt;
  |   |     `-CXXThisExpr 0x1ef3d48 &lt;col:9&gt; 'Widget *' implicit this
  |   `-UnaryOperator 0x1ef3e00 &lt;line:4:9, col:10&gt; 'int' postfix '++'
  |     `-MemberExpr 0x1ef3dd0 &lt;col:9&gt; 'int' lvalue -&gt;x 0x1ef3c60
  |       `-CXXThisExpr 0x1ef3dc0 &lt;col:9&gt; 'Widget *' implicit this
</code></pre>

<p>You can see that our target parts are represented as <code>MemberExpr</code> and <code>CXXThisExpr</code>
with optional <code>ImplicitCastExpr</code>. Cast is there because we’re calling <code>const</code>
function from <code>non-const</code> one, hence, casting <code>Widget*</code> to <code>const Widget*</code>. AST 
matcher for it is straightforward:</p>

<pre><code class="language-cpp">memberExpr(has(
    ignoringImpCasts(
        cxxThisExpr().bind("thisExpr"))))
.bind("memberExpr")
</code></pre>

<p><code>bind()</code> is needed to get access to the matched node, in our case we need
<code>MemberExpr</code> and <code>CXXThisExpr</code>, thus, we bind them to names. In the <a href="https://clang.llvm.org/doxygen/classclang_1_1CXXThisExpr.html#a933d8a76b980a003a6bcfb7bb583aa5a">CXXThisExpr documentation</a>
we can see <code>isImplicit()</code> method that does exactly what we need:</p>

<pre><code class="language-cpp">void EnforceThisStyleCheck::check(const MatchFinder::MatchResult &amp;Result) {
  const auto ThisExpr = Result.Nodes.getNodeAs&lt;CXXThisExpr&gt;("thisExpr");
  const auto MembExpr = Result.Nodes.getNodeAs&lt;MemberExpr&gt;("memberExpr");
  // ...
  if (ThisExpr-&gt;isImplicit()) {
    addExplicitThis(*MembExpr);
  }
}
</code></pre>

<h4 id="fix">Fix</h4>

<p>Fixing is really simple, <code>clang-tidy</code> has a lot of examples of it, we have to 
provide hint, location, and text for our fix:</p>

<pre><code class="language-cpp">void EnforceThisStyleCheck::addExplicitThis(const MemberExpr &amp;MembExpr) {
  const auto ThisLocation = MembExpr.getBeginLoc();
  diag(ThisLocation, "insert 'this-&gt;'")
      &lt;&lt; FixItHint::CreateInsertion(ThisLocation, "this-&gt;");
}
</code></pre>

<p>We use <code>MemberExpr</code>’s location instead of <code>CXXThisExpr</code>’s because in case of
qualified names(<code>Base::Method();</code>) <code>CXXThisExpr::getBeginLoc()</code> points to the start of <code>Method</code>, not the start of a namespace.</p>

<h3 id="enforce-implicit-this">Enforce implicit <code>this</code></h3>

<h4 id="target-c-code-1">Target C++ code</h4>

<p>This case is a bit harder because in some cases removing explicit <code>this</code> could 
change the meaning of code due to name lookup rules, in other cases it could 
result in a compilation error.</p>

<h4 id="special-members">Special members</h4>

<p>We can’t remove explicit <code>this</code> from a special member functions like destructors
or operators:</p>

<pre><code class="language-cpp">void Widget::Do(){
    this-&gt;~Widget();
    this-&gt;operator=(Widget{});
}
</code></pre>

<p>This can happen only when member expression refers to a method, not to a variable.
Thus, we need to get member declaration, check whether it’s a method, and then
check it’s name:</p>

<pre><code class="language-cpp">static bool isNonSpecialMember(const MemberExpr &amp;MembExpr) {
  const auto MemberDecl = MembExpr.getMemberDecl();
  assert(MemberDecl);

  const auto MethodDecl = dyn_cast&lt;CXXMethodDecl&gt;(MemberDecl);
  // CXXMethodDecl::getIdentifier() returns nullptr for special members
  return !MethodDecl || MethodDecl-&gt;getIdentifier();
}
</code></pre>

<h4 id="name-conflicts">Name conflicts</h4>

<p>Consider this case:</p>

<pre><code class="language-cpp">void Widget::Do(int x){
    this-&gt;x++;  // increment member
    x++;        // increment argument
}
</code></pre>

<p>If we remove explicit <code>this</code> from the expression at line 2, it will increment 
argument instead of data member. Generally, any visible local name hides class member name during the lookup. Unfortunately,
<code>Clang</code> doesn’t have the API to detect such conflicts, so I choose less precise
but easier to implement way(thanks to Nicolás Alvarez for this idea):</p>

<pre><code class="language-cpp">static bool hasVariableWithName(const CXXMethodDecl &amp;Function,
                                ASTContext &amp;Context, const StringRef Name) {
  const auto Matches =
      match(decl(hasDescendant(varDecl(hasName(Name)))), Function, Context);

  return !Matches.empty();
}
</code></pre>

<p>This method enumerates all declared variables(including arguments) in the 
function, ignoring
their visibility. It means that this
code will be untouched even if it’s safe:</p>

<pre><code class="language-cpp">void Widget::Do(){
    this-&gt;x++;  // increment member
    x++;        // still increment member but confusing
    int x;
    x++;        // increment local variable
}
</code></pre>

<h4 id="dependent-names">Dependent names</h4>

<pre><code class="language-cpp">template&lt;typename Base&gt;
class Derived : public Base{
    void Do(){
        this-&gt;baseCounter++;    // baseCounter is defined somewhere in Base
    }
};
</code></pre>

<p>C++ requires dependent member names to be prepended with explicit <code>this</code>, thus, 
removing
it here will yield a compile-time error. In our case it means is that if name is
provided by the base class, explicit <code>this</code> is required. So, removing explicit
<code>this</code> from a name is safe when this name is a direct(non-inherited) member of
a class:</p>

<pre><code class="language-cpp">static bool hasDirectMember(const CXXRecordDecl &amp;Class, ASTContext &amp;Context,
                            const StringRef Name) {
  const auto Matches =
      match(cxxRecordDecl(has(namedDecl(hasName(Name)))), Class, Context);

  return !Matches.empty();
}
</code></pre>

<p>Now, we can create our final <code>isRedundantExplicitThis()</code> function:</p>

<pre><code class="language-cpp">static bool isRedundantExplicitThis(const MemberExpr &amp;MembExpr,
                                    const CXXMethodDecl &amp;MethodDecl,
                                    ASTContext &amp;Context) {
  return (isNonSpecialMember(MembExpr) &amp;&amp;
          !hasVariableWithName(MethodDecl, Context,
                               MembExpr.getMemberDecl()-&gt;getName()) &amp;&amp;
          !isDependentName(MethodDecl, MembExpr, Context));
}
</code></pre>

<p>And, because we need access to the corresponding <code>CXXMethodDecl</code>, our final 
matcher for both cases becomes:</p>

<pre><code class="language-cpp">cxxMethodDecl(
    isDefinition(), isUserProvided(),
    forEachDescendant(
        memberExpr(has(ignoringImpCasts(cxxThisExpr().bind("thisExpr"))))
            .bind("memberExpr")))
    .bind("methodDecl")
</code></pre>

<p><code>isUserProvided()</code> is self-explainable, we’re interested only in user-provided
functions, not in compiler-generated ones.</p>

<h4 id="fix-1">Fix</h4>

<p>Again, fixing is mostly simple. We have to provide hint and range for removal.
Qualified names require special handling because we don’t want to remove 
namespace part.</p>

<pre><code class="language-cpp">void EnforceThisStyleCheck::removeExplicitThis(const SourceManager &amp;SM,
                                               const MemberExpr &amp;MembExpr) {
  const auto ThisStart = MembExpr.getBeginLoc();
  auto ThisEnd = MembExpr.getMemberLoc();
  if (MembExpr.hasQualifier()) {
    ThisEnd = MembExpr.getQualifierLoc().getBeginLoc();
  }

  const auto ThisRange = Lexer::makeFileCharRange(
      CharSourceRange::getCharRange(ThisStart, ThisEnd), SM, getLangOpts());

  diag(ThisStart, "remove 'this-&gt;'") &lt;&lt; FixItHint::CreateRemoval(ThisRange);
}
</code></pre>

<h3 id="results">Results</h3>

<p>Applying to CMake codebase:</p>
<ul>
  <li>enforce explicit <code>this</code>: 129 files changed, 4689 insertions</li>
  <li>enforce implicit <code>this</code>: 406 files changed, 23237 insertions</li>
</ul>

<p>Full source code is <a href="https://github.com/OleksandrKvl/clang-tidy-playground/blob/master/misc/EnforceThisStyleCheck.cpp">here</a>.  <br />
CMake branch with explicit <code>this</code> is <a href="https://gitlab.kitware.com/OleksandrKvl/cmake/-/tree/explicit-this-fix">here</a>.  <br />
CMake branch with implicit <code>this</code> is <a href="https://gitlab.kitware.com/OleksandrKvl/cmake/-/tree/implicit-this-fix">here</a>.</p>

<p>I’m pretty satisfied with the result. The whole tool takes  &lt;170 lines of code.
Hope that in future there will be more good tutorials to make this framework 
more available to more people.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Creating your own clang-tidy checks without building LLVM</title><link href="/2020/10/06/building-clang-tidy-without-llvm.html" rel="alternate" type="text/html" title="Creating your own clang-tidy checks without building LLVM" /><published>2020-10-06T14:10:00+00:00</published><updated>2020-10-06T14:10:00+00:00</updated><id>/2020/10/06/building-clang-tidy-without-llvm</id><content type="html" xml:base="/2020/10/06/building-clang-tidy-without-llvm.html"><![CDATA[<h3 id="what-is-clang-tidy">What is clang-tidy</h3>

<p><a href="https://clang.llvm.org/extra/clang-tidy/">clang-tidy</a> is a static analysis tool
based on <code>Clang</code>’s <a href="https://clang.llvm.org/docs/LibTooling.html">LibTooling</a> 
library. It can find and sometimes fix subtle problems 
in your code or just make it look better. The <a href="https://clang.llvm.org/extra/clang-tidy/checks/list.html">list of checks</a>
is pretty extensive.</p>

<h3 id="create-your-own-tool">Create your own tool</h3>

<p>What’s more interesting is that you can create your own tool that can detect/fix
some problems with your code, enforce your custom coding style, refactor it and so 
on.</p>

<p>There are two options for that:</p>
<ul>
  <li>use <a href="https://clang.llvm.org/docs/LibTooling.html">LibTooling</a></li>
  <li><a href="https://clang.llvm.org/extra/clang-tidy/Contributing.html">create</a> custom <code>clang-tidy</code> check</li>
</ul>

<p>When I started, I chose the <code>LibTooling</code> way. But it turned out that it’s
pretty low-level, you need to understand a lot more things, and write more 
boiler-plate code to create useful tool.
Custom <code>clang-tidy</code> check is a much better option, it’s actively supported, you can 
use existing checks as an example for your own ones. It has the whole 
infrastructure 
like diagnostic messages, deduplication of fix-it replacements, <code>run-clang-tidy.py</code> 
to run your check in parallel and so on.</p>

<h3 id="little-problem">Little problem</h3>

<p>The only problem with <code>clang-tidy</code> check is that according to official manual you 
have to build it from sources which means you have to build the whole LLVM. Why 
should someone who wanted to play with simple checks to build LLVM? And I’m afraid
to imagine how long it will take on my 2013 2-core MBP laptop.</p>

<h3 id="solution">Solution</h3>

<p>So, I wanted to have full <code>clang-tidy</code> infrastructure without building LLVM.
Thankfully, LLVM has Debian/Ubuntu <a href="https://apt.llvm.org/">prebuilt packages</a> 
which include
all required libraries for <code>clang-tidy</code>. The only remaining thing is to link
<code>clang-tidy</code> sources with it. It turned out to be pretty simple. We need to edit
three <code>CMakeLists.txt</code>s, replace parts responsible for building libraries from 
sources with parts that do link to static libraries. 
I’ve removed all checks to make it as light as possible. Now you can follow the
<a href="https://clang.llvm.org/extra/clang-tidy/Contributing.html">official manual</a>, use
<code>add_new_check.py</code> to create a new check, build it with prebuilt packages, and run 
it with <code>run-clang-tidy.py</code> in parallel.</p>

<p>The resulting repository is <a href="https://github.com/OleksandrKvl/clang-tidy-standalone">here</a>.
It requires LLVM 10 packages. Porting it to next versions should be simple but 
it’s not in my plans right now.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[What is clang-tidy]]></summary></entry><entry><title type="html">Reducing CMake heap usage part 2: Know your tools</title><link href="/2020/09/06/reducing-cmake-heap-usage-2.html" rel="alternate" type="text/html" title="Reducing CMake heap usage part 2: Know your tools" /><published>2020-09-06T18:14:00+00:00</published><updated>2020-09-06T18:14:00+00:00</updated><id>/2020/09/06/reducing-cmake-heap-usage-2</id><content type="html" xml:base="/2020/09/06/reducing-cmake-heap-usage-2.html"><![CDATA[<h3 id="introduction">Introduction</h3>

<p>At the end of the <a href="https://oleksandrkvl.github.io/2020/08/30/reducing-cmake-heap-usage.html">previous post</a>,
after all those optimizations I stated:</p>
<blockquote>
  <p>For more complex configurations the economy is of course lower. Partly because 
<strong>there’s an old parsing routine that allocates a lot and becomes the major 
memory consumer.</strong></p>
</blockquote>

<p>After a while, I decided to investigate why so much memory is used and found 
a surprisingly easy way to fix it.</p>

<h3 id="previous-results">Previous results</h3>

<p>Here’re the overall results of previous optimizations(total allocated bytes 
(number of allocations)):</p>
<ul>
  <li>empty project: 65 MB (394k) -&gt; 39 MB (280k)</li>
  <li>google benchmark: 233 MB (1344k) -&gt; 196 MB (1190k)</li>
  <li>heaptrack: 305 MB (1308k) -&gt; 268 MB (1148k)</li>
</ul>

<p>As you can see, not a big improvement for the last two projects. Let’s take a
look at heaptrack report of heaptrack itself:</p>

<p><img src="/assets/images/heaptrack-expand-old.jpg" alt="heaptrack-expand-old" /></p>

<p>We can see that the top consumer is <code>cmCommandArgumet_yyalloc()</code>(1), it stems
from <code>cmMakefile::ExpandVariablesInStringOld()</code>(2) that in turn stems from 
<code>cmMakefile::ExpandArguments()</code>(3). Also, notice the huge difference between 
the first(1) and second(4) heap consumers: 119 MB vs 1 MB correspondingly.</p>

<h3 id="whats-going-on">What’s going on?</h3>

<p>Looking at that report, I had several questions:</p>
<ol>
  <li>What is <code>ExpandVariablesInStringOld()</code>?</li>
  <li>Why does it eat so much memory?</li>
  <li>Why doesn’t it exist in the report of the empty project?</li>
</ol>

<p>I’ll answer the (1) and (3) first, then the (2).</p>

<h4 id="argument-expansion">Argument expansion</h4>

<p>Argument expansion is a process of replacing variable reference(<code>${var}</code>) with 
variable’s value(<code>var_value</code>) and, for unquoted arguments, replacing 
list(<code>a;b;c</code>) with its elements as separate arguments(<code>a</code>, <code>b</code>, <code>c</code>). CMake does 
this for every argument of every command call using <code>ExpandArguments()</code>, 
so this function is called pretty frequently.</p>

<p>Variable references could be nested and mixed with plain 
strings(<code>ab_${cd_${ef}}_$ENV{env_var}</code>), the algorithm for their expansion is 
not so trivial. CMake uses Flex scanner and Bison 
parser to do this, the driver function to run them is called 
<code>ExpandVariablesInStringOld()</code>.
Why <em>old</em>? Because that was the case before CMake 3.1. Back then, Flex/Bison 
implementation was considered to be slow and inefficient(which is strange
because these tools are used for decades) and the whole thing was
replaced with hand-crafted 
implementation(<code>cmMakefile::ExpandVariablesInStringNew()</code>).</p>

<h4 id="old-vs-new">Old vs. new</h4>

<p>If it was replaced in CMake 3.1 then why was it used in CMake 3.18 when I
configured google benchmark and heaptrack? Because of the <code>cmake_minimum_required()</code> 
command. Roughly speaking, when you set a minimum required version for your
project, CMake adjusts its behavior to that version. So, if you call it with 
anything below <code>3.1</code>, CMake will use <code>ExpandVariablesInStringOld()</code>.
Heaptrack calls <code>cmake_minimum_required(VERSION 2.8.12)</code> and google benchmark 
includes google test which calls <code>cmake_minimum_required(VERSION 2.8.8)</code>.</p>

<h3 id="the-problem">The problem</h3>

<p>Now, we can reproduce that enormous heap consumption with empty project:</p>
<pre><code class="language-cmake">cmake_minimum_required(VERSION 2.9)
project(empty)
</code></pre>

<p>Heaptrack report is quite similar to the above one:</p>

<p><img src="/assets/images/heaptrack-empty-exp-old.jpg" alt="heaptrack-empty-exp-old" /></p>

<p>Here and further I’ll omit <code>cmCommandArgument_</code> part of function names, that’s
just a prefix to avoid name clashes.
We can see that <code>yylex()</code>(1) allocates a lot of memory using 
<code>yy_create_buffer()</code>(2). FYI, <code>yylex()</code>(Flex part) is responsible for reading
the input and returning the token to <code>yyparse()</code>(Bison part).
Here’s the troublesome part of the <code>yylex()</code> code:</p>

<pre><code class="language-cpp">#define YY_BUF_SIZE 16384

int yylex()
{
    if(!init)
    {
        yy_create_buffer(YY_BUF_SIZE);
    }    
    //...
}
</code></pre>

<p>We can see that Flex allocates 16 KB of memory for whatever purpose. Recall that
<code>ExpandVariablesInStringOld()</code> is called for every argument, thus, Flex
allocates 16 KB of RAM thousands of times, and it doesn’t depend on argument 
structure nor its size. Looks pretty bad, huh?</p>

<h4 id="flex-input-management">Flex input management</h4>

<p>Why does Flex need that buffer? Flex can be configured to take its input from
the file handle(the default is <code>stdin</code>, i.e., the terminal) or from the provided 
buffer. When it’s a file handle, obviously, it needs a buffer for the data to be 
read, so it allocates 16 KB for that purpose. When it’s configured to read from
the buffer it doesn’t need that additional 16 KB because all data is already 
provided. Flex needs mutable buffer, client has a choice: provide a mutable 
buffer or allow Flex to make a copy of immutable one. Sounds reasonable?</p>

<p>Arguments are of course located in string buffers, not in file, why, 
after all, was that 16 KB allocated? Well, because for whatever reasons CMake
uses the third, kinda tricky, way: it doesn’t configure Flex to read from the 
buffer, instead it replaces <em>file reading routine</em> through Flex macro <code>YY_INPUT</code>
with code that do actual read from buffer. Thus, Flex thinks it’s going to
read from file, allocates 16 KB for file buffer, and calls overridden 
<em>file reading routine</em>. Arguments are usually small strings, much less than
16 KB, hence we got that huge overconsumption.</p>

<h3 id="the-fix">The fix</h3>

<p>The fix was fairly simple. I replaced all that hackery with the call to 
public API <code>yy_scan_string()</code> and everything just worked. Let’s check heaptrack 
report:</p>

<p><img src="/assets/images/heaptrack-empty-exp-old-fixed.jpg" alt="heaptrack-empty-exp-old-fixed" /></p>

<p>Now, <code>yyalloc()</code>(1) has allocated 1.1 MB instead of 83 MB :)</p>

<p>Overall results(total allocated bytes (number of allocations)):</p>
<ul>
  <li>empty project: 120 MB (354k) -&gt; 38 MB (351k)</li>
  <li>google benchmark: 233 MB (1344k) -&gt; 137 MB (1153k)</li>
  <li>heaptrack: 305 MB (1308k) -&gt; 140 MB (1113k)</li>
</ul>

<h4 id="benchmark">Benchmark</h4>

<p>Like I said before, I was surprised that Flex/Bison solution was replaced by 
hand-crafted parser due to the poor performance of the former. So I decided to
do a little benchmark of three methods: old <code>ExpandVariablesInStringOld()</code>, new
one, and hand-written <code>ExpandVariablesInStringNew()</code>. I used this simple
file:</p>

<pre><code class="language-cmake"># cmake_policy(SET CMP0053 OLD)     # use ExpandVariablesInStringOld()
cmake_policy(SET CMP0053 NEW)       # use ExpandVariablesInStringNew()

function(sink)
endfunction()

foreach(i RANGE 1000000)
    sink(simple_var ${simple_var_ref} ${nested_${var_${ref}}})
endforeach()
</code></pre>

<p>It’s main purpose is to generate a lot of argument expansion calls. Measurement
was done with</p>
<pre><code class="language-shell">/usr/bin/time -v cmake -P bench.cmake
</code></pre>

<p>Results:</p>
<ul>
  <li>hand-written <code>ExpandVariablesInStringNew()</code>: 4.53 sec</li>
  <li>new <code>ExpandVariablesInStringOld()</code>: 5.38 sec</li>
  <li>old <code>ExpandVariablesInStringOld()</code>: 12.71 sec</li>
</ul>

<p>So yes, the old version was really slow, and had to be replaced. But was it slow
due to Flex or Bison problems? Of course no. Actually, I suppose that if the old
version had been done right, nobody would have thought about its replacement.
Yes, it’s still slower by ~20% than the hand-written version, but this
difference isn’t noticeable in real-world cases. Not to mention how easier it’s
to understand and maintain Flex/Bison specs compared to hand-written code.</p>

<h3 id="conclusion">Conclusion</h3>

<p>Know your tools, don’t reinvent the wheel. Widely used tools usually have all
the needed APIs for nearly all common use-cases. If you find that it doesn’t fit
your needs, consider you’re doing something wrong. Think hard to comprehend the 
problem, then learn what tool provides. And only when you understand them both,
you can use some hacks or custom solutions.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Reducing CMake heap usage with Heaptrack</title><link href="/2020/08/30/reducing-cmake-heap-usage.html" rel="alternate" type="text/html" title="Reducing CMake heap usage with Heaptrack" /><published>2020-08-30T20:02:00+00:00</published><updated>2020-08-30T20:02:00+00:00</updated><id>/2020/08/30/reducing-cmake-heap-usage</id><content type="html" xml:base="/2020/08/30/reducing-cmake-heap-usage.html"><![CDATA[<h3 id="introduction">Introduction</h3>

<p>While working on <a href="https://oleksandrkvl.github.io/2020/08/09/allowing-cmake-functions-to-return-value.html">previous CMake experiment</a> 
I noticed several places that looked suspiciously suboptimal. As in case of any
performance considerations everything should be measured so I decided to run it
through a <a href="https://github.com/KDE/heaptrack">Heaptrack</a> and check what’s going on.</p>

<h3 id="clean-run">Clean run</h3>

<p>I used Heaptrack 1.2.80, CMake 3.18.1 with debug symbols and this barely empty 
CMakeLists.txt:</p>

<pre><code class="language-cmake">cmake_minimum_required(VERSION 3.17)
project(empty)
</code></pre>

<p>Let’s take a look at what heaptrack gave us:</p>

<p><img src="/assets/images/heaptrack_clean.jpg" alt="heaptrack_clean" /></p>

<p>I don’t expand all entries but you can see that top two heap consumers(1 and 2) 
are related to <code>std::vector&lt;cmListFileFunction&gt;</code> and 
<code>std::vector&lt;cmListFileArgument&gt;</code> operations which in turn stem from 
<code>cmFunctionBlocker::IsFunctionBlocked()</code>(3 and 4). This is exactly what I 
expected. Now let’s look at these types and their role.</p>

<h3 id="representation">Representation</h3>

<p>These types are used to represent parsed code. Here are their simplified 
definitions:</p>

<pre><code class="language-cpp">// command argument representation
struct cmListFileArgument
{
    std::string value;  // argument itself
    Delimiter delim;    // argument's type: quoted/unquoted/bracket
    long line;          // argument's position
};

// command representation
struct cmListFileFunction
{
    std::string nameLower;
    std::string nameOriginal;
    long line;
    std::vector&lt;cmListFileArgument&gt; arguments;
};

// file representation
struct cmListFile
{
    std::vector&lt;cmListFileFunction&gt; functions;
};
</code></pre>

<p>For example, representation for this code</p>
<pre><code class="language-cmake">Find_Package(Boost 1.41.0 COMPONENTS system filesystem iostreams)
</code></pre>

<p>would be</p>

<pre><code class="language-cpp">cmListFile
{
    std::vector&lt;cmListFileFunction&gt;
    {
        cmListFileFunction
        {
            std::string{"find_package"},
            std::string{"Find_Package"},
            long{1},
            std::vector&lt;cmListFileArgument&gt;
            {
                {std::string{"Boost"}, Unquoted, long{1}},
                {std::string{"1.41.0"}, Unquoted, long{1}},
                {std::string{"COMPONENTS"}, Unquoted, long{1}},
                {std::string{"system"}, Unquoted, long{1}},
                {std::string{"filesystem"}, Unquoted, long{1}},
                {std::string{"iostreams"}, Unquoted, long{1}},
            }
        }
    }
};
</code></pre>

<p>When the file is parsed, <code>cmListFile</code> contains all its commands. After, they
will be executed in a row.</p>

<h4 id="function-blocker">Function blocker</h4>

<p>Everything in CMake is a command including things like functions, conditionals 
and cycles. Hence, it needs a way to represent a <em>scope</em>, 
e.g. it can’t just execute a function body on its first occurence. Instead, it 
needs to collect function’s
inner commands as its body and create a new function definition. To achieve that CMake
has <em>function blocker</em> which has that <code>IsFunctionBlocked()</code> function:</p>

<pre><code class="language-cpp">class cmFunctionBlocker
{
public:
    bool IsFunctionBlocked(cmListFileFunction const&amp; function)
    {
        //...
        function.push_back(function);   // copy!
        return true;
    }
    //...
private:
    std::vector&lt;cmListFileFunction&gt; functions;
};
</code></pre>

<p>When <em>starting</em> command(e.g. <em>function()</em>, <em>if()</em>) is executed, it starts
to collect scope body using <code>IsFunctionBlocked()</code> and when <em>ending</em> command(<em>endfunction()</em>, <em>endif()</em>) is 
met, it does something with its body. The important moment here: once the body is 
collected, it will never be modified.</p>

<h3 id="the-problem">The problem</h3>

<p>As you can see it copies functions inside.
Consider how this simple code would be represented:</p>

<pre><code class="language-cmake">function(f)
    # function() block
    if(${var})
        # if() block
        message("hello world")
    endif()
endfunction()

f()
</code></pre>

<p>Here’s how it’s stored in memory:</p>
<ul>
  <li><code>cmListFile</code> - stores commands at lines [1; 9] (ranges are closed)</li>
  <li><code>function() blocker</code> - stores commands at lines [2; 6]</li>
  <li><code>if() blocker</code> - stores commands at lines [4; 5]</li>
</ul>

<p>That’s the problem, blockers copy same commands multiple times depending on 
the code structure. It’s quite expensive because each command contains two strings
(for its name) and a vector of strings(for arguments). Moreover, it does it 
multiple times. Inner blocker are populated each time outer one is executed. In
the above example, if-blocker will be recreated again on each f() call.</p>

<h4 id="solution">Solution</h4>

<p>My first thought was to store raw pointers in blockers but it doesn’t work. When 
it reads dependent file(via <em>include()</em>) it really needs to copy <code>cmListFileFunction</code> 
because corresponding <code>cmListFile</code> is destroyed after parsing but we still need
to use included functions. So the actual
solution is to store each command in a <code>std::shared_ptr</code> to make copy cheap:</p>

<pre><code class="language-cpp">// old cmListFileFunction
struct cmListFileFunctionImpl
{
    std::string nameLower;
    std::string nameOriginal;
    long line;
    std::vector&lt;cmListFileArgument&gt; arguments;
};

using cmListFileFunction = std::shared_ptr&lt;cmListFileFunctionImpl&gt;;
</code></pre>

<h4 id="another-little-problem">Another little problem</h4>

<p>Second place where it does unnecessary copy of commands is in 
<code>ExecuteCommand()</code> function. Although it’s not critical, I still think obviously
unneeded copy operations should be avoided.</p>

<pre><code class="language-cpp">using Command = std::function&lt;bool(std::vector&lt;cmListFileArgument&gt; const&amp;)&gt;;

std::map&lt;std::string, Command&gt; RegisteredCommands;

Command GetCommandByExactName(std::string const&amp; name);

bool ExecuteCommand(const cmListFileFunction&amp; function)
{
    //...
    if (auto command = GetCommandByExactName(function.nameLower))   // copy!
    {
        // execute
        command(function.arguments);
    }
    //...
}
</code></pre>

<p>As you can see, commands are stored in a <code>std::function</code> and during 
execution that object is copied. Why? Because commands could be redefined, thus
currently executing function object might be reassigned and the following 
execution would be UB if it’s not stored anywhere.
Copy of <code>std::function</code> usually involves copy of its control block where the 
actual Callable is stored. Most built-in commands in CMake are stored as a raw 
function pointers so their copy is relatively cheap. But user-defined functions
(<em>function()/endfunction()</em>) contains a vector of commands in its blocker and its copy is not cheap.
Again, solution is to use <code>std::shared_ptr</code>:</p>

<pre><code class="language-cpp">using CommandPtr = std::shared_ptr&lt;Command&gt;;
std::map&lt;std::string, CommandPtr&gt; RegisteredCommands;
CommandPtr GetCommandByExactName(std::string const&amp; name);
</code></pre>
<p>No more excessive copies of vectors of strings. Both solutions could be even 
better with something like <code>boost::local_shared_ptr</code> which avoids synchronization
overhead.</p>

<h3 id="optimized-run">Optimized run</h3>

<p>Let’s measure those changes.</p>

<p><img src="/assets/images/heaptrack_optimized.jpg" alt="heaptrack_optimized" /></p>

<p>Now top heap consumers(1 and 2) are related to <code>std::string</code> operations. Some of 
them also could be fixed but it requires more effort to get significant 
improvement.
Total bytes allocation decreased from 65MB to 39MB. Number of allocations 
decreased from 394k to 280k.</p>

<p>For more complex configurations the economy is of course lower. Partly because 
there’s an old parsing routine that allocates a lot and becomes the major memory 
consumer(I’ve fixed them in the <a href="https://oleksandrkvl.github.io/2020/09/06/reducing-cmake-heap-usage-2.html">next post</a>). Here are results for heaptrack itself and Google benchmark 
configuration step (total allocated bytes (number of allocations)).</p>

<p>Heaptrack:</p>
<ul>
  <li>before 305 MB (1308k)</li>
  <li>after 268 MB (1148k)</li>
</ul>

<p>Google benchmark:</p>
<ul>
  <li>before: 233 MB (1344k)</li>
  <li>after: 196 MB (1190k)</li>
</ul>

<h3 id="conclusion">Conclusion</h3>

<p>I’m not a fan of deep optimizations in a project like CMake where resources
are not scarce and actual execution is rare. However, in this case it’s
not really an <em>optimization</em>, just avoidance of unneeded copy
operations, just the C++ way of doing things.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Introduction]]></summary></entry><entry><title type="html">Allowing CMake functions to return(value)</title><link href="/2020/08/09/allowing-cmake-functions-to-return-value.html" rel="alternate" type="text/html" title="Allowing CMake functions to return(value)" /><published>2020-08-09T10:55:09+00:00</published><updated>2020-08-09T10:55:09+00:00</updated><id>/2020/08/09/allowing-cmake-functions-to-return-value</id><content type="html" xml:base="/2020/08/09/allowing-cmake-functions-to-return-value.html"><![CDATA[<h2 id="introduction">Introduction</h2>

<p>It’s a story of implementing CMake feature that I call <code>command reference</code>
(similar to existing <code>variable reference</code>), i.e., using result of command 
invocation as an argument. Having this idea for a long time I never had enough 
time to dig into it. Now, being unemployed I decided at least to try it before 
looking for a next job. It was not as easy as I expected but I’m pretty satisfied 
with the result.</p>

<p>It consists of two parts:</p>
<ol>
  <li>First part contains motivation, design and results.</li>
  <li>Second part explains some implementation details, such as why new lexer and 
parser is needed.</li>
</ol>

<h2 id="motivation">Motivation</h2>

<p>Most part of my career I used Visual Studio
and when switched to Linux I was slightly shocked. Compared to MSVS, makefiles 
felt like bows and arrows against machine gun. Then I discovered CMake and it felt
much better, instead of cryptic makefiles we got a distinct language with 
commands and variables. And since it’s just another language, the same rules 
apply to its code: meaningful names, small functions, separation of abstractions, 
etc. Unfortunately, many CMake files look like one very big function 
that mixes everything in it. CMake allows us to handle almost all those things 
right except one - it doesn’t have return values, thus, limiting the usefulness 
of function abstraction. As a result, some parts of your 
CMakeLists look bad.</p>

<p>Let’s look at some examples.</p>

<pre><code class="language-cmake">if(${CMAKE_CURRENT_LIST_DIR} STREQUAL ${CMAKE_SOURCE_DIR})    # is top level list?
if(WIN32)   # surprisingly short name comparing to other CMake vars
if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
if((CMAKE_CXX_COMPILER_ID STREQUAL "Clang") 
    OR (CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang"))
if(CMAKE_SIZEOF_VOID_P EQUAL 8)    # my favorite one, x64 check
</code></pre>
<p>The problem with such code is that it doesn’t express logic, only implementation 
details. It requires you to remember all those long and tricky variable names,
magic values and relation between them.</p>

<p>We can move that into a function but it doesn’t solve the problem. We are lazy,
nobody wants to write two lines instead of one:</p>
<pre><code class="language-cmake">check_is_top_level_list(is_top_level_list)
if(is_top_level_list)...

# is it better than one-liner?
if(${CMAKE_CURRENT_LIST_DIR} STREQUAL ${CMAKE_SOURCE_DIR})
</code></pre>

<p>Almost all feature tests become:</p>
<pre><code class="language-cmake">check_feature_available(RESULT is_feature_available)
if(is_feature_available)
</code></pre>
<p>It’s better than direct values manipulation but it’s ugly. Now you need to
check documentation for the name of this output argument, intuitive 
candidates are 
<code>RESULT</code>, <code>RESULT_VAR</code>, <code>OUTPUT</code>, or just absence of such argument at all: 
<code>check_feature_available(is_feature_available)</code>(or <code>check_feature_available()</code> with <code>FATAL_ERROR</code>).
It also forces you to think about names for variables, many 
of whom are used only once. In any popular language we can write all of the above 
in a clear manner:</p>

<pre><code class="language-cpp">if(IsTopLevelProject()){}
if(IsWindowsBuild()){}
if(IsClang() || IsGcc()){}
if(IsX64Build()){}
if(IsFeatureAvailable()){}
</code></pre>

<p>Why should I know how those checks are performed?</p>

<p>Again, you can handle all that stuff, CMake has been successfully used for years. But why not make it simpler? CMake is a big part of C++ so why 
shouldn’t we make it easier for new people as we do with C++ itself?</p>

<h2 id="design">Design</h2>

<h3 id="why-fh-isnt-possible">Why f(h()) isn’t possible</h3>

<p>Initial idea was to allow something like <code>get_name_by_id(get_id())</code> but it quickly
turned out to be wrong because of two reasons:</p>
<ol>
  <li>
    <p>CMake syntax is too simple, it doesn’t have keywords, command names are not 
restricted, anything is a string(including parens).
For example expression <code>if(x AND (y OR z))</code> means
call <code>if_impl("x", "AND", "(", "y", "OR", "z", ")");</code> where <code>AND</code>, <code>OR</code> and parens 
are just plain strings that are handled in a specific way by <code>if_impl</code>. 
The only requirement here is that parens should match, e.g. <code>if(x AND (y))))</code> 
isn’t allowed. Because of that this is ambiguous:</p>

    <pre><code class="language-cmake"> function(AND a b c)
 endfunction()

 if(x AND (y OR z))  # if_impl(x, AND, (, y, OR, z, )) or
                     # if_impl(x, AND_impl(y, OR, z)) ?
</code></pre>
    <p>You can extend this case to named arguments which, unlike bool operations,
can have non-trivial names.</p>
  </li>
  <li>
    <p>But the main reason is that the above form isn’t flexible enough. How to use it
within the quoted argument or to mix it with plain strings?</p>

    <pre><code class="language-cmake"> function(h)
 endfunction()

 f(a_h() "b h()_c")  # not possible
</code></pre>
  </li>
</ol>

<h3 id="meet-the-command-reference">Meet the command reference</h3>

<p>Syntax mimics variable reference: <code>${command_name( args... )}</code>(notice, there’s no 
spaces before command name and after final paren). It works 
just like you expect:</p>

<pre><code class="language-cmake">function(get_name)
    return("Alex")
endfunction()

message(${get_name()})    # prints "Alex"
</code></pre>

<p>More generally:</p>

<pre><code class="language-cmake">function(f)
    return(return-value-expr)
endfunction()
use_f(${f()})

# is equal to
set(__ret_var_name return-value-expr)
use_f(${__ret_var_name})
</code></pre>

<p>It can be used wherever variable reference can.
Comments, nested calls and lists are also allowed, let’s mix it all together:</p>

<pre><code class="language-cmake">function(format_name first last)
    return("First: ${first}, last: ${last}")
endfunction()

function(get_first_name)
    return("John")          # return quoted
endfunction()

function(get_last_name)
    return(Doe)             # return unquoted
endfunction()

function(get_first_and_last)
    return([[John]] Doe)    # return list
endfunction()

message(
    ${format_name(      # pass separate args
        ${get_first_name()}     # comments
        ${get_last_name()}      #[[ inside 
                                    command
                                    reference ]]
    )}
)                       # First: John, last: Doe

message(
    ${format_name(      # pass as a list, expands in two arguments
        ${get_first_and_last()}
    )}
)                       # First: John, last: Doe

# return() becomes a function that returns its arguments
message(${CMAKE_${return("VERSION")}})  # 3.18.1-...
</code></pre>

<h2 id="downloads">Downloads</h2>

<p>Current implementation based upon CMake 3.18.1 release. You can build it 
from <a href="https://github.com/OleksandrKvl/CMake/tree/command-reference-3.18.1">sources</a> or use pre-built binaries:</p>
<ol>
  <li><a href="https://github.com/OleksandrKvl/cmake_cmdref_builds/raw/master/cmake-3.18.1-win64-x64.zip">cmake-3.18.1-win64-x64.zip</a></li>
  <li><a href="https://github.com/OleksandrKvl/cmake_cmdref_builds/raw/master/cmake-3.18.1-Linux-x86_64.tar.gz">cmake-3.18.1-Linux-x86_64.tar.gz</a></li>
</ol>

<h3 id="warning">Warning</h3>

<p><em>Some CMake features or policies, especially related to syntax or variable 
expansion, might not work. One such policy I’m aware of is OLD part of
<a href="https://cmake.org/cmake/help/v3.1/policy/CMP0053.html">CMP0053</a>. Syntax related 
error messages are also slightly different. All other things should work, I’ve
successfully built Google Test, Google Benchmark and fmt, using it. I’m not a
CMake developer, integration is quite dirty in some places so don’t expect it to 
be production ready right away.</em></p>

<h2 id="part-2-implementation-details">Part 2. Implementation details</h2>

<p>At the beginning I naively supposed that if CMake can already parse single 
command  invocation, it would be enough just to call that function recursively on 
every argument :) But it turned out to be a way more complex and required 
completely new lexer and parser. I’ve created it using Flex&amp;Bison, you can find
separate project that does parsing and pseudo-evaluation <a href="https://github.com/OleksandrKvl/cmake_parser">here</a>.</p>

<h3 id="existing-cmake-parser-and-why-its-not-enough">Existing CMake parser and why it’s not enough</h3>

<p>Current implementation is relatively simple(but not its code). It consists of 
Flex-based scanner and hand-written parser.
Scanner detects separated arguments and their kinds,
it’s easy since we know how each argument starts and ends. Current parser 
mostly verifies basic syntax rules like valid separations, parens matching etc.
For example
<code>command(a "${b}")</code> is parsed as <code>call("command").with_args(unquoted_arg{"a"}, 
quoted_arg{"${b}"})</code>. Notice that variable reference <code>${b}</code> is passed as a plain text.
During command execution each argument is parsed again with another parser that
can detect, verify and evaluate variable references. If such command appears in a 
cycle it does this additional parsing on every iteration.
Also if you make a mistake inside reference, it won’t be detected until 
expression is evaluated:</p>

<pre><code class="language-cmake">if(${ALWAYS_TRUE_IN_YOUR_ENV})
    # no errors or warnings on your machine
	message("hello world")
else()
    # syntax error at run-time on another machine
    message(${@:-:@})
endif()
</code></pre>
<p>Now, when we allow another command appear inside argument, argument separation is not so easy:</p>
<pre><code class="language-cmake">command("result: ${get_result("a" b)}")
</code></pre>
<p>You can see that highlighter marks “a” in black because it thinks that arguments are
<code>"result: ${get_result("</code>, <code>a</code>, <code>" b)}"</code>. Existing CMake parser sees it in 
the same way.
To separate arguments correctly we got to be able to parse recursively
when we meet command reference.</p>

<p>In terms of BNF existing syntax looks like(simplified):</p>

<pre><code class="language-bnf">command_invocation ::= identifier '(' argument* ')'
</code></pre>

<p>with command reference we got:</p>

<pre><code class="language-bnf">command_invocation ::= identifier '(' (argument | command_invocation)* ')'
</code></pre>

<p>with only difference that command reference might appear inside argument, not only
as a separate one.</p>

<p>As you can see, now we need to parse it much deeper than existing parser does,
there’s no sense in trying to extend it, also writing parser for recursive rules 
by hand is not trivial so I have no choice but to write both scanner and parser 
from scratch. Flex and Bison were chosen because they’re already used
in CMake.</p>

<h3 id="bnf-for-a-new-syntax">BNF for a new syntax</h3>

<p>Let’s slightly update <a href="https://cmake.org/cmake/help/latest/manual/cmake-language.7.html#syntax">official BNF</a> accordingly to new syntax:</p>

<pre><code class="language-bnf">command_invocation  ::=  identifier space* '(' arguments ')'

quoted_argument     ::= '"' (quoted_element | reference)* '"'
unquoted_argument   ::= (unquoted_element | reference)+

reference           ::= var_reference | command_reference
var_reference       ::= var_ref_open (variable_name | reference)* ref_close
command_reference   ::= cmd_ref_open command_invocation ref_close

var_ref_open        ::= "${" | "$ENV{" | "$CACHE{"
cmd_ref_open        ::= "${"
ref_close           ::= "}"

quoted_element      ::= &lt;check official docs&gt;
unquoted_element    ::= &lt;check official docs&gt;
variable_name       ::= &lt;check official docs&gt;
</code></pre>

<p>Unlike existing implementation, I want to avoid parsing during execution and get
all details in one pass. Now, each quoted/unquoted argument consists of 
string(<code>quoted/unquoted_element+</code>) and reference. To get its real value
at run-time we need to evaluate and concatenate all its parts. For example, 
<code>a_${b}_c</code> has 3 elements: <code>string("a_")</code>, <code>var_ref("b")</code>, <code>string("_c")</code>. At 
run-time we get the value of <code>b</code> and concatenate them together: <code>a_B_VALUE_c</code>.</p>

<h3 id="expression-representation-and-evaluation">Expression representation and evaluation</h3>

<p>Here’s brief overview of key expressions:</p>
<ul>
  <li>call expression is a list of arguments.</li>
  <li>quoted/unquoted argument expression is a list of strings and references.</li>
  <li>variable reference expression is a list of strings and references</li>
  <li>command reference expression is similar to call expression.</li>
</ul>

<p>Now we need a good representation that can store and evaluate 
such expressions efficiently.</p>

<h4 id="ast">AST</h4>

<p>First approach was to use classic <a href="https://en.wikipedia.org/wiki/Interpreter_pattern">Interpreter pattern</a> and compose expressions into a tree. Since each expression is 
list-like we can represent them all as a  <code>std::vector&lt;std::unique_ptr&lt;IExpression&gt;&gt;</code>. 
It works but even simple command becomes quite
involved, <code>command(a b)</code> is represented roughly with</p>

<pre><code class="language-cpp">vector{         // vector of arguments
    "command",  // command name
    vector{     // each argument is a vector itself
        "a"
    },
    vector{
        "b"
    }
}
</code></pre>

<p>Things got worse when we add reference, <code>command(a ${b}_c)</code>:</p>

<pre><code class="language-cpp">vector{
    "command",
    vector{
        "a"
    },
    vector{
        vector{     // reference is also a vector
            "b"
        },
        "_c"
    }
}
</code></pre>

<p>Too many vectors %)</p>

<h4 id="rpn">RPN</h4>

<p><a href="https://en.wikipedia.org/wiki/Reverse_Polish_notation">Reverse Polish(or postfix) Notation</a> is a notation when arguments comes before 
operator. It shines when you need to represent “linear” expression without branches, also it doesn’t need parens to express precedence:</p>

<pre><code>Normal(infix) notation: a + b
RPN: a b +

Normal: (a + b) * c
RPN: a b + c *
</code></pre>

<p>Now <code>command(a ${b}_c)</code> is represented with:</p>
<pre><code class="language-cpp">vector&lt;IExpression&gt;{
    StringExpr{"command"},           // command name
    StringExpr{"a"}, UnquotedArg{1}, // 1 means number of subexpressions to concat
    StringExpr{"b"}, VarRefExpr{1}, UnquotedArg{1}, // same for VarRefExpr
    CallExpr{3}                      // 2 means number of arguments including name
}
</code></pre>
<p>One vector instead of four with AST approach, regardless how complex expression 
is, win :)</p>

<p>It also fits nicely a bottom-up parser like Bison because of the order in which 
symbols are discovered.
In example above, Bison will discover symbols exactly in 
their order in that vector, you can just push expressions without any knowledge
about previous symbols or other context.</p>

<h4 id="evaluation">Evaluation</h4>

<p>RPN is evaluated using stack. Each expression knows its arity
(number of arguments), it pops them from stack and pushes back the result. But 
there’s a little problem here. CMake expands list strings into multiple arguments:</p>

<pre><code class="language-cmake">set(my_list a;b;c)
command(${my_list})       # called with 3 args: a, b, c
</code></pre>

<p>It means that if our CallExpr has arity = 1, at run-time it might become any 
number including zero. Classical RPN evaluation doesn’t work here. To overcome 
this we need to adjust definition of <code>arity</code>: now arity means <em>number of 
expressions whose results should be taken as arguments</em>. And we need additional 
stack to track this <em>results count</em>. Consider RPN representation of the above example:</p>

<pre><code class="language-cpp">{
    StringExpr{"command"},
    StringExpr{"my_list"}, VarRefExpr{1}, UnquotedArgExpr{1},
    CallExpr{2}
}
</code></pre>

<p>Take a look at both stacks before CallExpr evaluation for two cases:</p>
<ol>
  <li>my_list expands into 3 arguments
    <pre><code class="language-cpp"> results:        {"command", "a", "b", "c"}
 results_count:  {1, 3}
</code></pre>
    <p>CallExpr arity is 2, thus actual arity is the sum of last two elements in
results_count stack
and that will be the final number of its arguments <code>1 +3 = 4</code>.</p>
  </li>
  <li>my_list expands into 0 arguments
    <pre><code class="language-cpp"> results:        {"commands"}
 results_count:  {1, 0}
</code></pre>
    <p>Here, actual arity is <code>1 + 0 = 1</code>.</p>
  </li>
</ol>

<h3 id="another-small-benefits-of-this-implementation">Another small benefits of this implementation</h3>

<h4 id="easy-to-change">Easy to change</h4>

<p>Writing syntax rules in Bison makes it much easier to change, 
understand, review and support, then hand-written parser.</p>

<h4 id="symbol-locations">Symbol locations</h4>

<p>Bison makes symbol locations tracking almost automatic. With <a href="https://github.com/OleksandrKvl/cmake_parser/blob/master/src/scanner.l#L73">simple action</a>
you only need to track lines manually.</p>

<h4 id="error-messages">Error messages</h4>

<p>Bison’s out-of-the-box error messages are pretty good:</p>
<pre><code class="language-cmake">f(${@}) # 1.5 : syntax error, unexpected invalid token, expecting command name or 
        # reference opening or reference closing or variable name
</code></pre>

<h4 id="bom-and-line-breaks-handling">BOM and line breaks handling</h4>

<p>CMake supports BOM header but only UTF-8
is allowed. Instead of reading it <a href="https://github.com/Kitware/CMake/blob/master/Source/LexerParser/cmListFileLexer.in.l#L421">by hand</a> 
we can handle it easily with another rule in parser.</p>

<p>CMake converts all <code>\r\n</code> into <code>\n</code> during <a href="https://github.com/Kitware/CMake/blob/0cd3b5d0ca8d541fc3769f467db71a07a95be7f6/Source/LexerParser/cmListFileLexer.in.l#L326">file reading</a> by <a href="https://github.com/Kitware/CMake/blob/master/Source/LexerParser/cmListFileLexer.in.l#L59">replacing</a> Flex’s input routine.
Honestly, I can’t fully understand that code. Supposedly it just replaces <code>\r\n</code>
with <code>\n</code> and <code>memcpy()</code> the rest, I want something better. In many places we can 
just use <code>\r?\n</code> regexp endings in scanner rules. In theory it’s possible 
that string literal might contain <code>\r\r\n</code> which should become <code>\r\n</code>(I’m talking
about raw bytes <code>0x0D 0x0A</code>, not escapes). To handle this I remove trailing 
<code>\r</code> (if any) when <code>\n</code> is met in string literal on the fly in scanner. Since 
rules are written to take input line-by-line it doesn’t involve much overhead. 
These simple solutions allow to eliminate custom reading routines and tons of 
<code>memcpy()</code> calls.</p>

<h2 id="aftenotes">Aftenotes</h2>

<p>It’s not an official CMake feature of course. If you like it, let me or CMake 
devs know to increase chances of having it in future CMake versions.</p>]]></content><author><name>Oleksandr Koval</name></author><summary type="html"><![CDATA[Introduction]]></summary></entry></feed>