...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
There are a few C++ published libraries which implement some of the HTTP protocol. We analyze the message model chosen by those libraries and discuss the advantages and disadvantages relative to Beast.
The general strategy used by the author to evaluate external libraries is as follows:
Note | |
---|---|
Declarations examples from external libraries have been edited: portions have been removed for simplification. |
cpp-netlib is a network programming library previously intended for Boost but not having gone through formal review. As of this writing it still uses the Boost name, namespace, and directory structure although the project states that Boost acceptance is no longer a goal. The library is based on Boost.Asio and bills itself as "a collection of network related routines/implementations geared towards providing a robust cross-platform networking library". It cites "Common Message Type" as a feature. As of the branch previous linked, it uses these declarations:
template <class Tag> struct basic_message { public: typedef Tag tag; typedef typename headers_container<Tag>::type headers_container_type; typedef typename headers_container_type::value_type header_type; typedef typename string<Tag>::type string_type; headers_container_type& headers() { return headers_; } headers_container_type const& headers() const { return headers_; } string_type& body() { return body_; } string_type const& body() const { return body_; } string_type& source() { return source_; } string_type const& source() const { return source_; } string_type& destination() { return destination_; } string_type const& destination() const { return destination_; } private: friend struct detail::directive_base<Tag>; friend struct detail::wrapper_base<Tag, basic_message<Tag> >; mutable headers_container_type headers_; mutable string_type body_; mutable string_type source_; mutable string_type destination_; };
This container is the base class template used to represent HTTP messages.
It uses a "tag" type style specializations for a variety of trait
classes, allowing for customization of the various parts of the message.
For example, a user specializes headers_container<T>
to determine what container type holds the header fields. We note some problems
with the container declaration:
body_
to after the headers are read
in.
string_type
(a customization point) for source, destination, and body suggests that
string_type
models a
ForwardRange whose value_type
is char
. This representation
is less than ideal, considering that the library is built on Boost.Asio.
Adapting a DynamicBuffer
to the required forward range destroys information conveyed by the ConstBufferSequence and MutableBufferSequence used in dynamic
buffers. The consequence is that cpp-netlib implementations will be less
efficient than an equivalent Networking.TS conforming implementation.
string<Tag>
to change the type of string used
everywhere, including the body, field name and value pairs, and extraneous
metadata such as source and destination. The user may only choose a single
type: field name, field values, and the body container will all use the
same string type. This limits utility of the customization point. The
library's use of the string trait is limited to selecting between std::string
and std::wstring
.
We do not find this use-case compelling given the limitations.
boost::network::http
namespace. The way the traits are used in the library limits the usefulness
of the traits to trivial purpose.
The design of the message container in this library is cumbersome with its system of customization using trait specializations. The use of these customizations is extremely limited due to the way they are used in the container declaration, making the design overly complex without corresponding benefit.
boost.http is a library resulting from the 2014 Google Summer of Code. It was submitted for a Boost formal review and rejected in 2015. It is based on Boost.Asio, and development on the library has continued to the present. As of the branch previously linked, it uses these message declarations:
template<class Headers, class Body> struct basic_message { typedef Headers headers_type; typedef Body body_type; headers_type &headers(); const headers_type &headers() const; body_type &body(); const body_type &body() const; headers_type &trailers(); const headers_type &trailers() const; private: headers_type headers_; body_type body_; headers_type trailers_; }; typedef basic_message<boost::http::headers, std::vector<std::uint8_t>> message; template<class Headers, class Body> struct is_message<basic_message<Headers, Body>>: public std::true_type {};
This container cannot model a complete message. The start-line
items (method and target for requests, reason-phrase for responses) are
communicated out of band, as is the http-version.
A function that operates on the message including the start line requires
additional parameters. This is evident in one of the example
programs. The 500
and "OK"
arguments
represent the response status-code and reason-phrase
respectively:
... http::message reply; ... self->socket.async_write_response(500, string_ref("OK"), reply, yield);
headers_
, body_
, and trailers_
may only be default-constructed, since there are no explicitly declared
constructors.
std::vector
is a model of Body. More formally, that
a body is represented by the ForwardRange
concept whose value_type
is an 8-bit integer. This representation is less than ideal, considering
that the library is built on Boost.Asio. Adapting a DynamicBuffer to the required forward range
destroys information conveyed by the ConstBufferSequence and MutableBufferSequence used in dynamic
buffers. The consequence is that Boost.HTTP implementations will be less
efficient when dealing with body containers than an equivalent Networking.TS
conforming implementation.
This representation addresses a narrow range of use cases. It has limited potential for customization and performance. It is more difficult to use because it excludes the start line fields from the model.
cpprestsdk is a Microsoft project which "...aims to help C++ developers connect to and interact with services". It offers the most functionality of the libraries reviewed here, including support for Websocket services using its websocket++ dependency. It can use native APIs such as HTTP.SYS when building Windows based applications, and it can use Boost.Asio. The WebSocket module uses Boost.Asio exclusively.
As cpprestsdk is developed by a large corporation, it contains quite a bit of functionality and necessarily has more interfaces. We will break down the interfaces used to model messages into more manageable pieces. This is the container used to store the HTTP header fields:
class http_headers { public: ... private: std::map<utility::string_t, utility::string_t, _case_insensitive_cmp> m_headers; };
This declaration is quite bare-bones. We note the typical problems of most field containers:
Now we analyze the structure of the larger message container. The library
uses a handle/body idiom. There are two public message container interfaces,
one for requests (http_request
)
and one for responses (http_response
).
Each interface maintains a private shared pointer to an implementation class.
Public member function calls are routed to the internal implementation. This
is the first implementation class, which forms the base class for both the
request and response implementations:
namespace details { class http_msg_base { public: http_headers &headers() { return m_headers; } _ASYNCRTIMP void set_body(const concurrency::streams::istream &instream, const utf8string &contentType); /// Set the stream through which the message body could be read void set_instream(const concurrency::streams::istream &instream) { m_inStream = instream; } /// Set the stream through which the message body could be written void set_outstream(const concurrency::streams::ostream &outstream, bool is_default) { m_outStream = outstream; m_default_outstream = is_default; } const pplx::task_completion_event<utility::size64_t> & _get_data_available() const { return m_data_available; } protected: /// Stream to read the message body. concurrency::streams::istream m_inStream; /// stream to write the msg body concurrency::streams::ostream m_outStream; http_headers m_headers; bool m_default_outstream; /// <summary> The TCE is used to signal the availability of the message body. </summary> pplx::task_completion_event<utility::size64_t> m_data_available; };
To understand these declarations we need to first understand that cpprestsdk
uses the asynchronous model defined by Microsoft's Concurrency Runtime. Identifiers from the
pplx
namespace
define common asynchronous patterns such as tasks and events. The concurrency::streams::istream
parameter and m_data_available
data member indicates a lack of separation of concerns. The representation
of HTTP messages should not be conflated with the asynchronous model used
to serialize or parse those messages in the message declarations.
The next declaration forms the complete implementation class referenced by the handle in the public interface (which follows after):
/// Internal representation of an HTTP request message. class _http_request final : public http::details::http_msg_base, public std::enable_shared_from_this<_http_request> { public: _ASYNCRTIMP _http_request(http::method mtd); _ASYNCRTIMP _http_request(std::unique_ptr<http::details::_http_server_context> server_context); http::method &method() { return m_method; } const pplx::cancellation_token &cancellation_token() const { return m_cancellationToken; } _ASYNCRTIMP pplx::task<void> reply(const http_response &response); private: // Actual initiates sending the response, without checking if a response has already been sent. pplx::task<void> _reply_impl(http_response response); http::method m_method; std::shared_ptr<progress_handler> m_progress_handler; }; } // namespace details
As before, we note that the implementation class for HTTP requests concerns itself more with the mechanics of sending the message asynchronously than it does with actually modeling the HTTP message as described in rfc7230:
std::unique_ptr<http::details::_http_server_context
breaks encapsulation and separation of concerns. This cannot be extended
for user defined server contexts.
_reply_impl
function
implies that the message implementation also shares responsibility for
the means of sending back an HTTP reply. This would be better if it was
completely separate from the message container.
Finally, here is the public class which represents an HTTP request:
class http_request { public: const http::method &method() const { return _m_impl->method(); } void set_method(const http::method &method) const { _m_impl->method() = method; } /// Extract the body of the request message as a string value, checking that the content type is a MIME text type. /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out. pplx::task<utility::string_t> extract_string(bool ignore_content_type = false) { auto impl = _m_impl; return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->extract_string(ignore_content_type); }); } /// Extracts the body of the request message into a json value, checking that the content type is application/json. /// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out. pplx::task<json::value> extract_json(bool ignore_content_type = false) const { auto impl = _m_impl; return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->_extract_json(ignore_content_type); }); } /// Sets the body of the message to the contents of a byte vector. If the 'Content-Type' void set_body(const std::vector<unsigned char> &body_data); /// Defines a stream that will be relied on to provide the body of the HTTP message when it is /// sent. void set_body(const concurrency::streams::istream &stream, const utility::string_t &content_type = _XPLATSTR("application/octet-stream")); /// Defines a stream that will be relied on to hold the body of the HTTP response message that /// results from the request. void set_response_stream(const concurrency::streams::ostream &stream); { return _m_impl->set_response_stream(stream); } /// Defines a callback function that will be invoked for every chunk of data uploaded or downloaded /// as part of the request. void set_progress_handler(const progress_handler &handler); private: friend class http::details::_http_request; friend class http::client::http_client; std::shared_ptr<http::details::_http_request> _m_impl; };
It is clear from this declaration that the goal of the message model in this library is driven by its use-case (interacting with REST servers) and not to model HTTP messages generally. We note problems similar to the other declarations:
concurrency::streams::istream
and concurrency::streams::ostream
reference parameters. Presumably,
these are abstract interfaces which may be subclassed by users to achieve
custom behaviors.
concurrency::streams::istream
.
No user defined types are possible.
set_response_stream
member).
Again this is likely purpose-driven but the lack of separation of concerns
limits this library to only the uses explicitly envisioned by the authors.
The general theme of the HTTP message model in cpprestsdk is "no user definable customizations". There is no allocator support, and no separation of concerns. It is designed to perform a specific set of behaviors. In other words, it does not follow the open/closed principle.
Tasks in the Concurrency Runtime operate in a fashion similar to std::future
,
but with some improvements such as continuations which are not yet in the
C++ standard. The costs of using a task based asynchronous interface instead
of completion handlers is well documented: synchronization points along the
call chain of composed task operations which cannot be optimized away. See:
A Universal Model for Asynchronous Operations
(Kohlhoff).