...one of the most highly
regarded and expertly designed C++ library projects in the
world.
— Herb Sutter and Andrei
Alexandrescu, C++
Coding Standards
When building Boost.Interprocess architecture, I took some basic guidelines that can be summarized by these points:
Boost.Interprocess is built above 3 basic classes: a memory algorithm, a segment manager and a managed memory segment:
The memory algorithm is an object that is placed in the first bytes of a shared memory/memory mapped file segment. The memory algorithm can return portions of that segment to users marking them as used and the user can return those portions to the memory algorithm so that the memory algorithm mark them as free again. There is an exception though: some bytes beyond the end of the memory algorithm object, are reserved and can't be used for this dynamic allocation. This "reserved" zone will be used to place other additional objects in a well-known place.
To sum up, a memory algorithm has the same mission as malloc/free of standard C library, but it just can return portions of the segment where it is placed. The layout of a memory segment would be:
Layout of the memory segment: ____________ __________ ____________________________________________ | | | | | memory | reserved | The memory algorithm will return portions | | algorithm | | of the rest of the segment. | |____________|__________|____________________________________________|
The memory algorithm takes care of memory synchronizations, just like malloc/free guarantees that two threads can call malloc/free at the same time. This is usually achieved placing a process-shared mutex as a member of the memory algorithm. Take in care that the memory algorithm knows nothing about the segment (if it is shared memory, a shared memory file, etc.). For the memory algorithm the segment is just a fixed size memory buffer.
The memory algorithm is also a configuration point for the rest of the Boost.Interprocess framework since it defines two basic types as member typedefs:
typedef /*implementation dependent*/ void_pointer; typedef /*implementation dependent*/ mutex_family;
The void_pointer
typedef
defines the pointer type that will be used in the Boost.Interprocess
framework (segment manager, allocators, containers). If the memory algorithm
is ready to be placed in a shared memory/mapped file mapped in different
base addresses, this pointer type will be defined as offset_ptr<void>
or a similar relative pointer. If the
memory algorithm will be used just with
fixed address mapping, void_pointer
can be defined as void*
.
The rest of the interface of a Boost.Interprocess
memory algorithm is described in Writing
a new shared memory allocation algorithm section. As memory algorithm
examples, you can see the implementations simple_seq_fit
or rbtree_best_fit
classes.
The segment manager, is an object also placed in the first bytes of the managed memory segment (shared memory, memory mapped file), that offers more sophisticated services built above the memory algorithm. How can both the segment manager and memory algorithm be placed in the beginning of the segment? That's because the segment manager owns the memory algorithm: The truth is that the memory algorithm is embedded in the segment manager:
The layout of managed memory segment: _______ _________________ | | | | | some | memory | other |<- The memory algorithm considers |members|algorithm|members| "other members" as reserved memory, so |_______|_________|_______| it does not use it for dynamic allocation. |_________________________|____________________________________________ | | | | segment manager | The memory algorithm will return portions | | | of the rest of the segment. | |_________________________|____________________________________________|
The segment manager initializes the memory algorithm and tells the memory manager that it should not use the memory where the rest of the segment manager's member are placed for dynamic allocations. The other members of the segment manager are a recursive mutex (defined by the memory algorithm's mutex_family::recursive_mutex typedef member), and two indexes (maps): one to implement named allocations, and another one to implement "unique instance" allocations.
typeid(T).name()
operation.
The memory needed to store [name pointer, object information] pairs in
the index is allocated also via the memory algorithm,
so we can tell that internal indexes are just like ordinary user objects
built in the segment. The rest of the memory to store the name of the object,
the object itself, and meta-data for destruction/deallocation is allocated
using the memory algorithm in a single
allocate()
call.
As seen, the segment manager knows nothing about shared memory/memory mapped files. The segment manager itself does not allocate portions of the segment, it just asks the memory algorithm to allocate the needed memory from the rest of the segment. The segment manager is a class built above the memory algorithm that offers named object construction, unique instance constructions, and many other services.
The segment manager is implemented in
Boost.Interprocess by the segment_manager
class.
template<class CharType ,class MemoryAlgorithm ,template<class IndexConfig> class IndexType> class segment_manager;
As seen, the segment manager is quite generic: we can specify the character type to be used to identify named objects, we can specify the memory algorithm that will control dynamically the portions of the memory segment, and we can specify also the index type that will store the [name pointer, object information] mapping. We can construct our own index types as explained in Building custom indexes section.
The Boost.Interprocess managed memory
segments that construct the shared memory/memory mapped file, place there
the segment manager and forward the user requests to the segment manager.
For example, basic_managed_shared_memory
is a Boost.Interprocess managed memory
segment that works with shared memory. basic_managed_mapped_file
works with memory mapped files, etc...
Basically, the interface of a Boost.Interprocess managed memory segment is the same as the segment manager but it also offers functions to "open", "create", or "open or create" shared memory/memory-mapped files segments and initialize all needed resources. Managed memory segment classes are not built in shared memory or memory mapped files, they are normal C++ classes that store a pointer to the segment manager (which is built in shared memory or memory mapped files).
Apart from this, managed memory segments offer specific functions: managed_mapped_file
offers functions
to flush memory contents to the file, managed_heap_memory
offers functions to expand the memory, etc...
Most of the functions of Boost.Interprocess managed memory segments can be shared between all managed memory segments, since many times they just forward the functions to the segment manager. Because of this, in Boost.Interprocess all managed memory segments derive from a common class that implements memory-independent (shared memory, memory mapped files) functions: boost::interprocess::ipcdetail::basic_managed_memory_impl
Deriving from this class, Boost.Interprocess implements several managed memory classes, for different memory backends:
basic_managed_shared_memory
(for shared memory).
basic_managed_mapped_file
(for memory mapped files).
basic_managed_heap_memory
(for heap allocated memory).
basic_managed_external_buffer
(for user provided external buffer).
The Boost.Interprocess STL-like allocators are fairly simple and follow the usual C++ allocator approach. Normally, allocators for STL containers are based above new/delete operators and above those, they implement pools, arenas and other allocation tricks.
In Boost.Interprocess allocators, the approach is similar, but all allocators are based on the segment manager. The segment manager is the only one that provides from simple memory allocation to named object creations. Boost.Interprocess allocators always store a pointer to the segment manager, so that they can obtain memory from the segment or share a common pool between allocators.
As you can imagine, the member pointers of the allocator are not a raw
pointers, but pointer types defined by the segment_manager::void_pointer
type. Apart from this, the pointer
typedef of Boost.Interprocess allocators
is also of the same type of segment_manager::void_pointer
.
This means that if our allocation algorithm defines void_pointer
as offset_ptr<void>
,
boost::interprocess::allocator<T>
will store an offset_ptr<segment_manager>
to point to the segment manager and
the boost::interprocess::allocator<T>::pointer
type will be offset_ptr<T>
. This way, Boost.Interprocess
allocators can be placed in the memory segment managed by the segment manager,
that is, shared memory, memory mapped files, etc...
Segregated storage pools are simple and follow the classic segregated storage algorithm.
The pool is implemented by the private_node_pool and shared_node_pool classes.
Adaptive pools are a variation of segregated lists but they have a more complicated approach:
The adaptive pool is implemented by the private_adaptive_node_pool and adaptive_node_pool classes.
Boost.Interprocess containers are standard
conforming counterparts of STL containers in boost::interprocess
namespace, but with these little details:
operator==()
to know if this is possible.
pointer
type defined by the allocator of the container. This allows placing
containers in managed memory segments mapped in different base addresses.
This section tries to explain the performance characteristics of Boost.Interprocess, so that you can optimize Boost.Interprocess usage if you need more performance.
You can have two types of raw memory allocations with Boost.Interprocess classes:
allocate()
and deallocate()
functions of managed_shared_memory/managed_mapped_file... managed memory
segments. This call is translated to a MemoryAlgorithm::allocate()
function, which means that you will
need just the time that the memory algorithm associated with the managed
memory segment needs to allocate data.
boost::interprocess::allocator<...>
with Boost.Interprocess containers.
This allocator calls the same MemoryAlgorithm::allocate()
function than the explicit method,
every time a vector/string has to
reallocate its buffer or every time
you insert an object in a node container.
If you see that memory allocation is a bottleneck in your application, you have these alternatives:
flat_map
family instead of the map
family if you mainly do searches and the insertion/removal is mainly
done in an initialization phase. The overhead is now when the ordered
vector has to reallocate its storage and move data. You can also call
the reserve()
method of these containers when you know beforehand how much data you
will insert. However in these containers iterators are invalidated
in insertions so this substitution is only effective in some applications.
allocate()
only when the pool runs out of nodes.
This is pretty efficient (much more than the current default general-purpose
algorithm) and this can save a lot of memory. See Segregated
storage node allocators and Adaptive
node allocators for more information.
Boost.Interprocess allows the same parallelism as two threads writing to a common structure, except when the user creates/searches named/unique objects. The steps when creating a named object are these:
insert()
function in the index. So the time
this requires is dependent on the index type (ordered vector, tree,
hash...). This can require a call to the memory algorithm allocation
function if the index has to be reallocated, it's a node allocator,
uses pooled allocations...
The steps when destroying a named object using the name of the object (destroy<T>(name)
)
are these:
find(const key_type &)
and erase(iterator)
members of the index. This can require element reordering if the index
is a balanced tree, an ordered vector...
The steps when destroying a named object using the pointer of the object
(destroy_ptr(T *ptr)
) are these:
boost::interprocess::is_node_index
specialization): Take the iterator stored near the object and
call erase(iterator)
.
This can require element reordering if the index is a balanced
tree, an ordered vector...
If you see that the performance is not good enough you have these alternatives: