std::numeric_limits<> constants

is_specialized

true for all arithmetic types (integer, floating and fixed-point) for which std::numeric_limits<T>::numeric_limits is specialized.

A typical test is

if (std::numeric_limits<T>::is_specialized == false)
{
  std::cout << "type " << typeid(T).name()  << " is not specialized for std::numeric_limits!" << std::endl;
// ...
}

Typically numeric_limits<T>::is_specialized is true for all T where the compile-time constant members of numeric_limits are indeed known at compile time, and don't vary at runtime. For example floating-point types with runtime-variable precision such as mpfr_float have no numeric_limits specialization as it would be impossible to define all the members at compile time. In contrast the precision of a type such as mpfr_float_50 is known at compile time, and so it does have a numeric_limits specialization.

Note that not all the std::numeric_limits member constants and functions are meaningful for all user-defined types (UDT), such as the decimal and binary multiprecision types provided here. More information on this is given in the sections below.

infinity

For floating-point types, ∞ is defined wherever possible, but clearly infinity is meaningless for __arbitrary_precision arithmetic backends, and there is one floating-point type (GMP's mpf_t, see gmp_float) which has no notion of infinity or NaN at all.

A typical test whether infinity is implemented is

if(std::numeric_limits<T>::has_infinity)
{
   std::cout << std::numeric_limits<T>::infinity() << std::endl;
}

and using tests like this is strongly recommended to improve portability.

If the backend is switched to a type that does not support infinity then, without checks like this, there will be trouble.

is_signed

std::numeric_limits<T>::is_signed == true if the type T is signed.

For built-in binary types, the sign is held in a single bit, but for other types (cpp_dec_float and cpp_bin_float) it may be a separate storage element, usually bool.

is_exact

std::numeric_limits<T>::is_exact == true if type T uses exact representations.

This is defined as true for all integer types and false for floating-point types.

A usable definition has been discussed.

ISO/IEC 10967-1, Language independent arithmetic, noted by the C++ Standard defines

A floating-point type F shall be a finite subset of [real].

The important practical distinction is that all integers (up to max()) can be stored exactly.

Rational types using two integer types are also exact.

Floating-point types cannot store all real values (those in the set of ℜ) exactly. For example, 0.5 can be stored exactly in a binary floating-point, but 0.1 cannot. What is stored is the nearest representable real value, that is, rounded to nearest.

Fixed-point types (usually decimal) are also defined as exact, in that they only store a fixed precision, so half cents or pennies (or less) cannot be stored. The results of computations are rounded up or down, just like the result of integer division stored as an integer result.

There are number of proposals to add Decimal floating-point Support to C++.

Decimal TR.

And also C++ Binary Fixed-Point Arithmetic.

is_bounded

std::numeric_limits<T>::is_bounded == true if the set of values represented by the type T is finite.

This is true for all built-in integer, fixed and floating-point types, and most multi-precision types.

It is only false for a few __arbitrary_precision types like cpp_int.

Rational and fixed-exponent representations are exact but not integer.

is_modulo

std::numeric_limits<T>::is_modulo is defined as true if adding two positive values of type T can yield a result less than either value.

is_modulo == true means that the type does not overflow, but, for example, 'wraps around' to zero, when adding one to the max() value.

For most built-in integer types, std::numeric_limits<>::is_modulo is true.

bool is the only exception.

The modulo behaviour is sometimes useful, but also can be unexpected, and sometimes undesired, behaviour.

Overflow of signed integers can be especially unexpected, possibly causing change of sign.

Boost.Multiprecision integer type cpp_int is not modulo because as an __arbitrary_precision types, it expands to hold any value that the machine resources permit.

However fixed precision cpp_int's may be modulo if they are unchecked (i.e. they behave just like built in integers), but not if they are checked (overflow causes an exception to be raised).

Built-in and multi-precision floating-point types are normally not modulo.

Where possible, overflow is to std::numeric_limits<>::infinity(), provided std::numeric_limits<>::has_infinity == true.

radix

Constant std::numeric_limits<T>::radix returns either 2 (for built-in and binary types) or 10 (for decimal types).

digits

The number of radix digits that be represented without change:

for integer types, the number of non-sign bits in the significand.
for floating types, the number of radix digits in the significand.

The values include any implicit bit, so for example, for the ubiquious double using 64 bits (IEEE binary64 ), digits == 53, even though there are only 52 actual bits of the significand stored in the representation. The value of digits reflects the fact that there is one implicit bit which is always set to 1.

The Boost.Multiprecision binary types do not use an implicit bit, so the digits member reflects exactly how many bits of precision were requested:

typedef number<cpp_bin_float<53, digit_base_2> >   float64;
typedef number<cpp_bin_float<113, digit_base_2> >  float128;
std::numeric_limits<float64>::digits == 53.
std::numeric_limits<float128>::digits == 113.

For the most common case of radix == 2, std::numeric_limits<T>::digits is the number of bits in the representation, not counting any sign bit.

For a decimal integer type, when radix == 10, it is the number of decimal digits.

digits10

Constant std::numeric_limits<T>::digits10 returns the number of decimal digits that can be represented without change or loss.

For example, numeric_limits<unsigned char>::digits10 is 2.

This somewhat inscrutable definition means that an unsigned char can hold decimal values 0..99 without loss of precision or accuracy, usually from truncation.

Had the definition been 3 then that would imply it could hold 0..999, but as we all know, an 8-bit unsigned char can only hold 0..255, and an attempt to store 256 or more will involve loss or change.

For bounded integers, it is thus one less than number of decimal digits you need to display the biggest integer std::numeric_limits<T>::max(). This value can be used to predict the layout width required for

std::cout
  << std::setw(std::numeric_limits<short>::digits10 +1 +1) // digits10+1, and +1 for sign.
  << std::showpos << (std::numeric_limits<short>::max)() // +32767
  << std::endl
  << std::setw(std::numeric_limits<short>::digits10 +1 +1)
  << (std::numeric_limits<short>::min)() << std::endl;   // -32767

For example, unsigned short is often stored in 16 bits, so the maximum value is 0xFFFF or 65535.

std::cout
  << std::setw(std::numeric_limits<unsigned short>::digits10 +1 +1) // digits10+1, and +1 for sign.
  << std::showpos << (std::numeric_limits<unsigned short>::max)() //  65535
  << std::endl
  << std::setw(std::numeric_limits<unsigned short>::digits10 +1 +1) // digits10+1, and +1 for sign.
  << (std::numeric_limits<unsigned short>::min)() << std::endl;   //      0

For bounded floating-point types, if we create a double with a value with digits10 (usually 15) decimal digits, 1e15 or 1000000000000000 :

std::cout.precision(std::numeric_limits<double>::max_digits10);
double d =  1e15;
double dp1 = d+1;
std::cout << d << "\n" << dp1 << std::endl;
// 1000000000000000
// 1000000000000001
std::cout <<  dp1 - d << std::endl; // 1

and we can increment this value to 1000000000000001 as expected and show the difference too.

But if we try to repeat this with more than digits10 digits,

std::cout.precision(std::numeric_limits<double>::max_digits10);
double d =  1e16;
double dp1 = d+1;
std::cout << d << "\n" << dp1 << std::endl;
// 10000000000000000
// 10000000000000000
  std::cout << dp1 - d << std::endl; // 0 !!!

then we find that when we add one it has no effect, and display show that there is loss of precision. See Loss of significance or cancellation error.

So digits10 is the number of decimal digits guaranteed to be correct.

For example, 'round-tripping' for double:

If a decimal string with at most digits10( == 15) significant decimal digits is converted to double and then converted back to the same number of significant decimal digits, then the final string will match the original 15 decimal digit string.
If a double floating-point number is converted to a decimal string with at least 17 decimal digits and then converted back to double, then the result will be binary identical to the original double value.

For most purposes, you will much more likely want std::numeric_limits<>::max_digits10, the number of decimal digits that ensure that a change of one least significant bit (ULP) produces a different decimal digits string.

For the most common double floating-point type,max_digits10 is digits10+2, but you should use C++11 max_digits10 where possible (see below).

max_digits10

std::numeric_limits<T>::max_digits10 was added for floating-point because digits10 decimal digits are insufficient to show a least significant bit (ULP) change giving puzzling displays like

0.666666666666667 != 0.666666666666667

from failure to 'round-trip', for example:

double write = 2./3; // Any arbitrary value that cannot be represented exactly.
double read = 0;
std::stringstream s;
s.precision(std::numeric_limits<double>::digits10); // or `float64_t` for 64-bit IEE754 double.
s << write;
s >> read;
if(read != write)
{
  std::cout <<  std::setprecision(std::numeric_limits<double>::digits10)
    << read << " != " << write << std::endl;
}

If you wish to ensure that a change of one least significant bit (ULP) produces a different decimal digits string, then max_digits10 is the precision to use.

For example:

double pi = boost::math::double_constants::pi;
std::cout.precision(std::numeric_limits<double>::max_digits10);
std::cout << pi << std::endl; // 3.1415926535897931

will display π to the maximum possible precision using a double.

and similarly for a much higher precision type:

using namespace boost::multiprecision;

typedef number<cpp_dec_float<50> > cpp_dec_float_50; // 50 decimal digits.

using boost::multiprecision::cpp_dec_float_50;

cpp_dec_float_50 pi = boost::math::constants::pi<cpp_dec_float_50>();
std::cout.precision(std::numeric_limits<cpp_dec_float_50>::max_digits10);
std::cout << pi << std::endl;
// 3.141592653589793238462643383279502884197169399375105820974944592307816406

For integer types, max_digits10 is implementation-dependent, but is usually digits10 + 2. This is the output field-width required for the maximum value of the type T std::numeric_limits<T>::max() including a sign and a space.

So this will produce neat columns.

std::cout << std::setw(std::numeric_limits<int>::max_digits10) ...

The extra two or three least-significant digits are 'noisy' and may be junk, but if you want to 'round-trip' - printing a value out as a decimal digit string and reading it back in - (most commonly during serialization and de-serialization) you must use os.precision(std::numeric_limits<T>::max_digits10).

	Note
	For Microsoft Visual Studio 2010, `std::numeric_limits<float>::max_digits10` is wrongly defined as 8. It should be 9.

Note

For Microsoft Visual Studio before 2013 and the default floating-point format, a small range of double-precision floating-point values with a significand of approximately 0.0001 to 0.004 and exponent values of 1010 to 1014 do not round-trip exactly being off by one least significant bit, for probably every third value of the significand.

A workaround is using the scientific or exponential format std::scientific.

Other compilers also fail to implement round-tripping entirely fault-free, for example, see Incorrectly Rounded Conversions in GCC and GLIBC.

For more details see Incorrect Round-Trip Conversions in Visual C++, and references therein and Easy Accurate Reading and Writing of Floating-Point Numbers, Aubrey Jaffer (August 2018).

Microsoft VS2017 and other recent compilers, now use the Ryu fast float-to-string conversion by Ulf Adams algorithm, claimed to be both exact and fast for 32 and 64-bit floating-point numbers.

	Note
	BOOST_NO_CXX11_NUMERIC_LIMITS is a suitable feature-test macro to determine if `std::numeric_limits<float>::max_digits10` is implemented on any platform.

If max_digits10 is not available, you should use the Kahan formula for floating-point type T.

In C++, the equations for what Kahan (on page 4) describes as 'at least' and 'at most' are:

static long double const log10Two = 0.30102999566398119521373889472449L; // log10(2.)

static_cast<int>(floor((significand_digits - 1) * log10Two)); // == digits10  - 'at least' .
static_cast<int>(ceil(1 + significand_digits * log10Two)); // == max_digits10  - 'at most'.

Unfortunately, these cannot be evaluated (at least by C++03) at compile-time. So the following expression is often used instead.

max_digits10 = 2 + std::numeric_limits<T>::digits * 3010U/10000U;

// == 2 + std::numeric_limits<T>::digits for double and 64-bit long double.
// == 3 + std::numeric_limits<T>::digits for float,  80-bit long-double and __float128.

often the actual values are computed for the C limits macros:

#define FLT_MAXDIG10 (2+FLT_MANT_DIG * 3010U/10000U)  // 9
#define DBL_MAXDIG10 (2+ (DBL_MANT_DIG * 3010U)/10000U) // 17
#define LDBL_MAXDIG10 (2+ (LDBL_MANT_DIG * 3010U)/10000U) // 17 for MSVC, 18 for others.

The factor 3010U/10000U is log₁₀(2) = 0.3010 that can be evaluated at compile-time using only short unsigned ints to be a desirable const or constexpr (and usually also static).

Boost macros allow this to be done portably, see BOOST_CONSTEXPR_OR_CONST or BOOST_STATIC_CONSTEXPR.

(See also Richard P. Brent and Paul Zimmerman, Modern Computer Arithmetic Equation 3.8 on page 116).

For example, to be portable (including obselete platforms) for type T where T may be: float, double, long double, 128-bit quad type, cpp_bin_float_50 ...

  typedef float T;

#if defined BOOST_NO_CXX11_NUMERIC_LIMITS
   // No max_digits10 implemented.
    std::cout.precision(max_digits10<T>());
#else
  #if(_MSC_VER <= 1600)
   //  Wrong value for std::numeric_limits<float>::max_digits10.
    std::cout.precision(max_digits10<T>());
  #else // Use the C++11 max_digits10.
     std::cout.precision(std::numeric_limits<T>::max_digits10);
  #endif
#endif

  std::cout << "std::cout.precision(max_digits10) = " << std::cout.precision() << std::endl; // 9

  double x = 1.2345678901234567889;

  std::cout << "x = " << x << std::endl; //

which should output:

std::cout.precision(max_digits10) = 9
x = 1.23456789

round_style

The rounding style determines how the result of floating-point operations is treated when the result cannot be exactly represented in the significand. Various rounding modes may be provided:

round to nearest up or down (default for floating-point types).
round up (toward positive infinity).
round down (toward negative infinity).
round toward zero (integer types).
no rounding (if decimal radix).
rounding mode is not determinable.

For integer types, std::numeric_limits<T>::round_style is always towards zero, so

std::numeric_limits<T>::round_style == std::round_to_zero;

A decimal type, cpp_dec_float rounds in no particular direction, which is to say it doesn't round at all. And since there are several guard digits, it's not really the same as truncation (round toward zero) either.

For floating-point types, it is normal to round to nearest.

std::numeric_limits<T>::round_style == std::round_to_nearest;

See function std::numeric_limits<T>::round_error for the maximum error (in ULP) that rounding can cause.

has_denorm_loss

true if a loss of precision is detected as a denormalization loss, rather than an inexact result.

Always false for integer types.

false for all types which do not have has_denorm == std::denorm_present.

denorm_style

Denormalized values are representations with a variable number of exponent bits that can permit gradual underflow, so that, if type T is double.

std::numeric_limits<T>::denorm_min() < std::numeric_limits<T>::min()

A type may have any of the following enum float_denorm_style values:

std::denorm_absent, if it does not allow denormalized values. (Always used for all integer and exact types).
std::denorm_present, if the floating-point type allows denormalized values.
std::denorm_indeterminate, if indeterminate at compile time.

Tinyness before rounding

bool std::numeric_limits<T>::tinyness_before

true if a type can determine that a value is too small to be represent as a normalized value before rounding it.

Generally true for is_iec559 floating-point built-in types, but false for integer types.

Standard-compliant IEEE 754 floating-point implementations may detect the floating-point underflow at three predefined moments:

After computation of a result with absolute value smaller than std::numeric_limits<T>::min(), such implementation detects tinyness before rounding (e.g. UltraSparc).
After rounding of the result to std::numeric_limits<T>::digits bits, if the result is tiny, such implementation detects tinyness after rounding (e.g. SuperSparc).
If the conversion of the rounded tiny result to subnormal form resulted in the loss of precision, such implementation detects denorm loss.