std::numeric_limits<> functions

`std::numeric_limits<>` functions

`max` function

Function (std::numeric_limits<T>::max)() returns the largest finite value that can be represented by the type T. If there is no such value (and numeric_limits<T>::bounded is false) then returns T().

For built-in types there is usually a corresponding MACRO value TYPE_MAX, where TYPE is CHAR, INT, FLOAT etc.

Other types, including those provided by a typedef, for example INT64_T_MAX for int64_t, may provide a macro definition.

To cater for situations where no numeric_limits specialization is available (for example because the precision of the type varies at runtime), packaged versions of this (and other functions) are provided using

#include <boost/math/tools/precision.hpp>

T = boost::math::tools::max_value<T>();

Of course, these simply use (std::numeric_limits<T>::max)() if available, but otherwise 'do something sensible'.

lowest function

Since C++11: std::numeric_limits<T>::lowest() is

For integral types, the same as function min().
For floating-point types, generally the negative of max() (but implementation-dependent).

-(std::numeric_limits<double>::max)() == std::numeric_limits<double>::lowest();

`min` function

Function (std::numeric_limits<T>::min)() returns the minimum finite value that can be represented by the type T.

For built-in types, there is usually a corresponding MACRO value TYPE_MIN, where TYPE is CHAR, INT, FLOAT etc.

Other types, including those provided by a typedef, for example, INT64_T_MIN for int64_t, may provide a macro definition.

For floating-point types, it is more fully defined as the minimum positive normalized value.

See std::numeric_limits<T>::denorm_min() for the smallest denormalized value, provided

std::numeric_limits<T>::has_denorm == std::denorm_present

#include <boost/math/tools/precision.hpp>

T = boost::math::tools::min_value<T>();

Of course, these simply use std::numeric_limits<T>::min() if available.

denorm_min function

Function std::numeric_limits<T>::denorm_min() returns the smallest denormalized value, provided

std::numeric_limits<T>::has_denorm == std::denorm_present

std::cout.precision(std::numeric_limits<double>::max_digits10);
if (std::numeric_limits<double>::has_denorm == std::denorm_present)
{
  double d = std::numeric_limits<double>::denorm_min();

    std::cout << d << std::endl; //  4.9406564584124654e-324

    int exponent;

    double significand = frexp(d, &exponent);
    std::cout << "exponent = " << std::hex << exponent << std::endl; //  fffffbcf
    std::cout << "significand = " << std::hex << significand << std::endl; // 0.50000000000000000
}
else
{
  std::cout << "No denormalization. " << std::endl;
}

The exponent is effectively reduced from -308 to -324 (though it remains encoded as zero and leading zeros appear in the significand, thereby losing precision until the significand reaches zero).

round_error

Function std::numeric_limits<T>::round_error() returns the maximum error (in units of ULP) that can be caused by any basic arithmetic operation.

round_style == std::round_indeterminate;

The rounding style is indeterminable at compile time.

For floating-point types, when rounding is to nearest, only half a bit is lost by rounding, and round_error == 0.5. In contrast when rounding is towards zero, or plus/minus infinity, we can loose up to one bit from rounding, and round_error == 1.

For integer types, rounding always to zero, so at worst almost one bit can be rounded, so round_error == 1.

round_error() can be used with std::numeric_limits<T>::epsilon() to estimate the maximum potential error caused by rounding. For typical floating-point types, round_error() = 1/2, so half epsilon is the maximum potential error.

double round_err = std::numeric_limits<double>::epsilon() // 2.2204460492503131e-016
                 * std::numeric_limits<double>::round_error(); // 1/2
std::cout << round_err << std::endl; // 1.1102230246251565e-016

There are, of course, many occasions when much bigger loss of precision occurs, for example, caused by Loss of significance or cancellation error or very many iterations.

epsilon

Function std::numeric_limits<T>::epsilon() is meaningful only for non-integral types.

It returns the difference between 1.0 and the next value representable by the floating-point type T. So it is a one least-significant-bit change in this floating-point value.

For double (float_64t) it is 2.2204460492503131e-016 showing all possibly significant 17 decimal digits.

std::cout.precision(std::numeric_limits<double>::max_digits10);
double d = 1.;
double eps = std::numeric_limits<double>::epsilon();
double dpeps = d+eps;
std::cout << std::showpoint // Ensure all trailing zeros are shown.
  << d << "\n"           // 1.0000000000000000
  << dpeps << std::endl; // 2.2204460492503131e-016
std::cout << dpeps - d   // 1.0000000000000002
  << std::endl;

We can explicitly increment by one bit using the function boost::math::float_next() and the result is the same as adding epsilon.

double one = 1.;
double nad = boost::math::float_next(one);
std::cout << nad << "\n"  //  1.0000000000000002
  << nad - one // 2.2204460492503131e-016
  << std::endl;

Adding any smaller value, like half epsilon, will have no effect on this value.

std::cout.precision(std::numeric_limits<double>::max_digits10);
double d = 1.;
double eps = std::numeric_limits<double>::epsilon();
double dpeps = d + eps/2;

std::cout << std::showpoint // Ensure all trailing zeros are shown.
  << dpeps << "\n"       // 1.0000000000000000
  << eps/2 << std::endl; // 1.1102230246251565e-016
std::cout << dpeps - d   // 0.00000000000000000
  << std::endl;

So this cancellation error leaves the values equal, despite adding half epsilon.

To achieve greater portability over platform and floating-point type, Boost.Math and Boost.Multiprecision provide a package of functions that 'do something sensible' if the standard numeric_limits is not available. To use these #include <boost/math/tools/precision.hpp>.

A tolerance might be defined using this version of epsilon thus:

RealType tolerance = boost::math::tools::epsilon<RealType>() * 2;

Tolerance for Floating-point Comparisons

Machine epsilon ε is very useful to compute a tolerance when comparing floating-point values, a much more difficult task than is commonly imagined.

The C++ standard specifies std::numeric_limits<>::epsilon() and Boost.Multiprecision implements this (where possible) for its program-defined types analogous to the __fundamental floating-point types like double float.

For more information than you probably want (but still need) see What Every Computer Scientist Should Know About Floating-Point Arithmetic

The naive test comparing the absolute difference between two values and a tolerance does not give useful results if the values are too large or too small.

So Boost.Test uses an algorithm first devised by Knuth for reliably checking if floating-point values are close enough.

See Donald. E. Knuth. The art of computer programming (vol II). Copyright 1998 Addison-Wesley Longman, Inc., 0-201-89684-2. Addison-Wesley Professional; 3rd edition. (The relevant equations are in paragraph 4.2.2, Eq. 36 and 37.)

See Boost.Math floating_point comparison for more details.

	Note
that Boost.Test does not yet allow floating-point comparisons with expression templates on, so the default expression template parameter has been replaced by `et_off`.

Infinity - positive and negative

For floating-point types only, for which std::numeric_limits<T>::has_infinity == true, function std::numeric_limits<T>::infinity() provides an implementation-defined representation for ∞.

The 'representation' is a particular bit pattern reserved for infinity. For IEEE754 system (for which std::numeric_limits<T>::is_iec559 == true) positive and negative infinity are assigned bit patterns for all defined floating-point types.

Confusingly, the string resulting from outputting this representation, is also implementation-defined. And the string that can be input to generate the representation is also implementation-defined.

For example, the output is 1.#INF on Microsoft systems, but inf on most *nix platforms.

This implementation-defined-ness has hampered use of infinity (and NaNs) but Boost.Math and Boost.Multiprecision work hard to provide a sensible representation for all floating-point types, not just the built-in types, which with the use of suitable facets to define the input and output strings, makes it possible to use these useful features portably and including Boost.Serialization.

Not-A-Number NaN

Quiet_NaN

For floating-point types only, for which std::numeric_limits<T>::has_quiet_NaN == true, function std::numeric_limits<T>::quiet_NaN() provides an implementation-defined representation for NaN.

NaNs are values to indicate that the result of an assignment or computation is meaningless. A typical example is 0/0 but there are many others.

NaNs may also be used, to represent missing values: for example, these could, by convention, be ignored in calculations of statistics like means.

Many of the problems with a representation for Not-A-Number has hampered portable use, similar to those with infinity.

NaN can be used with binary multiprecision types like cpp_bin_float_quad:

using boost::multiprecision::cpp_bin_float_quad;

if (std::numeric_limits<cpp_bin_float_quad>::has_quiet_NaN == true)
{
  cpp_bin_float_quad tolerance =  3 * std::numeric_limits<cpp_bin_float_quad>::epsilon();

  cpp_bin_float_quad NaN =  std::numeric_limits<cpp_bin_float_quad>::quiet_NaN();
  std::cout << "cpp_bin_float_quad NaN is "  << NaN << std::endl; //   cpp_bin_float_quad NaN is nan

  cpp_bin_float_quad expected = NaN;
  cpp_bin_float_quad calculated = 2 * NaN;
  // Comparisons of NaN's always fail:
  bool b = expected == calculated;
  std::cout << b << std::endl;
  BOOST_CHECK_NE(expected, expected);
  BOOST_CHECK_NE(expected, calculated);
}
else
{
  std::cout << "Type " << typeid(cpp_bin_float_quad).name() << " does not have NaNs!" << std::endl;
}

But using Boost.Math and suitable facets can permit portable use of both NaNs and positive and negative infinity.

See boost:/libs/math/example/nonfinite_facet_sstream.cpp and we also need

#include <boost/math/special_functions/nonfinite_num_facets.hpp>

Then we can equally well use a multiprecision type cpp_bin_float_quad:

using boost::multiprecision::cpp_bin_float_quad;

typedef cpp_bin_float_quad T;

using boost::math::nonfinite_num_put;
using boost::math::nonfinite_num_get;
{
  std::locale old_locale;
  std::locale tmp_locale(old_locale, new nonfinite_num_put<char>);
  std::locale new_locale(tmp_locale, new nonfinite_num_get<char>);
  std::stringstream ss;
  ss.imbue(new_locale);
  T inf = std::numeric_limits<T>::infinity();
  ss << inf; // Write out.
 BOOST_ASSERT(ss.str() == "inf");
  T r;
  ss >> r; // Read back in.
  BOOST_ASSERT(inf == r); // Confirms that the floating-point values really are identical.
  std::cout << "infinity output was " << ss.str() << std::endl;
  std::cout << "infinity input was " << r << std::endl;
}

infinity output was inf
infinity input was inf

Similarly we can do the same with NaN (except that we cannot use assert (because any comparisons with NaN always return false).

{
  std::locale old_locale;
  std::locale tmp_locale(old_locale, new nonfinite_num_put<char>);
  std::locale new_locale(tmp_locale, new nonfinite_num_get<char>);
  std::stringstream ss;
  ss.imbue(new_locale);
  T n;
  T NaN = std::numeric_limits<T>::quiet_NaN();
  ss << NaN; // Write out.
  BOOST_ASSERT(ss.str() == "nan");
  std::cout << "NaN output was " << ss.str() << std::endl;
  ss >> n; // Read back in.
  std::cout << "NaN input was " << n << std::endl;
}

NaN output was nan
NaN input was nan

Signaling NaN

For floating-point types only, for which std::numeric_limits<T>::has_signaling_NaN == true, function std::numeric_limits<T>::signaling_NaN() provides an implementation-defined representation for NaN that causes a hardware trap. It should be noted however, that at least one implementation of this function causes a hardware trap to be triggered simply by calling std::numeric_limits<T>::signaling_NaN(), and not only by using the value returned.