Home | Libraries | People | FAQ | More |
Comparison of floating-point values has always been a source of endless difficulty and confusion.
Unlike integral values that are exact, all floating-point operations will potentially produce an inexact result that will be rounded to the nearest available binary representation. Even apparently inocuous operations such as assigning 0.1 to a double produces an inexact result (as this decimal number has no exact binary representation).
Floating-point computations also involve rounding so that some 'computational noise' is added, and hence results are also not exact (although repeatable, at least under identical platforms and compile options).
Sadly, this conflicts with the expectation of most users, as many articles and innumerable cries for help show all too well.
Some background reading is:
Boost provides a number of ways to compare floating-point values to see if they are tolerably close enough to each other, but first we must decide what kind of comparison we require:
fabs(a-b)
. This is the only meaningful comparison
to make if we know that the result may have cancellation error (see below).
float_distance
is a surgeon's scalpel, then relative_difference
is more like a Swiss army knife: both have important but different use
cases.
#include <boost/math/special_functions/relative_difference.hpp>
template <class T, class U> calculated-result-type relative_difference(T a, U b); template <class T, class U> calculated-result-type epsilon_difference(T a, U b);
The function relative_difference
returns the relative distance/error E between two values
as defined by:
E = fabs((a - b) / min(a,b))
The function epsilon_difference
is a convenience function that returns relative_difference(a,
b) / eps
where
eps
is the machine epsilon
for the result type.
The following special cases are handled as follows:
double
, this is std::numeric_limits<double>::max()
which is the same as DBL_MAX
or 1.7976931348623157e+308
.
These rules were primarily designed to assist with our own test suite, they are designed to be robust enough that the function can in most cases be used blindly, including in cases where the expected result is actually too small to represent in type T and underflows to zero.
Some using statements will ensure that the functions we need are accessible.
using namespace boost::math;
or
using boost::math::relative_difference; using boost::math::epsilon_difference; using boost::math::float_next; using boost::math::float_prior;
The following examples display values with all possibly significant digits.
Newer compilers should provide std::numeric_limitsFPT>::max_digits10
for this purpose, and here we use float
precision where max_digits10
= 9 to avoid displaying a distracting number of decimal digits.
Note | |
---|---|
Older compilers can use this formula to calculate |
One can set the display including all trailing zeros (helpful for this example
to show all potentially significant digits), and also to display bool
values as words rather than integers:
std::cout.precision(std::numeric_limits<float>::max_digits10); std::cout << std::boolalpha << std::showpoint << std::endl;
When comparing values that are quite close or approximately
equal, we could use either float_distance
or relative_difference
/epsilon_difference
, for example with type
float
, these two values are adjacent
to each other:
float a = 1; float b = 1 + std::numeric_limits<float>::epsilon(); std::cout << "a = " << a << std::endl; std::cout << "b = " << b << std::endl; std::cout << "float_distance = " << float_distance(a, b) << std::endl; std::cout << "relative_difference = " << relative_difference(a, b) << std::endl; std::cout << "epsilon_difference = " << epsilon_difference(a, b) << std::endl;
Which produces the output:
a = 1.00000000 b = 1.00000012 float_distance = 1.00000000 relative_difference = 1.19209290e-007 epsilon_difference = 1.00000000
In the example above, it just so happens that the edit distance as measured
by float_distance
, and the
difference measured in units of epsilon were equal. However, due to the way
floating point values are represented, that is not always the case:
a = 2.0f / 3.0f; // 2/3 inexactly represented as a float b = float_next(float_next(float_next(a))); // 3 floating point values above a std::cout << "a = " << a << std::endl; std::cout << "b = " << b << std::endl; std::cout << "float_distance = " << float_distance(a, b) << std::endl; std::cout << "relative_difference = " << relative_difference(a, b) << std::endl; std::cout << "epsilon_difference = " << epsilon_difference(a, b) << std::endl;
Which produces the output:
a = 0.666666687 b = 0.666666865 float_distance = 3.00000000 relative_difference = 2.68220901e-007 epsilon_difference = 2.25000000
There is another important difference between float_distance
and the relative_difference/epsilon_difference
functions in that float_distance
returns a signed result that reflects which argument is larger in magnitude,
where as relative_difference/epsilon_difference
simply return an unsigned value that represents how far apart the values are.
For example if we swap the order of the arguments:
std::cout << "float_distance = " << float_distance(b, a) << std::endl; std::cout << "relative_difference = " << relative_difference(b, a) << std::endl; std::cout << "epsilon_difference = " << epsilon_difference(b, a) << std::endl;
The output is now:
float_distance = -3.00000000 relative_difference = 2.68220901e-007 epsilon_difference = 2.25000000
Zeros are always treated as equal, as are infinities as long as they have the same sign:
a = 0; b = -0; // signed zero std::cout << "relative_difference = " << relative_difference(a, b) << std::endl; a = b = std::numeric_limits<float>::infinity(); std::cout << "relative_difference = " << relative_difference(a, b) << std::endl; std::cout << "relative_difference = " << relative_difference(a, -b) << std::endl;
Which produces the output:
relative_difference = 0.000000000 relative_difference = 0.000000000 relative_difference = 3.40282347e+038
Note that finite values are always infinitely far away from infinities even if those finite values are very large:
a = (std::numeric_limits<float>::max)(); b = std::numeric_limits<float>::infinity(); std::cout << "a = " << a << std::endl; std::cout << "b = " << b << std::endl; std::cout << "relative_difference = " << relative_difference(a, b) << std::endl; std::cout << "epsilon_difference = " << epsilon_difference(a, b) << std::endl;
Which produces the output:
a = 3.40282347e+038 b = 1.#INF0000 relative_difference = 3.40282347e+038 epsilon_difference = 3.40282347e+038
Finally, all denormalized values and zeros are treated as being effectively equal:
a = std::numeric_limits<float>::denorm_min(); b = a * 2; std::cout << "a = " << a << std::endl; std::cout << "b = " << b << std::endl; std::cout << "float_distance = " << float_distance(a, b) << std::endl; std::cout << "relative_difference = " << relative_difference(a, b) << std::endl; std::cout << "epsilon_difference = " << epsilon_difference(a, b) << std::endl; a = 0; std::cout << "a = " << a << std::endl; std::cout << "b = " << b << std::endl; std::cout << "float_distance = " << float_distance(a, b) << std::endl; std::cout << "relative_difference = " << relative_difference(a, b) << std::endl; std::cout << "epsilon_difference = " << epsilon_difference(a, b) << std::endl;
Which produces the output:
a = 1.40129846e-045 b = 2.80259693e-045 float_distance = 1.00000000 relative_difference = 0.000000000 epsilon_difference = 0.000000000 a = 0.000000000 b = 2.80259693e-045 float_distance = 2.00000000 relative_difference = 0.000000000 epsilon_difference = 0.000000000
Notice how, in the above example, two denormalized values that are a factor of 2 apart are none the less only one representation apart!
All the above examples are contained in float_comparison_example.cpp.
Imagine we're testing the following function:
double myspecial(double x) { return sin(x) - sin(4 * x); }
This function has multiple roots, some of which are quite predicable in that
both sin(x)
and sin(4x)
are zero
together. Others occur because the values returned from those two functions
precisely cancel out. At such points the relative difference between the true
value of the function and the actual value returned may be arbitrarily
large due to cancellation
error.
In such a case, testing the function above by requiring that the values returned
by relative_error
or epsilon_error
are below some threshold is
pointless: the best we can do is to verify that the absolute difference
between the true and calculated values is below some threshold.
Of course, determining what that threshold should be is often tricky, but a
good starting point would be machine epsilon multiplied by the largest of the
values being summed. In the example above, the largest value returned by sin(whatever)
is 1, so simply using machine epsilon as the
target for maximum absolute difference might be a good start (though in practice
we may need a slightly higher value - some trial and error will be necessary).