Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Testing

We work under the assumption that untested code doesn't work, so some tests for your new special function are in order, we'll divide these up in to 3 main categories:

Spot Tests

Spot tests consist of checking that the expected exception is generated when the inputs are in error (or otherwise generate undefined values), and checking any special values. We can check for expected exceptions with BOOST_CHECK_THROW, so for example if it's a domain error for the last parameter to be outside the range [0,1] then we might have:

BOOST_CHECK_THROW(my_special(0, -0.1), std::domain_error);
BOOST_CHECK_THROW(my_special(0, 1.1), std::domain_error);

When the function has known exact values (typically integer values) we can use BOOST_CHECK_EQUAL:

BOOST_CHECK_EQUAL(my_special(1.0, 0.0), 0);
BOOST_CHECK_EQUAL(my_special(1.0, 1.0), 1);

When the function has known values which are not exact (from a floating point perspective) then we can use BOOST_CHECK_CLOSE_FRACTION:

// Assumes 4 epsilon is as close as we can get to a true value of 2Pi:
BOOST_CHECK_CLOSE_FRACTION(my_special(0.5, 0.5), 2 * constants::pi<double>(), std::numeric_limits<double>::epsilon() * 4);
Independent Test Values

If the function is implemented by some other known good source (for example Mathematica or it's online versions functions.wolfram.com or www.wolframalpha.com then it's a good idea to sanity check our implementation by having at least one independendly generated value for each code branch our implementation may take. To slot these in nicely with our testing framework it's best to tabulate these like this:

// function values calculated on http://functions.wolfram.com/
static const boost::array<boost::array<T, 3>, 10> my_special_data = {{
    {{ SC_(0), SC_(0), SC_(1) }},
    {{ SC_(0), SC_(1), SC_(1.26606587775200833559824462521471753760767031135496220680814) }},
    /* More values here... */
}};

We'll see how to use this table and the meaning of the SC_ macro later. One important point is to make sure that the input values have exact binary representations: so choose values such as 1.5, 1.25, 1.125 etc. This ensures that if my_special is unusually sensitive in one area, that we don't get apparently large errors just because the inputs are 0.5 ulp in error.

Random Test Values

We can generate a large number of test values to check both for future regressions, and for accumulated rounding or cancellation error in our implementation. Ideally we would use an independent implementation for this (for example my_special may be defined in directly terms of other special functions but not implemented that way for performance or accuracy reasons). Alternatively we may use our own implementation directly, but with any special cases (asymptotic expansions etc) disabled. We have a set of tools to generate test data directly, here's a typical example:

#include <boost/multiprecision/cpp_dec_float.hpp>
#include <boost/math/tools/test_data.hpp>
#include <boost/test/included/prg_exec_monitor.hpp>
#include <fstream>

using namespace boost::math::tools;
using namespace boost::math;
using namespace std;
using namespace boost::multiprecision;

template <class T>
T my_special(T a, T b)
{
   // Implementation of my_special here...
   return a + b;
}

int cpp_main(int argc, char*argv [])
{
   //
   // We'll use so many digits of precision that any
   // calculation errors will still leave us with
   // 40-50 good digits.  We'll only run this program
   // once so it doesn't matter too much how long this takes!
   //
   typedef number<cpp_dec_float<500> > bignum;

   parameter_info<bignum> arg1, arg2;
   test_data<bignum> data;

   bool cont;
   std::string line;

   if(argc < 1)
      return 1;

   do{
      //
      // User interface which prompts for 
      // range of input parameters:
      //
      if(0 == get_user_parameter_info(arg1, "a"))
         return 1;
      if(0 == get_user_parameter_info(arg2, "b"))
         return 1;

      //
      // Get a pointer to the function and call
      // test_data::insert to actually generate
      // the values.
      //
      bignum (*fp)(bignum, bignum) = &my_special;
      data.insert(fp, arg2, arg1);

      std::cout << "Any more data [y/n]?";
      std::getline(std::cin, line);
      boost::algorithm::trim(line);
      cont = (line == "y");
   }while(cont);
   //
   // Just need to write the results to a file:
   //
   std::cout << "Enter name of test data file [default=my_special.ipp]";
   std::getline(std::cin, line);
   boost::algorithm::trim(line);
   if(line == "")
      line = "my_special.ipp";
   std::ofstream ofs(line.c_str());
   line.erase(line.find('.'));
   ofs << std::scientific << std::setprecision(50);
   write_code(ofs, data, line.c_str());

   return 0;
}

Typically several sets of data will be generated this way, including random values in some "normal" range, extreme values (very large or very small), and values close to any "interesting" behaviour of the function (singularities etc).

The Test File Header

We split the actual test file into 2 distinct parts: a header that contains the testing code as a series of function templates, and the actual .cpp test driver that decides which types are tested, and sets the "expected" error rates for those types. It's done this way because:

The test header contains 2 functions:

template <class Real, class T>
void do_test(const T& data, const char* type_name, const char* test_name);

template <class T>
void test(T, const char* type_name);

Before implementing those, we'll include the headers we'll need, and provide a default definition for the SC_ macro:

// A couple of Boost.Test headers in case we need any BOOST_CHECK_* macros:
#include <boost/test/unit_test.hpp>
#include <boost/test/floating_point_comparison.hpp>
// Our function to test:
#include <boost/math/special_functions/my_special.hpp>
// We need boost::array for our test data, plus a few headers from
// libs/math/test that contain our testing machinary:
#include <boost/array.hpp>
#include "functor.hpp"
#include "handle_test_result.hpp"
#include "table_type.hpp"

#ifndef SC_
#define SC_(x) static_cast<typename table_type<T>::type>(BOOST_JOIN(x, L))
#endif

The easiest function to implement is the "test" function which is what we'll be calling from the test-driver program. It simply includes the files containing the tabular test data and calls do_test function for each table, along with a description of what's being tested:

template <class T>
void test(T, const char* type_name)
{
   //
   // The actual test data is rather verbose, so it's in a separate file
   //
   // The contents are as follows, each row of data contains
   // three items, input value a, input value b and my_special(a, b):
   //
#  include "my_special_1.ipp"

   do_test<T>(my_special_1, name, "MySpecial Function: Mathematica Values");

#  include "my_special_2.ipp"

   do_test<T>(my_special_2, name, "MySpecial Function: Random Values");

#  include "my_special_3.ipp"

   do_test<T>(my_special_3, name, "MySpecial Function: Very Small Values");
}

The function do_test takes each table of data and calculates values for each row of data, along with statistics for max and mean error etc, most of this is handled by some boilerplate code:

template <class Real, class T>
void do_test(const T& data, const char* type_name, const char* test_name)
{
   // Get the type of each row and each element in the rows:
   typedef typename T::value_type row_type;
   typedef Real                   value_type;

   // Get a pointer to our function, we have to use a workaround here
   // as some compilers require the template types to be explicitly
   // specified, while others don't much like it if it is!
   typedef value_type (*pg)(value_type, value_type);
#if defined(BOOST_MATH_NO_DEDUCED_FUNCTION_POINTERS)
   pg funcp = boost::math::my_special<value_type, value_type>;
#else
   pg funcp = boost::math::my_special;
#endif

   // Somewhere to hold our results:
   boost::math::tools::test_result<value_type> result;
   // And some pretty printing:
   std::cout << "Testing " << test_name << " with type " << type_name
      << "\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n";

   //
   // Test my_special against data:
   //
   result = boost::math::tools::test_hetero<Real>(
      /* First argument is the table */
      data,
      /* Next comes our function pointer, plus the indexes of it's arguments in the table */
      bind_func<Real>(funcp, 0, 1),
      /* Then the index of the result in the table - potentially we can test several
      related functions this way, each having the same input arguments, and different
      output values in different indexes in the table */
      extract_result<Real>(2));
   //
   // Finish off with some boilerplate to check the results were within the expected errors,
   // and pretty print the results:
   //
   handle_test_result(result, data[result.worst()], result.worst(), type_name, "boost::math::my_special", test_name);
}

Now we just need to write the test driver program, at it's most basic it looks something like this:

#include <boost/math/special_functions/math_fwd.hpp>
#include <boost/math/tools/test.hpp>
#include <boost/math/tools/stats.hpp>
#include <boost/type_traits.hpp>
#include <boost/array.hpp>
#include "functor.hpp"

#include "handle_test_result.hpp"
#include "test_my_special.hpp"

BOOST_AUTO_TEST_CASE( test_main )
{
   //
   // Test each floating point type, plus real_concept.
   // We specify the name of each type by hand as typeid(T).name()
   // often gives an unreadable mangled name.
   //
   test(0.1F, "float");
   test(0.1, "double");
   //
   // Testing of long double and real_concept is protected
   // by some logic to disable these for unsupported
   // or problem compilers.
   //
#ifndef BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS
   test(0.1L, "long double");
#ifndef BOOST_MATH_NO_REAL_CONCEPT_TESTS
#if !BOOST_WORKAROUND(__BORLANDC__, BOOST_TESTED_AT(0x582))
   test(boost::math::concepts::real_concept(0.1), "real_concept");
#endif
#endif
#else
   std::cout << "<note>The long double tests have been disabled on this platform "
      "either because the long double overloads of the usual math functions are "
      "not available at all, or because they are too inaccurate for these tests "
      "to pass.</note>" << std::cout;
#endif
}

That's almost all there is too it - except that if the above program is run it's very likely that all the tests will fail as the default maximum allowable error is 1 epsilon. So we'll define a function (don't forget to call it from the start of the test_main above) to up the limits to something sensible, based both on the function we're calling and on the particular tests plus the platform and compiler:

void expected_results()
{
   //
   // Define the max and mean errors expected for
   // various compilers and platforms.
   //
   const char* largest_type;
#ifndef BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS
   if(boost::math::policies::digits<double, boost::math::policies::policy<> >() == boost::math::policies::digits<long double, boost::math::policies::policy<> >())
   {
      largest_type = "(long\\s+)?double|real_concept";
   }
   else
   {
      largest_type = "long double|real_concept";
   }
#else
   largest_type = "(long\\s+)?double";
#endif
   //
   // We call add_expected_result for each error rate we wish to adjust, these tell
   // handle_test_result what level of error is acceptable.  We can have as many calls
   // to add_expected_result as we need, each one establishes a rule for acceptable error
   // with rules set first given preference.
   //
   add_expected_result(
      /* First argument is a regular expression to match against the name of the compiler
         set in BOOST_COMPILER */
      ".*",
      /* Second argument is a regular expression to match against the name of the
         C++ standard library as set in BOOST_STDLIB */
      ".*",
      /* Third argument is a regular expression to match against the name of the
         platform as set in BOOST_PLATFORM */
      ".*",
      /* Forth argument is the name of the type being tested, normally we will
         only need to up the acceptable error rate for the widest floating
         point type being tested */
      largest_real,
      /* Fifth argument is a regular expression to match against
         the name of the group of data being tested */
      "MySpecial Function:.*Small.*",
      /* Sixth argument is a regular expression to match against the name
         of the function being tested */
      "boost::math::my_special",
      /* Seventh argument is the maximum allowable error expressed in units
         of machine epsilon passed as a long integer value */
      50,
      /* Eighth argument is the maximum allowable mean error expressed in units
         of machine epsilon passed as a long integer value */
      20);
}
Testing Multiprecision Types

Testing of multiprecision types is handled by the test drivers in libs/multiprecision/test/math, please refer to these for examples. Note that these tests are run only occationally as they take a lot of CPU cycles to build and run.

Improving Compile Times

As noted above, these test programs can take a while to build as we're instantiating a lot of templates for several different types, and our test runners are already stretched to the limit, and probably using outdated "spare" hardware. There are two things we can do to speed things up:

We can make these changes by changing the list of includes from:

#include <boost/math/special_functions/math_fwd.hpp>
#include <boost/math/tools/test.hpp>
#include <boost/math/tools/stats.hpp>
#include <boost/type_traits.hpp>
#include <boost/array.hpp>
#include "functor.hpp"

#include "handle_test_result.hpp"

To just:

#include <pch_light.hpp>

And changing

#include <boost/math/special_functions/my_special.hpp>

To:

#include <boost/math/special_functions/math_fwd.hpp>

The Jamfile target that builds the test program will need the targets

test_instances//test_instances pch_light

adding to it's list of source dependencies (see the Jamfile for examples).

Finally the project in libs/math/test/test_instances will need modifying to instantiate function my_special.

These changes should be made last, when my_special is stable and the code is in Trunk.

Concept Checks

Our concept checks verify that your function's implementation makes no assumptions that aren't required by our Real number conceptual requirements. They also check for various common bugs and programming traps that we've fallen into over time. To add your function to these tests, edit libs/math/test/compile_test/instantiate.hpp to add calls to your function: there are 7 calls to each function, each with a different purpose. Search for something like "ibeta" or "gamm_p" and follow their examples.


PrevUpHomeNext