Programmer's Documentation for the Adpative Sparse Integration Project

For the current version of this package, updated documentation and further material, please see the code homepage.

The Adaptive Sparse Integration Project provides a framework and algorithms for a dimension-adaptive version of sparse grid integration. The adaptive sparse integration code is embedded in a general integration framework. This also includes Monte-Carlo and Quasi-Monte-Carlo integration. However, the code for these is rudimentary, and they are not the focus of development.

There are two basic possibilities for using the code in this project. You may use the code either only as an end-user to calculate integrals, and not care exactly how the integration works. That is, you use the supplied code as a black-box for integration. This will be the preferred approach for most users. It is for them that the user section of this manuscript is written.

If you wish to modify the code, or even participate in its development, please also read the developer section, which deals with the class hierarchy, and describes the implementation of the code.

The mathematics and theoretical background for the algorithm are described in my diploma thesis. The algorithm can simply be used as a black box without this background, but is highly recommend for a developer that plans to extend this project.

Bugs

The estimation of the error of the integral value from the correct value is not yet satisfactory. It works well with most functions, but may underestimate the error for irregular, especially singular functions. This is can be seen with the Genz Corner Peak test function. This is true of both the EvaluateController and the EstimateController, although the latter suffers less from it.

Requirements

You need the Java Development Kit of at least version 1.6 (marketed as version 6 by Sun) to run the code.

To check whether the JDK is installed properly:

If both command work you are set and ready to go. In all other cases, you need to obtain the Java Development Kit. It is available for download at http://java.sun.com.

This document is divided into the following sections:

  1. A First Example
  2. Ways to Integrate
  3. The Integration Modules
  4. The Controller
  5. The Index Evaluator
  6. The Visualizer
  7. Other integrators
  8. Why Java?
  9. Optimizing Performance

A First Example

Let's get started with calculation a simple integral:

import de.torstennahm.integrate.*;
import de.torstennahm.integrate.sparse.*;
import de.torstennahm.math.*;

public class Example {
    public static void main(String[] args) {
        Integrator integrator = new DefaultSparseIntegrator();
        Function function = new Circle();
        
        try {
            IntegrationResult result = integrator.integrateAbsTol(function, 1e-6);
            System.out.println(result.value());
        } catch (IntegrationFailedException e) {
            System.out.println("Integration failed:");
            e.printStackTrace();
        }
    }
}

class Circle extends Function {
    public int dimension() {
        return 1;
    }
    
    @Override
    public double sEvaluate(double[] x) {
        checkArgument(x);
        
        return 4 * Math.sqrt(1 - x[0] * x[0]);
    }
    
    @Override
    public int inputDimension() {
        return 1;
    }
}

In case you are new to Java, here is an ultra-quick tutorial that may or may not work. In any case, it assumes that you have Java installed (see requirements above).

If this doesn't work, there is something wrong with the setup of your Java environment. Correcting this is beyond the scope of this document. Please see http://java.sun.com for information on installation and troubleshooting.

You can see in the source that the program consists of two parts. One is concerned with initializing the integrator (class AdaptiveSparseIntegrator), the other defines the function that is to be integrated (class Circle). The function defined is f(x)=4*sqrt(1-x^2). It is intergrated on the interval [0,1], yielding π as the integral value.

This example may seem to be quite a bit of code for just defining this simple function and calling the integrator. This complexity arises for two reasons. The first is that Java is object-orientated. While this makes for easy maintainance of code, it goes along with overhead to separate different code parts neatly from each other. The second is that the code has many safety checks. checkArgument(x) in the method evaluate, for example, makes sure that the vector passed as function argument actually has the same dimension as the function. And the part of the code revolving around the IntegrationFailedException makes sure unforeseen errors are reported. In standard C programming style, both of these cases might go unchecked, and cause difficult to locate errors or even wrong results.

Of course, this was a very simple example of a one-dimensional function. Now let use try something more interesting. In the code above, add the lines:

class Product extends Function {
    @Override
    public double sEvaluate(double[] x) {
        checkArgument(x);
        
        double p = 1.0;
        for (int i = 0; i < x.length; i++) {
            p *= x[i];
        }
        return p;
    }

    @Override
    public int inputDimension() {
        return 8;
    }
}
Now replace the line
        Function function = new Circle();
by
        Function function = new Product();

and recompile and run the program. This yields the result 0.00390625, which is equal to 1/256. Indeed, we have integrated the 8-dimensional function x_1*x_2*...*x_8 on the unit cube, giving us the result (1/2)^8.

As you can see, we specify the dimension in the method inputDimension() of a Function. In evaluate, we calculate the value of the function at the specified vector x. The call to checkArgument(x) ensures that the dimension of the vector x is the same as the dimension of the function, otherwise an error will be generated. Here we see one advantage of object-oriented programming. Since the dimension of the function is specified in the function itself, we do not need to tell the integrator about it. We simply call integrator.integrateAbsTol(function, tol), without needing to supply the dimension, and the intergrator obtains the dimension from the function itself. By default, the integrator will always integrate with the Lesbegue measure on the multi-dimensional unit cube.

You are now ready to test the integrator on your own functions.

Ways to Integrate

You will have noticed that we have called the method integrateAbsTol to do our integration. This integrates until the absolute tolerance of the integral value (that is, its estimated difference from the real integral value) is judged to be less than the specified tolerance. In addtition to integrating to an absolute tolerance, there are several other possibilities. Have a look at the AdaptiveSparseIntegrator API Specification. You will notice it is also possible to integrate until a relative tolerance is reached or until the function has been evaluated a minimum number of times.

The API specification is a reference for all the different classes used in the code. It is available here, and should serve you as a handy programming reference.

The Integration Modules

Internally, the adaptive sparse grid integrator. Here is an excerpt from the AdaptiveSparseIntegrator API Specification:

The integration process is executed by several modules that plug into this main class:
We will now cover the three different modules. So far, we have simply used the default modules with default settings. We did this by using the default constructor new AdaptiveSparseIntegrator(), without supplying any further arguments. To use different modules, you could for example use new AdaptiveSparseIntegrator(new EvaluateController(), new DeltaQuadratureEvaluator(new ClenshawCurtis()), new GridVisualizer()), specifying the contoller, index integrator and visuailzer in that order. Any of these arguments may be null, in which case the default module at that spot is used. See the method description.

The Controller

There are two different controllers available at the moment: EvaluateController and EstimateController. As regards performance, the EvaluateController is a bit faster, whereas the EstimateController is a bit more robust.

The EvaluateController implements the algorithm originally proposed in [1]. new EvaluateController() generates the controller. You probably don't want to use the constructor new EvaluateController(workWeight). It is there for historical reasons, since it was mentioned in [3], but it has turned out that it yields inferior results and actually asympotically only does standard non-adaptive sparse grind integration for workWeight>0.

The EstimateController can be used with the default constructor new EstimateController(). In this case, it is very similar to the EvaluateController. The constructor new EstimateController(simplexQuota) actually does what the new EvaluateController(workWeight) was supposed to do, that is, to partially include standard non-adaptive sparse grid integration. Generally, the closer simplexQuota is to 1, the more the algorithm will look like the non-adaptive form. This means more robustness but also less adaptiveness and hence slower convergence.

To use the estimate controller with a 50-50 mix of convential and adaptive sparse grid integration, use the following code to create your integrator:  new AdaptiveSparseIntegrator(new EstimateController(0.5), null, null). The two null arguments mean that you do want to use the default index evaluator and the default visualizer.

The Index Evaluator

At the moment there are two index evaluators available: DeltaQuadratureEvaluator and ProductIndexEvaluator. You should always use DeltaQuadratureEvaluator. ProductIndexEvaluator should theoretically be somewhat better in some cases (when the quadrature formula is not nested), but this has not been borne out in practice.

The DeltaQuadratureEvaluator and ProductIndexEvaluator work with quadrature formula generators. (See the package description). These generators produce a 1-dimensional quadrature formula as a series of nodes x_i and weight w_i, to yield a numerical integral of f as the sum over i for w_i * f(x_i). This approach is well-known in numerical integration theory. The quadrature formula generator supplies these quadrature formulas for different numbers of nodes. Every quadrature formula is constructed for a specific measures on the real numbers. In accordance with the sparse grid approach, for functions of a dimension larger than 1, the tensor product of the quadrature formulas is used for integrating the function.

Well-known quadrature formulas supplied in the package include the Clenshaw-Curtis rule, the trapezoidal rule and the Patterson (Gauss-Kronrod) rule. All of these quadrature formulas integrate with the Lesbegue measure on the interval [0,1].

The Gauss-Hermite formula instead uses the measure with the weight f(x)=1/sqrt(2*sigma*pi)*exp(-x^2/(2*sigma)) on the real numbers. sigma can be specified in the constructor, using new GaussHermite(sigma). For sigma=1 you get the Gaussian normal distribution as integration measure. This is especially useful for calculation path integrals. You should also check out the class BrownianBridge, which generates a Brownian path. If you compose your own function with this, you can directly work in path space, and save yourself the translation into a Brownian path.

The quadrature formula is supplied to the index evaluator, an this is in turn given to the adaptive sparse integrator. The following code would give you an integrator that integrates with the Gaussian normal distribution as measure: new AdaptiveSparseIntegrator(null, new DeltaQuadratureEvaluator(new GaussHermite(1.0)), null). Again, the two null arguments mean that you do want to use the default controller and the default visualizer.

If you only need to integrate on the multi-dimensional unit cube, you can stay with the default index evaluator, with uses the Patterson quadrature formula, and works well in most cases. However, the Patterson formulas are hard to compute and not available for high numbers of nodes. If you see the error "Generator maximum level exceeded" you can use the Clenshaw-Curtis rule instead. You do this by supplying new DeltaQuadratureEvaluator(new ClenshawCurtis()) as the index evaluator for the adaptive sparse grid integrator.

The Visualizer

I will not say all to much on the visualizers, but rather just give a few hints to using them, since these do not belong directly to the code. Rather, they were programmed ad-hoc to improve the program by graphically displaying the integration process. The visualizers plug into the integration process, and are supplied with information by callback methods by the other integration modules. For example, information is sent out when the integration starts, or when an index was evaluated.

To use the visualizers, simply create the via their constructors and add them to a list of the type List<Visualizer>. This list is then given as the last argument to an integrate call.

The GridVisualizer displays the integration set. The left panel color-codes the different indices according to their contribution. The right panel is more specific, and will not be covered here. While the information display is intuitive, you should always be aware that the grid visualizer displays only a two-dimensional section thru a high-dimensional index space. Thus, you will only be able to see all information about the index set for functions with 1 or 2 dimension. For higher dimension, you can adjust which dimensions are displayed in the grid, but you will only see those indices that have 0 as entry for all other dimensions.

The ResultVisualizer shows how the difference between the integral value computed and the correct value change over time, and also the algorithm's estimate of the error. Obviously, this visualizer will only work if the correct value of the integral is known (or know to a sufficient degree of accucary) and supplied to the visualizer.

The ExtentVisualizer shows the adaptive nature of the algorithm. It shows the maximum level of refinement for each dimension of the function as vertical black bars. At the top, the maximum total length (that is the sum of the levels for all dimensions) of the index set is shown in a red horizontal bar. The more heterogeneous the black bars are, the higher the advantage of the adaptive algorithm is over standard non-adaptive sparse grid integration.

The MultiVisualizer allows you to use several visualizers at once. Supply them to the multi visualizer with the addVisualizer method, and then pass the multi visualizer as visualizer to the adaptive sparse integrator.

Check the "Direct Known Subclasses" of the Visualizer interface for other implementing visualizer classes.

Other integrators

The adaptive sparse grid integrator belongs to a larger framework of integrators. This framework also supports standard sparse grid integration, Monte-Carlo integration and Quasi-Monte-Carlo integration. These are all part of the Integrator hierarchy, and due to this can be used instead adaptive sparse grid integrator with only minimal code modification. In the first example, change the line

        Integrator integrator = new AdaptiveSparseIntegrator();
to
        Integrator integrator = new QMCIntegrator();
Then recompile and run the program. You should approximately the same results as before, but now you are using a totally different integration method, the Quasi Monte Carlo method. See the "Direct Known Subclasses" of Integrator to find the other integrators.

Why Java?

“We should forget about small efficiencies, say about 97% of the time: Premature optimization is the root of all evil.”—Donald Knuth

I realize it is extremely unusual to offer a program to the numerical computation community that is written in Java. For a community that has been slow to embrace object-oriented programming, relying on Fortran and C, in which even C++ is a rarity, the use of an interpreted language, that is not compiled into native code, must seem something of a blasphemy. Please, let me defend myself.

I strongly believe in the statement by Donald Knuth I have cited above. The whole project of adaptive sparse grid integration is still in its early stages. Rather than aiming to produce the fastest possible code, I designed the project for a high degree of flexibility and robustness. High flexibility can only be achieved in an object-oriented language, which makes it possible to divide the program different self-contained modules. The issue of robustness led me to use Java, since it has many mechanisms of self-checking and validation during runtime. Also, since this program subscribes strongly to the object-oriented approach, the automatic garbage collection comes in very handy. Both are features which are not offered by C++.

Indeed, I have saved myself from countless hours of unproductive debugging. I would guess the total time of development I have spent with debugging to be somewhere between 5% and 10%. This time bought at the expense of spending a lot of time on designing class hierarchies and refactoring the code. Still, I think that altogether I would not have written the code faster if I had used C, but the result would be far less readable and manageable.

It is certainly planned to also implement adaptive sparse grid integration as soon as the algorithm has a reached a stable form. But for the momentary, experimental phase, I felt that Java is the best solution.

It is also important to note that Java is by far not as slow as it is perceived. It has long left its slow interpreter-only times behind. It now includes a sophisticated just-in-time compiler, so the code is in effect compiled. Certainly, just in time complilation will never be quite as good as pre-compilation. But still, the differences are not big. When I compared the integration of a high-dimensional function for an early C implementation of the algorithm to the Java implementation, the Java implementation was only about 10% slower. This is due to the fact the the lion's share of the running time is spent evaluating the integrand function anyway. This is usually a pure mathematical algorithm, which the Java just in time compiler handles about as well as a C compiler.

Optimizing Performance

If you are looking for a bit of extra performance, you should start Java with the option -server. This will make Java use a slower, but more efficient just in time compiler. For large integrations, this will increase performance by maybe 10%. If the integral is very large, the set of indices to manage will become huge, and Java may run out of memory. You will then need to increase the heap space allocated to Java. Please consult the help for the Java interpreter.


Packages

de.torstennahm.distribution
de.torstennahm.integrate
de.torstennahm.integrate.error
de.torstennahm.integrate.quadratureformula
de.torstennahm.integrate.sparse
de.torstennahm.integrate.sparse.evaluateindex
de.torstennahm.integrate.sparse.index
de.torstennahm.integrate.sparse.visualize
de.torstennahm.integrate.visualize
de.torstennahm.integrate.visualizerdata
de.torstennahm.math
de.torstennahm.series
de.torstennahm.statistics
de.torstennahm.util