Extended precision arithmetic for Julia



DoubleDouble.jl is a Julia package for performing extended-precision arithmetic using pairs of floating-point numbers. This is commonly known as "double-double" arithmetic, as the most common format is a pair of C-doubles (Float64 in Julia), although DoubleDouble.jl will actually work for any floating-point type. Its aim is to provide accurate results without the overhead of BigFloat types.

The core routines are based on the ideas and algorithms of Dekker (1971).


The main type is Double, with two floating-point fields: hi, storing the leading bits, and lo storing the remainder. hi is stored to full precision and rounded to nearest; hence, for any Double x, we have abs(x.lo) <= 0.5 * eps(x.hi). Although these types can be created directly, the usual interface is the Double function:

julia> using DoubleDouble

julia> x = Double(pi)

julia> eps(x.hi)

The other type defined is Single, which is simply a wrapper for a floating-point type, but whose results will be promoted to Double.


Exact products and remainders

By exploiting this property, we can compute exact products of floating point numbers.

julia> u, v = 64 * rand(), 64 * rand()

julia> w = Single(u) * Single(v)
Double{Float64}(609.3097112086186, -5.3107663829696295e-14)

Note that the product of two Singles is a Double: the hi element of this double is equal to the usual rounded product, and the lo element contains the exact difference between the exact value and the rounded.

This can be used to get an accurate remainder

julia> r = rem(w, 1.0)
Double{Float64}(0.309711208618584, 1.6507898617445858e-17)

julia> Float64(r)

This is much more accurate than taking ordinary products, and gives the same answer as using BigFloats:

julia> rem(u*v, 1.0)

julia> Float64(rem(big(u) * big(v), 1.0))

However, since the DoubleDouble version is carried out using ordinary floating-point operations, it is of the order of 1000x faster than the BigFloat version.

Correct rounding with non-exact floats

If a number cannot be exactly represented by a floating-point number, it may be rounded incorrectly when used later, e.g.

julia> pi * 0.1

julia> Float64(big(pi) * 0.1)

We can also do this computation using Doubles (note that the promotion rules mean that only one needs to be specified):

julia> Float64(Double(pi) * 0.1)

julia> Float64(pi * Single(0.1))

Emulated FMA

The fused multiply-add (FMA) operation is an intrinsic floating-point operation that allows the evaluation of a * b + c, with rounding occurring only at the last step. This operation is unavailable on 32-bit x86 architecture, and available only on the most recent x86_64 chips, but can be emulated via double-double arithmetic:

julia> 0.1 * 0.1 + 0.1

julia> Float64(big(0.1) * 0.1 + 0.1)

julia> Base.fma(a::Float64,b::Float64,c::Float64) = Float64(Single(a) * Single(b) + Single(c))
fma (generic function with 1 method)

julia> fma(0.1, 0.1, 0.1)

First Commit


Last Touched

2 months ago


40 commits


Used By: