## Java Performance: Efficiently Formatting Doubles

by Jack Shirazi
12/15/2000

In my book Java Performance Tuning, I considered efficient ways to convert numbers to strings, and I provided conversion algorithms that are faster than those released with the Java SDK (up to and including version 1.3). In this article, I'll consider how to format `double`s using a variation of that conversion algorithm, and show that the conversion of numbers with formatting can be faster than the original unformatted conversion.

### The SDK conversion methodology

First, let's look at how the SDK converts `double`s to strings and see why that procedure is slow. Calling `Double.toString(double)` to convert a `double` value to a string immediately creates an instance of `java.lang.FloatingDecimal`, a class that is private to the `java.lang` package. `FloatingDecimal` is the internal SDK class that provides all the logic necessary to handle floating point numbers. If you look at the source code for this class, you will see that converting floating point numbers to strings appears to be horrendously complicated. The algorithm needs to deal with several special cases: infinity, -infinity, not-a-number, -0.0, as well as digits before and after the decimal point, exponential values, and rounding accuracy. In addition, the SDK conversion algorithm works out the minumum number of digits necessary to identify uniquely the underlying bit-storage value.

This last factor is a little subtle: I'll illustrate it with an example. The number 5.39E-322 is relatively close to the smallest IEEE-754 `double` value. The underlying `double` representation cannot actually hold the full accuracy of a number that small (note that this representation is not Java-specific: all IEEE-754 floating point numbers in all languages have the same limitations). Instead the `double` value is held as 5.4E-322. The number 5.41E-322 is also stored to the same accuracy, resulting in the same number, i.e. the code

``````double d1 = 5.39E-322D;
double d2 = 5.41E-322D;
System.out.println(d1);
System.out.println(d2);
System.out.println(d2==d1);``````

produces the output

``````5.4E-322
5.4E-322
true``````

In fact, the underlying bit-stored number is closer to 5.396039603960396E-322. However, the SDK algorithm correctly identifies that all these numbers are stored as the same underlying value, and that there is no other close `double` value that can be stored which requires more than one digit of accuracy to be shown, i.e. 5.4E-322 uniquely identifies the underlying `double` value, while using the minimum number of digits. (The next nearest `double` value is 5.4455E-322 which is represented by 5.43E-322 using the SDK conversion algorithm, and this `double` would be assigned for all values between 5.42E-322 and 5.45E-322.)

### A faster conversion methodology

As you can see, converting `double`s to strings is a nuanced affair requiring lots of work. That explains why the SDK is so slow to convert them. In my book, I presented an alternative algorithm to convert `double`s to strings. The algorithm works by determining the magnitude of the `double` (there is a rapid algorithm to do this using some of the bits of the `double`), then scaling the `double` value into a `long` and converting the `long` value using another efficient algorithm. Essentially, the algorithm looks similar to

``````double example = 3.14159D;
//magnitude is 0 for 3.14159
long magnitude = magnitude(example);
// scaling_factor could be 1.0E18 here
long scaled = example * scaling_factor;
//now print the long, inserting a decimal point
//after the (magnitude+1)th digit.
...``````

The `long`-to-string conversion algorithm I use is also faster than the SDK version, as I use an algorithm which successively strips off the highest digit, thus avoiding the use of an intermediate `char` array or `StringBuffer` to hold a partially built string. (The documented code for all algorithms is included in the source code for this article.) The nth digit of a particular `long` can be obtained using a relatively simple formula

``````  //Input assumes l is positive. If negative, then pass -l.
//Note that if negating, should treat Long.MAX_VALUE specially,
//as that would overflow when negated.
public static long getNthDigit(long l, int n)
{
//Obviously, if you are successively accessing digits, you should
//inline this code, and avoid repeatedly calling tenthpower(long)
//since you only need to determine the tenthpower(long) for the
//first digit, then each subsequent digit has a tenthpower(long)
//which is the previous tenthpower(long) divided by 10.
return (l/(tenthPower(l)/l_tenthPower[n-1]))%10;
}

static long[] l_tenthPower = {1, 10L, 100L, 1000L, 10000L, 100000L,
1000000L, 10000000L, 100000000L, 1000000000L, 10000000000L,
100000000000L, 1000000000000L, 10000000000000L, 100000000000000L,
1000000000000000L, 10000000000000000L, 100000000000000000L,
1000000000000000000L,
};

private static long tenthPower(long l)
{
if (l < 10L) return 1;
else if (l < 100L) return 10L;
else if (l < 1000L) return 100L;
else if (l < 10000L) return 1000L;
else if (l < 100000L) return 10000L;
else if (l < 1000000L) return 100000L;
else if (l < 10000000L) return 1000000L;
else if (l < 100000000L) return 10000000L;
else if (l < 1000000000L) return 100000000L;
else if (l < 10000000000L) return 1000000000L;
else if (l < 100000000000L) return 10000000000L;
else if (l < 1000000000000L) return 100000000000L;
else if (l < 10000000000000L) return 1000000000000L;
else if (l < 100000000000000L) return 10000000000000L;
else if (l < 1000000000000000L) return 100000000000000L;
else if (l < 10000000000000000L) return 1000000000000000L;
else if (l < 100000000000000000L) return 10000000000000000L;
else if (l < 1000000000000000000L) return 100000000000000000L;
else return  1000000000000000000L;
}``````

By successively stripping off the highest digit, the numbers are printed or appended in order. This means that there is no need for any intermediate temporary structures when printing numbers using my method. This can be a great help when printing large amounts of data to streams. Avoiding the temporary objects that are generated during the normal SDK number-to-string conversions both speeds up the conversion and avoids garbage collection further down the line.

In my original `double`-to-string conversion algorithm, I didn't bother with printing the shortest unique representation of the `double`. In fact the algorithm always prints out all the digits available, up to about 15 decimal places. During the conversion, however, rounding errors mean the last few decimal places are not necessarily correct. Therefore, if you need full accuracy to the 15th decimal place, then you should not be using my conversion algorithm to print `double` values. I'd also argue that you shouldn't be using `double`s at all if that is your requirement: instead math manipulation packages and `BigDecimal` are probably more appropriate for your problem.

 Pages: 1, 2