Scala tail recursion vs non-tail receursion performance testing with dynaTrace

The scala compiler has partial tail call optimization (

Running some like methods with and without allowing the scala compiler to optimize demonstrated the performance improvements gained by this optimization.

First up, simple factoral functions (source:

//this function will be tail recusive optimized
def tFactorial(number: Long) : Long = {
 def tfactorialWithAccumulator(accumulator: Long, number: Long) : Long = {
 if (number == 1) 
 return accumulator
 tfactorialWithAccumulator(accumulator * number, number - 1)
 tfactorialWithAccumulator(1, number)
//Non-tail recursive function
def ntFactorial(number:Long) : Long = {
 if (number == 1)
 return 1
 number * ntFactorial (number - 1)

For explanation of these functions and why/not they are tail recursive see the source above.


Non-Tail recursive average response time: approx 7 ms

Tail recursive average response time: approx 0.7ms

The number of test conducted was limited to about 10 each and the test environment was an Ubuntu VM. The results will not be highly accurate but with a difference this significant it is clear that tail optimized recursion is very different from normal recursion (On a JVM). Looking at the purepath for each method reveals tail-recursion optimization working as expected and saving alot of execution time:

Comparing the two purepaths, the 10x increase in response time is clearly caused by the lack of tail recursion on the left.

The next example looks at what the compiler does if there is extra code within a tail optimized funtion:

// Tail recursive version 
 def execTest3Tail(loopCount: Int): String = {
 def test2WithAccumulator(accumulator: Int, number: Int): String = {
 println("Hello World " + accumulator + "\n")
 if (accumulator >= loopCount)
 return "Completed"
 test2WithAccumulator(loopCount, (accumulator + 1))
 test2WithAccumulator(1, loopCount)
// Non-Tail recursive version
 def execTest3nt(loopCount: Int, cur: Int): String = {
 println("Hello World " + cur + "\n") 
 if (cur >= loopCount)
 return "Completed"
 execTest3nt(loopCount, (cur + 1))

The output of the non-tail recursive function was as expected, printing all of the output expected. The scala compiler optimized over the expected (perhaps incorrectly expected) behaviour of the optimized function. The output was:

enter num of iterations:1000
Hello World 1
Hello World 1000

As the function was optimized not to run through all of the loops – the println’s simply did not occur. The purepath comparision:

Results are similar to the first test. The right (tail-optimized) purepath call hierarchy shows why there were only 2 screen outputs..


Leave a Reply

Your email address will not be published.