Controlling the infinite in Scala with Lazy List

Carlos Hernández
carlos-hernandez
Published in
5 min readJul 8, 2020

--

I would like to start this article by showing you a piece of code (see below).

Do you think this code will work?

In the code above we can see a function called ‘fibs’, and how this function calls itself to infinity. We are using this function to calculate the Fibonacci sequence. Maybe at this point quite a few questions you’d like to ask. How is this possible? Is this code really going to work?

Before we get to our example of Lazy List, we need to cover some basics. What are strictness and non-strictness, and how are these concepts expressed in Scala?

Strictness and non-strictness

Strictness and non-strictness are properties of functions. When we say that a function is non-strict we are saying that perhaps the function chooses not to evaluate one or more of its arguments. In contrast, a strict function always evaluates its arguments.

Let’s see these properties in action.

In this code, we can see how we are passing an exception in one of our arguments. This function is strict and the exception will be evaluated at the moment of calling the function. So this code is never going to work, an exception always returns. On the other hand, we can see the same example but this time using non-strictness. In this case the exception will be thrown only when isException will be true.

By default, all the functions in Scala are strict. If we need to define a non-strictness argument we have to add arrow => immediately before their type. In our second example the argument will only be evaluated if the argument isException is true. We don’t need to do anything special to evaluate an argument annotated with =>. We just reference the identifier as usual.

Lazy evaluation

In the previous section we saw how passing arguments to the functions is defined as strictness or non-strictness. With either syntax, the arguments will be evaluated once for each place it’s referenced on the function. By default, Scala doesn’t cache the result of evaluations.

In the first example lines 1 to 9 we are creating a non-strictness function that receives a function by parameter. In this case we can replace the references in the body for the function as we can see below:

In this case the argument isn’t evaluated until the body of the function is executed, but it will be evaluated two times by the function. On the other hand, using the property lazy (lines 11 to 21) we are going to evaluate the argument once and save the value in cache. In this case the evaluation will be the first time that we use the variable fLazy in line 13.

In this example we are going to print to console but only for the first time, saving the value 1 in cache and using it for the following substitution. The purpose of the last code is to show you how it is working, it isn’t compiled code.

In the console picture we can see how this is working better. First we initialized the variable fLazy as lazy with the value { println(“Once”); 1 }. In the next line we are evaluating it. In this evaluation we are going to print to console the value ‘Once’ as side effect and save as cache the value ‘1’ returned by the function. In the last line we are adding this value two times.

Lazy will cache the result so that subsequent references to it don’t trigger repeated evaluations.

Lazy List

Here’s a simple Lazy List definition:

LazyList’s object contains a smart constructor for creating a nonempty and empty LazyList. By convention, smart constructors typically type the first letter of the corresponding data constructor in lowercase. The apply method is a convenient method for constructing a LazyList from multiple elements.

In this example we are using an explicit thunk (() => A and () => LazyList[A]). A value of type () => A is a function that accepts zero arguments and returns a value of type A. Thunks are primarily used to delay a calculation until its result is needed, or to insert operations at the beginning or end of the other subroutine. For this reason Scala takes care of wrapping the arguments in thunk, so the as.head and apply(as.tail: _*) won’t be evaluated until we force the LazyList.

This give us a powerful ability. We may choose to describe a larger expression than we need, and then evaluate only a portion of it.

We can see this powerful ability in our first example:

Here we have a function that is going to calculate the Fibonacci sequence until infinity (extreme example). By using LazyList, we are ultimately defining what to do (similar to a recipe) and the expression will be evaluated when we look for the last element. At this point it is going to evaluate our “recipe” step by step and as soon as we take the desired value we aren’t going to continue evaluating it. This is the main key behind Apache Spark, for example.

Let me introduce another example to see how this is working in action. Imagine that we have the following expression:

In the first example (line 2) using List we are going to map all the values in the List. In this case the List has infinite values, so it is never going to finish. On the other hand, in the second case (line 5) we are going to evaluate laziness values one by one and the process will end as soon as we find our result. This particular case never is going to happen in the real word. However, we could have a huge file and the difference would be between map 1 gb (all the values) or only 1 kb (just some values).

Laziness leads to efficiency in code. Now, we are not making multiple copies of an object. You can defer execution until a much later time and have those executed very nicely, that’s one of the biggest benefits you get out of this.

Conclusion

Lazy is one of the most important topics in functional programming, and it can be very advantageous. However, it can only benefit us as long as the initialization of lazy vals doesn’t produce side effects nor depend on them. In the presence of side effects, initialization order starts to matter. So lazy vals are ideal when we are using referential transparency.

References

--

--