Coroutine Essentials

Sometimes it’s helpful to do more than one thing at a time. For example, when you’re on hold during a phone call, you might also check your email. While brewing coffee, you might also cook breakfast. And while driving a car, you might listen to a podcast.

In the same way, sometimes it’s helpful for the software that we write to do more than one thing at a time. For example, it could make two or three network calls at one time - all while updating the screen to show the progress of each call.

In Kotlin, we can use coroutines to do multiple things at a time. There are many ways to use coroutines. In fact, it’d be easy to dedicate an entire book to exploring them. In this chapter, we’re going to focus on the most essential coroutine concepts you need to be productive with them in your day-to-day coding. This knowledge will give you the confidence to start using them in your applications, and it’ll provide a solid foundation for understanding more advanced coroutine concepts in the future, if you choose to explore them further.

Let’s begin our adventure with coroutines by visiting Rusty and his trusty robot, Bot-1.

One Thing at a Time…

Rusty McAnnick spends most of his days engineering and managing Bot-1, his heavy-duty robot who works on construction projects. One day, Rusty and Bot-1 received a work order for constructing a new building. Their checklist for this project includes three tasks:

Lay the bricks for the foundation.
Install the windows.
Install the doors.

Rusty and Bot-1 have a large pile of bricks on hand, but the windows and doors are ordered specifically for each project, so they have to be delivered from the warehouses. They arrived at the construction site, ready to get started. Here’s how the day went.

Bot-1 called a warehouse to order the windows needed for this project. The delivery took a while to arrive. While he was waiting, he sat on the curb, twiddling his thumbs. Once the truck finally showed up, he got the windows off of the truck.
Next, he called a different warehouse to order the doors. Again, he sat down on the curb, bored, waiting for the doors to be delivered. When it finally did, he unloaded the doors.
Next, he laid the bricks. This part required some time and effort, but he completed the job skillfully.
Then, he installed the windows.
Finally, he installed the doors.

If we lay out the work that Bot-1 did over time, here’s what the timeline would look like.

Once the project was complete, Rusty took inventory of all the tasks, and shook his head. “That project took a really long time. Those deliveries were so slow. And the client wasn’t happy that we took so long to deliver the project. I wonder… what can we do to speed things up?”

Single-Threaded, Blocking Code

Just like Rusty’s construction project, our Kotlin code can be inefficient when we do only one thing at a time, especially when it involves waiting for slow operations, such as network requests. To start with, let’s create an enum class to represent the products that can be ordered from a warehouse, along with a function to place an order.

enum class Product(val description: String, val deliveryTime: Long) {
    DOORS("doors", 750),
    WINDOWS("windows", 1_250)
}

fun order(item: Product): Product {
    println("ORDER EN ROUTE  >>> The ${item.description} are on the way!")
    Thread.sleep(item.deliveryTime)
    println("ORDER DELIVERED >>> Your ${item.description} have arrived.")
    return item
}

In the code above, we’re using Thread.sleep() to simulate the amount of time that it takes for the truck to deliver the product. Since each product is at a different warehouse, they have different delivery times. Thread.sleep() takes a Long argument that indicates how long to pause, in milliseconds - so to wait one second, we can pass 1_000.

Next, let’s do the same thing for the tasks that Bot-1 needs to perform.

fun perform(taskName: String) {
    println("STARTING TASK   >>> $taskName")
    Thread.sleep(1_000)
    println("FINISHED TASK   >>> $taskName")
}

With these functions in place, we can model Rusty’s recent project like this.

fun main() {
    val windows = order(Product.WINDOWS)
    val doors = order(Product.DOORS)
    perform("laying bricks")
    perform("installing ${windows.description}")
    perform("installing ${doors.description}")
}

When we run this code, we’ll see output that looks like this:

ORDER EN ROUTE  >>> The windows are on the way!
ORDER DELIVERED >>> Your windows have arrived.
ORDER EN ROUTE  >>> The doors are on the way!
ORDER DELIVERED >>> Your doors have arrived.
STARTING TASK   >>> laying bricks
FINISHED TASK   >>> laying bricks
STARTING TASK   >>> installing windows
FINISHED TASK   >>> installing windows
STARTING TASK   >>> installing doors
FINISHED TASK   >>> installing doors

Just like with Rusty’s construction work, this code is a very slow way to get the job done. Let’s see what ideas Rusty comes up with to make his work more efficient!

Coroutines and Concurrency

Later that night, Rusty sat on his comfortable couch, flipping through channels on the television. He finally settled on a channel featuring professional tag-team wrestling. This tag-team wrestling match involved two teams of wrestlers, each with two teammates, where only one wrestler from each team can be in the ring at a time. By tagging his teammate who is waiting outside of the ring, the two wrestlers can switch places.

Rusty watched with excitement, rooting for the team of Sledge and Hammer as they took on the current champions, Villain and Vandal. Here’s what happened as he watched:

Sledge took his turn in the ring, suplexing Vandal. Then, he tagged Hammer, and the two teammates switched places.
Next, Hammer entered the ring, using a clothesline move to knock Vandal to the floor. Then he tagged Sledge, and they switched places again.
Sledge returned to the ring and put Vandal into a figure-four leglock. He tagged Hammer again, and they changed places.
Hammer stepped back into the ring, and used a piledriver on Vandal. He tagged Sledge one more time.
Finally, Sledge entered the ring again and pinned Vandal to the mat for three seconds, winning the match!

While Rusty keeps watching the wrestling event, let’s get back to some Kotlin code!

Introduction to Coroutines

Code written with coroutines works a lot like tag-team wrestling - one coroutine can do some work, then tag out and let another coroutine run for a while. The execution path can alternate between the coroutines, like this:

The code above demonstrates the essence of coroutines, which is that the execution path can bounce back and forth between parts of different functions. When code is written this way, we say that the tasks run concurrently.

By the way: The Coroutines Library

You’ll need to add a dependency on the kotlinx.coroutines library for the rest of the code in this chapter. The way you do this differs depending on your build tool. Appendix B (which is not yet written) will include instructions for Maven and Gradle.

Ready to create our first coroutine? We can build a new coroutine by calling a special kind of function known as a coroutine builder. Our very first coroutine builder is named runBlocking(). This function takes a lambda, which contains the code that this coroutine will run. (And be sure to include import kotlinx.coroutines.runBlocking in the file where you’re writing the code!)

import kotlinx.coroutines.runBlocking

fun main() {
    runBlocking {
        println("Sledge: Suplex!")
        println("Hammer: Clothesline!")
        println("Sledge: Figure-four Leglock!")
        println("Hammer: Piledriver!")
        println("Sledge: Pinning 1-2-3!")
    }
}

Let’s look more closely at each part of this code.

runBlocking() is a function that creates (and usually starts) a coroutine. As mentioned above, this kind of function is called a coroutine builder.
The lambda passed to runBlocking() is a special kind of function called a suspending function. Like a tag-team wrestler, a suspending function can “tag out” the coroutine, allowing another coroutine to run in the meantime. Instead of saying that it “tags out,” we say that it suspends execution. Since this suspending function is written as a lambda, this one can also be referred to as a suspending lambda.
This suspending function runs inside a coroutine. The coroutine can be suspended by the suspending function.

Where’s the Coroutine?

When we look at the code, we can point to the coroutine builder, and we can point to the suspending function. But oddly, we can’t point to the actual coroutine itself. That’s because a coroutine is an instance of some code, along with configuration and information about its state - such as whether it’s currently running, paused, completed, and so on.

Have you ever run the same Kotlin project in more than one IDE at the same time? Or have you run it from the command line in more than one console door at the same time? When you do this, each execution of the program is its own instance. It has its own state. For example, if you create a program to count to one million, you could run the same program twice at the same time, and at any given moment, each instance would be on a different number from the other.

In a similar way, a coroutine is an instance of execution, with its own state. But instead of executing an entire program, it’s executing the block of code that was passed to its coroutine builder. And instead of its execution being managed by machinery within the operating system, it’s managed by machinery within our Kotlin program.

Let’s Start Suspending!

Now, the code in Listing 20.4 includes a suspending function, which can suspend, but so far, it’s not actually suspending. All it’s doing is printing lines to the console. We could have done that without runBlocking()! We also have only a single coroutine, which is like a single wrestler instead of a team. Let’s create a second coroutine that can work together with the first. To do this, we can call a coroutine builder from within the runBlocking() lambda.

Since runBlocking() is the only coroutine builder we’ve used so far, let’s try using it again here. We’ll put the second and fourth wrestling moves in the nested coroutine builder, and the first, third, and fifth in the outer lambda, with the intention that the wrestling moves will alternate between the coroutines like they did with the wrestlers, Sledge and Hammer.

import kotlinx.coroutines.runBlocking

fun main() {
    runBlocking {
        runBlocking {
            println("Hammer: Clothesline!")
            println("Hammer: Piledriver!")
        }
        println("Sledge: Suplex!")
        println("Sledge: Figure-four Leglock!")
        println("Sledge: Pinning 1-2-3!")
    }
}

Since we’ve got one runBlocking() function call inside the lambda of another, we’re creating a coroutine from within another coroutine. This creates a parent-child relationship between the two coroutines, resulting in a simple hierarchy structure that looks like this.

Later in this chapter, we’ll see why this structure is important. Meanwhile, when we run the code above, we’ll get output that looks like this.

Hammer: Clothesline!
Hammer: Piledriver!
Sledge: Suplex!
Sledge: Figure-four Leglock!
Sledge: Pinning 1-2-3!

Hmm… this prints all of the wrestling moves, but they’re not in the order that we wanted - the moves are printed in the same order that they appear in the code. Hammer did all of his moves, and then Sledge did all of his moves. One problem here is that the runBlocking() coroutine builder waits until its code finishes before moving on. So, the code in the nested runBlocking() lambda runs until it’s done (printing “Hammer: Clothesline!” and “Hammer: Piledriver!”), and then the rest of the println() statements in the outer lambda run.

Since runBlocking() waits until its coroutine completes before moving on, it’s typically only used to build a root-level coroutine. In other words, it’s frequently only used directly in the main() function. From there, other coroutines are usually built with other coroutine builders.

It’s time to introduce our second coroutine builder, named launch(). Like runBlocking(), the launch() function also accepts a suspending lambda as an argument. Let’s replace the nested runBlocking() call with a call to launch(). Be sure to include the import for kotlinx.coroutines.launch.

import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking

fun main() {
    runBlocking {
        launch {
            println("Hammer: Clothesline!")
            println("Hammer: Piledriver!")
        }
        println("Sledge: Suplex!")
        println("Sledge: Figure-four Leglock!")
        println("Sledge: Pinning 1-2-3!")
    }
}

Sledge: Suplex!
Sledge: Figure-four Leglock!
Sledge: Pinning 1-2-3!
Hammer: Clothesline!
Hammer: Piledriver!

Well… that’s still not right! The result is almost the same as that of Listing 20.5, except that the wrestlers traded places. Last time, all the wrestling moves from the inner lambda were printed first. This time, all the wrestling moves from the outer lambda were printed first. In fact, with this output, their opponent was pinned in the middle of the match!

The problem is that our wrestlers aren’t tagging out after each move, so they never yield the ring to the other wrestler. In Kotlin, if we want a coroutine to tag out, it has to encounter a suspension point. Generally speaking, this happens when it calls a suspending function.¹

To demonstrate this, let’s update our code so that after each move, we include a call to a function named yield().

import kotlinx.coroutines.launch
import kotlinx.coroutines.runBlocking
import kotlinx.coroutines.yield

fun main() {
    runBlocking {
        launch {
            println("Hammer: Clothesline!")
            yield()
            println("Hammer: Piledriver!")
            yield()
        }
        println("Sledge: Suplex!")
        yield()
        println("Sledge: Figure-four Leglock!")
        yield()
        println("Sledge: Pinning 1-2-3!")
    }
}

yield() is a suspending function, and each time we call it, the coroutine hits a suspension point. This yields the wrestling ring back to the wrestler’s teammate - in other words, it gives the other coroutine a chance to run some of its code. So, the execution path bounces back and forth between the runBlocking() lambda and the launch() lambda, producing the following output, which shows the wrestling moves in the correct order.

Sledge: Suplex!
Hammer: Clothesline!
Sledge: Figure-four Leglock!
Hammer: Piledriver!
Sledge: Pinning 1-2-3!

Declaring a Suspending Function

Let’s update our code so that the output indicates each time that a wrestler is tagging out. So instead of directly calling yield(), we could create another function that first prints “Tagging out!” and then calls yield(). If we try to declare a regular function for this, we’ll get a compiler error:

fun tagOut() {
    println("    Tagging out!    ")
    yield()
}

Error

The reason for this error is that a suspending function can’t be called from just anywhere. It can only be called from another suspending function! In other words, a regular function can call a regular function, and a suspending function can call either a regular function or another suspending function.

Kind of function	Can it call a regular function?	Can it call a suspending function?
Regular function	Yes	No
Suspending function	Yes	Yes

To fix this error, we can simply prepend the suspend modifier to this function, like this:

suspend fun tagOut() {
    println("    Tagging out!    ")
    yield()
}

By doing this, we’ve changed the tagOut() function from a regular function to a suspending function, so it’s now possible for it to call yield().

By the way: “Suspend Function”

Since the modifier is named suspend, many Kotlin developers simply refer to this kind of function as a “suspend function” instead of “suspending function” - but they both mean the same thing. It would be most precise to say “a function that is capable of suspending a coroutine,” but that’s quite a mouthful! So, throughout the rest of this chapter, we’ll use the term “suspending function”.

Now that we’ve created the tagOut() function, we can go back to our main() function and replace the call to yield() with a call to tagOut():

fun main() {
    runBlocking {
        launch {
            println("Hammer: Clothesline!")
            tagOut()
            println("Hammer: Piledriver!")
            tagOut()
        }
        println("Sledge: Suplex!")
        tagOut()
        println("Sledge: Figure-four Leglock!")
        tagOut()
        println("Sledge: Pinning 1-2-3!")
    }
}

When we run this, the output looks like this:

Sledge: Suplex!
 Tagging out!
Hammer: Clothesline!
 Tagging out!
Sledge: Figure-four Leglock!
 Tagging out!
Hammer: Piledriver!
 Tagging out!
Sledge: Pinning 1-2-3!

So, we can write our own suspending functions that call other suspending functions, like yield(). Often the suspending functions that we write in our own applications code will simply call other suspending functions, usually from a library. For example, we could use the Ktor HTTP client library to make an HTTP call, which would suspend the coroutine until a response is received.

suspend fun getExample(): String {
    return client.get("https://www.example.com/").bodyAsText()
}

Before we move on, let’s review the main concepts we’ve learned in this section:

Coroutines can run concurrently with one another. In other words, their execution can be suspended in order to give other coroutines a chance to run.
Suspending functions are able to suspend the coroutines that run them. They can only be called from other suspending functions.
The runBlocking() creates a coroutine. Any code that comes after runBlocking() won’t run until its coroutine has finished running.
The launch() builder also creates a coroutine, and any code that comes after it will be run immediately after the coroutine is launched.

Well, it’s been fun considering the similarities between coroutines and tag-team wrestlers, but Rusty has some big projects coming up, and he needs to find more efficient ways to complete his projects! So let’s see what he’s up to…

Two Things at a Time…

After watching the rest of the wrestling event, Rusty went to bed. As he tried to fall asleep, he wondered about how he could make his construction projects more efficient. He thought, “What if our tasks could be more like tag-team wrestlers? What if Bot-1 could get started on one task, and then that task could “tag out” and let Bot-1 work on another task for a while… and then that task could also be paused, and Bot-1 could return to the first task…?”

The next day, he decided to test it out. He received a work order to construct another building. “This time,” he said, “whenever we order the supplies, instead of just waiting for the delivery, let’s start working on the next available task.” So, Bot-1 got to work. Here’s how the day went.

First he called the warehouse to order the windows.
Without waiting for them, he immediately called the other warehouse to order the doors.
Then, he got to work on the bricks right away. While he was still working on the bricks, the delivery truck dropped off the doors. Soon after laying the last brick, the second truck arrived and dropped off the windows.
Bot-1 installed the windows.
Finally, he installed the doors.

Rusty was much happier after this project was done. By doing some work, putting it down, working on something else, and eventually coming back to the first thing, they were able to save lots of time. Bot-1 spent much less time sitting on the curb!

Modeling the Construction Site

Now that we’ve been introduced to the concepts of coroutines, coroutine builders, suspending functions, and suspension points, let’s bring our knowledge back to Rusty and Bot-1. Back in Listing 20.2, we introduced a function to order supplies such as windows and doors. Let’s convert it to a suspending function, and replace Thread.sleep() with a suspending function from the coroutines library, named delay().

import kotlinx.coroutines.delay

suspend fun order(item: Product): Product {
    println("ORDER EN ROUTE  >>> The ${item.description} are on the way!")
    delay(item.deliveryTime)
    println("ORDER DELIVERED >>> Your ${item.description} have arrived.")
    return item
}

Just like with Thread.sleep(), the delay() function accepts a Long argument that tells it how long to delay. So what are the differences between Thread.sleep() and delay()?

Thread.sleep() does not suspend a coroutine. Instead, it simply blocks the execution for the designated amount of time. Since it doesn’t suspend, it doesn’t give another coroutine a chance to run in the meantime. This function can be called from either a regular function or a suspending function.
delay() does suspend the coroutine. This means the coroutine can set down its work for the designated amount of time, allowing some other coroutine to run in the meantime. This function can only be called from within a suspending function.

By the way: What about `perform()`?

Back in Listing 20.2, we added a second function, named perform(). Both order() and perform() included a call to Thread.sleep(). Although we just updated order() to use delay(), we did not update perform() to do the same. Why not?

The order() function is a metaphor for requesting an external resource - like loading a file from the file system or requesting a JSON document from a REST service.
The perform() function, on the other hand, is a metaphor representing work that keeps the CPU busy, such as transforming large data sets or running complex calculations.

Input/output operations (like the one simulated by order()) can happen in the background, without blocking execution. Then, once the operation is finished, our code can continue with the result. So, these operations are best simulated with a suspending function like delay().

CPU-heavy operations, however, keep the processor busy with whatever it’s calculating, blocking other work from happening. So, these operations are best simulated with a blocking operation like Thread.sleep() (…although if you’re feeling inspired, you could give your perform() function some busy-work instead, such as figuring out some quantity of prime numbers).

Just like with regular functions, suspending functions can return a value, as the two functions above are doing. There are many Kotlin libraries that use suspend functions to return something important. For example the Ktor client library mentioned earlier includes suspending functions that return a response from a Web service.

Naturally, we can also assign the result to a variable. To try this out, let’s take our code from Listing 20.3, and put it inside a runBlocking() lambda.

fun main() {
    runBlocking {
        val windows = order(Product.WINDOWS)
        val doors = order(Product.DOORS)
        perform("laying bricks")
        perform("installing ${windows.description}")
        perform("installing ${doors.description}")
    }
}

Even though this code is running in a coroutine, and even though the order() function is now a suspending function, we still get the exact same output as we had in Listing 20.3:

ORDER EN ROUTE  >>> The windows are on the way!
ORDER DELIVERED >>> Your windows have arrived.
ORDER EN ROUTE  >>> The doors are on the way!
ORDER DELIVERED >>> Your doors have arrived.
STARTING TASK   >>> laying bricks
FINISHED TASK   >>> laying bricks
STARTING TASK   >>> installing windows
FINISHED TASK   >>> installing windows
STARTING TASK   >>> installing doors
FINISHED TASK   >>> installing doors

This demonstrates that the code in a coroutine is still run top-to-bottom, just like regular Kotlin code - and this is true even when calling suspending functions instead of regular functions.

In a moment, we’ll update this so that the tasks can run concurrently. But first, let’s consider the timing of this code. The output above appears over the course of about five seconds - 750ms for delivering the doors, 1250ms for the windows, and one second for each call to perform(). (The delivery times were specified in the enum class back in Listing 20.1).

Now, in Rusty’s most recent project, Bot-1 called one warehouse for the windows, then immediately called the other warehouse for the doors. After that, he started laying the bricks, all without waiting for the deliveries. As we’ve discovered, if we want to do things concurrently in Kotlin, we can’t just throw our code into a single coroutine - we need two or more coroutines. So let’s update our code from Listing 20.13 so that Bot-1 can lay the bricks while waiting on the deliveries.

One idea is to wrap each of our order() calls with a launch() coroutine builder, and then all of the perform() calls in another launch(). That way, each of the two order() calls will happen in its own coroutine, concurrently with the brick-laying. When we do this, though, we’ll get a compiler error.

fun main() {
    runBlocking {
        val windows = launch { order(Product.WINDOWS) }
        val doors = launch { order(Product.DOORS) }
        launch {
            perform("laying bricks")
            perform("install ${windows.description}")
            perform("install ${doors.description}")
        }
    }
}

Error

The reason for this error is that launch() does not return the result of order(). Instead, it returns an object whose type is Job. This Job object is helpful, as we’ll see later in this chapter, but it doesn’t give us any way to get the result of the call to order().

Instead, we’ll need to use a third kind of coroutine builder, named async(). This builder works a lot like launch(), but instead of returning a Job object, it returns an object that is a subtype of Job, named Deferred. This object gives us a function named await(), which allows us to get the result from order(). Here’s how we can use it.

fun main() {
    runBlocking {
        val windows = async { order(Product.WINDOWS) }
        val doors = async { order(Product.DOORS) }
        launch {
            perform("laying bricks")
            perform("installing ${windows.await().description}")
            perform("installing ${doors.await().description}")
        }
    }
}

In this code, after laying the bricks, we call await() on both of the Deferred objects - the deferred windows and the deferred doors. await() is a suspending function, and it will suspend the coroutine until its async() coroutine has completed.

If the windows arrived while the bricks were still being laid, then the windows will be installed as soon as the bricks are finished. If they haven’t arrived by the time the bricks are finished, then the coroutine created by launch() will be suspended until order(Product.WINDOWS) has completed.

Here’s the output we get from running Listing 20.15 above:

ORDER EN ROUTE  >>> The windows are on the way!
ORDER EN ROUTE  >>> The doors are on the way!
STARTING TASK   >>> laying bricks
FINISHED TASK   >>> laying bricks
ORDER DELIVERED >>> Your doors have arrived.
ORDER DELIVERED >>> Your windows have arrived.
STARTING TASK   >>> installing windows
FINISHED TASK   >>> installing windows
STARTING TASK   >>> installing doors
FINISHED TASK   >>> installing doors

Just like Rusty, by making these changes, we end up saving quite a bit of time! Instead of taking 5 seconds, this code now takes only 3.25 seconds to complete.

With the code in Listing 20.15, we’ve got a coroutine structure that looks like this:

It’s almost time to get back to Rusty, but before we do, let’s review a few concepts from this section.

The launch() builder returns a Job object, so it’s the right choice when you do not need a result from that coroutine.
The async() builder creates a coroutine, and returns a Deferred object. You can call the await() function on this object to wait for its result.

Rusty and Bot-1 saw some great gains by doing work concurrently, but they’re about to find ways to speed up their projects even more! Let’s see what they’re up to!

Two Robots, Two Things at a Time…

Later that night, Rusty relaxed at home, enjoying a car-racing event on television. As one racecar swooped into a pit stop, he marveled at the pit crew’s efficiency! Within seconds, they jacked up the car, replaced the tires with precision, and filled up the gas tank. All of this happened so quickly because the pit crew had multiple team members, all doing different things at the same time.

Rusty said to himself, “If I had multiple robots in my construction crew, I bet we could speed up our projects even more!”

So the next day, Rusty got to work creating more robots. While building them, he considered the kinds of tasks involved in his projects. Some tasks involve hard work like moving and laying bricks, but other tasks only involve phoning for supplies and watching for the delivery. So, he split up his robots into different teams.

The first team was tasked with doing the hard work and heavy lifting. He called them the “Default” Team, because they do the main construction work.
The second team was called the “IO” Team, because they handle input (I) and output (O) - or, “inbound” and “outbound” - communications with facilities that are offsite.

Each team had a foreman who was responsible for assigning tasks to the different robots at different times, depending on the need of the moment.

When the next work order came in, Rusty was excited to try out his new crew of robots!

The robots on the IO Team called the warehouses to order the windows and doors, and watched for their arrival.
Meanwhile, the Default Team got started laying the bricks.
The doors were delivered first, but the Default team wasn’t done laying the bricks. Since the bricks have to be done before installing the doors, the doors sat on the side for a little while.
Once the bricks were done, a robot started installing the doors.
Soon, the second delivery truck showed up with the windows. A second bot on the Default Team was ready to install the windows as soon as they arrived, so that the windows and doors were being installed at the same time.

This saved even more time than in the previous project!

Rusty was thrilled! Not everything could happen at once, of course. The bricks still had to be laid before the windows and doors could be installed. However, by installing windows and doors at the same time, his project was completed in record time! If we want to accomplish the same thing in our Kotlin code, we need to start by learning about threads, concurrency, and parallelism.

Multithreaded Concurrency

Anything you run on your computer - whether it’s your own Kotlin program, an application that you installed, or a service running the background - runs on a thread in the operating system.² Since most computers these days have more than one processor core, they can process multiple threads at the same moment.

Just like with Rusty’s robots, we could use a single thread to do multiple things over a period of time, but if our computer has a multi-core processor we can also use multiple threads to do multiple things at the same moment.³

This brings up an important distinction.

When a single execution path of our code bounces back and forth between two or more tasks, those tasks are running concurrently.
When there are multiple execution paths, each running a different task at the same moment, those tasks are running in parallel.

So far, our coroutines have run code concurrently, but always on a single thread. Let’s get those coroutines running on multiple threads, so that they can run in parallel!

When we last left our code in Listing 20.15, it looked like this.

fun main() {
    runBlocking {
        val windows = async { order(Product.WINDOWS) }
        val doors = async { order(Product.DOORS) }
        launch {
            perform("laying bricks")
            perform("install ${windows.await().description}")
            perform("install ${doors.await().description}")
        }
    }
}

Just like each of Rusty’s teams had a foreman who assigned tasks to the robots on that team, in Kotlin, we’ve got different dispatchers that can assign coroutines to run on the threads they manage. If we want a particular dispatcher to manage a coroutine, we simply pass the dispatcher as an argument to the coroutine builder.

For example, we can assign the product-ordering tasks to the IO Team, by using Dispatchers.IO. The rest of the tasks can go to the Default Team, by using Dispatchers.Default.

fun main() {
    runBlocking {
        val windows = async(Dispatchers.IO) { order(Product.WINDOWS) }
        val doors = async(Dispatchers.IO) { order(Product.DOORS) }
        launch(Dispatchers.Default) {
            perform("laying bricks")
            perform("install ${windows.await().description}")
            perform("install ${doors.await().description}")
        }
    }
}

Previously, everything was running on a single thread - in other words, one robot was doing every task. But now, by assigning the coroutines to Dispatchers.IO and Dispatchers.Default, the work is happening on three different threads:

Two threads managed by the IO dispatcher order the products and keep an eye out for their delivery.
One thread managed by the Default dispatcher is laying the bricks, then installing the windows, and finally installing the doors.

Even though we’ve assigned the work to different teams, this work is still taking three seconds to complete. In order to speed this up even more, we need to install the windows and doors in parallel.

As usual, we need to put a coroutine builder around the code that we want to run concurrently or in parallel with our other code. Let’s wrap the last two perform() calls with another launch(). Note that, because the bricks must be laid before the windows and doors can be installed, we’re not launching these coroutines until after that work has finished.

fun main() {
    runBlocking {
        val windows = async(Dispatchers.IO) { order(Product.WINDOWS) }
        val doors = async(Dispatchers.IO) { order(Product.DOORS) }
        launch(Dispatchers.Default) {
            perform("laying bricks")
            launch { perform("install ${windows.await().description}") }
            launch { perform("install ${doors.await().description}") }
        }
    }
}

With this change, the windows and doors are installed at the same time, completing all of the work in only about 2.25 seconds!

This code creates 6 different coroutines - one is created with runBlocking(), two are created with async(), and three are created with launch(). The result is a hierarchy of coroutines that looks like this:

This works like we want, but we can also do this more efficiently with just four coroutines. To make this happen, we can use a function named withContext().

`withContext()`: Handing Work to Another Dispatcher

So far, we’ve separated the code that orders a product from the code that installs the product.

However, ordering a product and installing a product are closely-related concepts, so it could be helpful to keep those tasks near to each other in the code. For example, to keep the window-ordering code and the window-installing code near to each other - yet still ensure that the correct team is responsible for each step - we don’t need to create a new coroutine. We can just use a function named withContext().

The withContext() function allows us to switch dispatchers without launching a whole new coroutine. In other words, it’s like a robot on IO Team placing the order and waiting for its arrival; then, once it arrives, it hands the windows over to a robot on the Default Team to do the actual installation work. Here’s how the code would look.

fun main() {
    runBlocking {
        launch(Dispatchers.IO) {
            val windows = order(Product.WINDOWS)
            withContext(Dispatchers.Default) { 
                perform("install ${windows.description}") 
            }
        }
        launch(Dispatchers.IO) {
            val doors = order(Product.DOORS)
            withContext(Dispatchers.Default) {
                perform("install ${doors.description}") 
            }
        }
        launch(Dispatchers.Default) {
            perform("laying bricks")
        }
    }
}

In this code, we call launch() to create three coroutines - once for dealing with the windows, once for dealing with the doors, and once for laying the bricks. Within the first two, the product is ordered on a thread managed by Dispatchers.IO. But, once the product has arrived, we use withContext() to change the dispatcher, so that perform() happens on a thread managed by Dispatchers.Default.

By making this change, we’ve created fewer coroutines, and the related work (e.g., ordering the windows and installing the windows) stays closer together in the code. We’ve introduced a problem, though! The windows and doors are supposed to be installed only after the bricks have been laid. When we look at the output from Listing 20.19, we’ll notice that we start installing doors before the bricks job has finished!

STARTING TASK   >>> laying bricks
ORDER EN ROUTE  >>> The windows are on the way!
ORDER EN ROUTE  >>> The doors are on the way!
ORDER DELIVERED >>> Your doors have arrived.
STARTING TASK   >>> install doors
FINISHED TASK   >>> laying bricks
ORDER DELIVERED >>> Your windows have arrived.
STARTING TASK   >>> install windows
FINISHED TASK   >>> install doors
FINISHED TASK   >>> install windows

How can we wait for the bricks job to finish before we start installing the doors and windows?

As you might remember, when we call the async() builder, it returns a Deferred object, and we can call await() on this object to suspend the coroutine until the result is ready. Similarly, the launch() builder returns a Job object, which includes a function called join(). Like await(), the join() function suspends the coroutine until the code in the launch() block has completed. Let’s rearrange our code once more. This time, we’ll make sure the bricks job has completed before installing the windows and doors.

fun main() {
    runBlocking {
        val bricksJob = launch(Dispatchers.Default) {
            perform("laying bricks")
        }
        launch(Dispatchers.IO) {
            val windows = order(Product.WINDOWS)
            bricksJob.join()
            withContext(Dispatchers.Default) { 
                perform("install ${windows.description}") 
            }
        }
        launch(Dispatchers.IO) {
            val doors = order(Product.DOORS)
            bricksJob.join()
            withContext(Dispatchers.Default) { 
                perform("install ${doors.description}") 
            }
        }
    }
}

Here, we rearranged the launch() calls so that the bricks job comes first. We assigned the result of that launch() call to a variable named bricksJob. Then, inside the remaining two launch() blocks, we call bricksJob.join() to suspend until the bricks job is complete. When we run this code, we can see that the doors and windows are not installed until the bricks have been laid.

STARTING TASK   >>> laying bricks
ORDER EN ROUTE  >>> The windows are on the way!
ORDER EN ROUTE  >>> The doors are on the way!
ORDER DELIVERED >>> Your doors have arrived.
FINISHED TASK   >>> laying bricks
STARTING TASK   >>> install doors
ORDER DELIVERED >>> Your windows have arrived.
STARTING TASK   >>> install windows
FINISHED TASK   >>> install doors
FINISHED TASK   >>> install windows

Rusty and his crew have nailed their most recent project, but in the construction world, you never know what wrenches will get thrown into the works! Let’s see what challenges they’re about to face.

Cancellations

Canceling the Entire Job

One day, Rusty and his construction crew were busy working on a project when he got a call from the client.

“Hey, Rusty, here’s the thing…” began the client, “We’re moving our entire operation to another part of town. So you know that building that you’re working on? We don’t need it any more. Just cancel the whole project.”

Well, the team was already in the middle of the job, but if the client didn’t need the building any more, it wouldn’t make sense to keep working on it. Two robots were waiting for the delivery of the windows and doors, and one was laying the bricks. Rusty ran up to each robot in turn and gave the news. The robots who were waiting for the deliveries got the message right away, so they packed up their things and got ready to go home.

Bot-3, who was laying the bricks, however, was hard at work with his headphones on, so he didn’t notice Rusty at first. After the final brick was laid, at last he looked up and saw Rusty signaling that everyone was about to head home, so he finally gathered his things and wrapped up.

Canceling Top-Level Coroutines

Just like at the construction site, sometimes a coroutine job needs to get canceled. We can call a function named cancel() inside the lambdas that we pass to the coroutines builders. Let’s update our code so that we cancel the job after all the work has begun.

fun main() {
    runBlocking {
        val bricksJob = launch(Dispatchers.Default) {
            perform("laying bricks")
        }
        launch(Dispatchers.IO) {
            val windows = order(Product.WINDOWS)
            bricksJob.join()
            withContext(Dispatchers.Default) { perform("install ${windows.description}") }
        }
        launch(Dispatchers.IO) {
            val doors = order(Product.DOORS)
            bricksJob.join()
            withContext(Dispatchers.Default) { perform("install ${doors.description}") }
        }
        cancel()
    }
}

Running this, we’ll see output that looks something like this:

STARTING TASK   >>> laying bricks
ORDER EN ROUTE  >>> The windows are on the way!
ORDER EN ROUTE  >>> The doors are on the way!
FINISHED TASK   >>> laying bricks
Exception in thread "main" kotlinx.coroutines.JobCancellationException: 
BlockingCoroutine was canceled; job=BlockingCoroutine{Canceled}@7006c658

By calling cancel(), we got a JobCancellationException. And just like at Rusty’s work site, the bricks task was not interrupted. In a moment, we’ll learn why this happened. For now, let’s take a closer look at how cancellation works.

As we’ve seen throughout this chapter, when one coroutine launches another (which might launch another!) we end up with a hierarchy of coroutines. In Listing 20.21, the coroutine created by runBlocking() ends up creating three other coroutines, resulting in a structure that looks like this.

Coroutines that are all part of the same hierarchy exist within the same coroutine scope. In fact, CoroutineScope is an actual interface, and the launch() and async() coroutine builders are extension functions on that interface, which allows them to tie their new coroutines to that CoroutineScope.

By structuring coroutines into a scope, Kotlin can keep track of that scope’s coroutines and the parent-child relationships between them. That way, if work is canceled or something goes wrong, Kotlin will ensure that each coroutine is accounted for, without requiring the developer to handle these situations manually. This feature is called structured concurrency. Let’s see how structured concurrency applies when a job is canceled.

In Listing 20.21, we called cancel() inside the runBlocking() lambda - that is, the coroutine at the top of the hierarchy.

Thanks to structured concurrency, when the root coroutine is canceled, we don’t need to manually cancel each of its children. Instead, each child coroutine is automatically sent a signal to cancel. And if that child happens to have child coroutines of its own, it sends along the cancellation signal to them, as well.

However, just like Bot-3 was heads-down with his headphones on, not paying attention to Rusty, a coroutine won’t notice the cancellation signal unless it remembers to look for it.

It’s been a while, but let’s look at the code for the perform() function from Listing 20.2 again.

fun perform(taskName: String) {
    println("STARTING TASK   >>> $taskName")
    Thread.sleep(1_000)
    println("FINISHED TASK   >>> $taskName")
}

In order to simulate the time it takes to perform a task, this function is using Thread.sleep(). Since the thread running this code is busy the whole time, it never looks up to see if the job is canceled.

Instead of doing work for 1,000 milliseconds straight, let’s divide up the work into five units, so that the robot gets a break every 200 milliseconds. We’ll use a function named repeat() to loop five times, and then sleep for only one-fifth of the time on each iteration.

We’ll call yield() to take a break. This will give our robot a chance to notice if the job has been canceled. Remember, we can’t call a suspending function from a regular function. Since yield() is a suspending function, we have to also make perform() a suspending function.

suspend fun perform(taskName: String) {
    println("STARTING TASK   >>> $taskName")
    repeat(5) {
        Thread.sleep(200)
        yield()
    }
    println("FINISHED TASK   >>> $taskName")
}

By calling this function every once in a while, this gives the coroutine a chance to look up to see if the work has been canceled. Running the code again now produces output that looks like the following:

STARTING TASK   >>> laying bricks
ORDER EN ROUTE  >>> The windows are on the way!
ORDER EN ROUTE  >>> The doors are on the way!
Exception in thread "main" kotlinx.coroutines.JobCancellationException:
BlockingCoroutine was canceled; job=BlockingCoroutine{Canceled}@7006c658

This time, the thread that was laying the bricks had a chance to notice that the job was canceled, so it quit the work without finishing it.

This demonstrates that cancellation in coroutines is cooperative. A coroutine that’s hard at work (laying bricks or running computations) will not notice a cancellation unless it takes an occasional break from its work. If its code doesn’t cooperate by choosing to check for cancellation, it won’t notice the signal. You can check for cancellation by calling yield(), as we’re doing in Listing 20.23 above. Alternatively, we can check the isActive property of the CoroutineScope, or by calling its ensureActive() function.

The good news is that many suspending functions in real projects - such as those in the coroutines library, in Ktor, and so on - are written in a way where they’ll notice the cancellation. But if you’ve got a coroutine that’s doing some heavy lifting, be sure it has a chance to suspend once in a while, so that it can quickly respond to cancellations without doing unnecessary work.

Not all cancellations have to affect the entire job, though, as Rusty and his crew are about to find out!

Canceling Part of a Job

One day as the crew was hard at work constructing another building, the client called, saying “I know you’ve already started installing the doors, but we decided we want a more open-space feeling. So, don’t bother installing the doors. I still want the building - just without the doors.”

This time, instead of telling all of the robots to stop working, Rusty went directly to the robot who was waiting on the door delivery, and gave him the cancellation signal. The windows were still installed, and the building was completed successfully without the doors.

The IO-Bot waiting for the door delivery notices Rusty's signal.

Canceling a Child Coroutine

As we saw earlier, because of structured concurrency, when you cancel a coroutine, that coroutine is canceled along with all of its children. However, the cancellation does not affect parent or sibling coroutines. To demonstrate this, let’s cancel the door job, like the Rusty’s client did.

fun main() {
    runBlocking {
        val bricksJob = launch(Dispatchers.Default) {
            perform("laying bricks")
        }
        launch(Dispatchers.IO) {
            val windows = order(Product.WINDOWS)
            bricksJob.join()
            withContext(Dispatchers.Default) { perform("install ${windows.description}") }
        }
        launch(Dispatchers.IO) {
            val doors = order(Product.DOORS)
            bricksJob.join()
            cancel()
            withContext(Dispatchers.Default) { perform("install ${doors.description}") }
        }
    }
}

When we run this, we get output that looks like this:

STARTING TASK   >>> laying bricks
ORDER EN ROUTE  >>> The windows are on the way!
ORDER EN ROUTE  >>> The doors are on the way!
ORDER DELIVERED >>> Your doors have arrived.
FINISHED TASK   >>> laying bricks
ORDER DELIVERED >>> Your windows have arrived.
STARTING TASK   >>> install windows
FINISHED TASK   >>> install windows

This works exactly as we hoped it would! Notice that the doors arrived, but they were never installed. Everything else continued according to plan - the bricks were laid, and the windows were installed. So, when we cancel a coroutine, that coroutine itself is canceled, and if it has any children, they are also canceled. But parent and sibling coroutines are unaffected.

Cancellation isn’t the only surprise that can affect a job, though, as Rusty and his crew are about to find out!

By the way: Canceling from Outside

In the code above, the coroutine canceled itself by calling cancel() inside its suspending function. However, it’s also possible to cancel a coroutine from outside of its body. As we’ve seen, when we create a coroutine, we get a Job or Deferred object back. We can call cancel() on that object, as well.

val job = launch { perform("laying bricks") }
job.cancel()

When We Can’t Recover From a Problem

One day, the construction crew was back at it, working on one more construction project. While the bricks were being laid, one of the robots on the IO team called to request the delivery of the doors. However, the warehouse gave him some surprising news.

“Sorry, but we can’t deliver those doors to you. Your client has exceeded his budget limits, and he can’t afford any more doors,” said the voice on the other end of the phone.

Without the doors, and indeed, without any more money, the project simply couldn’t continue. The robot canceled his work, then ran up to Rusty to inform him of the situation. “Well, it looks like we’re going to have to abandon this project,” said Rusty, who then walked over to the rest of his crew and signaled for them to stop their work.

Exceptions in Coroutines

Sometimes there are unrecoverable problems. In Chapter 17, we learned all about exceptions, and how an exception that isn’t caught will eventually work its way to the beginning of the call stack, and cause the entire app to crash. There’s a similar situation with coroutines, but it also involves canceling work that’s in progress. To demonstrate this, let’s throw an exception inside the coroutine that is ordering doors.

fun main() {
    runBlocking {
        val bricksJob = launch(Dispatchers.Default) {
            perform("laying bricks")
        }
        launch(Dispatchers.IO) {
            val windows = order(Product.WINDOWS)
            bricksJob.join()
            withContext(Dispatchers.Default) { perform("install ${windows.description}") }
        }
        launch(Dispatchers.IO) {
            val doors = order(Product.DOORS)
            throw Exception("Out of money!")
            bricksJob.join()
            withContext(Dispatchers.Default) { perform("install ${doors.description}") }
        }
    }
}

When running this, we’ll see output that looks something like this:

STARTING TASK   >>> laying bricks
ORDER EN ROUTE  >>> The doors are on the way!
ORDER EN ROUTE  >>> The windows are on the way!
ORDER DELIVERED >>> Your doors have arrived.
Exception in thread "main" java.lang.Exception: Out of money!

Although the work was started for each of the three jobs, none of them finished, because of the exception that was thrown from inside the door job.

By default, an uncaught exception inside a coroutine will affect all of the coroutines within its scope:

The coroutine with the uncaught exception will cancel all of its children.
Then, it hands the exception up to its parent, who in turn cancels all of its children, who will in turn cancel any children that it has, and so on.
This process continues until the exception reaches the top of the coroutine hierarchy.

This behavior - canceling children and propagating exceptions upward throughout the coroutine scope - is another feature of structured concurrency. As with cancellation, it removes a lot of manual work that we would otherwise have to do ourselves to ensure that all coroutines are properly shut down.

By the way: Supervisors

This default behavior is exactly what we want most of the time, but there are occasions when you might want to limit the blast radius of an exception. If you want to keep the sibling and parent coroutines running when an exception happens, you can use SupervisorJob or supervisorScope(). More information about supervisors can be found on the official Kotlin documentation.

Farewell, Rusty & Company!

As the sun set over the newly-constructed skyline, Rusty stood admiring the work that he and his robot construction crew had accomplished. The latest building looked great and was completed in record time. He thought back a few days to when Bot-1 was inefficiently waiting on the curb for a delivery. How far they had come! By working their tasks concurrently and in parallel, they constructed the buildings in less time, taking their client satisfaction scores to an all-time high.

As the robots powered down, Rusty dozed off for the night, dreaming about expanding his robot crew enough to one day build a skyscraper!

Summary

Enjoying this book?
Order the paperback today!

Kotlin: An Illustrated Guide is now available on Amazon

See the book on Amazon

Most software projects these days needs some amount of concurrency, and in this chapter, we learned:

How coroutines create concurrency, with an execution path that bounces back and forth between functions - like tag-team wrestlers.
How to use coroutines to do more than one thing at a time - like laying bricks while awaiting deliveries.
How to use coroutines to do more than one thing at the same moment, in parallel.
How to use withContext() to hand work off to another dispatcher.
How structured concurrency affects cancellation of coroutines in a hierarchy.
How structured concurrency affects exceptions that are thrown within a coroutine.

Congratulations on working your way through this chapter! Coroutines can be a challenging topic for many developers, but with these essentials under your belt, you’ll be prepared to use them with confidence!

Many Kotlin developers find it a helpful generalization to think of a suspension point as any call to a suspending function, but that’s not entirely precise. It’s possible to write a suspending function that doesn’t actually suspend. Down in the lowest depths of the coroutine machinery, there’s a function called suspendCoroutineUninterceptedOrReturn(). A suspension point is technically when this function is called with a lambda that returns a special value called COROUTINE_SUSPENDED. When functions like delay() or yield() are called, they might call another function or two, but somewhere inside the call stack, it will result in a call to suspendCoroutineUninterceptedOrReturn(), which is the actual suspension point. ↩︎
In most operating systems, each thread is also owned by a process, which contains resources and other context. So, a process can have one or more threads. For more information about this topic, consider reading Grokking Concurrency by Kirill Bobrov (2024), published by Manning Publications. ↩︎
Even if your computer has a single core, you typically can still use multiple threads. The work of these threads will get sliced up so that each thread gets a little time on the processor. In fact, it works a lot like the coroutines we’ve created so far, except that instead of choosing to tag out (cooperative multitasking), the thread is kicked out at some point - it’s like the referee blows a whistle and kicks the wrestler out of the ring, and calls him back in later (preemptive multitasking). ↩︎