# Inside Sequences: Create Your Own Sequence Operations

Sequences are a fantastic way to process collections of data in a way that can perform better than the standard collection operations, as we saw in the previous article, Kotlin Sequences: An Illustrated Guide. Today, we’re going even deeper! We’re going to look under the hood - inside Kotlin sequences - in order to understand how they work like never before!

In fact, by the end of this article, we’ll even create our very own `Sequence` class and plug it into an operation chain!

If you’re new to sequences, you’ll definitely want to check out that previous article first, as it explains what sequences are, and provides some simple illustrations and diagrams to get you up to speed quickly.

Ready to go? All right, let’s do this!

# Sequences, Iterables, and Iterators

As you recall from the previous article, the magic that can give sequences a performance edge over normal collection operations is the order in which the individual operations are applied.

• In collection operation chains, each operation is applied to all elements before moving onto the next operation. In other words, one operation at a time.
• In sequence operation chains, each element receives all applicable operations before moving onto the next element. In other words, one element at a time.

Here’s the diagram from last time, which shows this difference visually:

The key takeaway here is that Sequences work on one item at a time.

One item at a time…

That sure sounds a lot like an `Iterable`, doesn’t it? And in fact, if you were to look at the definitions of `Iterable` and `Sequence` side-by-side, you would hardly notice a difference!

 ``````public interface Iterable { public operator fun iterator(): Iterator } `````` ``````public interface Sequence { public operator fun iterator(): Iterator } ``````

That’s right! Other than their names, their definitions are identical! They both simply contain an `iterator()` function, which, when paired with the `operator` keyword, makes it so that instances of these types can be used as the subject of a `for` loop.

So, `Iterable` and `Sequence` look quite similar.

But…

Whereas subtypes of `Iterable` are often named for the way the items are structured (e.g., an `ArrayList` is backed by an array) or the way the items relate to each other (e.g., a `List` is ordered, items in a `Set` are distinct, etc.), subtypes of `Sequence` are named for the operations that they perform on those items.

For example, in order to perform a `filter` operation, there’s a sequence class called `FilteringSequence`. And in order to perform a `map` operation, there’s a sequence class called `TransformingSequence`.

By wrapping one kind of `Sequence` instance in another, we can compose operations that can be applied to each item of the `Iterator` in turn.

For example, in the last article, we modeled a sequence that processed crayons. Here’s what it looked like:

``````val fireSet = BigBoxOfCrayons
.asSequence()
.filter { it.color in includedColors }
.map { it.copy(label = "New!") }
.take(5)
.toSet()
``````

As we call each operation like `filter()`, `map`, and `take()`, we’re really just wrapping one sequence instance in another. To help visualize this, let’s walk through the operation chain one step at a time, and see what it looks like!

First, we call `asSequence`, which creates a sequence by using `BigBoxOfCrayons`’s iterator as the source data. This gives us a single `Sequence` instance:

 ``````val fireSet = BigBoxOfCrayons .asSequence() .filter { it.color in includedColors } .map { it.copy(label = "New!") } .take(5)``````

Next, we call `.filter()`. This wraps our last `Sequence` with a new instance of `FilteringSequence`:

 ``````val fireSet = BigBoxOfCrayons .asSequence() .filter { it.color in includedColors } .map { it.copy(label = "New!") } .take(5)``````

Then, we call `.map()`. This wraps those two sequences in a third, called `TransformingSequence`:

 ``````val fireSet = BigBoxOfCrayons .asSequence() .filter { it.color in includedColors } .map { it.copy(label = "New!") } .take(5)``````

Finally, we call `take()`, which wraps those three sequence instances in a fourth, called `TakeSequence`:

 ``````val fireSet = BigBoxOfCrayons .asSequence() .filter { it.color in includedColors } .map { it.copy(label = "New!") } .take(5) ``````

As you can see, every time we call a new operation, all we’re really doing is wrapping the previous sequence in another sequence!

In fact, the exact same operation chain above could be built by just calling the constructors for each sequence subtype instead…

``````TakeSequence(
TransformingSequence(
FilteringSequence(
BigBoxOfCrayons.asSequence(),
true,
{ it.color in includedColors }
),
{ it.copy(label = "New!") }
),
5
)
``````

except that these classes are all marked `internal` in the standard library, so you can’t actually call their constructors. But c’mon, who wants to call a bunch of constructors when that dot notation (`.filter { }`) looks sooo much nicer?!

Now, if each operation is just wrapping more and more sequence instances, when do the operations actually execute? In other words, when does the filtering, mapping, and taking actually happen?

## Intermediate and Terminal Operations

Sequence operation chains have two phases:

1. The building phase, where the sequence is built by wrapping a sequence in another sequence, probably multiple times.
2. The execution phase, where the `iterator()` of the `Sequence` is finally traversed.

If the building phase is like setting up a row of standing dominoes, then the execution phase is where you tip the first one over and watch it all play out.

Let’s look at our code from earlier, and see which calls are a part of which phase:

``````val fireSet = BigBoxOfCrayons
.asSequence()                            // Building...
.filter { it.color in includedColors }   // Building...
.map { it.copy(label = "New!") }         // Building...
.take(5)                                 // Building...
.toSet()                                 // Executing!
``````

In this code, the building phase consists of the first four operations that we perform the `BigBoxOfCrayons` - `asSequence()`, `filter()`, `map()`, and `take()`. We’ve wrapped the sequence a few times, but haven’t actually looped over the items yet.

Then, when we call `toSet()`, the iterator is finally kicked off, and the whole chain of operations gets going!

• The functions that build out the sequence chain are called intermediate operations.
• The function that executes the sequence chain is called a terminal operation.

As the name implies, once you invoke a terminal operation, you can’t build that sequence any further - that operation chain is done.

It’s worth noting, though, that in many cases you can store off the `Sequence` to a variable, and fire off multiple terminal operations on it.

``````    val fireSequence = BigBoxOfCrayons
.asSequence()                            // Building...
.filter { it.color in includedColors }   // Building...
.map { it.copy(label = "New!") }         // Building...
.take(5)                                 // Building...

val fireSet = fireSequence.toSet()           // Executing!
val crayonCount = fireSequence.count()       // Executing!
``````

### How can you tell the difference between an intermediate and terminal operation?

As a general rule, you can recognize an intermediate operation because it returns another `Sequence`.

The terminal operations, on the other hand, generally return something that’s not a `Sequence`, such as:

• Some other kind of collection, like a `List` or `Set`.
• A single value from the sequence, such as the first `String` from a `Sequence<String>`.
• A primitive class like `Boolean`, `Int`, `Double`, and so on. Often these are the result of an aggregate operation, like `sum()` or `average()`.

The distinction between intermediate and terminal operations is important to keep in mind, because it can mean the difference between whether your code executes or not. For example, what will the following code print?

``````sequenceOf("Hello", "Kotlin", "World")
.onEach { println("I spy: \$it") }
``````

Nothing at all!

Why?

Because `onEach()` is an intermediate operation. Since we never finished it off with a terminal operation, the iterator was never traversed, and the `println()` function was never invoked!

To get it to actually call `println()`, we’d have to cap it off with a terminal operation, such as `toSet()`:

``````sequenceOf("Hello", "Kotlin", "World")
.onEach { println("I spy: \$it") }
.toSet()
``````

But more likely, you didn’t want to use the intermediate operation `onEach()`; you wanted to use the terminal operation `forEach()`:

``````sequenceOf("Hello", "Kotlin", "World")
.forEach { println("I spy: \$it") }
``````

# Creating Our Own Sequence

Now that we’ve got a good grasp on what sequence classes are, and the difference between intermediate and terminal operations, we can use that knowledge to create our very own sequence operation!

Instead of using `onEach { println("I spy: \$it") }` as we did above, let’s create our very own custom sequence operation that can print out the element for us!

When we’re done, we’ll be able to write this:

``````sequenceOf("Hello", "Kotlin", "World")
.spy()
.toSet()``````

… which will print each item to the console.

## Step 1: Implementing the `Sequence` Interface

Naturally, if we want to make our own sequence, we’ll need to start with a class that implements the `Sequence` interface. In order to do that, we also need to include the `iterator()` function. As a placeholder for the implementation of that function, let’s start with just `TODO()`:

``````class SpyingSequence<T> : Sequence<T> {
override fun iterator() = TODO()
}
``````

## Step 2: Wrapping Another Sequence

Now, we know that sequence classes wrap other sequences, so let’s go ahead and pass the underlying sequence into the constructor. Remember - the underlying sequence is just the previous sequence in the operation chain.

``````class SpyingSequence<T>(private val underlyingSequence: Sequence<T>) : Sequence<T> {
override fun iterator() = TODO()
}``````

When one sequence wraps another, the iterator also has to wrap the underlying sequence’s iterator. As a first pass at that, we can just do some simple delegation like this:

``````class SpyingSequence<T>(private val underlyingSequence: Sequence<T>) : Sequence<T> {
override fun iterator() = underlyingSequence.iterator()
}``````

## Step 3: Adding the Functionality

The code above compiles, and would simply serve as a no-op. That is, it’s got “no operation”; it just a passes through to the iterator of the underlying sequence. We want to spy on the elements, so we’ll need to make a custom iterator where we can put our `println()`.

Let’s take a small step in that direction by making a custom `Iterator` that delegates to the underlying iterator for each of its two functions - `hasNext()` and `next()`:

``````class SpyingSequence<T>(private val underlyingSequence: Sequence<T>) : Sequence<T> {
override fun iterator() = object : Iterator<T> {
val underlyingIterator = underlyingSequence.iterator()

override fun hasNext() = underlyingIterator.hasNext()
override fun next(): T = underlyingIterator.next()
}
}``````

We’re almost there! All that’s left to do is to print out the item. We’ll expand the `next()` function a little bit. Instead of just returning the result of the underlying `iterator.next()` call, we’ll print out the item first.

``````class SpyingSequence<T>(private val underlyingSequence: Sequence<T>) : Sequence<T> {
override fun iterator() = object : Iterator<T> {
val iterator = underlyingSequence.iterator()

override fun hasNext() = iterator.hasNext()

override fun next(): T {
val item = iterator.next()
println("I spy: \$item")
return item
}
}
}``````

And now our custom `Sequence` class is done!

## Step 4: Creating the Extension Function

But, in order to use it from the middle of a sequence chain, we’ve got one more step: we need to create an extension function!

``````fun <T> Sequence<T>.spy() = SpyingSequence(this)
``````

We’re done! Now we can include it like any other sequence operation!

``````val fireSet = BigBoxOfCrayons
.asSequence()
.filter { it.color in includedColors }
.map { it.copy(label = "New!") }
.spy()  // <---- here's our sequence!
.take(5)
.toSet()``````

This is, in fact, exactly how sequences are included in Kotlin’s standard library! They’re included by:

1. Creating a class that implements `Sequence`, and then…
2. Creating an extension function that calls that class’ constructor.

There are dozens of intermediate sequence operations included in the standard library - `.filter()`, `.map()`, `withIndex()`, `distinct()`, and `chunked()`, just to name a few. So you might expect that this could result in a lot of different `Sequence` classes. But in fact, just a handful of well-crafted, dynamic `Sequence` classes power almost all of those different operations!

Here are a few examples. Notice how some of the same sequence classes are used for multiple operations.

Sequence Extension Function Sequence Class(es)
`.map()` `TransformingSequence()`
`.mapIndexed()` `TransformingIndexedSequence()`
`.mapIndexedNotNull()` `FilteringSequence(TransformingIndexedSequence())`
`.filter()` `FilteringSequence()`
`.filterNot()` `FilteringSequence()`
`.filterNotNull()` `FilteringSequence()`
`.filterIsInstance()` `FilteringSequence()`
`.filterIndexed()` `TransformingSequence(FilteringSequence(IndexingSequence()))`

Different extension functions can invoke the constructor of the same sequence class with different arguments to achieve different results, and some extension functions combine multiple sequence classes.

# Summary

Congratulations! You’ve gone way beyond just understanding what Kotlin’s sequences are - you’ve now got a solid understanding of how Kotlin’s sequences work on the inside! You’ve seen how sequence classes wrap other sequence instances, and understand the difference between intermediate and terminal operations.

Before we finish up this series, though, we still need to answer the big question - “how do I know when I should use sequences instead of normal collection operations?” Performance is always a complex issue, and for that reason, we’re going to cover this in the third and final article in this series.

And for more fun, you can also check out the episode of the Talking Kotlin podcast where Hadi Hariri and I discuss sequences!

Thanks to Kartik Gujarati, Mitch Ware, and Krzysztof Zienkiewicz for reviewing this article.