Generators landed back in PHP 5.5 and I’ve mostly ignored them. I had a vague understanding that they were a feature that allowed you to build iterators that didn’t require loading up a huge data structure with all your information. This also seemed to be the gist of most online generator tutorials. So, in the practical world of business programming where jamming everything into a giant PHP array is usually good enough, there wasn’t much of a need to understand generators.
So imagine my surprise when I discovered that generators are actually an alternative to linear code flow. Or maybe you don’t need to imagine any surprise and are thinking
Alternative to linear code flow — what does that even mean?
Today we’re going to cover generators “from scratch”. By the end of this article you should be able to reason about any generator function in PHP and understand the flow of code when a generator is invoked.
Generator Functions
For now, pretend we didn’t tell you that generators are for building iterators.
New Definition: Generators are a special type of function in PHP that always returns a Generator
object. Generator function definitions are similar to regular function definitions, with one exception. Instead of using a return
keyword, they use a yield
keyword. Here’s a simple example program that demonstrates this.
#File: generator-example.php
<?php
function myGeneratorFunction()
{
yield;
}
$returnValue = myGeneratorFunction();
echo get_class($returnValue),"\n";
This program defines a function named myGeneratorFunction
. This function doesn’t have a return value, but does include the keyword yield
. (We’re not quite ready to explain what yield
does, but if you like to read ahead it’s similar to the return
keyword — we’ll get to those details momentarily)
Next, we call myGeneratorFunction
, and assign its return value to a variable named (appropriately) $returnValue
. Finally, we pass $returnValue
into the get_class
function and echo
the output.
If you aren’t familiar with generators, you might expect $returnValue
to contain the value null
. After all, myGeneratorFunction
didn’t return anything. However, if you run the above program, you’ll see our function returned a object instantiated from the built-in Generator
class.
$ php generator-example.php
Generator
Although our function is defined with the regular old function
keyword, PHP’s internals treat it differently because the function includes the yield
keyword. PHP will always treat a function that includes the yield
keyword as a generator function, and a generator function will always return a Generator
object.
Yield and Program Flow
Generator objects are PHP iterators. If you haven’t used iterators before, they are (from one point of view) classes that allow you to create objects that allow you to loop over values. This sample program demonstrates the built-in array iterator.
#File: generator-example.php
<?php
$values = [1,2,3,4,5];
// using foreach
foreach($values as $number) {
echo $number, "\n";
}
// using an iterator
$iterator = new ArrayIterator($values);
while($number = $iterator->current()) {
echo $number, "\n";
$iterator->next();
}
In PHP an array iterator is a bit more verbose than a foreach
statement. Syntactic sugar is popular when it creates less code, so iterators aren’t often used in day-to-day PHP code.
However, PHP also includes a special built-in iterator interface. This interface allows end-user-programmers (you!) to define their own objects with rules for how a set of data is traversed over — and you can use these objects as an iterator or directly in PHP’s foreach
loops. If you’ve ever used a collection object in a framework like Magento, under the hood these collections all implement PHP’s base Iterator
class.
Generators are another special type of iterator object. However, instead of relying on a defined class for their functionality, they rely on generator functions and the special properties of the yield
keyword.
Yield as Return
In PHP, the yield
keyword tells PHP to pause the current function execution and return a value to the generator/iterator object. This happens the first time the generator’s current
method is called. When an end-user-programmer calls the generator object’s next
function, PHP will return to the generator function and continue execution immediately after the point that yield
was called.
If you’re a little confused by that, don’t worry. It’s involves breaking a bunch of base assumptions about how PHP code flows. This quick test program should help clear things up.
#File: generator-example.php
<?php
function myGeneratorFunction()
{
echo "One","\n";
yield 'first return value';
echo "Two","\n";
yield 'second return value';
echo "Three","\n";
yield 'third return value';
}
// get our Generator object (remember, all generator function return
// a generator object, and a generator function is any function that
// uses the yield keyword)
$iterator = myGeneratorFunction();
// get the current value of the iterator
$value = $iterator->current();
// get the next value of the iterator
// $value = $iterator->next();
// and the value after that the next value of the iterator
// $value = $iterator->next();
The output of this first program will be
$ php generator-example.php
One
When we called current
on the iterator object, PHP began executing the code in the myGeneratorFunction
function, and stopped when it reached the first yield
.
You probably noticed a few lines commented at the bottom of our test program. If we uncomment the first call to next
#File: generator-example.php
<?php
function myGeneratorFunction()
{
echo "One","\n";
yield;
echo "Two","\n";
yield;
echo "Three","\n";
yield;
}
// get our Generator object (remember, all generator function return
// a generator object, and a generator function is any function that
// uses the yield keyword)
$iterator = myGeneratorFunction();
// get the current value of the iterator
$value = $iterator->current();
// get the next value of the iterator
$value = $iterator->next();
// and the value after that the next value of the iterator
// $value = $iterator->next();
we’ll see the following output
$php generator-example.php
One
Two
When we called next
, PHP resumed executing myGeneratorFunction
at the point it had previously stopped. That’s what we mean when we say the yield
keyword pauses the function. Uncomment the last $value = $iterator->next();
and you’ll see that execution resumes after the second yield.
So that explains yield
‘s power to pause a function — but what about when we said
[The
yield
keyword is] similar to thereturn
keyword
Here’s another sample program that demonstrates this.
#File: generator-example.php
<?php
function myGeneratorFunction()
{
echo "One","\n";
yield 'first return value';
echo "Two","\n";
yield 'second return value';
echo "Three","\n";
yield 'third return value';
}
// get our Generator object (remember, all generator function return
// a generator object, and a generator function is any function that
// uses the yield keyword)
$iterator = myGeneratorFunction();
// get the current value of the iterator
$value = $iterator->current();
echo 'The value returned: ', $value, "\n";
// get the next value of the iterator
$iterator->next();
$value = $iterator->current();
echo 'The value returned: ', $value, "\n";
// and the value after that the next value of the iterator
$iterator->next();
$value = $iterator->current();
echo 'The value returned: ', $value, "\n";
Run this program and you’ll see the following output.
$ php generator-example.php
One
The value returned: first return value
Two
The value returned: second return value
Three
The value returned: third return value
This program is very similar to our first, with two exceptions
- We’ve included string values after the
yield
keywords (yield "a string value";
) - After calling
next
on the iterator object, we fetch the iterator’s current value with thecurrent
method
In addition to pausing a function — the yield
keyword also returns a value that the generator/iterator object will know to use as the current
value.
All this next
/current
business may seem verbose. Don’t forget that PHP knows how to handle an iterator in a foreach
loop. Give the following program a try
#File: generator-example.php
<?php
function myGeneratorFunction()
{
yield 'first return value';
yield 'second return value';
yield 'third return value';
}
$generator = myGeneratorFunction();
foreach($generator as $value) {
echo 'My Value Is: ', $value, "\n";
}
Run it, and you’ll get the following output
$ php generator-example.php
My Value Is: first return value
My Value Is: second return value
My Value Is: third return value
Under the hood, when you use an iterator object in a foreach
loop, PHP is making the same calls to that iterator’s next
and current
methods.
Pausing State
So far we’ve discussed generators and yield
as though they were just a fancy version of the goto
statement. There’s one key piece of information we’ve left out. When you yield
inside a generator function and return control to the other part of your program, PHP pauses everything about that function. This includes the state of any variables inside the generator function.
The implications of that might not be immediately obvious. Let’s use the classic generator example (reimplementing the range function) to demonstrate the implications.
#File: generator-example.php
<?php
# 1. Define a Generator Function
function generator_range($min, $max)
{
#3b. Start executing once `current`'s been called
for($i=$min;$i<=$max;$i++) {
echo "Starting Loop","\n";
yield $i; #3c. Return execution to the main program
#4b. Return execution to the main program again
#4a. Resume exection when `next is called
echo "Ending Loop","\n";
}
}
#2. Call the generator function
$generator = generator_range(1, 5);
#3a. Call the `current` method on the generator function
echo $generator->current(), "\n";
#4a. Resume execution and call `next` on the generator object
$generator->next();
#5 echo out value we yielded when calling `next` above
echo $generator->current(), "\n";
// give this a try when you have some free time
// foreach(generator_range(1, 5) as $value) {
// echo $value, "\n";
// }
If we run this program we’ll see the following output.
Starting Loop
1
Ending Loop
Starting Loop
2
In plain english, this program
- Defines a generator function
- Calls that generator function to get a generator object
- Starts executing the generator function when the program calls
current
, whichyields
a value - Returns to the generator function when we call
$generator->next()
and makes another trip through the loop untilyield
is called again echo
s out the value of the secondyield
when we callcurrent
again
The most important step is #4. When we call next
and return execution to the generator function — the values of $i
, $min
, and $max
are all the same as when we left the function in step #3. PHP held on to these values when it paused the function. That’s the magic of generators, and what allows them to be more memory efficient than returning and storing a set of a values in an array.
Wrap Up
There’s a lot more to learn about generators. Here’s just a few
- The
yield from
statement, allows you to yield another generator - Sending values and throwing exceptions back into the generator function
- The effect of a
return
statement inside a generator function
but I think we’ll wrap things up here today. The main thing I wanted to get across, which I think too many generator articles skip, is how generator code flow actually works. Once you understand that generators become just another piece of code to reason about.
In all honesty, you can probably get by as a PHP programmer without ever touching generators. In practice, when they’re used by other developers, it tends to be behind the scenes and transparent to end users of a library or API. However, as PHP starts to evolve towards providing support for asynchronous programming features, you’ll be hearing a lot more about generators. Generators are an example of a coroutine, and when combined asynchronous PHP frameworks like React (not the UI framework of the same name) they unlock a lot of new, powerful, programming metaphors and techniques.