So awhile back I documented my experience implementing recursive coroutines into Sleep 2.1. Looking back I didn't quite cover the full saga so I'll catch you up real quick.
I was reading over the python PEPs and I was really impressed by what recursive coroutines could do. Just the nice simple power of them. And I knew I wanted this in Sleep. I implemented basic generators without to much trouble. When I started allowing coroutines to call themselves I built a simple example based on the Python PEP 255. An example that basically did a preorder, inorder, and postorder traversal of a tree data structure. Just to fill in space I'll paste it here now:
# do a preorder traversal of a tree closure
sub preorder
{
local('$x');
if ($1 !is $null)
{
while $x (preorder([$1 left]))
{
yield $x;
}
while $x (preorder([$1 right]))
{
yield $x;
}
yield [$1 label];
}
return $null;
}
sub tree
{
this('$label $left $right');
if ($0 eq "left")
{
return $left;
}
if ($0 eq "right")
{
return $right;
}
if ($0 eq "label")
{
return $label;
}
}
sub node
{
return lambda(&tree, $label => $1, $left => $2, $right => $3);
}
$root = node("root", node("+", node(3), node(4)), node("*", node("-", node(6), node(2)), node(30)));
println("Pre order tree traversal");
while $traversal (preorder($root))
{
println($traversal);
}
I think dissecting this code would make a perfect Sleep snippets blog post. I'll add that to my todo list. Anyways I think it was the preorder function that kept going into an infinite loop on me. I had no idea why and trying to track down the bug was incredibly frustrating.
Finally I put in some println statements into the code and everything worked without a problem. I was quite confused. And then I noticed when I put a println statement into the code following the third yield statement it worked. Anywhere else, it didn't. I was really confused. Finally I broke down and I added a null op instruction to the sleep interpreter. This instruction was added after every yield request. A hacky fix, I didn't understand why it fixed the problem, but it did.
So here we are several Sleep betas later. Marty, one of the guys I work with, was looking for some code to generate a decent listing of two directories and compare them. I read somewhere that comparing two trees with a good coroutine implementation is absolute cake. Doing so without coroutines sucks. So I proceeded to try to write a function to act as a directory iterator.
sub traverse
{
local('$f $temp');
foreach $f (ls($1))
{
yield $f;
if (-isDir $f)
{
while $temp (traverse($f))
{
yield $temp;
}
}
}
}
while $blah (traverse("/Users/raffi/sleepdev/"))
{
println($blah);
}
Unfortunately the traverse function would work for a few iterations and then eventually it would completely crap out. Again I had no idea why. I just started tackling this problem today. I was shaking in my boots at the thought of going through another 3 week hunt-for-the-bug-in-the-coroutine-implementation binge.
Then I figured it out. I have no idea why my null op fix made so many situations work, but here was the real problem.
When a yield is requested, Sleep knows it needs to save up all of the past context. This is done by setting a flag (same mechanism as return) that forces all of the blocks to immediately exit all the way up to the top level caller. Note: reseting this flag is very important, this is why you always use SleepUtils.runCode to call Sleep code. Not reseting this flag was a cause of many early and hard to find bugs back in the day. Anyways... when a yield is requested Sleep knows to start saving each block and "program counter" value to what I call a context stack.
When a coroutine is resumed, Sleep starts iterating through the bottom of the saved context stack. It just grabs the block and the program counter value and starts ticking away again. Once that block is exited the next block in the context stack is executed, so on and so forth until all the blocks are exhausted. Without coroutines this happens thanks to Java's call stack. When a block exits, the evaluate function for that block returns and the calling block picks up executing where it left off. Nothing special.
Now another detail. When a coroutine is resumed, a new context stack is made and the old one is iterated through as mentioned above. The idea here is that if a yield happens then the new context stack won't screw up the old one i.e. they won't conflict with eachother. Especially important when we're doing recursive coroutines (i.e. the same function calling itself, resuming itself, saving itself, etc..).
Now here is a fun question. What happens when we're evaluating the current context stack and a yield is requested. Well of course all that bubbling up and saving happens, right? This is fine. This captures the stuff from the Java call stack quite well but not the stuff from the resumed call stack.
Without explicitly dealing with this situation, the old code was basically continuing to evaluate the old context steps which would cause an instruction to be advanced in the old saved stuff and thanks to the flow_control_yield flag the program counter and block would be saved. This is why the null op fixed some of these problems. So long as the simulated stack wasn't too deep all that would get skipped was the null op.
I added some code to the sleep interpreter to check for the yield flag within the evaluate old context function. If a yield is requested then the remaining current context is immediately dumped to the new context that is being saved.
Once I did that everything worked. I was also able to eliminate the null op following the yield step.
Good news.
Ok, now I'm going to crash. I've done enough coding and debugging for one day. |