I haven't specifically refactored from sequential to parallel, but the approach in the pattern seems very logical. If you have a stable build you can do unit testing by comparing the output of the sequential version to that of the current stage of evolution, and even dynamically generate test cases as you have a working (sequential) version.
I have used some tools in the .NET environment to profile the execution for performance bottlenecks, though my course of action in these cases was sequential optimization rather than parallelization. I could see that some applications might not require such tools to isolate the best portion to parallelize, but I would think that the profiler is more "fact based", in that what you think is the bottleneck might not truly be one, and that you would likely make better decisions when assisted by tools.
I could definitely see that a bunch of different transformations/evolution iterations would be needed to see performance improvement. Likely, early transformations would be adding overhead to support the first items of parallelization, which would be amortized over more parallel execution down the road.
I think that a task queue could be an implementation level detail of a fork/join or divide-and-conquer algorithm. The fork/join, for example, deals with how to decompose a problem into smaller subproblems that can be executed in parallel, and a task queue would be one way of executing these tasks once defined.
As far as the existence of source code, this seems like one of the patterns that didn't need it quite so much--I think everyone gets the idea of a queue, so probably just the variations from the standard conception would have been sufficient.
I'll have to admit, quite a lot of this pattern was beyond my realm of expertise. It would seem to me that you would have to have quite a complex, dynamic (from execution to execution), and intensive problem to solve to warrant the overhead of this pattern. Additionally, given the complexity of implementing it, you'd better *really* need that performance boost to risk the bugs that might creep into the program due to this complexity.