When deciding whether to parallelize an existing sequential application or to re-architect for parallelism, I think it is of great concern to consider whether the underlying algorithms and structures are sufficiently decoupled (in algorithms, iteratively decoupled), such that parallelization would be practical and useful. In applications using algorithms where each iteration explicitly depends on the next, and where there is little or no "fan out", the underlying algorithms should be inspected to see if the solution can be formulated in a different manner, and therefore parallelization through refactoring should be decided against. On the other hand, however, if the structures and algorithms exhibit good decoupling, parallelization through refactoring may bring great results, and thus should be further investigated and possibly undertaken.
Parallel libraries are of a great advantage to the programmer for several reasons. Firstly, they abstract a great deal of the complexity of concurrent programming away from the developer, and therefore let the developer work with a simplified/abstracted model. This leads to a decrease in subtle timing issues/bugs related to hard-to-test situations, arising from lack of complete knowledge by the developer. They additional provide the benefit of disseminating knowledge about parallel patterns, by encapsulating relevant functionality. This provides a library of paradigms which will become known to the developer, and therefore provide the developer with a variety of new perspectives from which to model their algorithms.
When it comes to semi-automatic vs. fully automatic refactoring, I believe that there is an appropriate place for each. To the extent that exact semantics can be ensured, a fully automatic approach would be preferred, as it keeps the codebase simpler and more to the point. In the event, however, that the semantics of the application would be changed, no matter how slightly, it would be best to place this control in the developer's hands, as ultimately, they must decide what the application must do, and stand responsible for its operation.
Cluttering of code due to parallel refactorings are certainly an issue when it comes to maintainability. I believe that as languages become more expressive, and essentially more functional, the code will be more conducive to fully automatic refactorings, which can happen in the compilation process, as opposed to at the source code level.
Not so much a refactoring, per se, but in order to achieve parallelism in a number of my applications, I use a SQL Server database, which has a great deal of internal parallelization built in, and then I try to formulate my solution in terms of operations across this database. In this manner, I can gain a great deal of parallelization over a purely sequential program.
Another factor i would have liked addressed is an analysis of how these parallelizations affect total system scalability when faced with a large number of copies of the same algorithm running--i.e. if different users were running instances of the algorithms on a shared system. Perhaps this would be able to be accomplished by providing a single core benchmark, to show the overhead of the parallel refactorings.