Tuesday, September 15, 2009

Pipes & Filters

While I haven't done too much with the UNIX style pipes and filters, I have built a system that performs a series of data transformations and applies business logic in a fashion akin to a pipes and filters paradigm, though due to the technologies involved, it wasn't explicitly viewed as such.

In the same system, it would have been hard to strictly segregate based on a pipes and filters model due to the diversity of errors that can occur, and the need to handle them carefully, and respond in different ways to different compound conditions in the system.

As far as parallelization goes, I believe that the primary criteria for selecting the pipes and filters pattern would be the incremental nature of data processing and commonality of data representation.  If all data flows can be modeled such that the necessary input buffer is very small for the filter to perform work, and a common format can be agreed upon for all pipes that make the "glue logic" overhead minimal, there will likely be parallelization gains.

Active filters would be best in scenarios where large flows of data are likely to occur at irregular intervals, and parallelization gains are desired.  In these cases, having filters ready and able to process input when it becomes available makes good sense.  Passive filters would work better in a subsystem type environment, where a processing pipeline exists to process some type of request, but which is not the primary work of the system, and therefore not worth the active process overhead.

No comments:

Post a Comment