User Tools

Site Tools

This is an old revision of the document!


Very short Tutorial

The FastFlow programming model

The FastFlow programming model is a structured parallel programming model. The framework provides several pre defined, general purpose, customizable and composable parallel patterns (or algorithmic skeletons). Any applications whose parallel structure may be modelled using the provided parallel patterns, used alone or in composition, may be implemented using FastFlow.

The basic


Pipelining is one of the simplest parallel pattern where data flows through a series of stages (or nodes) and each stage processes the input data in some way producing as output a modified version or new data. We will call the data flows streams of data or simply streams. A pipeline's stages can operate sequentially or in parallel and may have or not have an internal state.

/* this is a 3-stage pipeline example */
#include <ff/pipeline.hpp>
using namespace ff;
typedef long fftask_t;
struct firstStage: ff_node_t<task_t> {
    fftask_t *svc(fftask_t *t) {
	for(long i=0;i<10;++i) ff_send_out(new fftask_t(i));
	return EOS; // End-Of-Stream
fftask_t* secondStage(fftask_t *t,ff_node*const node) {
    std::cout << "Hello I'm stage" << node->get_my_id() << "\n";
    return t;
struct thirdStage: ff_node_t<task_t> {
    fftask_t *svc(fftask_t *t) {
	std::cout << "stage" << get_my_id() << " received " << *t << "\n";
        delete t;
	return GO_ON;
int main() {
    ff_pipe<fftask_t> pipe(new firstStage, secondStage, new thirdStage);
    pipe.cleanup_nodes(); // cleanup at exit
    if (pipe.run_and_wait_end()<0) error("running pipe");
    return 0;


The task-farm pattern is a stream parallel paradigm based on the replication of a purely functional computation (let's call the function F). Its parallel semantics ensures that it will process tasks such that the single task latency is close to the time needed to compute the function F sequentially, while the throughput (under certain conditions) is close to F/n where n is the number of parallel agents used to execute the farm (called Workers). The concurrent scheme of a farm is composed of three distinct parts: the Emitter (E), the pool of workers (Ws) and the Collector (C). The emitter gets farm's input tasks and distributes them to workers using a given scheduling strategy (round-robin, auto-scheduling, user-defined). The collector collects tasks from workers and sends them to the farm's output stream.

/* the third stage is transformed in a farm */
#include <ff/farm.hpp>
#include <ff/pipeline.hpp>
int main() {
    std::vector<ff_node*> W = {new thirdStage, new thirdStage}; // the farm has 2 workers
    ff_pipe<fftask_t> pipe(new firstStage, secondStage, new ff_farm<>(W));
    if (pipe.run_and_wait_end()<0) error("running pipe");
    return 0;


A sequential iterative kernel with independent iterations is also known as a par- allel loop. Parallel loops may be clearly parallelized by using the map or farm patterns, but this typically requires a substantial re-factoring of the original loop code with the possibility to introduce bugs and not preserving sequential equivalence. In the FastFlow framework there are a set of data parallel patterns implemented on top of the basic FastFlow skeletons to ease the implementation of parallel loops: ParallelFor, ParallelForReduce, ParallelForPipeReduce.

Heare a very basic usage example of the ParallelFor pattern:

#include <ff/parallel_for.hpp>
using namespace ff;
int main() {
    long A[100];
    ParallelFor pf;
    pf.parallel_for(0,100,[&A](const long i) {
      A[i] = i;
    return 0;

Data Dependency Tasks Executor (aka MDF)


Some valid combinations of pipeline and farm (and feedback)

ffnamespace/tutorial.1410714489.txt.gz · Last modified: 2014/09/14 19:08 by aldinuc