Papers

The bibtex file can be found here: bibtex file. An always updated listing, arranged by venue and including abstract, can be found here: FastFlow papers by venue.

Overview

M. Aldinucci, M. Danelutto, P. Kilpatrick, and M. Torquati, “FastFlow: high-level and efficient streaming on multi-core,” in Programming Multi-core and Many-core Computing Systems, S. Pllana and F. Xhafa, Ed., Wiley, 2014.

Methodology and theory

M. Danelutto and M. Torquati, “Loop parallelism: a new skeleton perspective on data parallel patterns,” in Proc. of Intl. Euromicro PDP 2014: Parallel Distributed and network-based Processing, Torino, Italy, 2014. doi:10.1109/PDP.2014.13
M. Aldinucci, S. Campa, M. Danelutto, P. Kilpatrick, and M. Torquati, “Design patterns percolating to parallel programming framework implementation,” International Journal of Parallel Programming, 2013. doi:10.1007/s10766-013-0273-6
M. Danelutto and M. Torquati, “A RISC building block set for structured parallel programming,” in Proc. of Intl. Euromicro PDP 2013: Parallel Distributed and network-based Processing, Belfast, Nothern Ireland, U.K., 2013. doi:10.1109/PDP.2013.17
M. Aldinucci, M. Danelutto, P. Kilpatrick, and M. Torquati, “Targeting heterogeneous architectures via macro data flow,” Parallel Processing Letters, vol. 22, iss. 2, 2012.
M. Aldinucci, S. Campa, M. Danelutto, P. Kilpatrick, and M. Torquati, “Targeting Distributed Systems in FastFlow,” in Euro-Par 2012 Workshops, Proc. of the CoreGrid Workshop on Grids, Clouds and P2P Computing, 2013, p. 47—56.
M. Aldinucci, M. Danelutto, P. Kilpatrick, M. Meneghin, and M. Torquati, “An Efficient Unbounded Lock-Free Queue for Multi-core Systems,” in Proc. of 18th Intl. Euro-Par 2012 Parallel Processing, Rhodes Island, Greece, 2012, pp. 662-673.
M. Aldinucci, M. Danelutto, L. Anardu, M. Torquati, and P. Kilpatrick, “Parallel patterns + Macro Data Flow for multi-core programming,” in Proc. of Intl. Euromicro PDP 2012: Parallel Distributed and network-based Processing, Garching, Germany, 2012, pp. 27-36.
M. Aldinucci, M. Danelutto, P. Kilpatrick, M. Meneghin, and M. Torquati, “Accelerating code on multi-cores with FastFlow,” in Proc. of 17th Intl. Euro-Par 2011 Parallel Processing, Bordeaux, France, 2011, pp. 170-181.

Applications, comparison against other frameworks

Clusters of multicores, distributed systems and cloud

M. Aldinucci, M. Torquati, C. Spampinato, M. Drocco, C. Misale, C. Calcagno, and M. Coppo, “Parallel stochastic systems biology in the cloud,” Briefings in Bioinformatics, 2013. doi:10.1093/bib/bbt040
A. Secco, I. Uddin, G. P. Pezzi, and M. Torquati, “Message passing on InfiniBand RDMA for parallel run-time supports,” in Proc. of Intl. Euromicro PDP 2014: Parallel Distributed and network-based Processing, Torino, Italy, 2014. doi:10.1109/PDP.2014.23
M. Danelutto, L. Deri, D. De Sensi, “Network Monitoring on Multicores with Algorithmic Skeletons,” in: Proc. of Intl. Parallel Computing (PARCO), September 2011.
M. Drocco, “Parallel stochastic simulators in systems biology: the evolution of the species,” Master Thesis, 2013.

Shared-memory Multicore (cache-coherent)

M. Aldinucci, S. Ruggieri, and M. Torquati, “Decision Tree Building on Multi-Core using FastFlow,” Concurrency and Computation: Practice and Experience, 26(3):800–820, Mar. 2014.
C. Misale, “Accelerating Bowtie2 with a lock-less concurrency approach and memory affinity,” in Proc. of Intl. Euromicro PDP 2014: Parallel Distributed and network-based Processing, Torino, Italy, 2014. doi:10.1109/PDP.2014.50 (Best paper award)
M. Aldinucci, F. Tordini, M. Drocco, M. Torquati, and M. Coppo, “Parallel stochastic simulators in system biology: the evolution of the species,” in Proc. of Intl. Euromicro PDP 2013: Parallel Distributed and network-based Processing, Belfast, Nothern Ireland, U.K., 2013.
M. Aldinucci, M. Coppo, F. Damiani, M. Drocco, E. Sciacca, S. Spinella, M. Torquati, and A. Troina, “On Parallelizing On-Line Statistics for Stochastic Biological Simulations,” in Euro-Par 2011 Workshops, Proc. of the 2st Workshop on High Performance Bioinformatics and Biomedicine (HiBB), Bordeaux, France, 2012, pp. 3-12.
M. Aldinucci, M. Coppo, F. Damiani, M. Drocco, M. Torquati, and A. Troina, “On Designing Multicore-Aware Simulators for Biological Systems,” in Proc. of Intl. Euromicro PDP 2011: Parallel Distributed and network-based Processing, Ayia Napa, Cyprus, 2011, pp. 318-325.
M. Aldinucci, S. Ruggieri, and M. Torquati, “Porting Decision Tree Algorithms to Multicore using FastFlow,” in Proc. of European Conference in Machine Learning and Knowledge Discovery in Databases (ECML PKDD), Barcelona, Spain, 2010, pp. 7-23. (18% acceptance rate)
M. Aldinucci, A. Bracciali, P. LiÒ, A. Sorathiya, and M. Torquati, “StochKit-FF: Efficient Systems Biology on Multicore Architectures,” in Euro-Par 2010 Workshops, Proc. of the 1st Workshop on High Performance Bioinformatics and Biomedicine (HiBB), Ischia, Italy, 2011, pp. 167-175.
Marco Aldinucci, Salvatore Ruggieri, and Massimo Torquati, “Porting decision tree building and pruning algorithms to multicore using FastFlow,” Università di Pisa, Dipartimento di Informatica, Italy, number TR-11-06, March 2011.
M. Aldinucci, M. Meneghin, and M. Torquati, “Efficient Smith-Waterman on multi-core with FastFlow,” in Proc. of Intl. Euromicro PDP 2010: Parallel Distributed and network-based Processing, Pisa, Italy, 2010, pp. 195-199.
M. Aldinucci, M. Danelutto, M. Meneghin, P. Kilpatrick, and M. Torquati, “Efficient streaming applications on multi-core with FastFlow: the biosequence alignment test-bed,” in Parallel Computing: From Multicores and GPU’s to Petascale (Proc. of PARCO 2009, Lyon, France), Lyon, France, 2010, pp. 273-280.

Heterogenous Multicore + Many-core (GPGPU, Tilera Tile64, Xeon Phi, ...)

M. Aldinucci, G. P. Pezzi, M. Drocco, F. Tordini, P. Kilpatrick, and M. Torquati, “Parallel video denoising on heterogeneous platforms,” in Proc. of Intl. Workshop on High-level Programming for Heterogeneous and Hierarchical Parallel Systems (HLPGPU), 2014.
D. Buono, M. Danelutto, S. Lametti, and M. Torquati, “Parallel Patterns for General Purpose Many-Core,” in Proc. of Intl. Euromicro PDP 2013: Parallel Distributed and network-based Processing, Belfast, Nothern Ireland, U.K., 2013.
M. Aldinucci, C. Spampinato, M. Drocco, M. Torquati, and S. Palazzo, “A Parallel Edge Preserving Algorithm for Salt and Pepper Image Denoising,” in Proc of 2nd Intl. Conference on Image Processing Theory Tools and Applications (IPTA), Istambul, Turkey, 2012, pp. 97-102.

Speculations, future directions, ideas

C. Misale, M. Aldinucci, and M. Torquati, “Memory affinity in multi-threading: the Bowtie2 case study,” in Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES) — Poster Abstracts, Fiuggi, Italy, 2013.
M. Aldinucci, S. Campa, F. Tordini, M. Torquati, and P. Kilpatrick, “An abstract annotation model for skeletons,” in Formal Methods for Components and Objects: Intl. Symposium, FMCO 2011, Torino, Italy, October 3–5, 2011, Revised Invited Lectures, B. Beckert, F. Damiani, F. S. de Boer, and M. M. Bonsangue, Ed., Springer, 2013, vol. 7542, pp. 257-276.
K. Hammond, M. Aldinucci, C. Brown, F. Cesarini, M. Danelutto, H. González-Vélez, P. Kilpatrick, R. Keller, M. Rossbory, and G. Shainer, “The ParaPhrase Project: Parallel Patterns for Adaptive Heterogeneous Multicore Systems,” in Formal Methods for Components and Objects: Intl. Symposium, FMCO 2011, Torino, Italy, October 3–5, 2011, Revised Invited Lectures, B. Beckert, F. Damiani, F. S. de Boer, and M. M. Bonsangue, Ed., Springer, 2013, vol. 7542, pp. 218-236.
F. Tordini, M. Aldinucci, and M. Torquati, “High-level lock-less programming for multicore,” in Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES) — Poster Abstracts, Fiuggi, Italy, 2012.
M. Aldinucci, A. Bracciali, and P. LiÒ, “Formal Synthetic Immunology,” ERCIM News, vol. 82, pp. 40-41, 2010.

Talks

Would like to know more about FastFlow? Invite us for a talk at your university or company!

FastFlow: high-level programming patterns with non-blocking lock-free run-time support. Politecnico di Milano, Dipartimento di Elettronica ed informazione, Milano, Italy, December 2012.
A Parallel Edge Preserving Algorithm for Salt and Pepper Image Denoising. Intl. Conference on Image Processing Theory Tools and Applications (IPTA), Istambul, Turkey, October 2012.
Turning Big data into knowledge: Techniques and Tools for Parallel Computing on Online Data Streams in Systems Biology and Epidemiology. Invited talk given at BioIt World conference, Vienna, Austria. October 2012.
FastFlow: high-level programming patterns with non-blocking lock-free run-time support. Invited talk given at UPMARC Workshop on Task-Based Parallel Programming, Uppsala, Sweden. September 2012.
An efficient Unbounded Lock-Free Queue for Multi-Core Systems. Talk given at Euro-Par 2012, Rhodes Island, Greece. August 2012.
Targeting distributed systems in FastFlow. Talk given at CoreGrid Symposium (co-located with Euro-Par 2012), August 2012.
Pattern-based Parallel Edge Preserving Algorithm for Salt-and-Pepper Image Denoising. Talk given at HPC Advisory Council Swiss Workshop, March 2012. Video of the talk from InsideHPC (including several comments on theory of synchronization in the shared memory)
Accelerating code on multi-cores with FastFlow. Talk given at Euro-Par 2011, Bordeaux, France. September 2011.
On Parallelizing On-Line Statistics for Stochastic Biological Simulations. Talk given at HiBB 2011: High Performance Bioinformatics and Biomedicine, Bordeaux, France. September 2011.
On Designing Multicore-Aware Simulators for Biological Systems. Talk given at IEEE PDP 2011: Parallel Distributed and network-based Processing, Ayia Napa, Ciprus. February 2011.
Skeletons from grids to multicores. Invited talk at Vrije University, Amsterdam, The Netherlands. October 2010.
Porting Decision Tree Algorithms to Multicore Using FastFlow. Extended version of the slides presented at ECML-PKDD 2010, Barcelona, Spain. September 2010.
FastFlow: a pattern-based programming framework for multicores. Dagstuhl seminar 10191, Schloss Dagstuhl, Germany. May 2010. Invited. Slides available on-demand.
FastFlow: why we need yet another programming framework. Guest lecture, Computer science Dept. Queen’s University Belfast, UK. March 2010. Invited within Master Course CSC4005. Slides available on-demand.
Efficient Smith-Waterman on multi-core with FastFlow. Talk given at IEEE PDP 2010: Parallel Distributed and network-based Processing, Pisa, Italy. February 2010.
Efficient streaming applications on multi-core with FastFlow: the biosequence alignment test-bed. Talk given at ParCo 2009, Lyon, France. September 2009.

Other Papers

Papers from people not directly involved in FastFlow design. The listing is not meant to be neither complete not always up to date.

Suresh Boob, Horacio González–Vélez, Alina Madalina Popescu. Automated Instantiation of Heterogeneous FastFlow CPU/GPU Parallel Pattern Applications in Clouds, in Proc. of Intl. Euromicro PDP 2014: Parallel Distributed and network-based Processing, Torino, Italy, 2014.
Mehdi Goli, Michael T. Garba, Horacio González–Vélez. Streaming Dynamic Coarse-Grained CPU/GPU Workloads with Heterogeneous Pipelines in FastFlow, in: Proc. of the Intl. Conference on High Performance and Communications (HPCC), 2012.
Alexander Collins Christian Fensch Hugh Leather. Optimization Space Exploration of the FastFlow Parallel Skeleton Framework, in: HiPEAC'12/HLPGPU: High-level programming for heterogeneous and hierarchical parallel systems, Paris, France. January 2012.
Zalán Szűgyi, Norbert Pataki. Generative Version of the FastFlow Multicore Library, Electronic Notes in Theoretical Computer Science, Vol. 279, Issue 3, 27 Dec. 2011, pages 73–84 (Proc. of the 3rd Workshop on Generative Technologies), 2011.
Alexander James Collins. Automatically Optimising Parallel Skeletons, M.Sc. Thesis in Computer Science, School of Informatics University of Edinburgh, UK, 2011.