User Tools

Site Tools


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
ffnamespace:performance [2014/08/31 02:46]
aldinuc
ffnamespace:performance [2014/08/31 02:52] (current)
aldinuc
Line 1: Line 1:
 ===== Applications and Performances =====  ===== Applications and Performances ===== 
-==== NGS tools (Bowtie2, BWA) ====+==== [2014] ​NGS tools (Bowtie2, BWA) ====
 Bowtie2.0.6,​ Bowtie-2.2.1,​ and BWA compared in performance against their porting onto the FastFlow library. Tested on    Bowtie2.0.6,​ Bowtie-2.2.1,​ and BWA compared in performance against their porting onto the FastFlow library. Tested on   
      * Intel 4-socket 8-core Nehalem (64 HT) @2.0GHz, 72MB L3, 64 GB mem, Linux x86_64      * Intel 4-socket 8-core Nehalem (64 HT) @2.0GHz, 72MB L3, 64 GB mem, Linux x86_64
Line 12: Line 12:
 |{{:​ffnamespace:​bowtie2-speedup.png?​300|}}|{{:​ffnamespace:​bowtie-bwa-maxspeedup.png?​300|}}| |{{:​ffnamespace:​bowtie2-speedup.png?​300|}}|{{:​ffnamespace:​bowtie-bwa-maxspeedup.png?​300|}}|
  
 +==== [2012] Yadt-ff (parallel C4.5) ====
 +The well-known C4.5 statistical classifier is a double hard algorithm. First of all, because data-miners simply would not like to spend time on a yet another brand new parallel version :-) Many past experiences demonstrated that tiny improvements of the sequential algorithm could bring much more performance than a robust investment on parallelization. This clearly does not absolutely mean that parallelization is useless, but, at least in our understanding,​ that a low-effort and conservative parallelization is the only fairly welcome parallelization in the data-mining community. Unfortunately that kind of parallelization,​ i.e. loop and recursion parallelization,​ is technically complex because independent tasks generated in this way may exhibit several non nice proprieties,​ including a huge range of variability in the task size that in turn may induce both severe synchronization overheads and non-trivial load balancing problems that limit the speedup.
  
 +The YaDT-FastFlow application faces both problems. [[http://​ieeexplore.ieee.org/​Xplore/​login.jsp?​url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F9460%2F30023%2F01374196.pdf|YaDT]] is a third-party,​ main-memory implementation of the C4.5-like decision tree algorithm by Salvatore Ruggieri. YaDT-FastFlow is a //​low-effort//​ parallelization of the sequential algorithm that required less than 10 hours of development (including tuning and testing) while producing a significant speedup over the sequential version.
 +
 +This application aims at demonstrating the ability of FastFlow and FastFlow accelerator to support rapid and efficient development via semi-automatic parallelization of loops and Divide&​Conquer in third-party and legacy codes. ​
 +
 +Stay tuned for a brand new Technical Report about that. The code will be publicly available with the Technical Report. The C.4.5-FastFlow application has been developed in cooperation with Salvatore Ruggieri, University of Pisa, Italy. ​
 +
 +=== Performances ===
 +Tests on andromeda (2 x quad-core HT - 16 contexts, Linux) and ottavinareale (2 x quad-core, Linux).
 +
 +|{{:​ffnamespace:​model_cr2_speedup.png?​320|Speedup on ottavinareale}}|{{:​ffnamespace:​ottavina_cr2_speedup.png?​320|Speedup on andromeda}}|
 +| On Andromeda (HT, 8 cores, 16 contexts) | On Ottavinareale (8 cores) |
  
 <note important>​ <note important>​
Line 124: Line 137:
  
  
-==== Yadt-ff (parallel C4.5)  ==== 
-The well-known C4.5 statistical classifier is a double hard algorithm. First of all, because data-miners simply would not like to spend time on a yet another brand new parallel version :-) Many past experiences demonstrated that tiny improvements of the sequential algorithm could bring much more performance than a robust investment on parallelization. This clearly does not absolutely mean that parallelization is useless, but, at least in our understanding,​ that a low-effort and conservative parallelization is the only fairly welcome parallelization in the data-mining community. Unfortunately that kind of parallelization,​ i.e. loop and recursion parallelization,​ is technically complex because independent tasks generated in this way may exhibit several non nice proprieties,​ including a huge range of variability in the task size that in turn may induce both severe synchronization overheads and non-trivial load balancing problems that limit the speedup. 
  
-The YaDT-FastFlow application faces both problems. [[http://​ieeexplore.ieee.org/​Xplore/​login.jsp?​url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel5%2F9460%2F30023%2F01374196.pdf|YaDT]] is a third-party,​ main-memory implementation of the C4.5-like decision tree algorithm by Salvatore Ruggieri. YaDT-FastFlow is a //​low-effort//​ parallelization of the sequential algorithm that required less than 10 hours of development (including tuning and testing) while producing a significant speedup over the sequential version. 
- 
-This application aims at demonstrating the ability of FastFlow and FastFlow accelerator to support rapid and efficient development via semi-automatic parallelization of loops and Divide&​Conquer in third-party and legacy codes. ​ 
- 
-Stay tuned for a brand new Technical Report about that. The code will be publicly available with the Technical Report. The C.4.5-FastFlow application has been developed in cooperation with Salvatore Ruggieri, University of Pisa, Italy. ​ 
- 
-=== Performances === 
-Tests on andromeda (2 x quad-core HT - 16 contexts, Linux) and ottavinareale (2 x quad-core, Linux). 
- 
-|{{:​ffnamespace:​model_cr2_speedup.png?​320|Speedup on ottavinareale}}|{{:​ffnamespace:​ottavina_cr2_speedup.png?​320|Speedup on andromeda}}| 
-| On Andromeda (HT, 8 cores, 16 contexts) | On Ottavinareale (8 cores) | 
 ==== Smith-Waterman ​ ==== ==== Smith-Waterman ​ ====
 In bioinformatics,​ sequence database searches are used to find the In bioinformatics,​ sequence database searches are used to find the
ffnamespace/performance.1409445968.txt.gz ยท Last modified: 2014/08/31 02:46 by aldinuc