GPU-based parallel implementations of algorithms are usually compared against the corresponding sequential versions compiled for a single-core CPU machine, without taking advantage of the multi-core and SIMD capabilities of modern processors. This leads to unfair comparisons, where speed-up figures are much larger than what could actually be obtained if the CPU-based version were properly parallelized and optimized. The availability of OpenCL, which compiles parallel code for both GPUs and multi-core CPUs, has made it much easier to compare execution speed of different architectures fully exploiting each architecture’s best features. We tested our latest parallel implementations of Particle Swarm Optimization (PSO), compiled under OpenCL for both GPUs and multi-core CPUs, and separately optimized for the two hardware architectures. Our results show that, for PSO, a GPU-based parallelization is still generally more efficient than a multi-core CPU-based one. However, the speed-up obtained by the GPU-based with respect to the CPU-based version is by far lower than the orders-of-magnitude figures reported by the papers which compare GPU-based parallel implementations to basic single-thread CPU code.

OpenCL implementation of particle swarm optimization: A comparison between multi-core CPU and GPU performances / Cagnoni, Stefano; A., Bacchini; Mussi, Luca. - STAMPA. - 7248:(2012), pp. 406-415. (Intervento presentato al convegno EvoApplications 2012 tenutosi a Malaga nel 11-13/4/2012) [10.1007/978-3-642-29178-4_41].

OpenCL implementation of particle swarm optimization: A comparison between multi-core CPU and GPU performances

CAGNONI, Stefano;MUSSI, LUCA
2012-01-01

Abstract

GPU-based parallel implementations of algorithms are usually compared against the corresponding sequential versions compiled for a single-core CPU machine, without taking advantage of the multi-core and SIMD capabilities of modern processors. This leads to unfair comparisons, where speed-up figures are much larger than what could actually be obtained if the CPU-based version were properly parallelized and optimized. The availability of OpenCL, which compiles parallel code for both GPUs and multi-core CPUs, has made it much easier to compare execution speed of different architectures fully exploiting each architecture’s best features. We tested our latest parallel implementations of Particle Swarm Optimization (PSO), compiled under OpenCL for both GPUs and multi-core CPUs, and separately optimized for the two hardware architectures. Our results show that, for PSO, a GPU-based parallelization is still generally more efficient than a multi-core CPU-based one. However, the speed-up obtained by the GPU-based with respect to the CPU-based version is by far lower than the orders-of-magnitude figures reported by the papers which compare GPU-based parallel implementations to basic single-thread CPU code.
2012
9783642291777
OpenCL implementation of particle swarm optimization: A comparison between multi-core CPU and GPU performances / Cagnoni, Stefano; A., Bacchini; Mussi, Luca. - STAMPA. - 7248:(2012), pp. 406-415. (Intervento presentato al convegno EvoApplications 2012 tenutosi a Malaga nel 11-13/4/2012) [10.1007/978-3-642-29178-4_41].
File in questo prodotto:
File Dimensione Formato  
chp%3A10.1007%2F978-3-642-29178-4_41.pdf

non disponibili

Tipologia: Documento in Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 225.06 kB
Formato Adobe PDF
225.06 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/2651925
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact