Graphics processing units: to use or not to use?

dc.contributor.authorThambawita, D. R. V. L. B.
dc.contributor.authorEllepola, N. C.
dc.contributor.authorRagel, R. G.
dc.contributor.authorElkaduwe, D.
dc.date.accessioned2024-09-14T10:30:25Z
dc.date.available2024-09-14T10:30:25Z
dc.date.issued2013-07-04
dc.description.abstractString matching is a very important aspect in various databases and text processing applications. Bioinformatics, signature based anti-virus software and many other important applications highly depend on the efficiency of string matching tools. With the advent of parallel computing, traditional sequential string matching drawbacks were phased out improving the application’s performance. Over the past few years, the use of Graphic Processing Units (GPUs) to achieve parallelism has shown promising results, in which the GPUs exhibit SPMD (Single Program Multiple Data) programming model. NVIDIA has introduced CUDA (Compute Unified Device Architecture) programming API enabling programmers to use threaded processors of a GPU to achieve higher data parallelism. In our research, we consider a basic string matching algorithm as a benchmark for comparing CPU and low end GPU performance for single string matching. Here, we consider changing memory types (global memory, constant memory, shared memory), data file size and the number of threads in both CPU and GPU and analyse them in order to compare their performance trade-offs. We utilize the maximum work load on both GPU and CPU when comparing the string by repeating the same pattern in the data file. In our approach, we observe the performance while altering the data file size, and experiments indicate that, when we only consider the kernel execution time, with the increment of the data file size, the rate of increment of the time taken to match the strings decreased in GPU in contrast to the CPU. However, in most GPU kernels data must be moved on to the device prior to being used by the kernel, which introduces an additional time for the computation. In our experiment, this causes performance deterioration with the increment of the data load to the device due to context initialization time. In this context, when the GPU load is low, low end GPUs show ill performance in basic string matching operations compared to that of the CPU due to the process initialization of the GPU. The performance of the GPU is gradually increased with the input data file size. As the next phase of the project, we are planning to conduct the experiment using different data types and different GPUs with higher bandwidth capabilities to minimize the effect of data transfer overhead.
dc.identifier.citationPeradeniya University Research Sessions PURSE - 2012, Book of Abstracts, University of Peradeniya, Sri Lanka, Vol. 17, July. 4. 2012 pp. 79
dc.identifier.isbn9789555891646
dc.identifier.issn13914111
dc.identifier.urihttps://ir.lib.pdn.ac.lk/handle/20.500.14444/1067
dc.language.isoen
dc.publisherThe University of Peradeniya
dc.subjectComputer engineering
dc.subjectGraphics processing units
dc.titleGraphics processing units: to use or not to use?
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
D.R.V.L.B.Thambawita.pdf
Size:
201.66 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections