izpis_h1_title_alt

Priprava programov OpenCL za učinkovito izvajanje na različnih arhitekturah
ŠEMROV, JURE (Author), Lotrič, Uroš (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (1,04 MB)
MD5: 5DAB64A9040640783829DCFE109DD693

Abstract
V diplomski nalogi se posvečamo predvsem vprašanju, kako programe OpenCL napisati, da se bodo učinkovito izvajali na različnih arhitekturah. Težava, s katero se soočamo, so arhitekturne razlike med sistemi. Če torej želimo doseči maksimalno učinkovitost, moramo program ustrezno prilagoditi. Prilagoditve obsegajo število računskih enot, število niti v skupini, uporabo vektorske enote, lokalnega pomnilnika in predpomnilnikov ter še druge načine za prikrivanje latence. Na kratko, izkoristiti moramo morebitne arhitekturne prednosti naprave in paralelizem tako na nivoju ukazov, kot tudi na nivoju niti. V nalogi obravnavamo pet programov, to so histogram, množenje matrik, predponska vsota, problem n teles in bitonično urejanje. Te programe prilagodimo trem različnim sistemom, in sicer CPE Intel Core i5-2450M, mnogojedrnik Xeon Phi 5110P, GPE Nvidia Tesla K20. Da bi te prilagoditve izkusili tudi v praksi, smo izmerili čas izvajanja programov za različno velike skupine in skušali razbrati kaj se dogaja. Če naše ugotovitve posplošimo, lahko privzamemo, da naj bo število skupin vsaj toliko, kot je računskih enot, skupine pa naj bodo ravno prav velike, da zmanjšamo režijo preklopa skupin in pomnilniško latenco ter obenem ne povečamo režije zaradi komunikacije ali zmanjšamo števila skupin, ki se sočasno izvajajo na računski enoti. Za učinkovito izvajanje moramo na CPE in mnogojedrniku upoštevati predpomnilnike in širino vektorske enote, medtem ko moramo na GPE čim bolje izkoristiti visoko prepustnost ter prikriti latenco z velikim številom niti in lokalnim pomnilnikom.

Language:Slovenian
Keywords:OpenCL, heterogeni sistemi, računske enote, delovne skupine, niti, SIMD, SIMT, lokalni pomnilnik
Work type:Bachelor thesis/paper (mb11)
Organization:FRI - Faculty of computer and information science
Year:2017
Views:906
Downloads:604
Metadata:XML RDF-CHPDL DC-XML DC-RDF
 
Average score:(0 votes)
Your score:Voting is allowed only to logged in users.
:
Share:AddThis
AddThis uses cookies that require your consent. Edit consent...

Secondary language

Language:English
Title:Optimizing OpenCL programs for different hardware architectures
Abstract:
The main question in this thesis we will be trying to solve, is how to write a proper OpenCL program to effectively run on different architectures. A problem to overcome are the architectural differences between systems. To maximize the efficiency, we need to adapt the program. This defers by the number of compute units, number of threads in a work-group, use of a vector unit, local memory and cache to minimize latency. To summarize, we need to exploit both instruction and thread level parallelism as well as other architectural advantages. We used five programs, histogram, matrix multiply, prefix sum, n body problem and bitonic sort. Then we adapted them to three different systems, Intel Core i5-2450M CPU, Xeon Phi 5110P manycore processor and Tesla K20 GPU. To test these adaptations in practice, we measured program runtime for different work-group sizes and tried to explain what is going on. Our conclusions show, that we need at least as many work-groups as there are compute units. The work-group size have to be large enough to reduce the overhead of maintaining a work-group and hide memory latency. At the same time they should be small enough to reduce overhead of communication and to keep executing more work-groups simultaneously on each compute unit. To execute programs efficiently on a CPU and manycore processors, we need to take into account caches and wideness of a vector unit, while on a GPU we need to exploit high memory throughput and hide latency with large work-groups and local memory.

Keywords:OpenCL, heterogeneous systems, compute unit, work groups, work-items, SIMD, SIMT, local memory

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Comments

Leave comment

You have to log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back