MASIVNO VZPOREDNE BINARNE NEVRONSKE MREŽE ZA PROGRAMIRLJIVA VEZJA

MUROVIČ, TADEJ

MASIVNO VZPOREDNE BINARNE NEVRONSKE MREŽE ZA PROGRAMIRLJIVA VEZJA
ID MUROVIČ, TADEJ (Avtor), ID Trost, Andrej (Mentor) Več o mentorju... Povezava se odpre v novem oknu

PDF - Predstavitvena datoteka, prenos (4,28 MB)
MD5: BCC0ABEA6813E031D147DF6E1D9467B4

Izvleček

Načrtovanje učinkovitih algoritmov strojnega učenja za obdelavo podatkov na robu je v zadnjih letih v ospredju akademskega in industrijskega raziskovalnega dela. Da bi zadostili zahtevam aplikacij in omejitvam strojne opreme na robu (zakasnitev, poraba, velikost), so razvili binarne nevronske mreže ter njihove masivno vzporedne različice. Binarne nevronske mreže, izvedene v čisto kombinacijskih oziroma asinhronih vezjih, zagotavljajo učinkovito porabo virov in izjemne hitrosti. V disertaciji podrobno opisujem in raziskujem vzporedne kombinacijske binarne nevronske mreže za programirljiva vezja FPGA in njihovo uporabo v aplikacijah robnega računalništva, kar vključuje učenje, gradnjo in implementacijo mrež. Razvil sem orodje za visokonivojsko sintezo strojno opisne kode vezij masivno vzporednih binarnih nevronskih mrež. Orodje omogoča hitro učenje in gradnjo za poljubno učno množico oziroma aplikacijo. Z uporabo razvitega orodja sem zgradil, naučil in sintetiziral mreže za primere robnih aplikacij, kot so strojni vid, razvrščanje internetnih paketov ter eksperimentalno fiziko. Sinteza pokaže, da MPBNN dosegajo zakasnitve pod 30 ns za vse primere aplikacij, kar omogoča hitro razvrščanje v robnih sistemih obdelave podatkov. Poleg tega je močnostna poraba nižja, kot v primerljivih delih kvantiziranih nevronskih mrež. Število potrebnih vpoglednih tabel za vse primere ne preseže 60 tisoč, kar omogoča implementacijo v nižjem cenovnem razredu čipov FPGA. Sintetizirane mreže so v primerjavi z bolj zmogljivimi mrežami po številu plasti in nevronov med manjšimi. Tako prejšnja dela, kot tudi sam ugotavljam, da je potrebno razviti nove arhitekture vezij oziroma optimizacijske tehnike za zmanjšanje velikosti logike. Zaradi tega sem razvil, opisal in vgradil v orodje tri nove optimizacijske algoritme, ki omogočajo izvedbo bolj učinkovitih vezij za vzporedne binarne mreže. Prvi algoritem išče podobnosti med utežmi nevronov, da zmanjša število potrebnih seštevalnikov. Rezultati kažejo, da predlagana optimizacija v primerjavi z neposredno izvedbo, doseže izboljšavo v velikosti logike za 24.7 % pri vezju za strojni vid, 39.9 % pri vezju za eksperimentalno fiziko in 38.1 % pri vezju za razvrščanje omrežnih paketov. Poleg tega je mogoče opaziti izboljšave močnostne porabe od 37.5 % do 51.9 %. Za preizkus drugega algoritma, ki išče podobnosti med zaporednimi nevroni, sem mreže zgradil specifično za aplikacije razvrščanja internetnih paketov. Majhne mreže sem naučil na bazah NSL-KDD ter UNSWNB15 in dosegel točnosti od 77.77 % do 98.96 %, ki so primerljive s podobnimi deli. Sinteza pokaže, da takšne optimizirane mreže porabijo od 8606 do 17990 vpoglednih tabel ter imajo zakasnitev do 19 ns, kar omogoča uporabo v modernih hitrih internetnih omrežjih. Za preizkus tretjega algoritma, ki združuje skupke nevronov, sem razvil celoten postopek zaznave ladij iz satelitskih slik. Postopek vključuje algoritme predobdelave, koraka sklepanja kombinacijskih mrež na FPGA ter algoritme poobdelave. Vezje doseže zakasnitev do 38.2 ns z 0.425 W porabe in samo 19000 vpoglednih tabel. To omogoča uporabo vezja v poceni FPGA sistemih in z visokim številom sličic na sekundo.

Jezik:	Slovenski jezik
Ključne besede:	binarne nevronske mreže, masivno vzporedne nevronske mreže, robno računalništvo, robna obdelava podatkov, FPGA, kombinacijska vezja
Vrsta gradiva:	Doktorsko delo/naloga
Organizacija:	FE - Fakulteta za elektrotehniko
Leto izida:	2021
PID:	20.500.12556/RUL-124547
Datum objave v RUL:	31.01.2021
Število ogledov:	2220
Število prenosov:	222
Metapodatki:
:	Kopiraj citat
Objavi na:

Sekundarni jezik

Izvleček:
Jezik:	Angleški jezik
Naslov:	MASSIVELY PARALLEL BINARY NEURAL NETWORKS FOR PROGRAMMABLE CIRCUITS
Designing efficient machine learning algorithms for near-sensor data processing on the edge has been at the research forefront in recent years. To achieve the required edge processing constraints, massively parallel binary neural networks have been developed. Binary neural networks implemented in purely combinational circuits provide resource utilization efficiency and performance. This thesis describes and researches massively parallel combinational binary neural network logic and how it is to be used in real-world deployment situations which include training and constructing networks for a variety of examples. A high-level synthesis toolchain is designed, which enables users to produce the hardware description language models of combinational binary neural networks circuits directly from application datasets. Standard and optimized combinational architectures are built for different edge processing applications by using this toolchain. For machine vision, Ethernet packet calssification, and experimental physics as edge processing examples, a hardwired Verilog hardware description language code is built using the toolchain. It is synthesized for an FPGA system to create designs for a set of concrete edge processing problems. Synthesis results show that massiveley parallel binary neural networks use minimal resources and achieve less than 30 ns inference delays, which is crucial for high-speed applications, less than 2 W power consumption and less than 60, 000 FPGA slices. This shows that parallel binary neural networks enable efficient hardware machine learning performance for a variety of edge processing problems. However, both from these examples and previous work done it is concluded that more efficient circuit design and optimization algorithms are still needed. Therefore, I design, describe, and implement into the toolchain three novel optimization techniques that require fewer adders and overall operations for parallel neuron activation computations. The first proposed optimization algorithm looks for similarities between the nerurons to reduce the amount and size of adders needed. It reaches a 39.9 % improvement in terms of FPGA slice usage, a 28.2 % improvement in nets used, and a 51.9 % reduction in power consumption compared to the naive implementation. By using the second optimization algorithm, called the genetically optimized ripple architecture, the networks are constructed and trained with the aim of tackling the problem of classifying Ethernet packets efficiently for intrusion detection systems. Shallow, single-hidden-layer binary neural networks are trained on benchmark NSL-KDD and UNSW-NB15 datasets and achieve accuracy rates (77.77 % to 98.96 %) comparable to those of similar compact networks used for detecting intrusions. These networks are then implemented in FPGA using this novel combinational ripple architecture, which is optimized using a genetic algorithm and uses neuron-toneuron similarities to achieve state-of-the-art performance in terms of resource usage (8, 606 to 17, 990 lookup tables) and classification latency (16–19 ns). With the third optimization algorithm we presents the development and simulation of a ship-detecting edge-processing system for deployment on an aerial FPGA platform. A ship detection chain was developed with imager-specific pre-processing algorithms, massively parallel FPGA neural network inference, and host postprocessing procedures. The ship detection binary neural network implemented in combinational logic that enables high frame and detection rates, and achieves 93.59 % patch classification accuracy. A new algorithm for optimizing a combinational binary neural network circuit is presented that merges multiple neurons in a network layer taking advantage of similarities between neuron weights, which leads to lower adder logic size and power consumption. Thus, state-of-the-art performance is achieved in comparison to the naive implementation and similar previous works using combinational binary neural networks, achieving 38.2 ns inference latency, 0.425 W of power dissipation, and only 19, 000 FPGA slices.
Ključne besede:	binary neural networks, massively parallel neural networks, edge processing, FPGA, combinational circuits

Podobna dela

Podobna dela v RUL:
Podobna dela v drugih slovenskih zbirkah:

Nazaj