Details

MASIVNO VZPOREDNE BINARNE NEVRONSKE MREŽE ZA PROGRAMIRLJIVA VEZJA
ID MUROVIČ, TADEJ (Author), ID Trost, Andrej (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (4,28 MB)
MD5: BCC0ABEA6813E031D147DF6E1D9467B4

Abstract
Načrtovanje učinkovitih algoritmov strojnega učenja za obdelavo podatkov na robu je v zadnjih letih v ospredju akademskega in industrijskega raziskovalnega dela. Da bi zadostili zahtevam aplikacij in omejitvam strojne opreme na robu (zakasnitev, poraba, velikost), so razvili binarne nevronske mreže ter njihove masivno vzporedne različice. Binarne nevronske mreže, izvedene v čisto kombinacijskih oziroma asinhronih vezjih, zagotavljajo učinkovito porabo virov in izjemne hitrosti. V disertaciji podrobno opisujem in raziskujem vzporedne kombinacijske binarne nevronske mreže za programirljiva vezja FPGA in njihovo uporabo v aplikacijah robnega računalništva, kar vključuje učenje, gradnjo in implementacijo mrež. Razvil sem orodje za visokonivojsko sintezo strojno opisne kode vezij masivno vzporednih binarnih nevronskih mrež. Orodje omogoča hitro učenje in gradnjo za poljubno učno množico oziroma aplikacijo. Z uporabo razvitega orodja sem zgradil, naučil in sintetiziral mreže za primere robnih aplikacij, kot so strojni vid, razvrščanje internetnih paketov ter eksperimentalno fiziko. Sinteza pokaže, da MPBNN dosegajo zakasnitve pod 30 ns za vse primere aplikacij, kar omogoča hitro razvrščanje v robnih sistemih obdelave podatkov. Poleg tega je močnostna poraba nižja, kot v primerljivih delih kvantiziranih nevronskih mrež. Število potrebnih vpoglednih tabel za vse primere ne preseže 60 tisoč, kar omogoča implementacijo v nižjem cenovnem razredu čipov FPGA. Sintetizirane mreže so v primerjavi z bolj zmogljivimi mrežami po številu plasti in nevronov med manjšimi. Tako prejšnja dela, kot tudi sam ugotavljam, da je potrebno razviti nove arhitekture vezij oziroma optimizacijske tehnike za zmanjšanje velikosti logike. Zaradi tega sem razvil, opisal in vgradil v orodje tri nove optimizacijske algoritme, ki omogočajo izvedbo bolj učinkovitih vezij za vzporedne binarne mreže. Prvi algoritem išče podobnosti med utežmi nevronov, da zmanjša število potrebnih seštevalnikov. Rezultati kažejo, da predlagana optimizacija v primerjavi z neposredno izvedbo, doseže izboljšavo v velikosti logike za 24.7 % pri vezju za strojni vid, 39.9 % pri vezju za eksperimentalno fiziko in 38.1 % pri vezju za razvrščanje omrežnih paketov. Poleg tega je mogoče opaziti izboljšave močnostne porabe od 37.5 % do 51.9 %. Za preizkus drugega algoritma, ki išče podobnosti med zaporednimi nevroni, sem mreže zgradil specifično za aplikacije razvrščanja internetnih paketov. Majhne mreže sem naučil na bazah NSL-KDD ter UNSWNB15 in dosegel točnosti od 77.77 % do 98.96 %, ki so primerljive s podobnimi deli. Sinteza pokaže, da takšne optimizirane mreže porabijo od 8606 do 17990 vpoglednih tabel ter imajo zakasnitev do 19 ns, kar omogoča uporabo v modernih hitrih internetnih omrežjih. Za preizkus tretjega algoritma, ki združuje skupke nevronov, sem razvil celoten postopek zaznave ladij iz satelitskih slik. Postopek vključuje algoritme predobdelave, koraka sklepanja kombinacijskih mrež na FPGA ter algoritme poobdelave. Vezje doseže zakasnitev do 38.2 ns z 0.425 W porabe in samo 19000 vpoglednih tabel. To omogoča uporabo vezja v poceni FPGA sistemih in z visokim številom sličic na sekundo.

Language:Slovenian
Keywords:binarne nevronske mreže, masivno vzporedne nevronske mreže, robno računalništvo, robna obdelava podatkov, FPGA, kombinacijska vezja
Work type:Doctoral dissertation
Organization:FE - Faculty of Electrical Engineering
Year:2021
PID:20.500.12556/RUL-124547 This link opens in a new window
Publication date in RUL:31.01.2021
Views:2311
Downloads:229
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:MASSIVELY PARALLEL BINARY NEURAL NETWORKS FOR PROGRAMMABLE CIRCUITS
Abstract:
Designing efficient machine learning algorithms for near-sensor data processing on the edge has been at the research forefront in recent years. To achieve the required edge processing constraints, massively parallel binary neural networks have been developed. Binary neural networks implemented in purely combinational circuits provide resource utilization efficiency and performance. This thesis describes and researches massively parallel combinational binary neural network logic and how it is to be used in real-world deployment situations which include training and constructing networks for a variety of examples. A high-level synthesis toolchain is designed, which enables users to produce the hardware description language models of combinational binary neural networks circuits directly from application datasets. Standard and optimized combinational architectures are built for different edge processing applications by using this toolchain. For machine vision, Ethernet packet calssification, and experimental physics as edge processing examples, a hardwired Verilog hardware description language code is built using the toolchain. It is synthesized for an FPGA system to create designs for a set of concrete edge processing problems. Synthesis results show that massiveley parallel binary neural networks use minimal resources and achieve less than 30 ns inference delays, which is crucial for high-speed applications, less than 2 W power consumption and less than 60, 000 FPGA slices. This shows that parallel binary neural networks enable efficient hardware machine learning performance for a variety of edge processing problems. However, both from these examples and previous work done it is concluded that more efficient circuit design and optimization algorithms are still needed. Therefore, I design, describe, and implement into the toolchain three novel optimization techniques that require fewer adders and overall operations for parallel neuron activation computations. The first proposed optimization algorithm looks for similarities between the nerurons to reduce the amount and size of adders needed. It reaches a 39.9 % improvement in terms of FPGA slice usage, a 28.2 % improvement in nets used, and a 51.9 % reduction in power consumption compared to the naive implementation. By using the second optimization algorithm, called the genetically optimized ripple architecture, the networks are constructed and trained with the aim of tackling the problem of classifying Ethernet packets efficiently for intrusion detection systems. Shallow, single-hidden-layer binary neural networks are trained on benchmark NSL-KDD and UNSW-NB15 datasets and achieve accuracy rates (77.77 % to 98.96 %) comparable to those of similar compact networks used for detecting intrusions. These networks are then implemented in FPGA using this novel combinational ripple architecture, which is optimized using a genetic algorithm and uses neuron-toneuron similarities to achieve state-of-the-art performance in terms of resource usage (8, 606 to 17, 990 lookup tables) and classification latency (16–19 ns). With the third optimization algorithm we presents the development and simulation of a ship-detecting edge-processing system for deployment on an aerial FPGA platform. A ship detection chain was developed with imager-specific pre-processing algorithms, massively parallel FPGA neural network inference, and host postprocessing procedures. The ship detection binary neural network implemented in combinational logic that enables high frame and detection rates, and achieves 93.59 % patch classification accuracy. A new algorithm for optimizing a combinational binary neural network circuit is presented that merges multiple neurons in a network layer taking advantage of similarities between neuron weights, which leads to lower adder logic size and power consumption. Thus, state-of-the-art performance is achieved in comparison to the naive implementation and similar previous works using combinational binary neural networks, achieving 38.2 ns inference latency, 0.425 W of power dissipation, and only 19, 000 FPGA slices.

Keywords:binary neural networks, massively parallel neural networks, edge processing, FPGA, combinational circuits

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back