طراحي و پياده سازي عمل كانولوشن در سيستم هاي شبكه هاي عصبي

شماره مدرك :

15922

شماره راهنما :

14218

پديد آورنده :

توكلي، محمدرضا

عنوان :

طراحي و پياده سازي عمل كانولوشن در سيستم هاي شبكه هاي عصبي

مقطع تحصيلي :

كارشناسي ارشد

گرايش تحصيلي :

الكترونيك (مدارهاي مجتمع)

محل تحصيل :

اصفهان : دانشگاه صنعتي اصفهان

سال دفاع :

1399

صفحه شمار :

چهارده، 111ص. :مصور، جدول، نمودار

استاد راهنما :

مسعود سيدي

توصيفگر ها :

شبكه هاي عصبي كانولوشن (CNN) , پردازش همزمان چند لايه , پردازش خط لوله , دسترسي به حافظه خارجي , پياده سازي سخت افزاري , اف پي جي اي

استاد داور :

اميررضا احمدي مهر، نسرين رضايي

تاريخ ورود اطلاعات :

1399/08/06

كتابنامه :

كتابنامه

رشته تحصيلي :

مهندسي برق

دانشكده :

مهندسي برق و كامپيوتر

تاريخ ويرايش اطلاعات :

1399/08/06

كد ايرانداك :

2646132

چكيده انگليسي :

Design and Implementation of Convolution Operation in Neural Networks Systems Mohammadreza Tavakoli mohammadreza tavakoli@ec iut ac ir August 24 2020 Department of Electrical and Computer Engineering Isfahan University of Technology Isfahan 84156 83111 Iran Degree M Sc Language Farsi Supervisor Prof Sayed Masoud Sayedi m sayedi@cc iut ac ir Abstract In recent years research on deep learning models especially convolutional neural networks CNNs due to their highaccuracy and excellent performance in many image recognition algorithms has increased significantly The presence ofhuge number of computations and data in these networks requires using high performance accelerators in their hardwareimplementations As a result many efficient accelerators have been proposed for hardware implementation In the conven tional approach of designing the accelerators the CNN layers proceed iteratively layer by layer In this approach due tothe large number of intermediate data the accelerator must use off chip memory to store data between the layers In this work by exploiting the dataflow mechanism across the convolutional layers some parts of input data are storedin the internal memory and by using an appropriate calculations approach adjacent CNN layers are computed in a pipelinestructure without a need to store intermediate data In this approach only the output data of the last layer is needed to bestored in an off chip memory To evaluate the performance of the proposed accelerator which is named MLCP architecture 3 adjacent convolution layers were processed concurrently in a pipeline structure Results are compared with those of theSLCP architecture in which calculations were performed layer by layer Both SLCP and MLCP architectures are designed atRTL level by using Verilog HDL and implemented on the FPGA Zynq 7000 family chip The results of MLCP architectureshow a 73 on chip storage reduction in the case of storing intermediate data on the on chip memory and a 6 6 times loweroff chip memory access rate in the case of storing intermediate data on an off chip memory Also by applying optimizationtechniques and using parallel computation the throughput of the MLCP architecture has been 2 7 times higher than that ofthe SLCP architecture This approach is also used to implement two first convolution layers of VGG 16 model network Along with achieving 232 GOPS performance the number of BRAMs and the number of external memory accesses arereduced compared to those of traditional implementations This has increased the energy efficiency of this implementationcompared to other works Key Words Convolutional Neural Network CNN Multi Layer Processing Pipeline Processing Off chip Memory Access Hardware Implementation FPGA

استاد راهنما :

مسعود سيدي

استاد داور :

اميررضا احمدي مهر، نسرين رضايي

لينک به اين مدرک :

https://library.iut.ac.ir/dL/search/default.aspx?Term=15922&Field=0&DTC=107

کلیه حقوق این اثر برای شرکت مهندسی ارتباطات پيام مشرق محفوظ می باشد