ارائه روش هاي پيشنهادي به منظور موازي سازي الگوريتم چينش VPR

چكيده انگليسي :

The Proposed Methods to Parallelize the VPR Placement Milad Ghorbani Moghaddam m ghorbanimoghaddam@ec iut ac ir June 19 2011 Department of Electrical and Computer Engineering Isfahan University of Technology Isfahan 84156 83111 Iran Degree M Sc Language Farsi Dr Kiarash Bazargan kia@cc iut ac irAbstractImprovements in the fabrication technology have resulted in the exponential growth of FPGA size andcomplexity in recent years One of the most critical stages in the FPGA CAD flow is the block placementthat tries to place the logic elements on the FPGA surface in order to shorten the wire length and the timedelay of the circuit In most designing tools annealing methods are widely chosen due to superior quality ofresults and robustness In terms of desktop computing power technology improvements have resulted in an increase in the numberof parallel cores and consequently achieving potentially significant speed ups by applying parallelizedprogramming techniques In recent years devices such as General Purpose computing on GraphicsProcessing Units GPGPU have offered a promising solution to improve runtime using only commodityhardware Therefore changing the serial version of an algorithm to a parallel one which is able to act onmultiple threads is a suitable way for gaining a high speed up In this thesis we studied the VPR tool which is one of the most popular FPGA placement and routing toolsin research and industry The VPR placement algorithm works based on the simulated annealing method thatacts serially as a result finding a proper parallel method to achieve a good speed up on multiple threads andmaintain exactly the same deterministic result as the serial version seems difficult On the other hand if oneis to implement a fully parallel implementation using completely independent annealing moves the speedupwould be great but the quality drastically worse than the serial version Therefore our goal was to conquerthis serial nature using multiple threads concurrently to obtain proper speed up without a significantreduction in the quality of result We proposed and analyzed four parallel methods and finally simulatedthem on a CPU mimicking to be able to run on SIMD device architecture such as a GPGPU with hundredsof concurrent threads The parallel moves method which tries to distribute the swapping function betweenthe threads can yield good speed up on a number of limited threads but the quality might be reduced whenwe increase the number of the threads While the upper bound of the speed up achieved by a method basedon the speculative computation which is dependent on the rate of the move acceptance in the VPR waslimited The idea was starting a new move using the predicted result of the current move The third method called the average coordination of the blocks failed due to irresolvable block conflicts It tried to derive thenew placement by gathering the results of all of the threads and finding the average blocks coordinationbetween them in order to improve the quality of the results obtained by the Markov chain method Finallyour last proposed method was partitioning based In this method we try to divide the main problem into anumber of sub problems and then solve these sub problems concurrently by our parallel processors But inorder not to degrade the quality of results the speed up got limited tightly while using hundreds of threadsKey WordsParallelizing FPGA Placement VPR Simulated Annealing

قرباني مقدم، ميلاد

ارائه روش هاي پيشنهادي به منظور موازي سازي الگوريتم چينش VPR

FPGA , مكان دهي , گداختگي شبيه سازي شده