欧美在线一区二区,欧美大胆a级视频免费,在线欧美中文字幕农村电影

基于HLS工具的CNN加速器的設計與優化方法研究

2021年電子技術應用第3期

程佳風，王紅亮

中北大學電子測量技術國家重點實驗室，山西太原030051

摘要： 基于軟硬件協同設計的思想，利用HLS工具，在PYNQ-Z2平臺上設計并實現了一個卷積神經網絡加速器，對卷積運算采用矩陣切割的優化方法，均衡了資源消耗和計算資源，使得加速器的性能達到了最優。利用MNIST數據集對加速器IP核進行性能測試，實驗結果表明：對單張圖片的測試，該加速器相對于ARM平臺實現了5.785的加速效果，對于1 000張圖片的測試則可達到9.72的加速效果，隨著測試圖片數量的不斷增加，加速器的性能也將越來越優。

關鍵詞： 卷積神經網絡 PYNQ-Z2 HLS工具加速器

中圖分類號： TN108.1
文獻標識碼： A
DOI：10.16157/j.issn.0258-7998.200841
中文引用格式： 程佳風，王紅亮. 基于HLS工具的CNN加速器的設計與優化方法研究[J].電子技術應用，2021，47(3)：18-21，26.
英文引用格式： Cheng Jiafeng，Wang Hongliang. Research on the design and optimization method of CNN accelerator based on HLS tools[J]. Application of Electronic Technique，2021，47(3)：18-21，26.

Research on the design and optimization method of CNN accelerator based on HLS tools

Cheng Jiafeng，Wang Hongliang

National Key Laboratory for Electronic Measurement Technology，North University of China，Taiyuan 030051，China

Abstract： Based on the idea of software and hardware co-design, this article uses HLS tools to design and implement a convolutional neural network accelerator on the PYNQ-Z2 platform, and uses the matrix cutting optimization method for convolution operations to balance resource consumption and computing resources , so that the performance of the accelerator is optimized. This article uses the MNIST data set to test the performance of the accelerator IP core. The experimental results show that: for a single image test, the accelerator achieves an acceleration effect of 5.785 compared with the ARM platform, and an acceleration of 9.72 for a 1000 image test. As a result, as the number of test images continues to increase, the performance of the accelerator will become better and better.

Key words : convolutional neural network(CNN)；PYNQ-Z2；HLS tool；accelerator

0 引言

近年來，卷積神經網絡的應用范圍越來越廣泛，其應用場景也日益復雜，卷積神經網絡的計算密集和存儲密集特征日益凸顯，成為快速高效實現卷積神經網絡的限制。于是基于GPU^[1]、ASIC^[2]、FPGA^[3]的不同的加速器平臺被相繼提出以提升CNN的設計性能。GPU的電力消耗巨大，硬件結構固定，限制了卷積神經網絡在嵌入式設備的應用；ASIC開發成本極高，靈活性低，不適合搭載復雜多變的卷積神經網絡；FPGA具有功耗低、性能高、靈活性好的特點，因此更加適用于卷積神經網絡硬件加速的開發研究，但由于Verilog HDL開發門檻高，開發周期相對較長，影響了FPGA在卷積神經網絡應用的普及^[4-5]。

本文基于軟硬件協同的思想，利用HLS工具，在PYNQ-Z2上實現了一個卷積神經網絡加速器，并采用矩陣切割的設計方法對卷積核運算進行優化。

本文詳細內容請下載:http://www.xxav2194.com/resource/share/2000003402

作者信息:

程佳風，王紅亮

(中北大學電子測量技術國家重點實驗室，山西太原030051)

原創聲明：此內容為AET網站原創，未經授權禁止轉載。

相關內容