当前位置:网站首页>Learn FPGA (from Verilog to HLS)
Learn FPGA (from Verilog to HLS)
2022-04-23 09:19:00 【Fei Xiaoxing】
【 Statement : copyright , Welcome to reprint , Do not use for commercial purposes . Contact mailbox :feixiaoxing @163.com】
hls, Its full name is high level synthesis. That is to say, the circuit synthesis is completed from a higher-level language . Once upon a time , To write fpga There is only one way , It's all about writing verilog Similar hardware language . But how to use it c、c++ Write integrable logic , This becomes very important , After all c、c++ Engineer ratio fpga There are many more engineers .
1、hls Importance
In a sense ,hls Will greatly expand the current fpga Application fields of . Comparison mcu、arm soc Come on ,fpga At present, it still focuses on signal sampling 、 Digital signal processing 、 AD conversion these scenarios . But in the field of artificial intelligence ,fpga Not much ink , An example of the opposite is gpu. Once upon a time gpu Just a graphics accelerator , Later, he made continuous efforts in the field of games and artificial , bring gpu There are more and more application fields , The scale of affiliated companies is also growing .
2、c The paradox of language and concurrency
c Language itself is a serial code , It's not exactly the same as concurrency , In this respect , It can be regarded as a period of 1 Special fpga Code . therefore , In the process of design , The difficulty is not the language itself , But in parallel thinking .c Language itself does not directly become a netlist , It also becomes verilog Language , And then turn it into a net list . To achieve this , Will be in c Make some modifications and restrictions on the language , This is it. hls The purpose of .
3、verilog and waveform Can't throw
For those with high timing requirements protocol,verilog Can't throw , This one uses hls It's hard to do . For general logic 、 Non standard algorithmic logic 、 Acceleration of scale algorithm , This is a hls The strengths of . Besides ,hls After transformation , How to determine hls Is it the desired effect , In addition to looking at the comprehensive effect , Also need to see the corresponding waveform, This is irreplaceable . Don't expect code to be written well , There is an immediate performance improvement effect .
4、 Write first c Code , To optimize the hls
For software engineers , First of all c Logic is OK , Optimize step by step . The basic method of optimization is to add various directive, That is to say pragma sign . The basic methods are three ,1、 Collect data while processing ;2、 Concurrent ;3、 Assembly line . It's essentially all kinds of serial restrictions , Reduce the complexity of the algorithm latency.
5、hls Still need a sequence diagram 、 Waveform design
hls Generally, serial processing is done by default , For example, the following code ,
for(int i = 0; i < 10; i++)
{
b[i] = a[i] + c + d;
}
If there is no explanation , That's basically loop body The operation code inside is executed in sequence 10 Time . If explicit acceleration is required , You can turn the loop on 、 use pipeline, In this way, we can basically speed up the processing . Acceleration is not without cost , The basic method is to exchange space for time , There is a trade-off , The algorithm may be fast , But resources may not be enough . A clever way , Is to design graphics first , etc. testbench When , Compare the test graphics with the design graphics , In this way, we can achieve twice the result with half the effort , It is also important not to over optimize .
6、 Pay attention to the interface 、 Memory 、hls Provided function
hls How and bus Interface communication , How to map the memory in the function , Inside this hls Provide a good way . Besides , For general functions ,hls Corresponding optimizations are also provided , In especial opencv Some of the functions provided ,hls There are corresponding versions .
7、 Study hls Another way to think about it
It is a good idea to learn concurrent programming with comparative thinking , such as openmp.openmp Itself is actually c、c++ Language , But by #pragma Can achieve the effect of concurrency , This and hls It's kind of similar .hls It is usually suitable for complex algorithm optimization , Not suitable for precise hardware protocols , If you need to put fpga Apply to more occasions , that hls At least for now, it is the only way .
8、hls Video tutorial
https://www.bilibili.com/video/BV1J5411t7uE
ps:
Many people may find it difficult to understand , since fpga It's so annoying to do algorithms , Why use . I think it's mainly because of the low frequency fpga The performance of the algorithm can be several times that of its own soc As good as , The circuit is relatively simple , cost 、 The supply chain is not so tight . Especially for non-standard products , Especially suitable for .
版权声明
本文为[Fei Xiaoxing]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230632556041.html
边栏推荐
- Kettle experiment (III)
- Valgrind et kcachegrind utilisent l'analyse d'exécution
- SQL used query statements
- DMP engine work summary (2021, lightsaber)
- [original] use system Text. JSON formats the JSON string
- The crawler returns null when parsing with XPath. The reason why the crawler cannot get the corresponding element and the solution
- 2021 Li Hongyi's adaptive learning rate of machine learning
- Kettle实验 转换案例
- Colorui solves the problem of blocking content in bottom navigation
- 3、 6 [Verilog HDL] gate level modeling of basic knowledge
猜你喜欢
Project upload part
653. 两数之和 IV - 输入 BST
LeetCode_ DFS_ Medium_ 1254. Count the number of closed islands
Common errors of VMware building es8
3、 6 [Verilog HDL] gate level modeling of basic knowledge
Failed to download esp32 program, prompting timeout
To remember the composition ~ the pre order traversal of binary tree
Correct method of calculating inference time of neural network
Cross domain configuration error: when allowcredentials is true, allowedorigins cannot contain the special value "*“
Summary of wrong questions 1
随机推荐
[in-depth good article] detailed explanation of Flink SQL streaming batch integration technology (I)
LeetCode_DFS_中等_1254. 统计封闭岛屿的数目
Brief steps to build a website / application using flash and H5
2021 Li Hongyi's adaptive learning rate of machine learning
108. Convert an ordered array into a binary search tree
RSA encryption and decryption signature verification
The K neighbors of each sample are obtained by packet switching
Kettle experiment (III)
Distributed message oriented middleware framework selection - Digital Architecture Design (7)
Trc20 fund collection solution based on thinkphp5 version
ATSS(CVPR2020)
What is monitoring intelligent playback and how to use intelligent playback to query video recording
Redis Desktop Manager for Mac
Go language self-study series | golang method
npm ERR! network
Unfortunately, I broke the leader's confidential documents and spit blood to share the code skills of backup files
kettle实验
[SQL Server fast track] view and cursor of database
nn. Explanation of module class
Vivo, hardware safe love and thunder