Hello Sangrae_Kim,
The VA FFT is for 1D , but image rotation can solve it to 2D since FFT is separable.
Do you require the FFT in 2D or is 1D (per line) enough?
Best regards,
Hello Sangrae_Kim,
The VA FFT is for 1D , but image rotation can solve it to 2D since FFT is separable.
Do you require the FFT in 2D or is 1D (per line) enough?
Best regards,
Hi Simon,
Yes, you have found a valid approach of increasing the DMA performance concerning small images in a dynamic process.
You concatenate 10 of these images to increase the DMA efficiency,
For static tasks you can simply use AppendImage to get the maximum DMA performance.
Best regards,
Dear Sangrae_Kim,
Does that mean that it only support up to 8k camera?
You can split a longer line into shorter parts to remove the high-frequency details.
Is this a valid approach for you?
Best regards,
Hello,
The operator SignalWidth is frequently used for generating a signal width in the signal domain.
It is used pretty often for a pulse width controlled exposure pulse on the trigger interface.
Vcc at the tick input will provide the system clock (mE5 = 125 MHz).
The resulting pulse width is defined by parameter Width in the unit ticks.
Best regards,
Hello together,
What are 'ticks' in relation to the system clock and the signal library?
Go here to get some more hints on it...
Best regards,
Björn
Hello together,
What are 'ticks' in relation to the system clock and the signal library?
Especially in the signal library the word 'ticks' is mentioned pretty often and in the following you can find an explanation of it:
The system clock is used by the FPGA within each VA platform.
All current microEnable5 (mE5) boards use 125 MHz, while the older mE4 is found at 62.5 MHz.
A single tick represents a single pulse with the period of the used system clock:
system clock = S = 125 MHz
tick duration = T = 1 / S
1 / (125 MHz) = 8 nanoseconds (click for calculation)
1 second in ticks:
S = 125 MHz
1 second is represented by 125,000,000 = 125 million ticks
duration = D = 1 ms (example value)
system clock = S = 125 MHz
tick duration = T = 1 / S = 1 / 125 MHz = 8 ns (see above)
D / T = amount of ticks representing D:
(1 millisecond) / (8 nanoseconds) = 125,000 ticks
In case of questions to this do not hesitate to place it below.
The community will answer as soon as possible...
Best regards,
Dear Sangrae_Kim,
To use the 1D FFT on a 16k line to remove high frequency you can simply split it into 2 8k lines, process it (FFT forward&inverse) and merge both results into a single 16k line again.
As a good strating point you can use the following example:
https://docs.baslerweb.com/liv…ntent/examples%20FFT.html
Since you want to remove the high frequencies this is a valid approach afaik.
Best regards,
Björn
Dear Community,
The design below MeasureDurationLatency_v001.va will give you exact timing and latency of the VA design and its acquisition:
An interesting part can be found within H-Box GetDataImmediately:
This function extracts the interesting image data from the CXP data stream and give minimum latency for CXP related data acquisition. This is only required for CoaXPress interfaces.
For all measurements in here the operator SignalToDelay is used:
The screenshot above shows how to measure the time between End of Acquisition and End of Transfer within H-Box Added_Latency.
How SignalToDelay works:
Best regards,
Dear Theo,
You post already started a internal discussion.
I will keep you updated concering feedback.
Best regards,
Hello Theo,
That is an interesting point:
You know, I am thinking about a documentation for People who are note developing new VAs but still Need to know what is possible because they want to use an VA in their application and are responsible for the specification.
A documentation that describes in a detailed way, what applications or processing options are possible or beneficial using VA, or even a combination of VA + CPU + GPU ideas. I will forward this idea to our product management and marketing deoartment.
Here you can find a simple PDF showing some slides VAoptions_SiliconSoftware_BjoernRudde_V1.01_EN.pdf on VA options in general.
It is a sub-set of slides I have shown on a series of distributors technology forums.
But these may help getting a good introduction, while this kind of documentation is missing.
During VACC and VADC trainings some presentation material is given to all participants.
Since your question started with Parameter Libraries, you can have a detailed look into the options here, but it is focused on VA developers.
Best regards,
Dear Community,
Here you can find the official VA download now:
- Download -
related current release notes,
and the install guide.
Best regards,
Dear Community,
a VA example on exposure control using a closed-loop control can be found here including detailed explanation (link/design):
Best regards,
Dear Community,
A detailed approach into bandwidth considerations including additional calculations can be found here:
The linked thread is mainly related to bandwdith details of a CXP system.
In case you are using a CXP camera the folowing formulas may be of interest.
That means up to 12.5 or 6.25 Gbit/s per link depending on the used grabber.
Due to 8b/10b encoding this represents per link at CXP6:
(8 / 10) * (6.25 (Gbit / s)) = 625 MB / s
quad link will end up at:
((4 * 8) / 10) * (6.25 (Gbit / s)) = 2500 MB / s
Since the interface is protocol based you will not get the full bandwidth for image data.
But at least it is the possible peak performance for CXP link itself.
The possible camera sensor data output will always be below these limits.
There are more options being supported by the VA design, but depending on the used platform:
CXP-1 |
1.25 Gbit/s |
up to 212 m |
CXP-2 |
2.5 Gbit/s |
up to 185 m |
CXP-3 |
3.125 Gbit/s |
up to 169 m |
CXP-5 |
5 Gbit/s |
up to 102 m |
CXP-6 |
6.25 Gbit/s |
up to 60 m |
CXP-10 |
10 Gbit/s |
up to 40 m |
CXP-12 |
12.5 Gbit/s |
up to 30 m |
Max link speed for mE5-MA-VCX-QP and mE5-VQ8-CXP6D is CXP6.
Status: 2nd of July 2020, current overview
Some math again for CXP1 speed with 4 links:
((4 * 8) / 10) * (1.25 (Gbit / s)) = 500 MB / s, where this peak bandwidth is very likely transporting 400 MB/s of image data.
Similar for CXP5 configuration with one link:
((1 * 8) / 10) * (5 (Gbit / s)) = 500 MB / s
Always check the CXP configuration using the hardware dialog in GenICam Explorer or microDisplayX.
The link topology will tell you what you are using precisely for CXP.
Best regards
Dear IhShin,
You question asks for a bandwidth of:
2000 * 2000 * (8 bit) * (239 Hz) = 956 MB / s
coming from the camera, while reporting a bandwidth limit at:
2000 * 2000 * (8 bit) * (96 Hz) = 384 MB / s
While your initial buffer is using 2 RAMs for 8 bit at paralleism 32 (receiving up to 20), the buffer approach inside the EDoF is receiving 16 bit at parallelism 32.
Hardware Configuration microEnable 5 ironman:
Resource | mE5VQ8-CXP6B/mE5VQ8-CXP6D | |
---|---|---|
Vision Processor | Xilinx Virtex6 XC6VLX240T FPGA | |
LUT | 150720 | |
FlipFlop | 301440 | |
Block RAM | 832 x 18432Bit | |
Embedded Arithmetic Logic Unit (DSP48) | 768 | |
RAM | 4 x 256MiB DDR3 | |
Data Width per RAM | 128Bit | |
Bandwidth per RAM | 4GB/s | |
Base Design Clock | 125MHz | |
Host Interface | PCIe x8 Gen2 | |
Host Interface (PCIe x 8 Gen 2) Bandwidth (theor.) | 4 Gbyte/s per direction on PCIe bus | |
Host Interface (PCIe x 8 Gen 2) Bandwidth (typ./max.) | up to 3.6 GByte/s on PCIe bus |
Table 55. Hardware Configuration microEnable 5 ironman (Source)
Let's look into the details:
To get the maximum performance of each RAM module, we have to use the full data width:
128 bit
using the full interface, does mean all 128 bit == bit-depth * parallelism, would give 4 GB/s of bandwidth.
4 GB/s = (1+1) * 2GB /s one for writing input and one for reading output.
Above we end up at an bandwidth
here it equals 956 MPixel/s, each pixel having 16 bit = 2000 * 2000 * (16 bit) * (239 Hz) = 1912 MB / s
Explanation: 16 bit consist of 2 intermediate values per pixel...
Two RAM blocks are used within the EDoF:
Each RAM in here uses 4 bit at parallelism 32 = 4 bit * 32 = 128 bit
Same choice for acquisition RAM modules.
So RAM interfacing in EDoF is fine.
While your intended bandwidth is less than parallelism * system clock = 8 * 125 MHz = 1GPixel /s the selected parallelism of 16 is a secure choice. No problem with this too.
From my point of view there is no issue within the VA design.
That is OK, but no solution or answer to your question.
There are two other details we have to look at now:
I guess and hope that the DMA performance of the ironman grabber in your system is not limited, but please double check that.
A limited PCIe performance could be a reason for that, because it would propagate stop's into the design's data flow.
The ironman is providing PCIe x8 Gen2 with theoretical 4 GB/s and 3.6 GB/s in practice, but it is possible that the mainboards PCIe slot does only support Gen1 and/or less lanes than x8.
One PCIe lane at Gen1 would deliver 256 MB/s in theory and practically close to 200 MB/s
One PCIe lane at Gen2 would deliver 512 MB/s in theory and practically close to 400 MB/s
Your design is correctly configured for PCIe 8 Gen2:
pasted-from-clipboard.png
Shown in Applet Properties operator.
You reported:
2000 * 2000 * (8 bit) * (96 Hz) = 384 MB / s
and 1 of 50 images : 2% is output bandwidth of second DMA:
(2000 * 2000 * (8 bit) * (96 Hz)) * (1 + (1 / 50)) = 391.68 MB / s
That is pretty close to 400 MB/s, being an indicator for 1 PCIe lane at Gen2.
In microDiagnostics you can double-check the possible bandwidth of the grabber:
Output will look like:
Test is carried out for the applet that is available (flashed) on the selected frame grabber.
Since VA designs are not fully supported, please flash the related acquisition applet.
In your case it will be: Acq_QuadCXP6x1AreaGray8.dll
On X axis of the performance diagram for Acq_* applet at FG_GRAY = 8 bit per pixel you need to see a peak at 2048 at or above 1000 MB/s to reach your target bandwidth.
What to do if DMA performance is fine?
The camera may be a second external issue.
In practice you are using CXP and the used operator supports 4 CXP6 links.
That means up to 6.25 Gbit/s per link.
Due to 8b/10b encoding this represents per link:
(8 / 10) * (6.25 (Gbit / s)) = 625 MB / s
quad link you be:
((4 * 8) / 10) * (6.25 (Gbit / s)) = 2500 MB / s
Since the interface is protocol based you will not get the full bandwidth for image data.
But at least it is the possible peak performance for CXP6.
There are more options being supported by the VA design:
CXP-1 | 1.25 Gbit/s | up to 212 m |
CXP-2 | 2.5 Gbit/s | up to 185 m |
CXP-3 | 3.125 Gbit/s | up to 169 m |
CXP-5 | 5 Gbit/s | up to 102 m |
CXP-6 | 6.25 Gbit/s | up to 60 m |
Some math again:
((4 * 8) / 10) * (1.25 (Gbit / s)) = 500 MB / s, where this peak bandwidth is very likely transporting 400 MB/s of image data.
Same for this configuration:
((1 * 8) / 10) * (5 (Gbit / s)) = 500 MB / s
To end this already pretty long story :
Check the CXP configuration using the hardware dialog in GenICam Explorer or microDisplayX.
The link topology will tell you what you are using precisely for CXP.
Your VA design is correct, you should see the expected bandwidth.
From my perspective it is possible that the PCIe connection or CXP topology is causing this.
To me it is very likely that the PCIe Gen2 slot provides a single lane only.
Please let me know what the DMA performance test (microDiagnistics) shows...
If you need some help in interpreting your tests and interpreting this into the observed performance:
I and all the other people in the VA forum community will help.
Thanks and best regards,
Dear Fabio,
The operator is new and due to being unknown to VA 3.0.1 is is getting replaced by dummy:
ExtractResult Box I get GetResult as Dummy
Solution:
Please download the most current VA version:
Best regards,
Dear Fabio,
Here some explanations on the VA design related to auto exposure... ExposureControl_StateMachine_B.Rudde.va
Download: ExposureControl_StateMachine_B.Rudde.va
A later post will give more explanation.
Full overview:
The points of interest are the 3 hierarchical boxes: RxExposure, BrightnessMismatch, NewExposure
Each listed with comments below:
BrightnessMismatch, if auto exposure want to work, we need to know how "bright" the image is.
Here you can set the target value for histogram mean into TargetHistMeanValue, preset = 127.
The function extracts the mean value from full histogram:
NewExposure, based on the brightness mismatch to target value and the last used exposure value the new one is calculated on basis of simple closed-loop control with P element only. More to read in the commenst within the VA design:
Basics of loop:
Regulator details:
RxExposure, simply receiving the new exposure value and forwarding it to pulse generation:
I hope the listed VA design shows what you are asking for. In case of questions do not hesitate to contact me or press the like button in case there are no questions to this VA design sketch.
Best regards,
Dear Fabio,
Here you can find a complete "auto exposure control" on basis of closed loop P-regulator.
Download: ExposureControl_StateMachine_B.Rudde.va
In the initial post you asked for fixed steps for control, you can see internal value P as fixed value,
but please see the large amount of comments within the design for more details.
So your design includes a loop, but can be simulated in VA except the real camera feedback.
But you can use any kind of sequence to see how the P-regulator is acting.
A later post will give more explanation.
The question of state machine is a shorter answer:
Rx- and TxSignalLink make it possible to build a signal based state machine,
or if you require values and simulation for it, please use:
RxImageLink and TxImageLink.
Best regards,
Dear Fabio,
Mode: CL BASE, 2 Taps, Monochrome, 8 bit Depth
Thank you for the quick answer; will use this as default configuration.
Best regards