Thank you for providing those details.
Currently I am on travel, but as soon as possible I will have a look into it.
Best regards,
Hi, could you please have a look?
Thanks
Bingnan
Thank you for providing those details.
Currently I am on travel, but as soon as possible I will have a look into it.
Best regards,
Hi, could you please have a look?
Thanks
Bingnan
Dear Bingnan Liu,
if you could post your VA design over here, I could have a look into it and give some explanations of what is going "wrong" or what needs to be changed. The connected camera configuration would be a welcome detail, since it may affect the bandwidth considerations.
Best regards,
Hi Bjoern,
Thank you for your reply. Attached are my design and the test images.
The camera I used is a CXP12 one (Optronis GmbH Cyclone-1HS-3500 - Optronis GmbH ), but the cable is cxp6 (mE5 VCX QP).
Best
Bingnan
For the aimed bandwidth: I would like to cut the ROI to 1280x300 (0.384 megapixels), and the processing speed ideally achieves 1000 frames per second. So the bandwidth rate would be 384 MHz.
As for the current frame rate, theoretically, with parallelism 1 should be 125Mhz with 113 fps; but when I adjusted ROI in GenCam as 1280x300, even though I set 113 fps, there was still overflow occurs (detected by the Overflow operator, explained in the forum post). So I guess there is something wrong with my design.
Could anyone help with my design?
Hi thank you for your reply. I tried to up the parallelism, but it will cause the computational resources to overflow.
Display MoreHello Bingnan,
I am currently quite busy, so answers may take a while or won't cover the full detail.
To your question: "1. The trigger example seems to be receiving an external trigger signal, then triggering the camera. How can I make the output-triggering signal controllable?"
If you open the "Hierachical Box" you will find the following layout:
pasted-from-clipboard.png
In the "Select" Operator you can choose the Triggermode, in your case the one you are looking for is the setting "1", so that the signal from "Generator" gets used as a Trigger.
If you go deeper into the "Generator" you can see the following:
pasted-from-clipboard.png
The Operator "Period" can then adjust your signal. Please read the textboxes for further instructions on how to adjust the trigger signal.
To your secon question:
"2. With the "Overflow", is that means I can use a larger template? Or it is limited by the resources the "Overflow" controls transfer bits but no effect on this?"
The overflow operator CUTS off an image, when a memory element is not capable of holding saving the image.
As you can see in your screenshot from MicroDisplay you have a "1" on Overflow. This means that the memory is not capable of saving the required data with the chosen bandwidth.
The following screenshots describe areas where you have bootlenekcs in your design:
pasted-from-clipboard.png
This will result in your processing pipeline only be able to handle 125 MegaPixel/s , since the SYNC operator creates dummy pixels.
Also after this HBox you have a ParallelDN that will limit your bandwidth.
This is all the help I can give you at the moment, please try to remove those bottlenecks, with operator suchs as ParallelUP before the SYNC operator etc.
Best Regards
Kevin
Hi
I tried your design, and no matter how I adjusted the frame rate and ROI, the "black zone" always appeared, as the first screenshot shows. I checked the transfer speed: pasted-from-clipboard.png
Do you have any idea why this happening?
For the output signal my last design seems be able to adjust in the window of the software.
Best
Bingnan
Hi Kevin,
1. The trigger example seems to be receiving an external trigger signal, then triggering the camera. How can I make the output-triggering signal controllable?
Attached is my revised design - I added a new box "SignalProcessing" before the "GPO" to let the user control the output signal's form on Microdisplay. Am I doing right?
2. With the "Overflow", is that means I can use a larger template? Or it is limited by the resources the "Overflow" controls transfer bits but no effect on this?
Thank you in advance.
Best
Bingnan
Display MoreHello Bingnan,
I looked at your design, and the first thing I noticed was the parallelDN right after the ImageBuffer at the beginning. I do not know how fast your camera is set, however if you go down to a parallelsim of 1, only 1 pixel is transfered with a rate of 125 MHz.
If your camera has a resolution of 1280x860, you are transfering 1.101 MegaPixel per Frame. Thus you will only be able to have a framerate of 113 FPS.
I updated your design, and removed the parallelDN Operator. Furthermore, I added an Overflow operator. This operator has a property called "OverflowOccured". You can look at this parameter in MicroDisplay(X). If it is "1" an Overflow occured, meaning that your buffers are not large enough, or that some other bottleneck exists in your design.
Speaking of bottlenecks that is often overseen is the DMA Transfer rate, which is 1.800 MB/s for the mE5 VCX QP. So the maximum you can achieve with your resolution is 1635 Hz ( (1800 MegaByte/s / 1280*860) ) - assuming you are using the full DMA Bandwidth. The PCI can transport up to 128 bits in parallel, thus when you have an 8bit image, you can use a parallelsim of 16. In your design it was still one, due to the paralleDN. This means, that only 102 FPS were able to be achieved.
Regarding your question to the triggers:
The standard acquisition applet has these functioanlities built in, when you are using your onw custom applet you are responsible for these implementations. You can find an example for the mE5 VCX QP in your VisualApplets installation folder:
<VASINSTALLDIR>\Examples\Processing\Trigger\me5-MA-VCX-QP\Area
You can copy the trigger functionality and the camera from there.
Best Regards and a nice weekend
Kevin
Thank you, Kevin, for your quick reply. I will test it on the board ASAP
Display MoreHello Bingnan,
I looked at your design, and the first thing I noticed was the parallelDN right after the ImageBuffer at the beginning. I do not know how fast your camera is set, however if you go down to a parallelsim of 1, only 1 pixel is transfered with a rate of 125 MHz.
If your camera has a resolution of 1280x860, you are transfering 1.101 MegaPixel per Frame. Thus you will only be able to have a framerate of 113 FPS.
I updated your design, and removed the parallelDN Operator. Furthermore, I added an Overflow operator. This operator has a property called "OverflowOccured". You can look at this parameter in MicroDisplay(X). If it is "1" an Overflow occured, meaning that your buffers are not large enough, or that some other bottleneck exists in your design.
Speaking of bottlenecks that is often overseen is the DMA Transfer rate, which is 1.800 MB/s for the mE5 VCX QP. So the maximum you can achieve with your resolution is 1635 Hz ( (1800 MegaByte/s / 1280*860) ) - assuming you are using the full DMA Bandwidth. The PCI can transport up to 128 bits in parallel, thus when you have an 8bit image, you can use a parallelsim of 16. In your design it was still one, due to the paralleDN. This means, that only 102 FPS were able to be achieved.
Regarding your question to the triggers:
The standard acquisition applet has these functioanlities built in, when you are using your onw custom applet you are responsible for these implementations. You can find an example for the mE5 VCX QP in your VisualApplets installation folder:
<VASINSTALLDIR>\Examples\Processing\Trigger\me5-MA-VCX-QP\Area
You can copy the trigger functionality and the camera from there.
Best Regards and a nice weekend
Kevin
Dears,
I want to ask some questions met when I implement the program on FPGA:
My basic idea is - to let the program do the “detection then trigger”, and at the same time, the operator can monitor what is happening. The frame grabber model is mE5 VCX QP. I would appreciate it if you could check my designs attached.
Best wishes
Bingnan
Hi
is there any idea of designing a delayed signal triggered by an event?
I am looking into these operators:
And I am planning to add a design between the Branch and GPO:pasted-from-clipboard.png
Hi,
may I ask if there is a way (ideally also doesn't cost much computational source) to improve the brightness and contrast of an image?
My images are taken from low exposure time (labeled as ET in the attached images) and are very dark. I am aiming to do preprocessing of lighting adjustment, then extract the morphological features.
Thank you in advance.
Best
Dear Bingnan,
thank you for the report of this bug.
I will inform our Visualapplets team about this error.
Which frame grabber type do you use? Is this error dependent on the frame grabber platform?
Hi Carmen,
thank you for your message. My frame grabber is mE5-VCX-QP. I just solved this problem by reinstalling the software with the option (Hardware - all platforms selected). Some operators do seem not platform-independent as I imagined.
Hi
I found after I updated the VA to latest version 3.3.2, the operator's connection is lost. The DMA operator from the library also has no port, may I ask how I should solve this?
Best
BL
Display MoreDear Bingnan,
please find atatched the updated design. In this design the small circle is detected
Following improvements are implemented:
1. The number of fractional bits for division by number "K" is increased to 10 fractional bits
2. In module "SigmaI" a module to avoid division by zereo is added.
The changes are marked in yellow in the following screnshot.
Concerning comment #15: The size of the template is only limited by the FPGA resources. You may change the upper limit of the operator IntParamTranslator "TranslateK" if FPGA resources allow a bigger template.
Dear Carmen,
May I ask what the purpose was of increasing the fractional bits from 2 to 10? Was it to reserve more fractional bits to increase the accuracy?
Best
Bingnan
Display MoreHi Bingnan,
In the step netlist generation you see that the hardware ressources are exhausted on the FPGA.
pasted-from-clipboard.png
This should not be over 100% at any of the Resources.
Below you also find a list of operators and elements that require too much resources.
Additionally, you can check in the top bar:
pasted-from-clipboard.png
There you will find a tabular view of the required sources that can also be sorted.
Lastly you can right click on each operator or H-Box and click on "FPGA Resources" to see how much resources are needed by this operator or H-Box. If it is grayed out you need to run DRC2.
If we take a look at the first Ressources we can see that a lot of it is in NCC/Sigma_R
pasted-from-clipboard.png
The size of the Kernels are 22x22 which is quite large. Would it be possible to downsample the input images, and therefore downsample the mask to 11x11 ?
Lastly, the input parallelism is 8. With the Kernel size this becomes 22x22x8 pixels that are computed in parallel. If you put a ParallelDN Operator before the NCC-Hbox you can save up a lot of resources.
Lastly, there are some other things that could be done to distribute the resouce usage, in some arithmetic operators you can choose the ImplementationType.
pasted-from-clipboard.png
Here you can see that there is still some EmbeddedALU left so it would be helpful to not use LUTs here.
By doing all proposed ideas I was able to reduce the fpga ressources from the previous 766% to 105% on the LUTs, and on all others below 100%.
pasted-from-clipboard.png
The applet is attached, however it won't produce the desired results since the kernels are 11x11 and I did not adapt any of them.
You may take this as a reference to further improve your design
Best Regards
Kevin
Hi
Is there a way to use a big template without downsampling the mask? I can lower the speed for trade-off.
Best
Bingnan
Dears,
when I run my design on the frame grabber, the microdisplay shows like this: pasted-from-clipboard.png
I removed the "NCC detection part" only to grab the images, it runs a few seconds and reported "Timeout Error code: -2120".
I also tried to build the .hap from /Example/VCX-QP/SingleCXP6x4AreaGray8; the result is this:
.
The hardwares I am using are Optronis GmbH Cyclone-1HS-3500 - Optronis GmbH; frame grabber ME5 VCX-QP.
Attached are my programs. Could you help me check them out?
Thanks
Bingnan
Display MoreHi Bingnan,
In the step netlist generation you see that the hardware ressources are exhausted on the FPGA.
pasted-from-clipboard.png
This should not be over 100% at any of the Resources.
Below you also find a list of operators and elements that require too much resources.
Additionally, you can check in the top bar:
pasted-from-clipboard.png
There you will find a tabular view of the required sources that can also be sorted.
Lastly you can right click on each operator or H-Box and click on "FPGA Resources" to see how much resources are needed by this operator or H-Box. If it is grayed out you need to run DRC2.
If we take a look at the first Ressources we can see that a lot of it is in NCC/Sigma_R
pasted-from-clipboard.png
The size of the Kernels are 22x22 which is quite large. Would it be possible to downsample the input images, and therefore downsample the mask to 11x11 ?
Lastly, the input parallelism is 8. With the Kernel size this becomes 22x22x8 pixels that are computed in parallel. If you put a ParallelDN Operator before the NCC-Hbox you can save up a lot of resources.
Lastly, there are some other things that could be done to distribute the resouce usage, in some arithmetic operators you can choose the ImplementationType.
pasted-from-clipboard.png
Here you can see that there is still some EmbeddedALU left so it would be helpful to not use LUTs here.
By doing all proposed ideas I was able to reduce the fpga ressources from the previous 766% to 105% on the LUTs, and on all others below 100%.
pasted-from-clipboard.png
The applet is attached, however it won't produce the desired results since the kernels are 11x11 and I did not adapt any of them.
You may take this as a reference to further improve your design
Best Regards
Kevin
Hi Kevin,
Thanks so much. Just one question: why some operators can specify resource type (ImplementationType), some not?
And when ImplementType = AUTO, what’s the principal of VA to allocate resources?
Best
Bingnan
Display MoreHi Bingnan,
No the parallelity is usally in the "width". Which means, that if you have a parallelity of 8 you will have 8 pixels transported in parallel. This pixels are usually the ones that are next to each other in one line.
If you set the height in the module properties of the "SplitImage" operator to 1 it will divide an image which has N lines into N images.
The parellelity will stay, and if you have a parallelity of 8 you will need to split it in parallel of 8 and then combine the logic with an OR Operator at the end.
This is similiar done in the example design that I have attached before.
Best Regards,
Kevin
Dear Kevin,
When I build it, DRC checks no error, but failed to build into .hap. Attahced are my design and log of build process. Could you please check what's the problem?
Thanks,
Bingnan
Hi
Attached is my design; during the design, I had some questions:
1. the operator IS_GreaterThan in the box "Trigger" has the properties pf Bit width and Parallelism 8, so I had to split parallel to 8. Is it because 8bit is the size of integer data?
2. After Split_Img should I add ParallelDn?
3. I did not add selectROI operator, Coz I want to adjust the real-time ROI manually when I monitor the camera. Would it cause any problems like resource allocation?
I would be grateful if you could give any advice on my design. Thank you in advance.
Best
Bingnan
Display MoreHi Bingnan,
I think the operator that you are searching for is called "SplitImage". In the Module Properties you can set the Height to "1", thus the image will get split into different lines.
I will attach you an example design. Note that you need to adapt it to your input image stream and to your parallelilty (currently working with parallelity 2)
Best Regards
Kevin
Note: This was not tested on any hardware.
Hi
In this case, for example, the height of the image is N; then I need to make N branches (parallelity)?
Is there a way to do as "for" logo, for (i=0; i<N; i++ )
Best,
B
Hi Kevin,
thank you for your information. Attached is the output picture, which I want to transfer from to trigger signal. My idea is to use "line-scan", the white dotes (detected objects) move from left to right - when there is a white dote passed, I output a signal using a thresholding operator (for example GreaterThan). How should I do this?
Thank you in advance.
Best
Bingnan
Display MoreHi Bingnan,
The Y-Coordinate Operator will give you the Y-Coordinates of each Pixel, as you correctly mentioned it will only Count in Y direction. There is also an X-Coodinate operator which will count in X-Direction.
Please notice that in your current design you are overwriting your image data that comes from the camera because of the Y-Coordinate Operator which only gives you the information about the pixel index. If you want to use both, I suggest using a Branch Opreator before.
Regarding your questions about the parallelility, if you check the Documentation of the PixelToSignal you will see the following Table:
pasted-from-clipboard.pngHere you can see that the Input Link I requires a parallelism of 1.
Here are some pointers that may help you:
ParallelDN -> Set it 1
SplitParallel -> Here you could Split the parallel pixel into links that only have the parallilty of 1 and then compute them further.
It all depends on what your use case exactly is.
Best Regards
Kevin