Posts by B.Ru

    Hello together,

    What are 'ticks' in relation to the system clock and the signal library?


    Especially in the signal library the word 'ticks' is mentioned pretty often and in the following you can find an explanation of it:


    The system clock is used by the FPGA within each VA platform.
    All current microEnable5 (mE5) boards use 125 MHz, while the older mE4 is found at 62.5 MHz.

    A single tick represents a single pulse with the period of the used system clock:

    Example for single tick duration:

    system clock = S = 125 MHz

    tick duration = T = 1 / S


    1 / (125 MHz) = 8 nanoseconds (click for calculation)


    1 second in ticks:


    S = 125 MHz


    1 second is represented by 125,000,000 = 125 million ticks


    8 ns * 125,000,000 = 1 second

    How to represent 1 ms in ticks?

    duration = D = 1 ms (example value)

    system clock = S = 125 MHz

    tick duration = T = 1 / S = 1 / 125 MHz = 8 ns (see above)


    D / T = amount of ticks representing D:


    (1 millisecond) / (8 nanoseconds) = 125,000 ticks


    In case of questions to this do not hesitate to place it below.
    The community will answer as soon as possible...


    Best regards,

    Dear Community,


    The design below MeasureDurationLatency_v001.va will give you exact timing and latency of the VA design and its acquisition:

    pasted-from-clipboard.png


    An interesting part can be found within H-Box GetDataImmediately:


    pasted-from-clipboard.png


    This function extracts the interesting image data from the CXP data stream and give minimum latency for CXP related data acquisition. This is only required for CoaXPress interfaces.


    For all measurements in here the operator SignalToDelay is used:


    pasted-from-clipboard.png


    The screenshot above shows how to measure the time between End of Acquisition and End of Transfer within H-Box Added_Latency.


    How SignalToDelay works:

    SignalToDelay1.png


    Best regards,

    Hello Theo,


    That is an interesting point:

    You know, I am thinking about a documentation for People who are note developing new VAs but still Need to know what is possible because they want to use an VA in their application and are responsible for the specification.

    A documentation that describes in a detailed way, what applications or processing options are possible or beneficial using VA, or even a combination of VA + CPU + GPU ideas. I will forward this idea to our product management and marketing deoartment.


    Here you can find a simple PDF showing some slides VAoptions_SiliconSoftware_BjoernRudde_V1.01_EN.pdf on VA options in general.

    It is a sub-set of slides I have shown on a series of distributors technology forums.

    But these may help getting a good introduction, while this kind of documentation is missing.


    During VACC and VADC trainings some presentation material is given to all participants.


    Since your question started with Parameter Libraries, you can have a detailed look into the options here, but it is focused on VA developers.


    Best regards,

    Dear Community,


    A detailed approach into bandwidth considerations including additional calculations can be found here:

    VQ8-CXP6D-DepthFromFocus


    The linked thread is mainly related to bandwdith details of a CXP system.


    In case you are using a CXP camera the folowing formulas may be of interest.


    That means up to 12.5 or 6.25 Gbit/s per link depending on the used grabber.


    Due to 8b/10b encoding this represents per link at CXP6:

    (8 / 10) * (6.25 (Gbit / s)) = 625 MB / s


    quad link will end up at:

    ((4 * 8) / 10) * (6.25 (Gbit / s)) = 2500 MB / s


    Since the interface is protocol based you will not get the full bandwidth for image data.

    But at least it is the possible peak performance for CXP link itself.

    The possible camera sensor data output will always be below these limits.


    There are more options being supported by the VA design, but depending on the used platform:

    CXP-1

    1.25 Gbit/s

    up to 212 m

    CXP-2

    2.5 Gbit/s

    up to 185 m

    CXP-3

    3.125 Gbit/s

    up to 169 m

    CXP-5

    5 Gbit/s

    up to 102 m

    CXP-6

    6.25 Gbit/s

    up to 60 m

    CXP-10

    10 Gbit/s

    up to 40 m

    CXP-12

    12.5 Gbit/s

    up to 30 m

    Source of table: Wikipedia


    Max link speed for mE5-MA-VCX-QP and mE5-VQ8-CXP6D is CXP6.

    Status: 2nd of July 2020, current overview


    Some math again for CXP1 speed with 4 links:

    ((4 * 8) / 10) * (1.25 (Gbit / s)) = 500 MB / s, where this peak bandwidth is very likely transporting 400 MB/s of image data.

    Similar for CXP5 configuration with one link:

    ((1 * 8) / 10) * (5 (Gbit / s)) = 500 MB / s


    Always check the CXP configuration using the hardware dialog in GenICam Explorer or microDisplayX.

    The link topology will tell you what you are using precisely for CXP.

    This will link to the documentation of link topology dialog, but screenshots are for GEV, but same steps for CXP.


    Best regards

    Dear IhShin,


    You question asks for a bandwidth of:

    2000 * 2000 * (8 bit) * (239 Hz) = 956 MB / s

    coming from the camera, while reporting a bandwidth limit at:

    2000 * 2000 * (8 bit) * (96 Hz) = 384 MB / s


    While your initial buffer is using 2 RAMs for 8 bit at paralleism 32 (receiving up to 20), the buffer approach inside the EDoF is receiving 16 bit at parallelism 32.


    Looking at the RAM


    Hardware Configuration microEnable 5 ironman:

    Resource mE5VQ8-CXP6B/mE5VQ8-CXP6D
    Vision Processor Xilinx Virtex6 XC6VLX240T FPGA
    LUT
    150720
    FlipFlop
    301440
    Block RAM 832 x 18432Bit
    Embedded Arithmetic Logic Unit (DSP48) 768
    RAM 4 x 256MiB DDR3
    Data Width per RAM 128Bit
    Bandwidth per RAM 4GB/s
    Base Design Clock
    125MHz
    Host Interface PCIe x8 Gen2
    Host Interface (PCIe x 8 Gen 2) Bandwidth (theor.) 4 Gbyte/s per direction on PCIe bus
    Host Interface (PCIe x 8 Gen 2) Bandwidth (typ./max.) up to 3.6 GByte/s on PCIe bus

    Table 55. Hardware Configuration microEnable 5 ironman (Source)


    Let's look into the details:


    To get the maximum performance of each RAM module, we have to use the full data width:

    128 bit

    using the full interface, does mean all 128 bit == bit-depth * parallelism, would give 4 GB/s of bandwidth.

    4 GB/s = (1+1) * 2GB /s one for writing input and one for reading output.


    Above we end up at an bandwidth

    956 MB / s

    here it equals 956 MPixel/s, each pixel having 16 bit = 2000 * 2000 * (16 bit) * (239 Hz) = 1912 MB / s

    Explanation: 16 bit consist of 2 intermediate values per pixel...


    Two RAM blocks are used within the EDoF:

    pasted-from-clipboard.png


    Each RAM in here uses 4 bit at parallelism 32 = 4 bit * 32 = 128 bit
    Same choice for acquisition RAM modules.

    So RAM interfacing in EDoF is fine.


    While your intended bandwidth is less than parallelism * system clock = 8 * 125 MHz = 1GPixel /s the selected parallelism of 16 is a secure choice. No problem with this too.


    From my point of view there is no issue within the VA design.

    That is OK, but no solution or answer to your question.


    There are two other details we have to look at now:

    DMA-performance and camera interface CXP.

    I guess and hope that the DMA performance of the ironman grabber in your system is not limited, but please double check that.

    A limited PCIe performance could be a reason for that, because it would propagate stop's into the design's data flow.

    The ironman is providing PCIe x8 Gen2 with theoretical 4 GB/s and 3.6 GB/s in practice, but it is possible that the mainboards PCIe slot does only support Gen1 and/or less lanes than x8.

    One PCIe lane at Gen1 would deliver 256 MB/s in theory and practically close to 200 MB/s

    One PCIe lane at Gen2 would deliver 512 MB/s in theory and practically close to 400 MB/s


    Your design is correctly configured for PCIe 8 Gen2:
    pasted-from-clipboard.png

    Shown in Applet Properties operator.


    You reported:

    2000 * 2000 * (8 bit) * (96 Hz) = 384 MB / s

    and 1 of 50 images : 2% is output bandwidth of second DMA:


    (2000 * 2000 * (8 bit) * (96 Hz)) * (1 + (1 / 50)) = 391.68 MB / s
    That is pretty close to 400 MB/s, being an indicator for 1 PCIe lane at Gen2.

    In microDiagnostics you can double-check the possible bandwidth of the grabber:

    Output will look like:

    OhneDMATurb_650x315.png


    Test is carried out for the applet that is available (flashed) on the selected frame grabber.

    Since VA designs are not fully supported, please flash the related acquisition applet.

    In your case it will be: Acq_QuadCXP6x1AreaGray8.dll


    On X axis of the performance diagram for Acq_* applet at FG_GRAY = 8 bit per pixel you need to see a peak at 2048 at or above 1000 MB/s to reach your target bandwidth.


    What to do if DMA performance is fine?


    The camera may be a second external issue.

    In practice you are using CXP and the used operator supports 4 CXP6 links.


    That means up to 6.25 Gbit/s per link.

    Due to 8b/10b encoding this represents per link:

    (8 / 10) * (6.25 (Gbit / s)) = 625 MB / s

    quad link you be:

    ((4 * 8) / 10) * (6.25 (Gbit / s)) = 2500 MB / s

    Since the interface is protocol based you will not get the full bandwidth for image data.

    But at least it is the possible peak performance for CXP6.


    There are more options being supported by the VA design:


    CXP-1 1.25 Gbit/s up to 212 m
    CXP-2 2.5 Gbit/s up to 185 m
    CXP-3 3.125 Gbit/s up to 169 m
    CXP-5 5 Gbit/s up to 102 m
    CXP-6 6.25 Gbit/s up to 60 m
    CXP-10 10 Gbit/s up to 40 m
    CXP-12 12.5 Gbit/s up to 30 m

    Source of table: Wikipedia


    Some math again:

    ((4 * 8) / 10) * (1.25 (Gbit / s)) = 500 MB / s, where this peak bandwidth is very likely transporting 400 MB/s of image data.

    Same for this configuration:

    ((1 * 8) / 10) * (5 (Gbit / s)) = 500 MB / s


    To end this already pretty long story :

    Check the CXP configuration using the hardware dialog in GenICam Explorer or microDisplayX.

    The link topology will tell you what you are using precisely for CXP.

    This will link to the documentation of link topology dialog, but screenshots are for GEV, but same steps for CXP.


    The End is a Summary

    Your VA design is correct, you should see the expected bandwidth.

    From my perspective it is possible that the PCIe connection or CXP topology is causing this.


    To me it is very likely that the PCIe Gen2 slot provides a single lane only.
    Please let me know what the DMA performance test (microDiagnistics) shows...


    If you need some help in interpreting your tests and interpreting this into the observed performance:
    I and all the other people in the VA forum community will help.


    Thanks and best regards,

    Dear Fabio,


    Here some explanations on the VA design related to auto exposure... ExposureControl_StateMachine_B.Rudde.va

    Download: ExposureControl_StateMachine_B.Rudde.va

    A later post will give more explanation.

    Full overview:


    pasted-from-clipboard.png


    The points of interest are the 3 hierarchical boxes: RxExposure, BrightnessMismatch, NewExposure

    Each listed with comments below:


    BrightnessMismatch, if auto exposure want to work, we need to know how "bright" the image is.
    Here you can set the target value for histogram mean into TargetHistMeanValue, preset = 127.

    The function extracts the mean value from full histogram:

    pasted-from-clipboard.png


    NewExposure, based on the brightness mismatch to target value and the last used exposure value the new one is calculated on basis of simple closed-loop control with P element only. More to read in the commenst within the VA design:


    Basics of loop:

    pasted-from-clipboard.png


    Regulator details:

    pasted-from-clipboard.png


    RxExposure, simply receiving the new exposure value and forwarding it to pulse generation:


    pasted-from-clipboard.png



    I hope the listed VA design shows what you are asking for. In case of questions do not hesitate to contact me or press the like button ;) in case there are no questions to this VA design sketch.


    Best regards,

    Dear Fabio,


    Here you can find a complete "auto exposure control" on basis of closed loop P-regulator.


    Download: ExposureControl_StateMachine_B.Rudde.va


    In the initial post you asked for fixed steps for control, you can see internal value P as fixed value,

    but please see the large amount of comments within the design for more details.


    So your design includes a loop, but can be simulated in VA except the real camera feedback.

    But you can use any kind of sequence to see how the P-regulator is acting.


    A later post will give more explanation.


    The question of state machine is a shorter answer:
    Rx- and TxSignalLink make it possible to build a signal based state machine,

    or if you require values and simulation for it, please use:

    RxImageLink and TxImageLink.


    Best regards,