Deadlock somewhere

  • Dear all,


    I believe we are having a deadlock issue that does not show in simulation. We have tried adding buffers in various places but haven't been able to resolve it. Could someone help locate the deadlock and explain a bit on why it is happening?


    The design extracts 18000 pixels from each 1280x1024 frame and reduce the image to 18000x1. Every 4 frames are appended for background subtraction where the 1st image is the background. We think the deadlock is somewhere in the subtract_background part.


    siso_forum_help.va

  • Dear,


    you are appending 4x4 frames, which results in an image height of 16. The Simulation stops then:

    Quote

    (BRANCH) Process0\subtract_bkgnd\ImageSequence\CopySequence\I: Image height(16) exceeds the maximal link height(4)!

    In runtime the behaviour of exceeded link dimensions is undefined! So maybe that's your main problem.


    Aside from this I don't see a obvious deadlock problem. All of your FIFOs are big enough.

    I see a bandwidth problem, as you are using a DRAM-operator at parallelism 1.

    What is the frame rate of your camera?

    On ME5 platforms the maximum possible parallelism should be used, as stated in the manual (Appendix. Device Resources/Shared Memory Concept):

    Quote

    Due to the shared bandwidth architecture, the applet developer should utilize all 256 bits of the operator’s memory interface (RAM Data Width) to achieve maximal throughput through the memory interface when using multiple RAM based operators even though the single RAM operator needs less bandwidth on its input.



    Best regards,

    Simon

  • Dear Simon,


    Thank you for pointing that out. I made a mistake in duplicating the Append operator.

    Thanks for confirming that the buffers are big enough for this design. After more investigation, we found that there was another deadlock in later parts of the design which had caused the problem.

    As for the bandwidth, it is certainly a problem we are facing. The current design can process an input framerate of 95fps for 1280x1024 images, and the output is 23fps. Any suggestions to how we might be able to optimize the bandwidth?


    Best,

    Lucy

  • Dear Simon,


    Thank you so much. Please let me know if there are information you need about the design for bandwidth optimization.

    The deadlock was actually in a part that is not present in the design I uploaded (this is a simplified version). I did not add buffers before syncing to add.

    debug_lifetime.PNG


    Best,

    Lucy

  • Dear Lucy,


    please have a look at the modified version.

    CarmenZ and I modified it to achieve more bandwidth.

    Most of the magic is done in "Par8_ROI_Extract". There are 2 LUTs with the pixel coordinates to extract from the image. The box "FrameBufferRandomRd_Par8" is a special approach to get more bandwidth while being able to do random reads in DRAM. You can find this in the examples folder of VA: "Examples/Processing/Geometry/GeometricTransformation/GeometricTransformation_PixelReplicator.va". It's just a little bit stripped down. It is also really expensive, I hope the rest of your design fits to the FPGA. If not, please tell me.


    Some changes are in "subtract_bkgnd/ImageSequence", where the three ImageFIFOs and the InsertImage are replaced by a FrameMemoryRandomRd with an address generator which repeats the input image three times.


    I hope this helps.


    Best regards,

    Simon