Question on using coefficient buffer multiple output for 16 bit image data

Sangrae_Kim · Mar 11th 2021

Hello,

We ask for advice on how to speed up in the conditions listed below.

Target H/W = microEnable 5 VCL

Target FPS = 30fps

Image Size = 3104 x 3088

Image bit depth = 16bit

I am getting a buffer overflow in my design. (Maybe I got up to 20~23fps without buffer overflow in my design)

Can I get any advice on using coefficientBuffer(multiple out) to prevent overflow?

Sorry , I can not upload the design file (some reasons).

So I uploaded a some design sketch.

If we don't use Coefficientbuffer 1 & 2, we can get 30 fps.

Coefficientbuffer = each link 16bit-P4 , Image Width & Height (776 x 3088)

pasted-from-clipboard.png

Thanks.

Best regards.

Sangre Kim.

Johannes Trein · Mar 11th 2021

Hello Sangrae Kim,

You will need the following memory bandwith:

3104 x 3088 * 30fps = 288 MPixel/s.

To buffer the camera image (assume 16Bit per pixel) = 288 MPixel/s * 2 byte * 2 for read and write = 1150 MB/s

For CoefficientBuffer you need 288MPixel/s * 2 images * 2 byte = 1150 MB/s

So in total your design requires a memory bandwidth of 2.3 GByte/s. The marathon VCL has theoretical total of 6400 MB/s so we are good here.

Operator ImageBuffer needs to be used at parallelism 16 to get the full performance. Otherwise you will not use all memory cells.

CoefficientBuffer is a little tricky. In this thread you can see how CoefficientBuffer needs to be parameterized to get the maximum performance. LINK

From the table we can figure out that the operator will only run in full performance when using four output links in parallel at a parallelism of two and a bit width of 64.

pasted-from-clipboard.png

See attached file (untested).

Let me know if you have further questions.

Johannes

Sangrae_Kim · Mar 12th 2021

Hello Johannes,

Thank you for your quick update.

I will check again with your advice.

There is only one thing below, but I tested it first.

In simulation,

We can not load the image file from coefficientBuffer like as below.

pasted-from-clipboard.png

d00.tif = The image size is 1552 x 3088. (0,2,4,6,8 ...... pixels)

d01.tif = The image size is 1552 x 3088. (1,3,5,7,9 ...... pixels)

w00.tif = The image size is 1552 x 3088. (0,2,4,6,8 ...... pixels)

w01.tif = The image size is 1552 x 3088. (1,3,5,7,9 ...... pixels)

First of all, I tried changing Coefficientbuffer the way you method.

but, the image file cannot be loaded in the simulation.

pasted-from-clipboard.png

Thanks.

Best regards.

Johannes Trein · Mar 12th 2021

Hello Sangrae Kim

Note that the images needs to be 8 bit per Pixel to fit to the 64 bit of CoefficientBuffer. A 16 Bit pixel image cannot directly be uploaded.

I added a simulation only H-Box to the design to let you know how to generate the images.

pasted-from-clipboard.png

Open to view:

pasted-from-clipboard.png

Load any image to the SimulationSource module.

Set the pixel alignment if your image is not a 16 bit image.

Start the simulation and save the results of the two probes to image files.

Set these images files in CoefficientBuffer and start the simulation again. The CoefficientBuffer will then use the images. Check the simulation probe of the design to verify your results.

I've attached two sample images which can be loaded directly to the CoefficientBuffers.

Let me know your feedback.

See attached ZIP

Best regards

Johannes

Sangrae_Kim · Mar 15th 2021

Hello Johannes,

Thank you for your support.

It works fine with your suggested method.

We can get target frame rate now.

Thanks.

Best regards.

Share