Hello,

I am fairly new to VA and was wondering how to work with floating constants that I need in my computation.

A very simple task could be

I_{1} = I_{0} + b

where b could be a floating value between -1 and 1.

I could not enter the floating number into the CONST operator, and neither could I find an appropiate operator for using float values.

Thus, I thought about some implementation techniques, mainly quantizing the value before in an appopiate bitwidth to utilize fractional bits, then multiplying it, so I can use an integer in the CONST operator, and adapt my formula accordingly so it would look something like this:

I_{1} = 128*I_{0} + 128*b_{quantized}

Since this is the the first thing which came into my mind I guess that there might be better ways to use point numbers.

Of course this is just a very simple example, will the techniques differ when I have constant matrix filled with floats? or a LUT ?

I am hoping to get some feedback and insights on how to work with floating numbers in VA this way and maybe help others facing the same problem.

I was not able to find any appropiate entry here, if there is one please let me know

Greetings,

Kevin