I am fairly new to VA and was wondering how to work with floating constants that I need in my computation.
A very simple task could be
I1 = I0 + b
where b could be a floating value between -1 and 1.
I could not enter the floating number into the CONST operator, and neither could I find an appropiate operator for using float values.
Thus, I thought about some implementation techniques, mainly quantizing the value before in an appopiate bitwidth to utilize fractional bits, then multiplying it, so I can use an integer in the CONST operator, and adapt my formula accordingly so it would look something like this:
I1 = 128*I0 + 128*bquantized
Since this is the the first thing which came into my mind I guess that there might be better ways to use point numbers.
Of course this is just a very simple example, will the techniques differ when I have constant matrix filled with floats? or a LUT ?
I am hoping to get some feedback and insights on how to work with floating numbers in VA this way and maybe help others facing the same problem.
I was not able to find any appropiate entry here, if there is one please let me know