Tutorial 4 of 5: fixing lower order parameters

In the previous section, we took advantage of the symmetry of $\sin(x)$ to build the following minimax expression:

$$\max_{y \in [0, \pi^2/4]}{\dfrac{\bigg\lvert\dfrac{\sin(\sqrt{y})}{\sqrt{y}} - Q(y)\bigg\rvert}{\dfrac{1}{|\sqrt{y}|}}} = E$$

Leading to the following coefficient, amongst others:

    return u * x + 9.9999788400553332261e-1f;

This is an interesting value, because it is very close to 1. Many CPUs can load the value 1 very quickly, which can be a potential runtime gain.

We may wonder: what if we used 1 directly here?

The brutal way

   return u * x + 1.f;

Let’s see the error value:

Duh. Pretty bad, actually. Maximum error is about 10 times worse.

The clever way

The clever way involves some more maths. Instead of looking for polynomial $Q(y)$ and setting $Q(0) = 1$ manually, we rewrite $Q$ as $Q(y) = 1 + yR(y)$ and search instead for $R$:

$$\max_{y \in [0, \pi^2/4]}{\dfrac{\bigg\lvert\dfrac{\sin(\sqrt{y})}{\sqrt{y}} - 1 - yR(y)\bigg\rvert}{\dfrac{1}{|\sqrt{y}|}}} = E$$

Dividing numerator and denominator by $y$ gives:

$$\max_{y \in [0, \pi^2/4]}{\dfrac{\bigg\lvert\dfrac{\sin(\sqrt{y})-\sqrt{y}}{y\sqrt{y}} - R(y)\bigg\rvert}{\dfrac{1}{|y\sqrt{y}|}}} = E$$

Once again, we get a form suitable for the Remez algorithm.

Invocation

lolremez --degree 3 --range "1e-50:pi*pi/4" "(sin(sqrt(x))-sqrt(x))/(x*sqrt(x))" "1/(x*sqrt(x))"

Only $f$ and $g$ changed here, as well as the polynomial degree. The rest is the same as in the previous section.

After all the iterations the output should be as follows:

/* Approximation of f(x) = (sin(sqrt(x))-sqrt(x))/(x*sqrt(x))
 * with weight function g(x) = 1/(x*sqrt(x))
 * on interval [ 9.9999999999999999999e-51, 2.4674011002723396547 ]
 * with a polynomial of degree 3. */
float f(float x)
{
    float u = 2.5180282371807974667e-6f;
    u = u * x + -1.9754457746087556056e-4f;
    u = u * x + 8.3319148411123829805e-3f;
    return u * x + -1.6666579517227416457e-1f;
}

We can therefore write the corresponding C++ function:

float fastsin2(float x)
{
    return x + x * f(x * x);
}

Analysing the results

Let’s see the new error curve:

Excellent! The loss of precision is clearly not as bad as before, and we need to load 4 constants instead of 5 previously.

Conclusion

You should now be able to fix lower-order coefficients in the minimax polynomial for possible performance improvements.

Please report any trouble you may have had with this document to [email protected]. You may then carry on to the next section: additional tips.

Provide feedback

Saved searches