Sweet memories, it was fun to derive it.
Faster here however must be taken with a grain of salt as the new code is not always guaranteed to be better pipelined.
And of course, it’s trivial to beat this generic C code with architecture-specific hand-rolled assembly.
git show cbc22908
commit cbc229081a9df67a577b4bea61ad6aac52d470cb
Author: Ariya Hidayat <ariya .hidayat>
Date: Tue Jun 30 11:18:03 2009 +0200
Faster quaternion multiplications.
Use the known factorization trick to speed-up quaternion multiplication.
Now we need only 9 floating-point multiplications, instead of 16 (but
at the cost of extra additions and subtractions).
Callgrind shows that the function now takes 299 instructions instead of
318 instructions, which is not a big win. However I assume the speed-up
has a better effect for mobile CPU, where multiplications are more
expensive.
Reviewed-by: Rhys Weatherley
diff --git a/src/gui/math3d/qquaternion.h b/src/gui/math3d/qquaternion.h
index 55c871d..9a1b590 100644
--- a/src/gui/math3d/qquaternion.h
+++ b/src/gui/math3d/qquaternion.h
@@ -198,24 +198,17 @@ inline QQuaternion &QQuaternion;::operator*=(qreal factor)
inline const QQuaternion operator*(const QQuaternion &q1;, const QQuaternion& q2)
{
- // Algorithm from:
- // http://www.j3d.org/matrix_faq/matrfaq_latest.html#Q53
- float x = q1.wp * q2.xp +
- q1.xp * q2.wp +
- q1.yp * q2.zp -
- q1.zp * q2.yp;
- float y = q1.wp * q2.yp +
- q1.yp * q2.wp +
- q1.zp * q2.xp -
- q1.xp * q2.zp;
- float z = q1.wp * q2.zp +
- q1.zp * q2.wp +
- q1.xp * q2.yp -
- q1.yp * q2.xp;
- float w = q1.wp * q2.wp -
- q1.xp * q2.xp -
- q1.yp * q2.yp -
- q1.zp * q2.zp;
+ float ww = (q1.zp + q1.xp) * (q2.xp + q2.yp);
+ float yy = (q1.wp - q1.yp) * (q2.wp + q2.zp);
+ float zz = (q1.wp + q1.yp) * (q2.wp - q2.zp);
+ float xx = ww + yy + zz;
+ float qq = 0.5 * (xx + (q1.zp - q1.xp) * (q2.xp - q2.yp));
+
+ float w = qq - ww + (q1.zp - q1.yp) * (q2.yp - q2.zp);
+ float x = qq - xx + (q1.xp + q1.wp) * (q2.xp + q2.wp);
+ float y = qq - yy + (q1.wp - q1.xp) * (q2.yp + q2.zp);
+ float z = qq - zz + (q1.zp + q1.yp) * (q2.wp - q2.xp);
+
return QQuaternion(w, x, y, z, 1);
}
</ariya>