ariya.io About Collections Archives

faster quaternion multiplication

2 min read

Sweet memories, it was fun to derive it.

Faster here however must be taken with a grain of salt as the new code is not always guaranteed to be better pipelined.

And of course, it’s trivial to beat this generic C code with architecture-specific hand-rolled assembly.

git show cbc22908
commit cbc229081a9df67a577b4bea61ad6aac52d470cb
Author: Ariya Hidayat <ariya .hidayat>
Date:   Tue Jun 30 11:18:03 2009 +0200

    Faster quaternion multiplications.
    
    Use the known factorization trick to speed-up quaternion multiplication.
    Now we need only 9 floating-point multiplications, instead of 16 (but
    at the cost of extra additions and subtractions).
    
    Callgrind shows that the function now takes 299 instructions instead of
    318 instructions, which is not a big win. However I assume the speed-up
    has a better effect for mobile CPU, where multiplications are more
    expensive.
    
    Reviewed-by: Rhys Weatherley

diff --git a/src/gui/math3d/qquaternion.h b/src/gui/math3d/qquaternion.h
index 55c871d..9a1b590 100644
--- a/src/gui/math3d/qquaternion.h
+++ b/src/gui/math3d/qquaternion.h
@@ -198,24 +198,17 @@ inline QQuaternion &QQuaternion;::operator*=(qreal factor)
 
 inline const QQuaternion operator*(const QQuaternion &q1;, const QQuaternion& q2)
 {
-    // Algorithm from:
-    // http://www.j3d.org/matrix_faq/matrfaq_latest.html#Q53
-    float x = q1.wp * q2.xp +
-                    q1.xp * q2.wp +
-                    q1.yp * q2.zp -
-                    q1.zp * q2.yp;
-    float y = q1.wp * q2.yp +
-                    q1.yp * q2.wp +
-                    q1.zp * q2.xp -
-                    q1.xp * q2.zp;
-    float z = q1.wp * q2.zp +
-                    q1.zp * q2.wp +
-                    q1.xp * q2.yp -
-                    q1.yp * q2.xp;
-    float w = q1.wp * q2.wp -
-                    q1.xp * q2.xp -
-                    q1.yp * q2.yp -
-                    q1.zp * q2.zp;
+    float ww = (q1.zp + q1.xp) * (q2.xp + q2.yp);
+    float yy = (q1.wp - q1.yp) * (q2.wp + q2.zp);
+    float zz = (q1.wp + q1.yp) * (q2.wp - q2.zp);
+    float xx = ww + yy + zz;
+    float qq = 0.5 * (xx + (q1.zp - q1.xp) * (q2.xp - q2.yp));
+
+    float w = qq - ww + (q1.zp - q1.yp) * (q2.yp - q2.zp);
+    float x = qq - xx + (q1.xp + q1.wp) * (q2.xp + q2.wp);
+    float y = qq - yy + (q1.wp - q1.xp) * (q2.yp + q2.zp);
+    float z = qq - zz + (q1.zp + q1.yp) * (q2.wp - q2.xp);
+
     return QQuaternion(w, x, y, z, 1);
 }
</ariya>

♡ this article? Explore more, check the archives, or follow me Twitter.

Share this on Twitter Facebook Google+

comments powered by Disqus