Metaprogramming for math library Jiří Chmelík, Marek Trtík PA199 ‹#› uMetaprogram is a code whose compilation generates a new code whose functionality we really want. u uFirst metaprogram: uErwin Unruh, 1994. uDuring compilation of his program the compiler computed prime numbers. u uOur first metaprogram will be much simpler… Metaprogramming ‹#› Our first metaprogram ‹#› uLet’s consider this implementation of the dot product: u float dot_product(int dim, float* u, float* v) { u float result = 0.0f; u for (int i=0; i struct DotProduct { // Recursion rule u static inline float result(float* u, float* v) { u return (*u) * (*v) + DotProduct::result(u+1, v+1); u } u}; u utemplate<> struct DotProduct<1> { // Base case u static inline float result(float* u, float* v) { u return (*u) * (*v); u } u}; Unrolling loops: Solution ‹#› u template inline float dot_product(float* u, float* v) { u return DotProduct::result(u, v); u } uThe compiler then converts the call u dot_product<3>(u,v) u into code: uDotProduct<3>::result(u,v); u(*u) * (*v) + DotProduct<2>::result(u+1,v+1); u(*u) * (*v) + ((*(u+1)) * (*(v+1)) + DotProduct<1>::result(u+2,v+2)); u(*u) * (*v) + ((*(u+1)) * (*(v+1)) + (*(u+2)) * (*(v+2))); Unrolling loops: Solution ‹#› uWhen designing a math library, it is desirable to provide this syntax: u Vector u(1000), v(1000); // Define vectors u … // Initialise them u u = 1.2f * u + u * v; // Perform operations. u uIn C++ we can achieve this syntax via operator overloading. u uQuestion: Could there be a performance issue with such approach? u uLet’s start with a naïve solution… Expression templates: Motivation ‹#› uclass SVector { // “Simple” Vector of any dimension. u int size_; u float* data_; u void copy(SVector const& o) { for (int i=0; i Mul Add, Mul > ‹#› u class Scalar { u float const& s_; u public: u Scalar(float const& s) : s_(s) {} u int size() const { return 0; } u float operator[](int const) const { return s_; } u }; Expression templates: Scalar ‹#› uInside operator classes Add and Mul we must distinguish how we reference operands: uScalar operand – by value. uOther operands – by reference. u u template struct ArgType { u typedef T const& result; u }; u template<> struct ArgType { u typedef Scalar result; u }; Expression templates: ArgType ‹#› u template u class Mul { u typename ArgType::result op1_; u typename ArgType::result op2_; u public: u Mul(OP1 const& op1, OP2 const& op2) : op1_(op1), op2_(op2) {} u int size() const { return op1.size()!=0? op1.size() : op2.size(); } u float operator[](int const i) const { return op1_[i] * op2_[i]; } u }; Expression templates: Mul ‹#› u template u class Add { u typename ArgType::result op1_; u typename ArgType::result op2_; u public: u Add(OP1 const& op1, OP2 const& op2) : op1_(op1), op2_(op2) {} u int size() const { return op1.size()!=0? op1.size() : op2.size(); } u float operator[](int const i) const { return op1_[i] + op2_[i]; } u }; Expression templates: Add ‹#› uHow do we define operators? How about this: u template u Mul operator*(T1 const& u, T2 const& v) { … } // (1) uNot good, parameters are too general, e.g.: u struct A {} a; u struct B : public A {} b; u template u A operator*(T const& s, A const& a) { … } // (2) u 3 * a; // Always calls (2) u 3 * b; // Calls (2) only when (1) is not defined. Expression templates: Expression class ‹#› uWe also do not want define all possible versions (too much work): u Mul< SVector, SVector > operator*(SVector const& u, SVector const& v); u template Mul > operator*(SVector const& u, Mul const& v); u template Mul, SVector> operator*(Mul const& u, SVector const& v); u template Mul< SVector,Add > operator*(SVector const& u, Add const& v); u … uSolution: We introduce a class “Expr” for wrapping our expressions: u template class Expr; uWe can safely declare the operator as: u template u Expr > operator*( u Expr const& u, Expr const& v); u Expression templates: Expression class ‹#› u template u class Expr { // Expression u Rep rep_; u public: u explicit Expr(int size) : rep_(size) {} u Expr(Rep const& rep) : rep_(rep) {} u Rep const& rep() const { return rep_; } // Unwrap the instance. u int size() const { return rep_.size(); } u float operator[](int const i) const { return rep_[i]; } u // Defined later: u template Expr& operator=(Expr const& o); u }; Expression templates: Expression ‹#› u template u Expr > operator*(Expr const& u, Expr const& v) { u return Expr >(Mul(u.rep(), v.rep())); u } u u template u Expr > operator*(float const& s, Expr const& u) { u return Expr >(Mul(Scalar(s), u.rep())); u } u u template u Expr > operator+(Expr const& u, Expr const& v) { u return Expr >(Add(u.rep(), v.rep())); u } Expression templates: Operators Wrap the result Actual operation Unwrap operands ‹#› utemplate utemplate uExpr& Expr::operator=(Expr const& o) { u for (int i=0; i Add[i] (Expr > forwards operator[] to ‘rep’) u ==> Add::op1[i] + Add::op2[i] u ==> Mul::op1[i] * Mul::op2[i] + Mul::op1[i] * Mul::op2[i] u ==> Scalar[i] * SVector[i] + SVector[i] * SVector[i] u ==> 1.2f * u[i] + u[i] * v[i] u Expression templates: Expr::operator= Add, Mul > Compiler converts ‘o[i]’ to ‘1.2f*u[i] + u[i]*v[i]’ => we get the efficient version! ‹#› uWe must declare our vector variables as: u Expr u(1000), v(1000); // Not nice! uTherefore, we usually define: u typedef Expr Vector; u uLimitation: Does not work for operations where RAM temporaries are really needed, e.g., ‘x = A*x’, where x is a vector and A is a matrix. uBut ‘y = A*x’ would work just fine (assuming x,y are different vectors). u uUsage rule of thumb: uFor small fixed-size vectors use metaprogramming, e.g., unrolling loops. uFor large dynamic-size vectors use expression templates. Expression templates: Notes & Limitations ‹#› u[1] D.Vandevoorde, N.M.Josuttis; C++ Temlates, The Complete Guide; Addison-Wesley,2006. References