Problem:
Using templates can lead to code bloat: binaries with replicated (or almost replicated) code, data, or both.
The result can be source code that looks fit and trim, yet object code that’s fat and flabby. Fat and flabby is rarely fashionable, so you need to know how to avoid such binary bombast.
Solution:
Your primary tool has the imposing name commonality and variability analysis.
Non-template code
When you’re writing a function and you realize that some part of the function’s implementation is essentially the same as another function’s implementation, do you just replicate the code?
Of course not. You factor the common code out of the two functions, put it into a third function, and have both of the other functions call the new one.
Similarly, if you’re writing a class and you realize that some parts of the class are the same as parts of another class, you don’t replicate the common parts. Instead, you move the common parts to a new class, then you use inheritance or composition (see Items 32, 38, and 39) to give the original classes access to the common features.
Template code
In non-template code, replication is explicit: you can see that there’s duplication between two functions or two classes. In template code, replication is implicit: there’s only one copy of the template source code, so you have to train yourself to sense the replication that may take place when a template is instantiated multiple times.
Example:
template<typename T, // template for n x n matrices of
std::size_t n> // objects of type T; see below for info
class SquareMatrix { // on the size_t parameter
public:
...
void invert(); // invert the matrix in place
};
SquareMatrix<double, 5> sm1;
...
sm1.invert(); // call SquareMatrix<double, 5>::invert
SquareMatrix<double, 10> sm2;
...
sm2.invert(); // call SquareMatrix<double, 10>::invert
Two copies of invert will be instantiated here. The functions won’t be identical, because one will work on 5 ×5 matrices and one will work on 10 ×10 matrices, but other than the constants 5 and 10, the two functions will be the same. This is a classic way for template-induced code bloat to arise.
Solution
template<typename T>
class SquareMatrixBase { // size-independent base class for
protected: // square matrices
...
void invert(std::size_t matrixSize); // invert matrix of the given size
...
};
template<typename T, std::size_t n>
class SquareMatrix: private SquareMatrixBase<T> {
private:
using SquareMatrixBase<T>::invert; // make base class version of invert
// visible in this class; see Items 33
// and 43
public:
...
void invert() { invert(n); } // make implicit inline call to
// base class version of invert
};
They will thus share a single copy of that class’s version of invert. (Provided, of course, you refrain from declaring that function inline. If it’s inlined, each instantiation of SquareMatrix::invert
will get a copy of SquareMatrixBase::invert
’s code, and you’ll find yourself back in the land of object code replication). The additional cost of calling it should be zero, because derived classes’ inverts
call the base class version using inline functions. (The inline is implicit — see Item 30.)
How does SquareMatrixBase::invert
know what data to operate on?
-
One possibility would be to add another parameter to
SquareMatrixBase::invert
That would work, but in all likelihood, invert is not the only function inSquareMatrix
that can be written in a size-independent manner and moved intoSquareMatrixBase
. We could add an extra parameter to all of them, but it is wrong to tellSquareMatrixBase
the same information repeatedly. -
An alternative is to have SquareMatrixBase store a pointer to the memory for the matrix values
template<typename T> class SquareMatrixBase { protected: SquareMatrixBase(std::size_t n, T *pMem) // store matrix size and a : size(n), pData(pMem) {} // ptr to matrix values void setDataPtr(T *ptr) { pData = ptr; }// reassign pData ... private: std::size_t size; // size of matrix T *pData; // pointer to matrix values }; template<typename T, std::size_t n> class SquareMatrix: private SquareMatrixBase<T> { public: SquareMatrix() // send matrix size and : SquareMatrixBase<T>(n, data) {} ... // data ptr to base class private: T data[n*n]; };
Objects of such types have no need for dynamic memory allocation, but the objects themselves could be very large. An alternative would be to put the data for each matrix on the heap
template<typename T, std::size_t n> class SquareMatrix: private SquareMatrixBase<T> { public: SquareMatrix( ) // set base class data ptr to null, : SquareMatrixBase<T>(n,0), // allocate memory for matrix pData(new T[n*n]) // values, save a ptr to the { this->setDataPtr(pData.get()); } // memory, and give a copy of it ... // to the base class private: boost::scoped_array<T> pData; // see Item 13 for info on }; // boost::scoped_array
Tradeoff
- Certain Optimization vs. decrease the size of executable
- For previous size-specific version, the versions of invert with the matrix sizes hardwired into them are likely to generate better code than the shared version where the size is passed as a function parameter or is stored in the object. For example, in the size-specific versions, the sizes would be compiletime constants, hence eligible for such optimizations as constant propagation, including their being folded into the generated instructions as immediate operands. That can’t be done in the size-independent version.
- having only one version of invert for multiple matrix sizes decreases the size of the executable, and that could reduce the program’s working set size and improve locality of reference in the instruction cache
- Another efficiency consideration concerns the sizes of objects.
If you’re not careful, moving size-independent versions of functions up into a base class can increase the overall size of each object.
For example,each SquareMatrix
object has a pointer to its data in theSquareMatrixBase
class, even though each derived class already has a way to get to the data. This increases the size of eachSquareMatrix
object by at least the size of a pointer.
It’s possible to modify the design so that these pointers are unnecessary
Bloat on type parameters
This Item has discussed only bloat due to non-type template parame- ters, but type parameters can lead to bloat, too.
For example, on many platforms, int
and long
have the same binary representation, so the member functions for, say, vector<int>
and vector<long>
would likely be identical — the very definition of bloat. Some linkers will merge identical function implementations, but some will not, and that means that some templates instantiated on both int and long could cause code bloat in some environments.
For Pointer
all pointer types have the same binary representation, so templates holding pointer types (e.g., list<int*>
, list<const int*>
, list<SquareMatrix<long, 3>*>
, etc.) should often be able to use a single underlying implementation for each member function.
Typically, this means implementing member functions that work with strongly typed pointers (i.e., T* pointers
) by having them call functions that work with untyped pointers (i.e., void* pointers
)
Things to Remember
- Templates generate multiple classes and multiple functions, so any template code not dependent on a template parameter causes bloat.
- Bloat due to non-type template parameters can often be eliminated by replacing template parameters with function parameters or class data members.
- Bloat due to type parameters can be reduced by sharing implementations for instantiation types with identical binary representations.