Scotts Test Case
His test case involves calling the std::sort() function on a vector of 1 million random doubles. The comparison function used for the sort method is greater
#include <stdio.h> //printf #include <iostream> //cout, endl #include <time.h> //timespec, clock_gettime(), CLOCK_PROCESS_CPUTIME_ID #include <vector> //vector #include <algorithm> //generate, sort #include <functional> //binary_function, greater using namespace std; class RandomDoubleGenerator { public: double operator()() { return (double)rand()/(double)rand(); } }; class TimespecAverage : public unary_function<struct timespec, void> { public: TimespecAverage() : numTimespecs(0) { } void operator()(struct timespec t) { ++numTimespecs; sum += t.tv_sec; sum += t.tv_nsec * 0.000000001; // nano is 1x10^-9 seconds } double average() const { return sum / numTimespecs; } private: double numTimespecs; double sum; }; timespec diff(struct timespec start, struct timespec end) { struct timespec diff; if ((end.tv_nsec - start.tv_nsec) < 0) { //compensate for something like: start - 12.9, end 13.1 diff.tv_sec = end.tv_sec - start.tv_sec - 1; diff.tv_nsec = 1000000000 + end.tv_nsec - start.tv_nsec; } else { diff.tv_sec = end.tv_sec - start.tv_sec; diff.tv_nsec = end.tv_nsec - start.tv_nsec; } return diff; } timespec time(void (*functionToTime)(vector<double>&), vector<double>& data) { struct timespec start; struct timespec stop; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start); functionToTime(data); clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &stop); return diff(start, stop); } void printElapsed(struct timespec functor, struct timespec function) { printf("%li.%.9li %li.%.9li \n", functor.tv_sec, functor.tv_nsec, function.tv_sec, function.tv_nsec); } void printAverages(vector<timespec>& functorTimes, vector<timespec>& functionTimes) { TimespecAverage functorAverage = for_each(functorTimes.begin(), functorTimes.end(), TimespecAverage()); TimespecAverage functionAverage = for_each(functionTimes.begin(), functionTimes.end(), TimespecAverage()); cout << endl << "Average" << endl; printf("%1.9f %1.9f \n", functorAverage.average(), functionAverage.average()); } bool greaterThan(double left, double right) { return left > right; } void sortFunctor(vector<double>& data) { sort(data.begin(), data.end(), greater<double>()); } void sortFunction(vector<double>& data) { sort(data.begin(), data.end(), greaterThan); } int main(int argc, char** argv) { const int LOOPS = 10; const int NUM_DATA = 10000000; //10million vector<timespec> functorTimes; vector<timespec> functionTimes; functorTimes.reserve(LOOPS); functionTimes.reserve(LOOPS); /* timer */ cout << "Functor Function" << endl; for (int i = 0; i < LOOPS; ++i) { vector<double> functorData(NUM_DATA); generate(functorData.begin(), functorData.end(), RandomDoubleGenerator()); //create a copy of our random data vector for use with our function sort() //this way both methods are using the exact same data set vector<double> functionData(functorData); /* functor */ struct timespec functorElapsed = time(sortFunctor, functorData); functorTimes.push_back(functorElapsed); /* function */ struct timespec functionElapsed = time(sortFunction, functionData); functionTimes.push_back(functionElapsed); printElapsed(functorElapsed, functionElapsed); } printAverages(functorTimes, functionTimes); return 0; }
Worst Case
I compiled this code using g++ 4.5.2 and the following compiler options (no optimizations):
g++ -lrt -ansi -pedantic -g -Wall
Below are my results (in seconds):
Functor Function
7.439764495 7.755845935
7.735696621 7.713312266
7.587360216 7.501062211
7.576613316 8.058204506
7.539563825 7.602603278
7.471405721 7.776704292
7.750183958 7.940997362
7.721883637 7.679431332
7.470727771 7.681166662
7.588803904 7.776486039
Average
7.588200346 7.748581388
As you can see the function and functor sort times are almost equal (functors are only 2% faster) and do not line up with the 50% worst case that Scott claims.
Best Case
I compiled the code again this time turning on full optimizations:
g++ -lrt -ansi -pedantic -O3 -Wall
Below are my results (in seconds):
Functor Function
1.530773436 2.230420939
1.487986836 2.240578317
1.576760542 2.280175380
1.521323717 2.334807950
1.499050688 2.203657561
1.518878620 2.272559906
1.581513080 2.212788678
1.563577081 2.393493630
1.616584657 2.313904654
1.581532295 2.291148088
Average
1.547798095 2.277353510
As you can see the call to std::sort() where you pass a regular function as the comparator is on average 0.729555415 seconds (47.1%) slower than the call to std::sort() using the functor as the comparator. This fairly close to what Scott claimed in his book as the worst case.
As Scott explains in his book, the reason for the performance increase is due to inlining. The compiler is able to inline the comparator functor but it is unable to inline the comparator function. This is because when you give the comparator function to sort() you are really giving a pointer to that function and the compiler is unable to inline the invokation of the function pointer. Where in the functor case the compiler will happily inline the operater() function in the functor and since you are passing the functor by value to sort() there is no pointer dereferencing.
Conclusion
I was unable to produce Scott's best case where functors performed 160% faster than their function counterparts, however i was able to produce an increase of 50% by simply using a functor over a function. Either way, there is no doubt that functors do produce more performant code than their function counterparts, but compiler optimizations need to be turned on in order to notice any real difference.
No comments:
Post a Comment