The C version of IMSL apparently does not optimize when doing matrix multiplication using imsl_{d,f}_mat_mul_rect(), in the following sense. When multiplying matrices in C, which uses row-major ordering, i.e., matrices are stored by row, you should always have the inner loop move along the rows of the matrices involved. IMSL does not do this, unless you are using the ``A*trans(B)'' option. So, to get around this, create a new matrix, which is the transpose of your B matrix and call the IMSL function with the ``A*trans(B)'' option. This will give you A*B, but will be much faster.

Chris Paciorek 2012-01-21