These are the profiling data from my Athlon 1.4GHz. Notice the *huge*
speedups when multiplying large (non-integer) matrices.


Explanation:
------------

dotblas.dot          == BLAS dot with python wrapper
Numeric.dot          == matrix product with its Numeric wrapper

interpretation of the timings:

for each entry

a*b is called 1000 times
a*A is called 100 times
A*A is called 10 times

TYPECODE: D
===========


Function   | a*b (10x1 * 10x1)  | A*a (10x10 * 10x1) | A*B (10x10 * 10x10)
-----------+--------------------+--------------------+--------------------
dotblas.dot|    0.00631         |    0.00084         |    0.00017
-----------+--------------------+--------------------+--------------------
Numeric.dot|    0.00604         |    0.00073         |    0.00019
-----------+--------------------+--------------------+--------------------

Function   | a*b (100x1 * 100x1)    | A*a (100x100 * 100x1)  | A*B (100x100 * 100x100)
-----------+------------------------+------------------------+------------------------
dotblas.dot|    0.00712             |    0.00635             |    0.06226
-----------+------------------------+------------------------+------------------------
Numeric.dot|    0.00703             |    0.00912             |    0.11361
-----------+------------------------+------------------------+------------------------

Function   | a*b (1000x1 * 1000x1)      | A*a (1000x1000 * 1000x1)   | A*B (1000x1000 * 1000x1000)
-----------+----------------------------+----------------------------+----------------------------
dotblas.dot|    0.01357                 |    2.99075                 |   40.56162
-----------+----------------------------+----------------------------+----------------------------
Numeric.dot|    0.01310                 |    4.70348                 |  895.69399
-----------+----------------------------+----------------------------+----------------------------
TYPECODE: l
===========


Function   | a*b (10x1 * 10x1)  | A*a (10x10 * 10x1) | A*B (10x10 * 10x10)
-----------+--------------------+--------------------+--------------------
dotblas.dot|    0.02385         |    0.00244         |    0.00032
-----------+--------------------+--------------------+--------------------
Numeric.dot|    0.00551         |    0.00069         |    0.00013
-----------+--------------------+--------------------+--------------------

Function   | a*b (100x1 * 100x1)    | A*a (100x100 * 100x1)  | A*B (100x100 * 100x100)
-----------+------------------------+------------------------+------------------------
dotblas.dot|    0.02398             |    0.00678             |    0.04120
-----------+------------------------+------------------------+------------------------
Numeric.dot|    0.00605             |    0.00475             |    0.04067
-----------+------------------------+------------------------+------------------------

Function   | a*b (1000x1 * 1000x1)      | A*a (1000x1000 * 1000x1)   | A*B (1000x1000 * 1000x1000)
-----------+----------------------------+----------------------------+----------------------------
dotblas.dot|    0.02854                 |    1.38042                 |  453.25659
-----------+----------------------------+----------------------------+----------------------------
Numeric.dot|    0.00958                 |    1.46957                 |  452.55097
-----------+----------------------------+----------------------------+----------------------------
TYPECODE: d
===========


Function   | a*b (10x1 * 10x1)  | A*a (10x10 * 10x1) | A*B (10x10 * 10x10)
-----------+--------------------+--------------------+--------------------
dotblas.dot|    0.00570         |    0.00080         |    0.00012
-----------+--------------------+--------------------+--------------------
Numeric.dot|    0.00546         |    0.00070         |    0.00015
-----------+--------------------+--------------------+--------------------

Function   | a*b (100x1 * 100x1)    | A*a (100x100 * 100x1)  | A*B (100x100 * 100x100)
-----------+------------------------+------------------------+------------------------
dotblas.dot|    0.00567             |    0.00327             |    0.03053
-----------+------------------------+------------------------+------------------------
Numeric.dot|    0.00578             |    0.00467             |    0.04665
-----------+------------------------+------------------------+------------------------

Function   | a*b (1000x1 * 1000x1)      | A*a (1000x1000 * 1000x1)   | A*B (1000x1000 * 1000x1000)
-----------+----------------------------+----------------------------+----------------------------
dotblas.dot|    0.00719                 |    1.46835                 |   10.12871
-----------+----------------------------+----------------------------+----------------------------
Numeric.dot|    0.00840                 |    2.17318                 |  595.59559
-----------+----------------------------+----------------------------+----------------------------
TYPECODE: f
===========


Function   | a*b (10x1 * 10x1)  | A*a (10x10 * 10x1) | A*B (10x10 * 10x10)
-----------+--------------------+--------------------+--------------------
dotblas.dot|    0.00728         |    0.00068         |    0.00012
-----------+--------------------+--------------------+--------------------
Numeric.dot|    0.00679         |    0.00068         |    0.00015
-----------+--------------------+--------------------+--------------------

Function   | a*b (100x1 * 100x1)    | A*a (100x100 * 100x1)  | A*B (100x100 * 100x100)
-----------+------------------------+------------------------+------------------------
dotblas.dot|    0.00720             |    0.00256             |    0.00534
-----------+------------------------+------------------------+------------------------
Numeric.dot|    0.00705             |    0.00411             |    0.03423
-----------+------------------------+------------------------+------------------------

Function   | a*b (1000x1 * 1000x1)      | A*a (1000x1000 * 1000x1)   | A*B (1000x1000 * 1000x1000)
-----------+----------------------------+----------------------------+----------------------------
dotblas.dot|    0.01009                 |    0.74370                 |    6.53564
-----------+----------------------------+----------------------------+----------------------------
Numeric.dot|    0.00969                 |    1.18666                 |  415.52507
-----------+----------------------------+----------------------------+----------------------------
python profileDot.py  2638.80s user 73.46s system 93% cpu 48:19.31 total

