pkgsrc-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: BLAS: Does cblas interface work with BLAS types other than netlib



Am Sat, 1 Oct 2022 08:45:22 +0530
schrieb Mayuresh <mayuresh%acm.org@localhost>:

> So
>     - pthread openblas seems to be performing better than openmp for MT
>       but spending much higher sys time
> 
>     - single threaded stock blas of dlib seems to be performing better
>       than either single or multi-threaded openblas in either mode
> 
> Workload details:
> 
>     - A convolutional neural network with window size 5x5, 1.5 million
>       samples, 6 dense layers, 10000 iterations

I am no ML expert, so I cannot judge to which kinds of linear algebra
operations this maps, but the numbers look like there weren't many
matrix operations that would benefit from BLAS optimization at all. You
got some weak scaling with thread count, but the plain netlib BLAS
being faster usually means that you just got small matrices and vectors
(that would be best served by writing the loops directly and not
calling out to BLAS).

Do you happen to know more about the structure and size of your algebra
operations?


Alrighty then,

Thomas

-- 
Dr. Thomas Orgis
HPC @ Universität Hamburg


Home | Main Index | Thread Index | Old Index