added Intel Core i5-6500;
changed results view of gcbd benchmark to relative performance gain; changed reference CPU and GPU; target architecture detection fixes
430
README.md
@ -1,9 +1,13 @@
|
|||||||
# BLAS libraries benchmarks
|
# BLAS libraries benchmarks
|
||||||
Andrzej Wójtowicz
|
Andrzej Wójtowicz
|
||||||
|
|
||||||
[![DOI](https://zenodo.org/badge/22705/andre-wojtowicz/blas-benchmarks.svg)](https://dx.doi.org/10.5281/zenodo.55662)
|
Document generation date: 2016-07-14 17:20:41
|
||||||
|
|
||||||
Document generation date: 2016-06-06 22:40:18
|
This document presents timing results for BLAS ([Basic Linear Algebra Subprograms](https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms)) libraries in [R](https://en.wikipedia.org/wiki/R_(programming_language)) on diverse CPUs and GPUs.
|
||||||
|
|
||||||
|
### Changelog
|
||||||
|
|
||||||
|
* 2016-07-14: **results:** added Intel Core i5-6500; changed results view of gcbd benchmark to relative performance gain; changed reference CPU (Intel Pentium Dual-Core E5300) and GPU (NVIDIA GeForce GT 630M); **code:** fixed target architecture detection for Intel Core i5-6500-like CPUs in multi-threaded Atlas library; added info how to force target architecture in GotoBLAS2 and BLIS libraries.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -14,6 +18,7 @@ Document generation date: 2016-06-06 22:40:18
|
|||||||
* [Intel Core i7-4790K + MSI GeForce GTX 980 Ti Lightning](#intel-core-i7-4790k--msi-geforce-gtx-980-ti-lightning)
|
* [Intel Core i7-4790K + MSI GeForce GTX 980 Ti Lightning](#intel-core-i7-4790k--msi-geforce-gtx-980-ti-lightning)
|
||||||
* [Intel Core i5-4590 + NVIDIA GeForce GT 430](#intel-core-i5-4590--nvidia-geforce-gt-430)
|
* [Intel Core i5-4590 + NVIDIA GeForce GT 430](#intel-core-i5-4590--nvidia-geforce-gt-430)
|
||||||
* [Intel Core i5-4590 + NVIDIA GeForce GTX 750 Ti](#intel-core-i5-4590--nvidia-geforce-gtx-750-ti)
|
* [Intel Core i5-4590 + NVIDIA GeForce GTX 750 Ti](#intel-core-i5-4590--nvidia-geforce-gtx-750-ti)
|
||||||
|
* [Intel Core i5-6500](#intel-core-i5-6500)
|
||||||
* [Intel Core i5-3570](#intel-core-i5-3570)
|
* [Intel Core i5-3570](#intel-core-i5-3570)
|
||||||
* [Intel Core i3-2120](#intel-core-i3-2120)
|
* [Intel Core i3-2120](#intel-core-i3-2120)
|
||||||
* [Intel Core i3-3120M](#intel-core-i3-3120m)
|
* [Intel Core i3-3120M](#intel-core-i3-3120m)
|
||||||
@ -33,7 +38,7 @@ Document generation date: 2016-06-06 22:40:18
|
|||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
**OS**: Debian Jessie, kernel 4.4
|
**OS**: [Debian](https://www.debian.org/) Jessie, kernel 4.4
|
||||||
|
|
||||||
**R software**: [Microsoft R Open](https://mran.microsoft.com/open/) (3.2.4)
|
**R software**: [Microsoft R Open](https://mran.microsoft.com/open/) (3.2.4)
|
||||||
|
|
||||||
@ -54,11 +59,12 @@ Document generation date: 2016-06-06 22:40:18
|
|||||||
|1.|[Intel Core i7-4790K](http://ark.intel.com/products/80807/Intel-Core-i7-4790K-Processor-8M-Cache-up-to-4_40-GHz) (OC 4.5 GHz)|[MSI GeForce GTX 980 Ti Lightning](https://us.msi.com/Graphics-card/GTX-980-Ti-LIGHTNING.html#hero-specification)|
|
|1.|[Intel Core i7-4790K](http://ark.intel.com/products/80807/Intel-Core-i7-4790K-Processor-8M-Cache-up-to-4_40-GHz) (OC 4.5 GHz)|[MSI GeForce GTX 980 Ti Lightning](https://us.msi.com/Graphics-card/GTX-980-Ti-LIGHTNING.html#hero-specification)|
|
||||||
|2.|[Intel Core i5-4590](http://ark.intel.com/products/80815/Intel-Core-i5-4590-Processor-6M-Cache-up-to-3_70-GHz)|[NVIDIA GeForce GT 430](http://www.geforce.com/hardware/desktop-gpus/geforce-gt-430/specifications)|
|
|2.|[Intel Core i5-4590](http://ark.intel.com/products/80815/Intel-Core-i5-4590-Processor-6M-Cache-up-to-3_70-GHz)|[NVIDIA GeForce GT 430](http://www.geforce.com/hardware/desktop-gpus/geforce-gt-430/specifications)|
|
||||||
|3.|[Intel Core i5-4590](http://ark.intel.com/products/80815/Intel-Core-i5-4590-Processor-6M-Cache-up-to-3_70-GHz)|[NVIDIA GeForce GTX 750 Ti](http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750-ti/specifications)|
|
|3.|[Intel Core i5-4590](http://ark.intel.com/products/80815/Intel-Core-i5-4590-Processor-6M-Cache-up-to-3_70-GHz)|[NVIDIA GeForce GTX 750 Ti](http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-750-ti/specifications)|
|
||||||
|4.|[Intel Core i5-3570](http://ark.intel.com/products/65702/Intel-Core-i5-3570-Processor-6M-Cache-up-to-3_80-GHz)| - |
|
|4.|[Intel Core i5-6500](http://ark.intel.com/products/88184/Intel-Core-i5-6500-Processor-6M-Cache-up-to-3_60-GHz)| - |
|
||||||
|5.|[Intel Core i3-2120](http://ark.intel.com/products/53426/Intel-Core-i3-2120-Processor-3M-Cache-3_30-GHz)| - |
|
|5.|[Intel Core i5-3570](http://ark.intel.com/products/65702/Intel-Core-i5-3570-Processor-6M-Cache-up-to-3_80-GHz)| - |
|
||||||
|6.|[Intel Core i3-3120M](http://ark.intel.com/products/71465/Intel-Core-i3-3120M-Processor-3M-Cache-2_50-GHz)| - |
|
|6.|[Intel Core i3-2120](http://ark.intel.com/products/53426/Intel-Core-i3-2120-Processor-3M-Cache-3_30-GHz)| - |
|
||||||
|7.|[Intel Core i5-3317U](http://ark.intel.com/products/65707/Intel-Core-i5-3317U-Processor-3M-Cache-up-to-2_60-GHz)|[NVIDIA GeForce GT 630M](http://www.geforce.com/hardware/notebook-gpus/geforce-gt-630m/specifications)|
|
|7.|[Intel Core i3-3120M](http://ark.intel.com/products/71465/Intel-Core-i3-3120M-Processor-3M-Cache-2_50-GHz)| - |
|
||||||
|8.|[Intel Pentium Dual-Core E5300](http://ark.intel.com/products/35300/Intel-Pentium-Processor-E5300-2M-Cache-2_60-GHz-800-MHz-FSB)| - |
|
|8.|[Intel Core i5-3317U](http://ark.intel.com/products/65707/Intel-Core-i5-3317U-Processor-3M-Cache-up-to-2_60-GHz)|[NVIDIA GeForce GT 630M](http://www.geforce.com/hardware/notebook-gpus/geforce-gt-630m/specifications)|
|
||||||
|
|9.|[Intel Pentium Dual-Core E5300](http://ark.intel.com/products/35300/Intel-Pentium-Processor-E5300-2M-Cache-2_60-GHz-800-MHz-FSB)| - |
|
||||||
|
|
||||||
**Benchmarks**: [R-benchmark-25](http://r.research.att.com/benchmarks/R-benchmark-25.R), [Revolution](https://gist.github.com/andrie/24c9672f1ea39af89c66#file-rro-mkl-benchmark-r), [Gcbd](https://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf).
|
**Benchmarks**: [R-benchmark-25](http://r.research.att.com/benchmarks/R-benchmark-25.R), [Revolution](https://gist.github.com/andrie/24c9672f1ea39af89c66#file-rro-mkl-benchmark-r), [Gcbd](https://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf).
|
||||||
|
|
||||||
@ -176,33 +182,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h1_b3_t1.png)![](gen/img/img_ph_h1_b3_t1b.png)
|
![](gen/img/img_ph_h1_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h1_b3_t2.png)![](gen/img/img_ph_h1_b3_t2b.png)
|
![](gen/img/img_ph_h1_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h1_b3_t3.png)![](gen/img/img_ph_h1_b3_t3b.png)
|
![](gen/img/img_ph_h1_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h1_b3_t4.png)![](gen/img/img_ph_h1_b3_t4b.png)
|
![](gen/img/img_ph_h1_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -314,33 +320,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h2_b3_t1.png)![](gen/img/img_ph_h2_b3_t1b.png)
|
![](gen/img/img_ph_h2_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h2_b3_t2.png)![](gen/img/img_ph_h2_b3_t2b.png)
|
![](gen/img/img_ph_h2_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h2_b3_t3.png)![](gen/img/img_ph_h2_b3_t3b.png)
|
![](gen/img/img_ph_h2_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h2_b3_t4.png)![](gen/img/img_ph_h2_b3_t4b.png)
|
![](gen/img/img_ph_h2_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -452,37 +458,37 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h3_b3_t1.png)![](gen/img/img_ph_h3_b3_t1b.png)
|
![](gen/img/img_ph_h3_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h3_b3_t2.png)![](gen/img/img_ph_h3_b3_t2b.png)
|
![](gen/img/img_ph_h3_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h3_b3_t3.png)![](gen/img/img_ph_h3_b3_t3b.png)
|
![](gen/img/img_ph_h3_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h3_b3_t4.png)![](gen/img/img_ph_h3_b3_t4b.png)
|
![](gen/img/img_ph_h3_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Intel Core i5-3570
|
## Intel Core i5-6500
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -590,37 +596,37 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h4_b3_t1.png)![](gen/img/img_ph_h4_b3_t1b.png)
|
![](gen/img/img_ph_h4_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h4_b3_t2.png)![](gen/img/img_ph_h4_b3_t2b.png)
|
![](gen/img/img_ph_h4_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h4_b3_t3.png)![](gen/img/img_ph_h4_b3_t3b.png)
|
![](gen/img/img_ph_h4_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h4_b3_t4.png)![](gen/img/img_ph_h4_b3_t4b.png)
|
![](gen/img/img_ph_h4_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Intel Core i3-2120
|
## Intel Core i5-3570
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -728,37 +734,37 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h5_b3_t1.png)![](gen/img/img_ph_h5_b3_t1b.png)
|
![](gen/img/img_ph_h5_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h5_b3_t2.png)![](gen/img/img_ph_h5_b3_t2b.png)
|
![](gen/img/img_ph_h5_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h5_b3_t3.png)![](gen/img/img_ph_h5_b3_t3b.png)
|
![](gen/img/img_ph_h5_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h5_b3_t4.png)![](gen/img/img_ph_h5_b3_t4b.png)
|
![](gen/img/img_ph_h5_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Intel Core i3-3120M
|
## Intel Core i3-2120
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -866,37 +872,37 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h6_b3_t1.png)![](gen/img/img_ph_h6_b3_t1b.png)
|
![](gen/img/img_ph_h6_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h6_b3_t2.png)![](gen/img/img_ph_h6_b3_t2b.png)
|
![](gen/img/img_ph_h6_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h6_b3_t3.png)![](gen/img/img_ph_h6_b3_t3b.png)
|
![](gen/img/img_ph_h6_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h6_b3_t4.png)![](gen/img/img_ph_h6_b3_t4b.png)
|
![](gen/img/img_ph_h6_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Intel Core i5-3317U + NVIDIA GeForce GT 630M
|
## Intel Core i3-3120M
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1004,37 +1010,37 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h7_b3_t1.png)![](gen/img/img_ph_h7_b3_t1b.png)
|
![](gen/img/img_ph_h7_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h7_b3_t2.png)![](gen/img/img_ph_h7_b3_t2b.png)
|
![](gen/img/img_ph_h7_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h7_b3_t3.png)![](gen/img/img_ph_h7_b3_t3b.png)
|
![](gen/img/img_ph_h7_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h7_b3_t4.png)![](gen/img/img_ph_h7_b3_t4b.png)
|
![](gen/img/img_ph_h7_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Intel Pentium Dual-Core E5300
|
## Intel Core i5-3317U + NVIDIA GeForce GT 630M
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1058,8 +1064,6 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Eigenvalues of a 600x600 random matrix
|
#### Eigenvalues of a 600x600 random matrix
|
||||||
|
|
||||||
BLIS hangs in this test
|
|
||||||
|
|
||||||
Time in seconds - 10 runs - lower is better
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h8_b1_t3.png)
|
![](gen/img/img_ph_h8_b1_t3.png)
|
||||||
@ -1144,33 +1148,173 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h8_b3_t1.png)![](gen/img/img_ph_h8_b3_t1b.png)
|
![](gen/img/img_ph_h8_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h8_b3_t2.png)![](gen/img/img_ph_h8_b3_t2b.png)
|
![](gen/img/img_ph_h8_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h8_b3_t3.png)![](gen/img/img_ph_h8_b3_t3b.png)
|
![](gen/img/img_ph_h8_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_ph_h8_b3_t4.png)![](gen/img/img_ph_h8_b3_t4b.png)
|
![](gen/img/img_ph_h8_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## Intel Pentium Dual-Core E5300
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### R-benchmark-25
|
||||||
|
|
||||||
|
#### 2800x2800 cross-product matrix
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b1_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Linear regr. over a 2000x2000 matrix
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b1_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Eigenvalues of a 600x600 random matrix
|
||||||
|
|
||||||
|
BLIS hangs in this test
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b1_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Determinant of a 2500x2500 random matrix
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b1_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Cholesky decomposition of a 3000x3000 matrix
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b1_t5.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Inverse of a 1600x1600 random matrix
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b1_t6.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Escoufier's method on a 45x45 matrix
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b1_t7.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### Revolution benchmark
|
||||||
|
|
||||||
|
#### Matrix Multiply
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b2_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Cholesky Factorization
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b2_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b2_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Principal Components Analysis
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b2_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Linear Discriminant Analysis
|
||||||
|
|
||||||
|
Time in seconds - 10 runs - lower is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b2_t5.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### Gcbd benchmark
|
||||||
|
|
||||||
|
#### Matrix Multiply
|
||||||
|
|
||||||
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### QR Decomposition
|
||||||
|
|
||||||
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
#### Triangular Decomposition
|
||||||
|
|
||||||
|
Performance gain regarding matrix size - reference: Netlib - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
|
![](gen/img/img_ph_h9_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1285,33 +1429,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l1_b3_t1.png)![](gen/img/img_pl_l1_b3_t1b.png)
|
![](gen/img/img_pl_l1_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l1_b3_t2.png)![](gen/img/img_pl_l1_b3_t2b.png)
|
![](gen/img/img_pl_l1_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l1_b3_t3.png)![](gen/img/img_pl_l1_b3_t3b.png)
|
![](gen/img/img_pl_l1_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l1_b3_t4.png)![](gen/img/img_pl_l1_b3_t4b.png)
|
![](gen/img/img_pl_l1_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1423,33 +1567,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l2_b3_t1.png)![](gen/img/img_pl_l2_b3_t1b.png)
|
![](gen/img/img_pl_l2_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l2_b3_t2.png)![](gen/img/img_pl_l2_b3_t2b.png)
|
![](gen/img/img_pl_l2_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l2_b3_t3.png)![](gen/img/img_pl_l2_b3_t3b.png)
|
![](gen/img/img_pl_l2_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l2_b3_t4.png)![](gen/img/img_pl_l2_b3_t4b.png)
|
![](gen/img/img_pl_l2_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1561,33 +1705,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l3_b3_t1.png)![](gen/img/img_pl_l3_b3_t1b.png)
|
![](gen/img/img_pl_l3_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l3_b3_t2.png)![](gen/img/img_pl_l3_b3_t2b.png)
|
![](gen/img/img_pl_l3_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l3_b3_t3.png)![](gen/img/img_pl_l3_b3_t3b.png)
|
![](gen/img/img_pl_l3_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l3_b3_t4.png)![](gen/img/img_pl_l3_b3_t4b.png)
|
![](gen/img/img_pl_l3_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1699,33 +1843,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l4_b3_t1.png)![](gen/img/img_pl_l4_b3_t1b.png)
|
![](gen/img/img_pl_l4_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l4_b3_t2.png)![](gen/img/img_pl_l4_b3_t2b.png)
|
![](gen/img/img_pl_l4_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l4_b3_t3.png)![](gen/img/img_pl_l4_b3_t3b.png)
|
![](gen/img/img_pl_l4_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l4_b3_t4.png)![](gen/img/img_pl_l4_b3_t4b.png)
|
![](gen/img/img_pl_l4_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1837,33 +1981,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l5_b3_t1.png)![](gen/img/img_pl_l5_b3_t1b.png)
|
![](gen/img/img_pl_l5_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l5_b3_t2.png)![](gen/img/img_pl_l5_b3_t2b.png)
|
![](gen/img/img_pl_l5_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l5_b3_t3.png)![](gen/img/img_pl_l5_b3_t3b.png)
|
![](gen/img/img_pl_l5_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l5_b3_t4.png)![](gen/img/img_pl_l5_b3_t4b.png)
|
![](gen/img/img_pl_l5_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -1975,33 +2119,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l6_b3_t1.png)![](gen/img/img_pl_l6_b3_t1b.png)
|
![](gen/img/img_pl_l6_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l6_b3_t2.png)![](gen/img/img_pl_l6_b3_t2b.png)
|
![](gen/img/img_pl_l6_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l6_b3_t3.png)![](gen/img/img_pl_l6_b3_t3b.png)
|
![](gen/img/img_pl_l6_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l6_b3_t4.png)![](gen/img/img_pl_l6_b3_t4b.png)
|
![](gen/img/img_pl_l6_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -2115,33 +2259,33 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l7_b3_t1.png)![](gen/img/img_pl_l7_b3_t1b.png)
|
![](gen/img/img_pl_l7_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l7_b3_t2.png)![](gen/img/img_pl_l7_b3_t2b.png)
|
![](gen/img/img_pl_l7_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l7_b3_t3.png)![](gen/img/img_pl_l7_b3_t3b.png)
|
![](gen/img/img_pl_l7_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: Intel Pentium Dual-Core E5300 - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l7_b3_t4.png)![](gen/img/img_pl_l7_b3_t4b.png)
|
![](gen/img/img_pl_l7_b3_t4.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -2253,31 +2397,31 @@ Time in seconds - 10 runs - lower is better
|
|||||||
|
|
||||||
#### Matrix Multiply
|
#### Matrix Multiply
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: NVIDIA GeForce GT 630M - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l8_b3_t1.png)![](gen/img/img_pl_l8_b3_t1b.png)
|
![](gen/img/img_pl_l8_b3_t1.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### QR Decomposition
|
#### QR Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: NVIDIA GeForce GT 630M - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l8_b3_t2.png)![](gen/img/img_pl_l8_b3_t2b.png)
|
![](gen/img/img_pl_l8_b3_t2.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Singular Value Deomposition
|
#### Singular Value Deomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: NVIDIA GeForce GT 630M - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l8_b3_t3.png)![](gen/img/img_pl_l8_b3_t3b.png)
|
![](gen/img/img_pl_l8_b3_t3.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#### Triangular Decomposition
|
#### Triangular Decomposition
|
||||||
|
|
||||||
Time in seconds regarding matrix size - right panel on log scale - from 50 to 5 runs - lower is better
|
Performance gain regarding matrix size - reference: NVIDIA GeForce GT 630M - from 50 to 5 runs - higher is better
|
||||||
|
|
||||||
![](gen/img/img_pl_l8_b3_t4.png)![](gen/img/img_pl_l8_b3_t4b.png)
|
![](gen/img/img_pl_l8_b3_t4.png)
|
||||||
|
|
||||||
|
Before Width: | Height: | Size: 19 KiB After Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 29 KiB After Width: | Height: | Size: 47 KiB |
Before Width: | Height: | Size: 32 KiB |
Before Width: | Height: | Size: 32 KiB After Width: | Height: | Size: 47 KiB |
Before Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 20 KiB After Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 31 KiB |
Before Width: | Height: | Size: 31 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 32 KiB |
Before Width: | Height: | Size: 34 KiB After Width: | Height: | Size: 48 KiB |
Before Width: | Height: | Size: 31 KiB |
Before Width: | Height: | Size: 20 KiB After Width: | Height: | Size: 51 KiB |
Before Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 47 KiB |
Before Width: | Height: | Size: 32 KiB |
Before Width: | Height: | Size: 32 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 32 KiB |
Before Width: | Height: | Size: 33 KiB After Width: | Height: | Size: 47 KiB |
Before Width: | Height: | Size: 31 KiB |
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 3.6 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 4.0 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.6 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.6 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 4.0 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 4.0 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 4.0 KiB |
Before Width: | Height: | Size: 18 KiB After Width: | Height: | Size: 43 KiB |
Before Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 27 KiB After Width: | Height: | Size: 41 KiB |
Before Width: | Height: | Size: 30 KiB |
Before Width: | Height: | Size: 29 KiB After Width: | Height: | Size: 39 KiB |
Before Width: | Height: | Size: 29 KiB |
Before Width: | Height: | Size: 19 KiB After Width: | Height: | Size: 44 KiB |
Before Width: | Height: | Size: 30 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.6 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 4.0 KiB After Width: | Height: | Size: 4.0 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 4.0 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 27 KiB After Width: | Height: | Size: 43 KiB |
Before Width: | Height: | Size: 28 KiB |
Before Width: | Height: | Size: 30 KiB After Width: | Height: | Size: 43 KiB |
Before Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 49 KiB |
Before Width: | Height: | Size: 28 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 4.0 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.6 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 3.9 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 3.8 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 27 KiB After Width: | Height: | Size: 42 KiB |
Before Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 29 KiB After Width: | Height: | Size: 42 KiB |
Before Width: | Height: | Size: 28 KiB |
Before Width: | Height: | Size: 22 KiB After Width: | Height: | Size: 43 KiB |
Before Width: | Height: | Size: 29 KiB |
Before Width: | Height: | Size: 4.2 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 4.0 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 4.3 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 4.1 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 4.0 KiB After Width: | Height: | Size: 3.7 KiB |
Before Width: | Height: | Size: 4.0 KiB After Width: | Height: | Size: 3.6 KiB |
Before Width: | Height: | Size: 4.3 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 4.2 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 4.2 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 4.2 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 4.3 KiB After Width: | Height: | Size: 3.9 KiB |
Before Width: | Height: | Size: 4.2 KiB After Width: | Height: | Size: 3.8 KiB |
Before Width: | Height: | Size: 23 KiB After Width: | Height: | Size: 44 KiB |