Development Linpack
From i2basque
Linpack is a set of bechmarks used to evaluate the speed of a machine or a cluster of machines running as a whole to solve one complex problem.
For the development infrastructure we have tested several configurations.
The first cluster configuration to try out was with 42 processes using the 9 dual Xeon HT computing nodes and mixing them with the 4-way Xeon HT SMP. The result is not very good due to the use of the 4-way SMP machine, which has a memory contention issue. When more than two processes have an intensive use of the memory system, the memory bus becomes saturated, so there is no use at utilizing the 4 processors at the same time for a task with a lot of memory usage. Furthermore, this machine has HT (Hyperthreading) so it has 8 virtual processors, that when used at the same time, the whole machine becomes saturated.
fe:/home/ridruejo/linpack_9.1/benchmarks/mp_linpack/bin_intel/ia32# /opt/mpich-intel32/bin/mpirun -np 42 -nolocal -machinefile
machines.LINUX xhpl
============================================================================
HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004
Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK
============================================================================
An explanation of the input/output parameters follows:
T/V : Wall time / encoded variant.
N : The order of the coefficient matrix A.
NB : The partitioning blocking factor.
P : The number of process rows.
Q : The number of process columns.
Time : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.
The following parameter values will be used:
N : 1000
NB : 112 120
PMAP : Row-major process mapping
P : 1 2 1 4
Q : 1 2 4 1
PFACT : Left
NBMIN : 4 2
NDIV : 2
RFACT : Crout
BCAST : 1ring
DEPTH : 0
SWAP : Mix (threshold = 256)
L1 : no-transposed form
U : no-transposed form
EQUIL : no
ALIGN : 8 double precision words
----------------------------------------------------------------------------
- The matrix A is randomly generated for each test.
- The following scaled residual checks will be computed:
1) ||Ax-b||_oo / ( eps * ||A||_1 * N )
2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 )
3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )
- The relative machine precision (eps) is taken to be 1.110223e-16
- Computational tests pass if scaled residuals are less than 16.0
Column=000112 Fraction=0.005 Mflops= 959.38
Column=000224 Fraction=0.010 Mflops= 1184.56
Column=000336 Fraction=0.015 Mflops= 1290.66
Column=000448 Fraction=0.020 Mflops= 1340.17
Column=000560 Fraction=0.025 Mflops= 1362.25
Column=000672 Fraction=0.030 Mflops= 1368.36
Column=000784 Fraction=0.035 Mflops= 1367.02
Column=000896 Fraction=0.040 Mflops= 1337.38
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 1 1 1.51 4.434e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0995348 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0266872 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0064474 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1864.77
Column=000224 Fraction=0.010 Mflops= 1858.83
Column=000336 Fraction=0.015 Mflops= 1840.08
Column=000448 Fraction=0.020 Mflops= 1819.79
Column=000560 Fraction=0.025 Mflops= 1800.01
Column=000672 Fraction=0.030 Mflops= 1781.96
Column=000784 Fraction=0.035 Mflops= 1766.51
Column=000896 Fraction=0.040 Mflops= 1755.79
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 1 1 0.38 1.740e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.2850990 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0311911 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0075355 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1872.84
Column=000240 Fraction=0.010 Mflops= 1851.93
Column=000360 Fraction=0.015 Mflops= 1835.65
Column=000480 Fraction=0.020 Mflops= 1817.57
Column=000600 Fraction=0.025 Mflops= 1797.45
Column=000720 Fraction=0.030 Mflops= 1777.35
Column=000840 Fraction=0.035 Mflops= 1763.95
Column=000960 Fraction=0.040 Mflops= 1753.87
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 1 1 0.38 1.743e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1472619 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0278456 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0067273 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1850.80
Column=000240 Fraction=0.010 Mflops= 1828.61
Column=000360 Fraction=0.015 Mflops= 1820.65
Column=000480 Fraction=0.020 Mflops= 1805.14
Column=000600 Fraction=0.025 Mflops= 1787.19
Column=000720 Fraction=0.030 Mflops= 1767.53
Column=000840 Fraction=0.035 Mflops= 1754.48
Column=000960 Fraction=0.040 Mflops= 1744.07
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 1 1 0.39 1.733e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1131697 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0270181 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0065274 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 185.51
Column=000224 Fraction=0.010 Mflops= 298.62
Column=000336 Fraction=0.015 Mflops= 377.33
Column=000448 Fraction=0.020 Mflops= 414.15
Column=000560 Fraction=0.025 Mflops= 436.53
Column=000672 Fraction=0.030 Mflops= 438.26
Column=000784 Fraction=0.035 Mflops= 436.83
Column=000896 Fraction=0.040 Mflops= 426.02
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 2 2 1.60 4.174e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0518234 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0255292 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0061677 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 2173.27
Column=000224 Fraction=0.010 Mflops= 1671.83
Column=000336 Fraction=0.015 Mflops= 1643.76
Column=000448 Fraction=0.020 Mflops= 1423.56
Column=000560 Fraction=0.025 Mflops= 1363.76
Column=000672 Fraction=0.030 Mflops= 1212.45
Column=000784 Fraction=0.035 Mflops= 1148.64
Column=000896 Fraction=0.040 Mflops= 1038.15
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 2 2 0.68 9.828e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.3327751 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0323482 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0078151 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2206.44
Column=000240 Fraction=0.010 Mflops= 1603.51
Column=000360 Fraction=0.015 Mflops= 1588.95
Column=000480 Fraction=0.020 Mflops= 1354.08
Column=000600 Fraction=0.025 Mflops= 1298.48
Column=000720 Fraction=0.030 Mflops= 1165.45
Column=000840 Fraction=0.035 Mflops= 1103.29
Column=000960 Fraction=0.040 Mflops= 1014.14
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 2 2 0.67 9.908e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0718054 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0260142 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0062848 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2278.31
Column=000240 Fraction=0.010 Mflops= 1699.55
Column=000360 Fraction=0.015 Mflops= 1666.59
Column=000480 Fraction=0.020 Mflops= 1445.90
Column=000600 Fraction=0.025 Mflops= 1382.41
Column=000720 Fraction=0.030 Mflops= 1232.95
Column=000840 Fraction=0.035 Mflops= 1157.03
Column=000960 Fraction=0.040 Mflops= 1059.86
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 2 2 0.65 1.035e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.2152906 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0294967 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0071262 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 3838.12
Column=000224 Fraction=0.010 Mflops= 1803.69
Column=000336 Fraction=0.015 Mflops= 1804.51
Column=000448 Fraction=0.020 Mflops= 1851.96
Column=000560 Fraction=0.025 Mflops= 1899.61
Column=000672 Fraction=0.030 Mflops= 1657.94
Column=000784 Fraction=0.035 Mflops= 1608.02
Column=000896 Fraction=0.040 Mflops= 1584.26
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 1 4 0.44 1.536e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0265450 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0249156 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0060194 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 4038.58
Column=000224 Fraction=0.010 Mflops= 1963.75
Column=000336 Fraction=0.015 Mflops= 1911.39
Column=000448 Fraction=0.020 Mflops= 1952.14
Column=000560 Fraction=0.025 Mflops= 1983.11
Column=000672 Fraction=0.030 Mflops= 1717.73
Column=000784 Fraction=0.035 Mflops= 1661.82
Column=000896 Fraction=0.040 Mflops= 1641.31
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 1 4 0.42 1.588e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.9533579 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0231393 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0055903 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4420.85
Column=000240 Fraction=0.010 Mflops= 1957.70
Column=000360 Fraction=0.015 Mflops= 1912.77
Column=000480 Fraction=0.020 Mflops= 1932.53
Column=000600 Fraction=0.025 Mflops= 1950.90
Column=000720 Fraction=0.030 Mflops= 1679.96
Column=000840 Fraction=0.035 Mflops= 1621.81
Column=000960 Fraction=0.040 Mflops= 1594.93
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 1 4 0.43 1.558e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.9148384 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0222044 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0053644 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4610.85
Column=000240 Fraction=0.010 Mflops= 1971.22
Column=000360 Fraction=0.015 Mflops= 1938.56
Column=000480 Fraction=0.020 Mflops= 1941.98
Column=000600 Fraction=0.025 Mflops= 1957.34
Column=000720 Fraction=0.030 Mflops= 1687.20
Column=000840 Fraction=0.035 Mflops= 1627.07
Column=000960 Fraction=0.040 Mflops= 1606.71
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 1 4 0.43 1.568e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.8638001 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0209656 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0050651 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 721.61
Column=000224 Fraction=0.010 Mflops= 795.85
Column=000336 Fraction=0.015 Mflops= 780.68
Column=000448 Fraction=0.020 Mflops= 749.05
Column=000560 Fraction=0.025 Mflops= 711.57
Column=000672 Fraction=0.030 Mflops= 662.26
Column=000784 Fraction=0.035 Mflops= 614.71
Column=000896 Fraction=0.040 Mflops= 570.99
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 4 1 1.25 5.339e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.9518852 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0231035 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0055816 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1157.45
Column=000224 Fraction=0.010 Mflops= 1044.79
Column=000336 Fraction=0.015 Mflops= 941.78
Column=000448 Fraction=0.020 Mflops= 871.08
Column=000560 Fraction=0.025 Mflops= 806.10
Column=000672 Fraction=0.030 Mflops= 734.65
Column=000784 Fraction=0.035 Mflops= 675.91
Column=000896 Fraction=0.040 Mflops= 621.38
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 4 1 1.16 5.762e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1032506 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0267774 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0064692 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1165.59
Column=000240 Fraction=0.010 Mflops= 1022.97
Column=000360 Fraction=0.015 Mflops= 917.61
Column=000480 Fraction=0.020 Mflops= 841.15
Column=000600 Fraction=0.025 Mflops= 771.05
Column=000720 Fraction=0.030 Mflops= 698.46
Column=000840 Fraction=0.035 Mflops= 637.09
Column=000960 Fraction=0.040 Mflops= 583.12
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 4 1 1.18 5.673e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0821954 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0262663 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0063458 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1163.70
Column=000240 Fraction=0.010 Mflops= 1006.83
Column=000360 Fraction=0.015 Mflops= 906.14
Column=000480 Fraction=0.020 Mflops= 823.68
Column=000600 Fraction=0.025 Mflops= 757.82
Column=000720 Fraction=0.030 Mflops= 686.89
Column=000840 Fraction=0.035 Mflops= 627.65
Column=000960 Fraction=0.040 Mflops= 575.96
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 4 1 1.19 5.606e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1617574 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0281974 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0068123 ...... PASSED
============================================================================
Finished 16 tests with the following results:
16 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.
----------------------------------------------------------------------------
End of Tests
So under the above conditions, this benchmark has reported a speed of 44.34 Gflops. We repeated the tests removing the 4-way SMP from the pool of computers using only the 9 dual Xeon HT computing nodes and the results improved to 48.10 Gflops.
fe:/home/ridruejo/linpack_9.1/benchmarks/mp_linpack/bin_intel/ia32# /opt/mpich-intel32/bin/mpirun -np 36 -nolocal -machinefile
machines.LINUX xhpl
========================================================================
HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004
Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK
========================================================================
An explanation of the input/output parameters follows:
T/V : Wall time / encoded variant.
N : The order of the coefficient matrix A.
NB : The partitioning blocking factor.
P : The number of process rows.
Q : The number of process columns.
Time : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.
The following parameter values will be used:
N : 1000
NB : 112 120
PMAP : Row-major process mapping
P : 1 2 1 4
Q : 1 2 4 1
PFACT : Left
NBMIN : 4 2
NDIV : 2
RFACT : Crout
BCAST : 1ring
DEPTH : 0
SWAP : Mix (threshold = 256)
L1 : no-transposed form
U : no-transposed form
EQUIL : no
ALIGN : 8 double precision words
----------------------------------------------------------------------------
- The matrix A is randomly generated for each test.
- The following scaled residual checks will be computed:
1) ||Ax-b||_oo / ( eps * ||A||_1 * N )
2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 )
3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )
- The relative machine precision (eps) is taken to be 1.110223e-16
- Computational tests pass if scaled residuals are less than 16.0
Column=000112 Fraction=0.005 Mflops= 1839.83
Column=000224 Fraction=0.010 Mflops= 1819.24
Column=000336 Fraction=0.015 Mflops= 1814.08
Column=000448 Fraction=0.020 Mflops= 1799.30
Column=000560 Fraction=0.025 Mflops= 1786.27
Column=000672 Fraction=0.030 Mflops= 1769.99
Column=000784 Fraction=0.035 Mflops= 1756.98
Column=000896 Fraction=0.040 Mflops= 1747.02
========================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 1 1 1.39 4.810e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0995348 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0266872 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0064474 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1894.07
Column=000224 Fraction=0.010 Mflops= 1854.22
Column=000336 Fraction=0.015 Mflops= 1841.04
Column=000448 Fraction=0.020 Mflops= 1824.62
Column=000560 Fraction=0.025 Mflops= 1808.90
Column=000672 Fraction=0.030 Mflops= 1789.07
Column=000784 Fraction=0.035 Mflops= 1774.98
Column=000896 Fraction=0.040 Mflops= 1764.50
========================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 1 1 0.38 1.749e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.2850990 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0311911 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0075355 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1914.38
Column=000240 Fraction=0.010 Mflops= 1902.15
Column=000360 Fraction=0.015 Mflops= 1882.46
Column=000480 Fraction=0.020 Mflops= 1860.25
Column=000600 Fraction=0.025 Mflops= 1835.39
Column=000720 Fraction=0.030 Mflops= 1815.40
Column=000840 Fraction=0.035 Mflops= 1801.63
Column=000960 Fraction=0.040 Mflops= 1791.12
========================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 1 1 0.38 1.780e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1472619 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0278456 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0067273 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1889.17
Column=000240 Fraction=0.010 Mflops= 1884.41
Column=000360 Fraction=0.015 Mflops= 1869.57
Column=000480 Fraction=0.020 Mflops= 1850.77
Column=000600 Fraction=0.025 Mflops= 1828.47
Column=000720 Fraction=0.030 Mflops= 1808.17
Column=000840 Fraction=0.035 Mflops= 1794.31
Column=000960 Fraction=0.040 Mflops= 1783.40
========================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 1 1 0.38 1.772e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1131697 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0270181 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0065274 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 183.23
Column=000224 Fraction=0.010 Mflops= 302.50
Column=000336 Fraction=0.015 Mflops= 378.74
Column=000448 Fraction=0.020 Mflops= 424.70
Column=000560 Fraction=0.025 Mflops= 449.09
Column=000672 Fraction=0.030 Mflops= 459.16
Column=000784 Fraction=0.035 Mflops= 457.91
Column=000896 Fraction=0.040 Mflops= 453.54
========================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 2 2 1.50 4.447e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0518234 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0255292 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0061677 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 2384.88
Column=000224 Fraction=0.010 Mflops= 2089.61
Column=000336 Fraction=0.015 Mflops= 1935.01
Column=000448 Fraction=0.020 Mflops= 1684.33
Column=000560 Fraction=0.025 Mflops= 1598.91
Column=000672 Fraction=0.030 Mflops= 1440.14
Column=000784 Fraction=0.035 Mflops= 1344.41
Column=000896 Fraction=0.040 Mflops= 1231.25
========================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 2 2 0.58 1.161e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.3327751 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0323482 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0078151 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2299.74
Column=000240 Fraction=0.010 Mflops= 1750.52
Column=000360 Fraction=0.015 Mflops= 1716.78
Column=000480 Fraction=0.020 Mflops= 1501.81
Column=000600 Fraction=0.025 Mflops= 1437.86
Column=000720 Fraction=0.030 Mflops= 1280.47
Column=000840 Fraction=0.035 Mflops= 1205.90
Column=000960 Fraction=0.040 Mflops= 1106.53
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 2 2 0.62 1.080e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0718054 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0260142 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0062848 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2311.49
Column=000240 Fraction=0.010 Mflops= 1515.25
Column=000360 Fraction=0.015 Mflops= 1547.40
Column=000480 Fraction=0.020 Mflops= 1379.84
Column=000600 Fraction=0.025 Mflops= 1336.41
Column=000720 Fraction=0.030 Mflops= 1205.81
Column=000840 Fraction=0.035 Mflops= 1138.69
Column=000960 Fraction=0.040 Mflops= 1048.90
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 2 2 0.65 1.025e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.2152906 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0294967 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0071262 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 3822.39
Column=000224 Fraction=0.010 Mflops= 1753.41
Column=000336 Fraction=0.015 Mflops= 1737.43
Column=000448 Fraction=0.020 Mflops= 1799.79
Column=000560 Fraction=0.025 Mflops= 1837.37
Column=000672 Fraction=0.030 Mflops= 1596.26
Column=000784 Fraction=0.035 Mflops= 1537.06
Column=000896 Fraction=0.040 Mflops= 1519.65
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 1 4 0.46 1.465e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0265450 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0249156 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0060194 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 4091.81
Column=000224 Fraction=0.010 Mflops= 1818.57
Column=000336 Fraction=0.015 Mflops= 1861.06
Column=000448 Fraction=0.020 Mflops= 1904.74
Column=000560 Fraction=0.025 Mflops= 1940.08
Column=000672 Fraction=0.030 Mflops= 1653.13
Column=000784 Fraction=0.035 Mflops= 1606.71
Column=000896 Fraction=0.040 Mflops= 1584.39
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 1 4 0.44 1.530e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.9533579 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0231393 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0055903 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4446.43
Column=000240 Fraction=0.010 Mflops= 1822.68
Column=000360 Fraction=0.015 Mflops= 1841.16
Column=000480 Fraction=0.020 Mflops= 1869.72
Column=000600 Fraction=0.025 Mflops= 1878.97
Column=000720 Fraction=0.030 Mflops= 1611.65
Column=000840 Fraction=0.035 Mflops= 1562.41
Column=000960 Fraction=0.040 Mflops= 1541.94
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 1 4 0.44 1.504e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.9148384 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0222044 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0053644 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4510.40
Column=000240 Fraction=0.010 Mflops= 1800.41
Column=000360 Fraction=0.015 Mflops= 1818.77
Column=000480 Fraction=0.020 Mflops= 1840.66
Column=000600 Fraction=0.025 Mflops= 1863.08
Column=000720 Fraction=0.030 Mflops= 1597.05
Column=000840 Fraction=0.035 Mflops= 1549.60
Column=000960 Fraction=0.040 Mflops= 1528.19
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 1 4 0.45 1.496e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.8638001 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0209656 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0050651 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1165.40
Column=000224 Fraction=0.010 Mflops= 1086.39
Column=000336 Fraction=0.015 Mflops= 986.69
Column=000448 Fraction=0.020 Mflops= 915.54
Column=000560 Fraction=0.025 Mflops= 855.79
Column=000672 Fraction=0.030 Mflops= 785.28
Column=000784 Fraction=0.035 Mflops= 723.05
Column=000896 Fraction=0.040 Mflops= 668.52
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 112 4 1 1.07 6.237e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.9518852 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0231035 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0055816 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1201.77
Column=000224 Fraction=0.010 Mflops= 1102.63
Column=000336 Fraction=0.015 Mflops= 999.33
Column=000448 Fraction=0.020 Mflops= 926.32
Column=000560 Fraction=0.025 Mflops= 859.31
Column=000672 Fraction=0.030 Mflops= 788.35
Column=000784 Fraction=0.035 Mflops= 724.79
Column=000896 Fraction=0.040 Mflops= 670.24
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 112 4 1 1.07 6.251e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1032506 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0267774 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0064692 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1176.97
Column=000240 Fraction=0.010 Mflops= 1032.57
Column=000360 Fraction=0.015 Mflops= 945.30
Column=000480 Fraction=0.020 Mflops= 867.74
Column=000600 Fraction=0.025 Mflops= 800.40
Column=000720 Fraction=0.030 Mflops= 728.25
Column=000840 Fraction=0.035 Mflops= 668.08
Column=000960 Fraction=0.040 Mflops= 612.96
============================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L4 1000 120 4 1 1.12 5.958e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.0821954 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0262663 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0063458 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1194.85
Column=000240 Fraction=0.010 Mflops= 1033.06
Column=000360 Fraction=0.015 Mflops= 944.66
Column=000480 Fraction=0.020 Mflops= 866.33
Column=000600 Fraction=0.025 Mflops= 799.54
Column=000720 Fraction=0.030 Mflops= 730.94
Column=000840 Fraction=0.035 Mflops= 668.44
Column=000960 Fraction=0.040 Mflops= 616.51
========================================================================
T/V N NB P Q Time Gflops
----------------------------------------------------------------------------
W00C2L2 1000 120 4 1 1.11 6.000e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 1.1617574 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0281974 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0068123 ...... PASSED
========================================================================
Finished 16 tests with the following results:
16 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.
----------------------------------------------------------------------------
End of Tests.
========================================================================
Each computing node has a performance of 5.0581 when being evaluated using the 4 virtual threads, althouh each machine has only 2 physical proccessors. If we only use 2 threads the performance achieved is 6.9658. The performance gain is due to the allocation of a single thread to every physical processor.
We analyzed separately the performance of the 4-way SMP. Using a thread per physical processor (4 threads) the performance achieved is 2.2893 Gflops. While the performance using only 2 threads is 6.8581 Gflops. So it is clear the problem of contention accessing to the RAM memory.
