Development Linpack

From i2basque

Jump to: navigation, search

Linpack is a set of bechmarks used to evaluate the speed of a machine or a cluster of machines running as a whole to solve one complex problem.

For the development infrastructure we have tested several configurations.

The first cluster configuration to try out was with 42 processes using the 9 dual Xeon HT computing nodes and mixing them with the 4-way Xeon HT SMP. The result is not very good due to the use of the 4-way SMP machine, which has a memory contention issue. When more than two processes have an intensive use of the memory system, the memory bus becomes saturated, so there is no use at utilizing the 4 processors at the same time for a task with a lot of memory usage. Furthermore, this machine has HT (Hyperthreading) so it has 8 virtual processors, that when used at the same time, the whole machine becomes saturated.

fe:/home/ridruejo/linpack_9.1/benchmarks/mp_linpack/bin_intel/ia32# /opt/mpich-intel32/bin/mpirun -np 42 -nolocal -machinefile          
machines.LINUX  xhpl

============================================================================
HPLinpack 1.0a  --  High-Performance Linpack benchmark  --   January 20, 2004
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Labs.,  UTK
============================================================================
An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.


The following parameter values will be used:
 N      :    1000 
 NB     :     112      120 
 PMAP   : Row-major process mapping
 P      :       1        2        1        4 
 Q      :       1        2        4        1 
 PFACT  :    Left 
 NBMIN  :       4        2 
 NDIV   :       2 
 RFACT  :   Crout 
 BCAST  :   1ring 
 DEPTH  :       0 
 SWAP   : Mix (threshold = 256)
 L1     : no-transposed form
 U      : no-transposed form
 EQUIL  : no
 ALIGN  : 8 double precision words
 
 ----------------------------------------------------------------------------
 
 - The matrix A is randomly generated for each test.
 - The following scaled residual checks will be computed:
   1) ||Ax-b||_oo / ( eps * ||A||_1  * N        )
   2) ||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  )
   3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )
 - The relative machine precision (eps) is taken to be          1.110223e-16
 - Computational tests pass if scaled residuals are less than           16.0
  
 Column=000112 Fraction=0.005 Mflops=  959.38
 Column=000224 Fraction=0.010 Mflops= 1184.56
 Column=000336 Fraction=0.015 Mflops= 1290.66
 Column=000448 Fraction=0.020 Mflops= 1340.17
 Column=000560 Fraction=0.025 Mflops= 1362.25
 Column=000672 Fraction=0.030 Mflops= 1368.36
 Column=000784 Fraction=0.035 Mflops= 1367.02
 Column=000896 Fraction=0.040 Mflops= 1337.38
============================================================================
 T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     1     1               1.51          4.434e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0995348 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0266872 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0064474 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1864.77
Column=000224 Fraction=0.010 Mflops= 1858.83
Column=000336 Fraction=0.015 Mflops= 1840.08
Column=000448 Fraction=0.020 Mflops= 1819.79
Column=000560 Fraction=0.025 Mflops= 1800.01
Column=000672 Fraction=0.030 Mflops= 1781.96
Column=000784 Fraction=0.035 Mflops= 1766.51
Column=000896 Fraction=0.040 Mflops= 1755.79
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     1     1               0.38          1.740e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.2850990 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0311911 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0075355 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1872.84
Column=000240 Fraction=0.010 Mflops= 1851.93
Column=000360 Fraction=0.015 Mflops= 1835.65
Column=000480 Fraction=0.020 Mflops= 1817.57
Column=000600 Fraction=0.025 Mflops= 1797.45
Column=000720 Fraction=0.030 Mflops= 1777.35
Column=000840 Fraction=0.035 Mflops= 1763.95
Column=000960 Fraction=0.040 Mflops= 1753.87
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     1     1               0.38          1.743e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1472619 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0278456 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0067273 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1850.80
Column=000240 Fraction=0.010 Mflops= 1828.61
Column=000360 Fraction=0.015 Mflops= 1820.65
Column=000480 Fraction=0.020 Mflops= 1805.14
Column=000600 Fraction=0.025 Mflops= 1787.19
Column=000720 Fraction=0.030 Mflops= 1767.53
Column=000840 Fraction=0.035 Mflops= 1754.48
Column=000960 Fraction=0.040 Mflops= 1744.07
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     1     1               0.39          1.733e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1131697 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0270181 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0065274 ...... PASSED
Column=000112 Fraction=0.005 Mflops=  185.51
Column=000224 Fraction=0.010 Mflops=  298.62
Column=000336 Fraction=0.015 Mflops=  377.33
Column=000448 Fraction=0.020 Mflops=  414.15
Column=000560 Fraction=0.025 Mflops=  436.53
Column=000672 Fraction=0.030 Mflops=  438.26
Column=000784 Fraction=0.035 Mflops=  436.83
Column=000896 Fraction=0.040 Mflops=  426.02
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     2     2               1.60          4.174e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0518234 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0255292 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0061677 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 2173.27
Column=000224 Fraction=0.010 Mflops= 1671.83
Column=000336 Fraction=0.015 Mflops= 1643.76
Column=000448 Fraction=0.020 Mflops= 1423.56
Column=000560 Fraction=0.025 Mflops= 1363.76
Column=000672 Fraction=0.030 Mflops= 1212.45
Column=000784 Fraction=0.035 Mflops= 1148.64
Column=000896 Fraction=0.040 Mflops= 1038.15
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     2     2               0.68          9.828e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.3327751 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0323482 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0078151 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2206.44
Column=000240 Fraction=0.010 Mflops= 1603.51
Column=000360 Fraction=0.015 Mflops= 1588.95
Column=000480 Fraction=0.020 Mflops= 1354.08
Column=000600 Fraction=0.025 Mflops= 1298.48
Column=000720 Fraction=0.030 Mflops= 1165.45
Column=000840 Fraction=0.035 Mflops= 1103.29
Column=000960 Fraction=0.040 Mflops= 1014.14
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     2     2               0.67          9.908e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0718054 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0260142 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0062848 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2278.31
Column=000240 Fraction=0.010 Mflops= 1699.55
Column=000360 Fraction=0.015 Mflops= 1666.59
Column=000480 Fraction=0.020 Mflops= 1445.90
Column=000600 Fraction=0.025 Mflops= 1382.41
Column=000720 Fraction=0.030 Mflops= 1232.95
Column=000840 Fraction=0.035 Mflops= 1157.03
Column=000960 Fraction=0.040 Mflops= 1059.86
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     2     2               0.65          1.035e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.2152906 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0294967 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0071262 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 3838.12
Column=000224 Fraction=0.010 Mflops= 1803.69
Column=000336 Fraction=0.015 Mflops= 1804.51
Column=000448 Fraction=0.020 Mflops= 1851.96
Column=000560 Fraction=0.025 Mflops= 1899.61
Column=000672 Fraction=0.030 Mflops= 1657.94
Column=000784 Fraction=0.035 Mflops= 1608.02
Column=000896 Fraction=0.040 Mflops= 1584.26
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     1     4               0.44          1.536e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0265450 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0249156 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0060194 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 4038.58
Column=000224 Fraction=0.010 Mflops= 1963.75
Column=000336 Fraction=0.015 Mflops= 1911.39
Column=000448 Fraction=0.020 Mflops= 1952.14
Column=000560 Fraction=0.025 Mflops= 1983.11
Column=000672 Fraction=0.030 Mflops= 1717.73
Column=000784 Fraction=0.035 Mflops= 1661.82
Column=000896 Fraction=0.040 Mflops= 1641.31
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     1     4               0.42          1.588e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.9533579 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0231393 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0055903 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4420.85
Column=000240 Fraction=0.010 Mflops= 1957.70
Column=000360 Fraction=0.015 Mflops= 1912.77
Column=000480 Fraction=0.020 Mflops= 1932.53
Column=000600 Fraction=0.025 Mflops= 1950.90
Column=000720 Fraction=0.030 Mflops= 1679.96
Column=000840 Fraction=0.035 Mflops= 1621.81
Column=000960 Fraction=0.040 Mflops= 1594.93
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     1     4               0.43          1.558e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.9148384 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0222044 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0053644 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4610.85
Column=000240 Fraction=0.010 Mflops= 1971.22
Column=000360 Fraction=0.015 Mflops= 1938.56
Column=000480 Fraction=0.020 Mflops= 1941.98
Column=000600 Fraction=0.025 Mflops= 1957.34
Column=000720 Fraction=0.030 Mflops= 1687.20
Column=000840 Fraction=0.035 Mflops= 1627.07
Column=000960 Fraction=0.040 Mflops= 1606.71
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     1     4               0.43          1.568e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.8638001 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0209656 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0050651 ...... PASSED
Column=000112 Fraction=0.005 Mflops=  721.61
Column=000224 Fraction=0.010 Mflops=  795.85
Column=000336 Fraction=0.015 Mflops=  780.68
Column=000448 Fraction=0.020 Mflops=  749.05
Column=000560 Fraction=0.025 Mflops=  711.57
Column=000672 Fraction=0.030 Mflops=  662.26
Column=000784 Fraction=0.035 Mflops=  614.71
Column=000896 Fraction=0.040 Mflops=  570.99
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     4     1               1.25          5.339e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.9518852 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0231035 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0055816 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1157.45
Column=000224 Fraction=0.010 Mflops= 1044.79
Column=000336 Fraction=0.015 Mflops=  941.78
Column=000448 Fraction=0.020 Mflops=  871.08
Column=000560 Fraction=0.025 Mflops=  806.10
Column=000672 Fraction=0.030 Mflops=  734.65
Column=000784 Fraction=0.035 Mflops=  675.91
Column=000896 Fraction=0.040 Mflops=  621.38
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     4     1               1.16          5.762e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1032506 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0267774 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0064692 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1165.59
Column=000240 Fraction=0.010 Mflops= 1022.97
Column=000360 Fraction=0.015 Mflops=  917.61
Column=000480 Fraction=0.020 Mflops=  841.15
Column=000600 Fraction=0.025 Mflops=  771.05
Column=000720 Fraction=0.030 Mflops=  698.46
Column=000840 Fraction=0.035 Mflops=  637.09
Column=000960 Fraction=0.040 Mflops=  583.12
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     4     1               1.18          5.673e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0821954 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0262663 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0063458 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1163.70
Column=000240 Fraction=0.010 Mflops= 1006.83
Column=000360 Fraction=0.015 Mflops=  906.14
Column=000480 Fraction=0.020 Mflops=  823.68
Column=000600 Fraction=0.025 Mflops=  757.82
Column=000720 Fraction=0.030 Mflops=  686.89
Column=000840 Fraction=0.035 Mflops=  627.65
Column=000960 Fraction=0.040 Mflops=  575.96
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     4     1               1.19          5.606e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1617574 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0281974 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0068123 ...... PASSED
============================================================================

Finished     16 tests with the following results:
             16 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
----------------------------------------------------------------------------

End of Tests

So under the above conditions, this benchmark has reported a speed of 44.34 Gflops. We repeated the tests removing the 4-way SMP from the pool of computers using only the 9 dual Xeon HT computing nodes and the results improved to 48.10 Gflops.

fe:/home/ridruejo/linpack_9.1/benchmarks/mp_linpack/bin_intel/ia32# /opt/mpich-intel32/bin/mpirun -np 36 -nolocal -machinefile          
machines.LINUX  xhpl
========================================================================
HPLinpack 1.0a  --  High-Performance Linpack benchmark  --   January 20, 2004
Written by A. Petitet and R. Clint Whaley,  Innovative Computing Labs.,  UTK
========================================================================

An explanation of the input/output parameters follows:
T/V    : Wall time / encoded variant.
N      : The order of the coefficient matrix A.
NB     : The partitioning blocking factor.
P      : The number of process rows.
Q      : The number of process columns.
Time   : Time in seconds to solve the linear system.
Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N      :    1000 
NB     :     112      120 
PMAP   : Row-major process mapping
P      :       1        2        1        4 
Q      :       1        2        4        1 
PFACT  :    Left 
NBMIN  :       4        2 
NDIV   :       2 
RFACT  :   Crout 
BCAST  :   1ring 
DEPTH  :       0 
SWAP   : Mix (threshold = 256)
L1     : no-transposed form
U      : no-transposed form
EQUIL  : no
ALIGN  : 8 double precision words 

----------------------------------------------------------------------------

- The matrix A is randomly generated for each test.
- The following scaled residual checks will be computed:
  1) ||Ax-b||_oo / ( eps * ||A||_1  * N        )
  2) ||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  )
  3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )
- The relative machine precision (eps) is taken to be          1.110223e-16
- Computational tests pass if scaled residuals are less than           16.0

Column=000112 Fraction=0.005 Mflops= 1839.83
Column=000224 Fraction=0.010 Mflops= 1819.24
Column=000336 Fraction=0.015 Mflops= 1814.08
Column=000448 Fraction=0.020 Mflops= 1799.30
Column=000560 Fraction=0.025 Mflops= 1786.27
Column=000672 Fraction=0.030 Mflops= 1769.99
Column=000784 Fraction=0.035 Mflops= 1756.98
Column=000896 Fraction=0.040 Mflops= 1747.02
========================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     1     1               1.39          4.810e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0995348 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0266872 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0064474 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1894.07
Column=000224 Fraction=0.010 Mflops= 1854.22 
Column=000336 Fraction=0.015 Mflops= 1841.04
Column=000448 Fraction=0.020 Mflops= 1824.62
Column=000560 Fraction=0.025 Mflops= 1808.90
Column=000672 Fraction=0.030 Mflops= 1789.07
Column=000784 Fraction=0.035 Mflops= 1774.98
Column=000896 Fraction=0.040 Mflops= 1764.50
========================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     1     1               0.38          1.749e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.2850990 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0311911 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0075355 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1914.38
Column=000240 Fraction=0.010 Mflops= 1902.15
Column=000360 Fraction=0.015 Mflops= 1882.46
Column=000480 Fraction=0.020 Mflops= 1860.25
Column=000600 Fraction=0.025 Mflops= 1835.39
Column=000720 Fraction=0.030 Mflops= 1815.40
Column=000840 Fraction=0.035 Mflops= 1801.63 
Column=000960 Fraction=0.040 Mflops= 1791.12
========================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     1     1               0.38          1.780e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1472619 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0278456 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0067273 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1889.17
Column=000240 Fraction=0.010 Mflops= 1884.41
Column=000360 Fraction=0.015 Mflops= 1869.57
Column=000480 Fraction=0.020 Mflops= 1850.77
Column=000600 Fraction=0.025 Mflops= 1828.47
Column=000720 Fraction=0.030 Mflops= 1808.17
Column=000840 Fraction=0.035 Mflops= 1794.31
Column=000960 Fraction=0.040 Mflops= 1783.40
========================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     1     1               0.38          1.772e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1131697 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0270181 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0065274 ...... PASSED
Column=000112 Fraction=0.005 Mflops=  183.23
Column=000224 Fraction=0.010 Mflops=  302.50
Column=000336 Fraction=0.015 Mflops=  378.74
Column=000448 Fraction=0.020 Mflops=  424.70
Column=000560 Fraction=0.025 Mflops=  449.09
Column=000672 Fraction=0.030 Mflops=  459.16
Column=000784 Fraction=0.035 Mflops=  457.91
Column=000896 Fraction=0.040 Mflops=  453.54
========================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     2     2               1.50          4.447e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0518234 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0255292 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0061677 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 2384.88
Column=000224 Fraction=0.010 Mflops= 2089.61
Column=000336 Fraction=0.015 Mflops= 1935.01
Column=000448 Fraction=0.020 Mflops= 1684.33
Column=000560 Fraction=0.025 Mflops= 1598.91
Column=000672 Fraction=0.030 Mflops= 1440.14
Column=000784 Fraction=0.035 Mflops= 1344.41
Column=000896 Fraction=0.040 Mflops= 1231.25
========================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     2     2               0.58          1.161e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.3327751 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0323482 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0078151 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2299.74
Column=000240 Fraction=0.010 Mflops= 1750.52
Column=000360 Fraction=0.015 Mflops= 1716.78
Column=000480 Fraction=0.020 Mflops= 1501.81
Column=000600 Fraction=0.025 Mflops= 1437.86
Column=000720 Fraction=0.030 Mflops= 1280.47
Column=000840 Fraction=0.035 Mflops= 1205.90
Column=000960 Fraction=0.040 Mflops= 1106.53
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     2     2               0.62          1.080e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0718054 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0260142 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0062848 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 2311.49
Column=000240 Fraction=0.010 Mflops= 1515.25
Column=000360 Fraction=0.015 Mflops= 1547.40
Column=000480 Fraction=0.020 Mflops= 1379.84
Column=000600 Fraction=0.025 Mflops= 1336.41
Column=000720 Fraction=0.030 Mflops= 1205.81
Column=000840 Fraction=0.035 Mflops= 1138.69
Column=000960 Fraction=0.040 Mflops= 1048.90
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     2     2               0.65          1.025e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.2152906 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0294967 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0071262 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 3822.39
Column=000224 Fraction=0.010 Mflops= 1753.41
Column=000336 Fraction=0.015 Mflops= 1737.43
Column=000448 Fraction=0.020 Mflops= 1799.79
Column=000560 Fraction=0.025 Mflops= 1837.37
Column=000672 Fraction=0.030 Mflops= 1596.26
Column=000784 Fraction=0.035 Mflops= 1537.06
Column=000896 Fraction=0.040 Mflops= 1519.65
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     1     4               0.46          1.465e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0265450 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0249156 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0060194 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 4091.81
Column=000224 Fraction=0.010 Mflops= 1818.57
Column=000336 Fraction=0.015 Mflops= 1861.06
Column=000448 Fraction=0.020 Mflops= 1904.74
Column=000560 Fraction=0.025 Mflops= 1940.08
Column=000672 Fraction=0.030 Mflops= 1653.13
Column=000784 Fraction=0.035 Mflops= 1606.71
Column=000896 Fraction=0.040 Mflops= 1584.39
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     1     4               0.44          1.530e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.9533579 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0231393 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0055903 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4446.43
Column=000240 Fraction=0.010 Mflops= 1822.68
Column=000360 Fraction=0.015 Mflops= 1841.16
Column=000480 Fraction=0.020 Mflops= 1869.72
Column=000600 Fraction=0.025 Mflops= 1878.97
Column=000720 Fraction=0.030 Mflops= 1611.65
Column=000840 Fraction=0.035 Mflops= 1562.41
Column=000960 Fraction=0.040 Mflops= 1541.94
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     1     4               0.44          1.504e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.9148384 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0222044 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0053644 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 4510.40
Column=000240 Fraction=0.010 Mflops= 1800.41
Column=000360 Fraction=0.015 Mflops= 1818.77
Column=000480 Fraction=0.020 Mflops= 1840.66
Column=000600 Fraction=0.025 Mflops= 1863.08
Column=000720 Fraction=0.030 Mflops= 1597.05
Column=000840 Fraction=0.035 Mflops= 1549.60
Column=000960 Fraction=0.040 Mflops= 1528.19
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     1     4               0.45          1.496e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.8638001 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0209656 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0050651 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1165.40
Column=000224 Fraction=0.010 Mflops= 1086.39
Column=000336 Fraction=0.015 Mflops=  986.69
Column=000448 Fraction=0.020 Mflops=  915.54
Column=000560 Fraction=0.025 Mflops=  855.79
Column=000672 Fraction=0.030 Mflops=  785.28
Column=000784 Fraction=0.035 Mflops=  723.05
Column=000896 Fraction=0.040 Mflops=  668.52
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   112     4     1               1.07          6.237e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.9518852 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0231035 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0055816 ...... PASSED
Column=000112 Fraction=0.005 Mflops= 1201.77
Column=000224 Fraction=0.010 Mflops= 1102.63
Column=000336 Fraction=0.015 Mflops=  999.33
Column=000448 Fraction=0.020 Mflops=  926.32
Column=000560 Fraction=0.025 Mflops=  859.31
Column=000672 Fraction=0.030 Mflops=  788.35
Column=000784 Fraction=0.035 Mflops=  724.79
Column=000896 Fraction=0.040 Mflops=  670.24
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   112     4     1               1.07          6.251e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1032506 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0267774 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0064692 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1176.97
Column=000240 Fraction=0.010 Mflops= 1032.57
Column=000360 Fraction=0.015 Mflops=  945.30
Column=000480 Fraction=0.020 Mflops=  867.74
Column=000600 Fraction=0.025 Mflops=  800.40
Column=000720 Fraction=0.030 Mflops=  728.25
Column=000840 Fraction=0.035 Mflops=  668.08
Column=000960 Fraction=0.040 Mflops=  612.96
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L4         1000   120     4     1               1.12          5.958e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.0821954 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0262663 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0063458 ...... PASSED
Column=000120 Fraction=0.005 Mflops= 1194.85
Column=000240 Fraction=0.010 Mflops= 1033.06
Column=000360 Fraction=0.015 Mflops=  944.66
Column=000480 Fraction=0.020 Mflops=  866.33
Column=000600 Fraction=0.025 Mflops=  799.54
Column=000720 Fraction=0.030 Mflops=  730.94
Column=000840 Fraction=0.035 Mflops=  668.44
Column=000960 Fraction=0.040 Mflops=  616.51
========================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
W00C2L2         1000   120     4     1               1.11          6.000e-01
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        1.1617574 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0281974 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0068123 ...... PASSED
========================================================================

Finished     16 tests with the following results:
            16 tests completed and passed residual checks,
             0 tests completed and failed residual checks,
             0 tests skipped because of illegal input values.
----------------------------------------------------------------------------

End of Tests.
========================================================================


Each computing node has a performance of 5.0581 when being evaluated using the 4 virtual threads, althouh each machine has only 2 physical proccessors. If we only use 2 threads the performance achieved is 6.9658. The performance gain is due to the allocation of a single thread to every physical processor.

We analyzed separately the performance of the 4-way SMP. Using a thread per physical processor (4 threads) the performance achieved is 2.2893 Gflops. While the performance using only 2 threads is 6.8581 Gflops. So it is clear the problem of contention accessing to the RAM memory.



High Performance Computing

Personal tools