您的当前位置:首页正文

HPC高性能计算项目Linpack性能测试报告

2020-12-13 来源:步旅网


HPC高性能计算项目 Linpack性能测试报告

目 录

1 Linpack简介 .................................................................................................................................. 1 2 HPC集群测试环境 ........................................................................................................................ 2 3 单机Linpack测试 ........................................................................................................................ 3

3.1 测试方案............................................................................................................................ 3 3.2 测试结果............................................................................................................................ 4 3.3 结果分析............................................................................................................................ 5 4 整机Linpack测试........................................................................................................................ 6

4.1 测试方案 ............................................................................................................................ 6 4.2 测试结果 ............................................................................................................................ 7 4.3 结果分析 ............................................................................................................................ 7 5 附录 .............................................................................................................................................. 8

5.1 HPL.dat修改说明 ............................................................................................................... 8 5.2 附录1 单机测试原始输入文件 ..................................................................................... 10 5.3 附录2 单机测试输出文件 ............................................................................................. 11 5.4 附录3 整机测试输出文件 ............................................................................................. 15

1 Linpack简介

Linpack是国际上最流行的用于测试高性能计算机系统浮点性能的benchmark。通过对高性能计算机采用高斯消元法求解一元N次稠密线性代数方程组的测试,评价高性能计算机的浮点性能。

Linpack 测试包括三类,Linpack100、Linpack1000和HPL。Linpack100求解规模为100阶的稠密线性代数方程组,它只允许采用编译优化选项进行优化,不得更改代码,甚至代码中的注释也不得修改。Linpack1000要求求解1000阶的线性代数方程组,达到指定的精度要求,可以在 不改变计算量的前提下做算法和代码上做优化。HPL即High Performance Linpack,也叫高度并行计算基准测试,它对数组大小N没有限制,求解问题的规模可以改变,除基本算法(计算量)不可改变外,可以采用其它任何优化方 法。前两种测试运行规模较小,已不是很适合现代计算机的发展。

HPL是针对现代并行计算机提出的测试方式。用户在不修改任意测试程序的基础上,可 以调节问题规模大小(矩阵大小)、使用CPU数目、使用各种优化方法等等来执行该测试程序,以获取最佳的性能。HPL采用高斯消元法求解线性方程组。求解 问题规模为N时,浮点运算次数为(2/3 * N^3-2*N^2)。因此,只要给出问题规模N,测得系统计算时间T,峰值=计算量(2/3 * N^3-2*N^2)/计算时间T,测试结果以浮点运算每秒(Flops)给出。HPL测试结果是TOP500排名的重要依据。

衡量计算机性能的一个重要指标就是计算峰值或者浮点计算峰值,它是指计算机每秒钟能完成的浮点计算最大次数。包括理论浮点峰值和实测浮点峰值。理论浮点峰值是该计算机理论上能达到的每秒钟能完成浮点计算最大次数,它主要是由CPU的主频决定的。

理论浮点峰值=CPU主频×CPU每个时钟周期执行浮点运算的次数×系统中CPU数。

2 HPC集群测试环境

注释:报告完成后请删除 1、请根据软硬件配置实际情况修改; 测试集群为某项目部署的60个刀片计算节点,主机名为comput1到comput60,集群内部管理网IP地址为192.168.172.1-60,集群计算网IP地址为12.12.12.1-60,详情请参考各节点的/etc/hosts文件。登录方式为,从集群管理节点login登录,可ssh到各计算节点。

集群软硬件环境如下:

CPU 内存 硬件环境 硬盘 网络 OS 编译器 软件环境 MPI HPL OpenMPI-1.8.5 2.1 双硬盘 Infiniband FDR 56Gbps CentOS release 6.6 (Final) Intel Compiler XE Version 15.0 Build 20150121 2*Intel Xeon E5-2680 v3 (2.5GHz) 12c 8*8GB DDR4 ECC

单节点Linpack双精度浮点计算理论峰值计算数值为:

2.5(主频GHz)* 16(每时钟周期运算次数)* 24(核心/节点) = 960 GFlops 集群整机Linpack双精度浮点计算理论峰值计算数值为:

2.5(主频GHz)* 16(每时钟周期运算次数)* 24(核心/节点) * 60 (节点数量)= 576000 GFlops = 576 TFlops

3 单机Linpack测试

注释:报告完成后请删除 1、 需要预先安装曙光Clussoft中的benchmark工具集, 参考【Clussoft使用手册】中的clusbench工具,进行自动化单机和整机Linpack测试 2、 也可以进行手动运行调优测试 3、 HPL.dat参数修改详见附录5.1 3.1 测试方案

2.1.1 测试对象:

HPC集群所有60个刀片计算节点 2.1.2测试目标:

1.检验所有节点是否能正常运行、是否存在软硬件异常; 2.检验各刀片计算节点的计算效率是否正常;

3.检验各刀片计算节点在长时间持续高负载运行时,性能是否正常且稳定; 4.检验各刀片计算节点在长时间持续高负载运行时,温度和散热否正常; 5.检验各刀片计算节点在长时间持续高负载运行时,供电是否正常稳定; 2.1.3测试步骤:

1)进行测试之前,需要确保整个集群环境正常。所有节点正常,且无负载,调试好Infiniband网络,确保风扇正常、CPU温度无异常,确保测试相关的环境变量已导入,无异常进程和服务。

2)随机选取任一计算节点,通过不断调整和优化相关测试参数,测得单机Linpack效率较高时的运行参数。

3) 使用在2)中获取的运行参数,同时对各节点进行单机Linpack测试。

4) 建立测试目录,将输入文件HPL.dat和测试程序xhpl拷到本目录下,手动运行单机Linpack测试命令:

nohup mpirun -np 24 /public/software/benchmark/hpl/2.1/intel/xhpl.Linux_Intel64 >& `hostname`_single.log &

3.2 测试结果

单机(NB=168) 主机名 node1 node2 node3 node4 node5 node6 node7 node8 node9 node10 node11 node12 node13 node14 node15 node16 node17 node18 node19 node20 node21 node22 node23 node24 node25 node26 node27 node28 节点数 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 CPU核心数 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 理论峰值(Gflops) 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 实测峰值(Gflops) 6.82E+02 7.18E+02 7.10E+02 6.98E+02 6.78E+02 6.88E+02 6.74E+02 6.88E+02 6.79E+02 6.97E+02 6.98E+02 6.96E+02 6.88E+02 7.27E+02 6.83E+02 6.88E+02 6.85E+02 6.86E+02 7.19E+02 6.82E+02 6.84E+02 7.10E+02 6.85E+02 6.83E+02 7.16E+02 6.86E+02 7.19E+02 6.87E+02 效率 71.0% 74.8% 74.0% 72.7% 70.6% 71.7% 70.2% 71.7% 70.7% 72.6% 72.7% 72.5% 71.6% 75.8% 71.1% 71.6% 71.4% 71.4% 74.9% 71.0% 71.3% 74.0% 71.4% 71.2% 74.6% 71.4% 74.9% 71.6% N 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 NB 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 P 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 Q 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

node29 node30 node31 node32 node33 node34 node35 node36 node37 node38 node39 node40 node41 node42 node43 node44 node45 node46 node47 node48 node49 node50 node51 node52 node53 node54 node55 node56 node57 node58 node59 node60 均值 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 960 6.92E+02 6.82E+02 6.89E+02 6.83E+02 7.18E+02 6.88E+02 6.82E+02 7.18E+02 7.10E+02 6.93E+02 7.18E+02 7.18E+02 7.15E+02 6.88E+02 6.98E+02 7.27E+02 7.22E+02 6.85E+02 7.02E+02 6.84E+02 6.89E+02 6.96E+02 6.86E+02 7.07E+02 7.15E+02 6.91E+02 7.18E+02 7.06E+02 6.75E+02 6.87E+02 7.19E+02 6.93E+02 72.0% 71.0% 71.8% 71.2% 74.8% 71.6% 71.0% 74.8% 73.9% 72.2% 74.8% 74.8% 74.5% 71.7% 72.7% 75.7% 75.2% 71.3% 73.1% 71.3% 71.8% 72.5% 71.4% 73.7% 74.5% 72.0% 74.8% 73.5% 70.5% 71.5% 74.9% 72.2% 72.7% 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 79897 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 168 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

3.3 结果分析

如上表所示,实测单机Linpack效率最高为75.8%,最低为70.2%,60个计算节点的单机效率平均值是72.7%(NB=168)。各节点运行效率正常,且表现稳定。

4 整机Linpack测试

4.1 测试方案

3.1.1 测试对象:

HPC集群所有60个正常的节点 3.1.2测试目标:

1.检验所有节点是否能正常运行、是否存在软硬件异常; 2.检验并行环境及计算网络的状态是否正常; 3.检验集群计算效率是否正常;

4.检验集群在长时间持续高负载运行时,性能是否正常且稳定; 5.检验集群在长时间持续高负载运行时,温度和散热否正常; 6.检验集群在长时间持续高负载运行时,供电是否能正常; 3.1.3测试步骤:

1).进行测试之前,需要确保整个集群环境正常。所有节点正常,且无负载,调试好Infiniband网络,确保风扇正常,CPU温度无异常,确保测试相关的环境变量已导入,无异常进程和服务。

2). 通过不断调整和优化相关测试参数,测得整机Linpack效率较高时的运行参数和结果数据。

3)根据2)参数进行24小时压力测试。

4) 建立测试目录,将输入文件HPL.dat和测试程序xhpl拷到本目录下,手动运行整机Linpack测试命令:

nohup mpirun -np 1440 -machinefile nodelist

/public/software/benchmark/hpl/2.1/intel/xhpl.Linux_Intel64 >& total_nodes.log &

4.2 测试结果

60节点整机Linpcack 节点数 60 60 60 60 CPU核心数 1440 1440 1440 1440 理论峰值(Gflops) 57600 57600 57600 57600 实测峰值(Gflops) 4.122e+03 4.159e+03 4.019e+03 4.141e+03 效率 71.50% 72.27% 69.77% 71.89% N 622119 622119 622119 622119 NB 168 168 168 168 P 36 36 36 36 Q 40 40 40 40 4.3 结果分析

60节点的整机Linpack效率为72.3%,计算性能表现稳定良好,测试期间,集群整体运行状态正常稳定,电源、风扇及功耗等硬件监控情况稳定无异常。

5 附录

5.1 HPL.dat修改说明

HPL输入文件内容如下,一般需要调整三部分参数进行优化测试:

1) 问题规模的个数及大小,可设置为多组,N=1表示一组,需要一个Ns值。问题规

模计算方法为sqrt (总内存 * 1024 * 1024 *1024 / 8) * 80%

1 # of problems sizes(N) 40000 Ns

2) NB值,即分块大小,取经验值,一般设置168、192、232、1024 3 # of NBs 192 232 1024 NBs

3) P和Q的设置(进程数目的设置),P和Q设置一般为1组,原则为:

P*Q=进程数

P≤Q且P和Q尽量接近

例如16进程,P=Q=4,如32进程,P=4,Q=8

1 # of process grids (P×Q) 4 Ps 4 Qs

修改好的HPL.dat示例(红色为修改项):

HPLinpack benchmark input file

Innovative Computing Laboratory, University of Tennessee HPL.out output file name (if any)

6 device out (6=stdout,7=stderr,file) 1 # of problems sizes (N)

79897 # = sqrt (总内存 * 1024 * 1024 *1024 / 8) * 80% Ns 1 # of NBs 168 192 # NBs

0 PMAP process mapping (0=Row-,1=Column-major) 1 * # of process grids (P x Q) 4 Ps 6 Qs

16.0 threshold

1 # of panel fact

0 1 2 PFACTs (0=left, 1=Crout, 2=Right) 1 # of recursive stopping criterium 2 4 NBMINs (>= 1)

1 # of panels in recursion 2 NDIVs

1 # of recursive panel fact.

0 1 2 RFACTs (0=left, 1=Crout, 2=Right) 1 # of broadcast

0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM) 1 # of lookahead depth 0 DEPTHs (>=0)

2 SWAP (0=bin-exch,1=long,2=mix) 64 swapping threshold

0 L1 in (0=transposed,1=no-transposed) form 0 U in (0=transposed,1=no-transposed) form 1 Equilibration (0=no,1=yes)

8 memory alignment in double (> 0)

5.2 附录1 单机测试原始输入文件

HPLinpack benchmark input file

Innovative Computing Laboratory, University of Tennessee HPL.out output file name (if any)

6 device out (6=stdout,7=stderr,file) 1 # of problems sizes (N)

79897 108800 108160 107520 106240 104960 1130240 Ns 2 # of NBs

168 192 448 384 NBs

0 PMAP process mapping (0=Row-,1=Column-major) 1 # of process grids (P x Q) 4 1 4 Ps 6 4 1 Qs

16.0 threshold

1 # of panel fact

0 1 2 PFACTs (0=left, 1=Crout, 2=Right) 2 # of recursive stopping criterium 2 4 NBMINs (>= 1)

1 # of panels in recursion 2 NDIVs

3 # of recursive panel fact.

0 1 2 RFACTs (0=left, 1=Crout, 2=Right) 1 # of broadcast

0 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM) 1 # of lookahead depth 0 DEPTHs (>=0)

2 SWAP (0=bin-exch,1=long,2=mix) 64 swapping threshold

0 L1 in (0=transposed,1=no-transposed) form 0 U in (0=transposed,1=no-transposed) form 1 Equilibration (0=no,1=yes)

8 memory alignment in double (> 0)

5.3 附录2 单机测试输出文件

随机选取一个计算节点输出,本报告以comput57节点为例的单机测试的原始输出文件:

================================================================================ HPLinpack 2.1 -- High-Performance Linpack benchmark -- October 26, 2012 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver

================================================================================

An explanation of the input/output parameters follows: T/V : Wall time / encoded variant.

N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns.

Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N : 79897 NB : 192

PMAP : Row-major process mapping P : 4 Q : 6 PFACT : Left NBMIN : 2 NDIV : 2 RFACT : Left BCAST : 1ring DEPTH : 0

SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes

ALIGN : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test. - The following scaled residual check will be computed:

- The relative machine precision (eps) is taken to be 2.220446e-16 - Computational tests pass if scaled residuals are less than 16.0 Begin

The progress: 1.00%. Time passed: 8 s. Flops: 6.0989E+02. Eff:63.530189% The progress: 2.00%. Time passed: 12 s. Flops: 6.0842E+02. Eff:63.377399% The progress: 3.00%. Time passed: 19 s. Flops: 6.3736E+02. Eff:66.392166% The progress: 4.00%. Time passed: 22 s. Flops: 6.5895E+02. Eff:68.640552% The progress: 5.00%. Time passed: 29 s. Flops: 6.6331E+02. Eff:69.094775% The progress: 6.00%. Time passed: 32 s. Flops: 6.7463E+02. Eff:70.274053% The progress: 7.00%. Time passed: 35 s. Flops: 6.8368E+02. Eff:71.216864% The progress: 8.00%. Time passed: 42 s. Flops: 6.8038E+02. Eff:70.872406% The progress: 9.00%. Time passed: 45 s. Flops: 6.8627E+02. Eff:71.486160% The progress: 10.00%. Time passed: 52 s. Flops: 6.8193E+02. Eff:71.034343% The progress: 11.00%. Time passed: 55 s. Flops: 6.8604E+02. Eff:71.462985% The progress: 12.00%. Time passed: 62 s. Flops: 6.8133E+02. Eff:70.972318% The progress: 13.00%. Time passed: 65 s. Flops: 6.8432E+02. Eff:71.283550% The progress: 14.00%. Time passed: 71 s. Flops: 6.8907E+02. Eff:71.777666% The progress: 15.00%. Time passed: 74 s. Flops: 6.9092E+02. Eff:71.971035% The progress: 16.00%. Time passed: 81 s. Flops: 6.8523E+02. Eff:71.378483% The progress: 17.00%. Time passed: 87 s. Flops: 6.8776E+02. Eff:71.641746% The progress: 18.00%. Time passed: 90 s. Flops: 6.8871E+02. Eff:71.741022% The progress: 19.00%. Time passed: 96 s. Flops: 6.9010E+02. Eff:71.885127% The progress: 20.00%. Time passed: 99 s. Flops: 6.9056E+02. Eff:71.933317% The progress: 21.00%. Time passed: 105 s. Flops: 6.9109E+02. Eff:71.988926% The progress: 22.00%. Time passed: 111 s. Flops: 6.9118E+02. Eff:71.997700% The progress: 23.00%. Time passed: 114 s. Flops: 6.9107E+02. Eff:71.986899% The progress: 24.00%. Time passed: 120 s. Flops: 6.9061E+02. Eff:71.938842% The progress: 25.00%. Time passed: 126 s. Flops: 6.8985E+02. Eff:71.859867% The progress: 26.00%. Time passed: 129 s. Flops: 6.8938E+02. Eff:71.810190% The progress: 27.00%. Time passed: 135 s. Flops: 6.8825E+02. Eff:71.692831% The progress: 28.00%. Time passed: 141 s. Flops: 6.8692E+02. Eff:71.554158% The progress: 29.00%. Time passed: 144 s. Flops: 6.8619E+02. Eff:71.477714% The progress: 30.00%. Time passed: 149 s. Flops: 6.8919E+02. Eff:71.790732% The progress: 31.00%. Time passed: 155 s. Flops: 6.8727E+02. Eff:71.590276% The progress: 32.00%. Time passed: 160 s. Flops: 6.8951E+02. Eff:71.823973% The progress: 33.00%. Time passed: 166 s. Flops: 6.8720E+02. Eff:71.583562% The progress: 34.00%. Time passed: 168 s. Flops: 6.9010E+02. Eff:71.885602% The progress: 35.00%. Time passed: 174 s. Flops: 6.8753E+02. Eff:71.617242% The progress: 36.00%. Time passed: 179 s. Flops: 6.8872E+02. Eff:71.741735% The progress: 37.00%. Time passed: 184 s. Flops: 6.8963E+02. Eff:71.836502% The progress: 38.00%. Time passed: 190 s. Flops: 6.8665E+02. Eff:71.525585% The progress: 39.00%. Time passed: 195 s. Flops: 6.8714E+02. Eff:71.577581%

The progress: 40.00%. Time passed: 200 s. Flops: 6.8742E+02. Eff:71.606208% The progress: 41.00%. Time passed: 205 s. Flops: 6.8749E+02. Eff:71.613290% The progress: 42.00%. Time passed: 210 s. Flops: 6.8736E+02. Eff:71.600479% The progress: 43.00%. Time passed: 215 s. Flops: 6.8707E+02. Eff:71.569272% The progress: 44.00%. Time passed: 220 s. Flops: 6.8660E+02. Eff:71.521031% The progress: 45.00%. Time passed: 225 s. Flops: 6.8599E+02. Eff:71.456997% The progress: 46.00%. Time passed: 229 s. Flops: 6.8822E+02. Eff:71.689997% The progress: 47.00%. Time passed: 234 s. Flops: 6.8727E+02. Eff:71.590622% The progress: 48.00%. Time passed: 239 s. Flops: 6.8620E+02. Eff:71.478814% The progress: 49.00%. Time passed: 243 s. Flops: 6.8783E+02. Eff:71.649078% The progress: 50.00%. Time passed: 250 s. Flops: 6.8714E+02. Eff:71.577285% The progress: 51.00%. Time passed: 255 s. Flops: 6.8562E+02. Eff:71.419058% The progress: 52.00%. Time passed: 259 s. Flops: 6.8666E+02. Eff:71.527179% The progress: 53.00%. Time passed: 263 s. Flops: 6.8753E+02. Eff:71.617428% The progress: 54.00%. Time passed: 270 s. Flops: 6.8597E+02. Eff:71.455592% The progress: 55.00%. Time passed: 274 s. Flops: 6.8648E+02. Eff:71.508645% The progress: 56.00%. Time passed: 278 s. Flops: 6.8685E+02. Eff:71.546671% The progress: 57.00%. Time passed: 284 s. Flops: 6.8714E+02. Eff:71.577107% The progress: 58.00%. Time passed: 288 s. Flops: 6.8717E+02. Eff:71.580644% The progress: 59.00%. Time passed: 294 s. Flops: 6.8700E+02. Eff:71.562386% The progress: 60.00%. Time passed: 298 s. Flops: 6.8674E+02. Eff:71.535359% The progress: 61.00%. Time passed: 304 s. Flops: 6.8615E+02. Eff:71.473903% The progress: 62.00%. Time passed: 308 s. Flops: 6.8563E+02. Eff:71.419733% The progress: 63.00%. Time passed: 313 s. Flops: 6.8686E+02. Eff:71.547738% The progress: 64.00%. Time passed: 319 s. Flops: 6.8565E+02. Eff:71.422234% The progress: 65.00%. Time passed: 324 s. Flops: 6.8637E+02. Eff:71.497254% The progress: 66.00%. Time passed: 328 s. Flops: 6.8532E+02. Eff:71.387426% The progress: 67.00%. Time passed: 333 s. Flops: 6.8565E+02. Eff:71.422265% The progress: 68.00%. Time passed: 338 s. Flops: 6.8576E+02. Eff:71.433526% The progress: 69.00%. Time passed: 343 s. Flops: 6.8566E+02. Eff:71.422471% The progress: 70.00%. Time passed: 348 s. Flops: 6.8535E+02. Eff:71.390292% The progress: 71.00%. Time passed: 355 s. Flops: 6.8400E+02. Eff:71.249495% The progress: 72.00%. Time passed: 359 s. Flops: 6.8517E+02. Eff:71.371541% The progress: 73.00%. Time passed: 364 s. Flops: 6.8424E+02. Eff:71.274724% The progress: 74.00%. Time passed: 368 s. Flops: 6.8500E+02. Eff:71.354616% The progress: 75.00%. Time passed: 374 s. Flops: 6.8450E+02. Eff:71.301867% The progress: 76.00%. Time passed: 378 s. Flops: 6.8483E+02. Eff:71.336247% The progress: 77.00%. Time passed: 384 s. Flops: 6.8380E+02. Eff:71.229326% The progress: 78.00%. Time passed: 389 s. Flops: 6.8427E+02. Eff:71.277711% The progress: 79.00%. Time passed: 394 s. Flops: 6.8443E+02. Eff:71.294912% The progress: 80.00%. Time passed: 399 s. Flops: 6.8431E+02. Eff:71.282574% The progress: 81.00%. Time passed: 404 s. Flops: 6.8393E+02. Eff:71.242261% The progress: 82.00%. Time passed: 409 s. Flops: 6.8328E+02. Eff:71.175460%

The progress: 83.00%. Time passed: 415 s. Flops: 6.8256E+02. Eff:71.099666% The progress: 84.00%. Time passed: 419 s. Flops: 6.8301E+02. Eff:71.147157% The progress: 85.00%. Time passed: 424 s. Flops: 6.8323E+02. Eff:71.170080% The progress: 86.00%. Time passed: 430 s. Flops: 6.8150E+02. Eff:70.989161% The progress: 87.00%. Time passed: 434 s. Flops: 6.8259E+02. Eff:71.103002% The progress: 88.00%. Time passed: 440 s. Flops: 6.8156E+02. Eff:70.996185% The progress: 89.00%. Time passed: 445 s. Flops: 6.8163E+02. Eff:71.002848% The progress: 90.00%. Time passed: 450 s. Flops: 6.8124E+02. Eff:70.962617% The progress: 91.00%. Time passed: 455 s. Flops: 6.8151E+02. Eff:70.990116% The progress: 92.00%. Time passed: 460 s. Flops: 6.8121E+02. Eff:70.958859% The progress: 93.00%. Time passed: 466 s. Flops: 6.7980E+02. Eff:70.812959% The progress: 94.00%. Time passed: 471 s. Flops: 6.8001E+02. Eff:70.834427% The progress: 95.00%. Time passed: 476 s. Flops: 6.8012E+02. Eff:70.845807% The progress: 96.00%. Time passed: 482 s. Flops: 6.7851E+02. Eff:70.677979% The progress: 97.00%. Time passed: 487 s. Flops: 6.7830E+02. Eff:70.655815% The progress: 98.00%. Time passed: 492 s. Flops: 6.7830E+02. Eff:70.656072% The progress: 99.00%. Time passed: 498 s. Flops: 6.7694E+02. Eff:70.514278% Finished. Time passed: 504 seconds.

================================================================================ T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WR00L2L2 79897 192 4 6 503.60 6.752e+02 HPL_pdgesv() start time Fri Apr 22 15:19:54 2016

HPL_pdgesv() end time Fri Apr 22 15:28:18 2016

-------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0013370 ...... PASSED ================================================================================

Finished 1 tests with the following results:

1 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values.

--------------------------------------------------------------------------------

End of Tests.

==============================================================================

5.4 附录3 整机测试输出文件

================================================================================ HPLinpack 2.1 -- High-Performance Linpack benchmark -- October 26, 2012 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver

================================================================================

An explanation of the input/output parameters follows: T/V : Wall time / encoded variant.

N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns.

Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N : 622119 NB : 192

PMAP : Row-major process mapping P : 36 Q : 40 PFACT : Left NBMIN : 2 NDIV : 2 RFACT : Left BCAST : 1ring DEPTH : 0

SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes

ALIGN : 8 double precision words

--------------------------------------------------------------------------------

- The matrix A is randomly generated for each test. - The following scaled residual check will be computed:

- The relative machine precision (eps) is taken to be 2.220446e-16 - Computational tests pass if scaled residuals are less than 16.0 Begin

The progress: 1.00%. Time passed: 386 s. Flops: 4.2209E+03. Eff:73.280243% The progress: 2.00%. Time passed: 723 s. Flops: 4.4917E+03. Eff:77.981043% The progress: 3.00%. Time passed: 957 s. Flops: 5.0728E+03. Eff:88.070026% The progress: 4.00%. Time passed: 1590 s. Flops: 4.0572E+03. Eff:70.437101% The progress: 5.00%. Time passed: 1985 s. Flops: 4.0484E+03. Eff:70.285580% The progress: 6.00%. Time passed: 2201 s. Flops: 4.4312E+03. Eff:76.930814% The progress: 7.00%. Time passed: 2574 s. Flops: 4.3961E+03. Eff:76.321390% The progress: 8.00%. Time passed: 3189 s. Flops: 4.0349E+03. Eff:70.050086% The progress: 9.00%. Time passed: 3557 s. Flops: 4.0899E+03. Eff:71.004616% The progress: 10.00%. Time passed: 3762 s. Flops: 4.2735E+03. Eff:74.192123% The progress: 11.00%. Time passed: 4395 s. Flops: 4.0347E+03. Eff:70.047632% The progress: 12.00%. Time passed: 4799 s. Flops: 4.0375E+03. Eff:70.095303% The progress: 13.00%. Time passed: 4988 s. Flops: 4.2114E+03. Eff:73.114608% The progress: 14.00%. Time passed: 5356 s. Flops: 4.1990E+03. Eff:72.900048% The progress: 15.00%. Time passed: 5968 s. Flops: 4.0376E+03. Eff:70.097557% The progress: 16.00%. Time passed: 6278 s. Flops: 4.0921E+03. Eff:71.044051% The progress: 17.00%. Time passed: 6580 s. Flops: 4.1646E+03. Eff:72.302241% The progress: 18.00%. Time passed: 7178 s. Flops: 4.0361E+03. Eff:70.071222% The progress: 19.00%. Time passed: 7556 s. Flops: 4.0401E+03. Eff:70.140166% The progress: 20.00%. Time passed: 7770 s. Flops: 4.1439E+03. Eff:71.942911% The progress: 21.00%. Time passed: 8350 s. Flops: 4.0393E+03. Eff:70.126933% The progress: 22.00%. Time passed: 8750 s. Flops: 4.0425E+03. Eff:70.182604% The progress: 23.00%. Time passed: 8959 s. Flops: 4.1301E+03. Eff:71.703087% The progress: 24.00%. Time passed: 9550 s. Flops: 4.0436E+03. Eff:70.202250% The progress: 25.00%. Time passed: 9943 s. Flops: 4.0449E+03. Eff:70.223218% The progress: 26.00%. Time passed: 10142 s. Flops: 4.1220E+03. Eff:71.562112% The progress: 27.00%. Time passed: 10718 s. Flops: 4.0472E+03. Eff:70.264323% The progress: 28.00%. Time passed: 11133 s. Flops: 4.0471E+03. Eff:70.262544% The progress: 29.00%. Time passed: 11323 s. Flops: 4.1156E+03. Eff:71.451109% The progress: 30.00%. Time passed: 11905 s. Flops: 4.0528E+03. Eff:70.360571% The progress: 31.00%. Time passed: 12300 s. Flops: 4.0553E+03. Eff:70.403838% The progress: 32.00%. Time passed: 12485 s. Flops: 4.1154E+03. Eff:71.447473% The progress: 33.00%. Time passed: 13079 s. Flops: 4.0596E+03. Eff:70.478672% The progress: 34.00%. Time passed: 13460 s. Flops: 4.0623E+03. Eff:70.526893% The progress: 35.00%. Time passed: 13665 s. Flops: 4.1162E+03. Eff:71.461524% The progress: 36.00%. Time passed: 14215 s. Flops: 4.0662E+03. Eff:70.593114% The progress: 37.00%. Time passed: 14546 s. Flops: 4.0868E+03. Eff:70.951995% The progress: 38.00%. Time passed: 14833 s. Flops: 4.1176E+03. Eff:71.485686% The progress: 39.00%. Time passed: 15389 s. Flops: 4.0735E+03. Eff:70.720721%

The progress: 40.00%. Time passed: 15609 s. Flops: 4.1182E+03. Eff:71.496658% The progress: 41.00%. Time passed: 16151 s. Flops: 4.0776E+03. Eff:70.791861% The progress: 42.00%. Time passed: 16547 s. Flops: 4.0805E+03. Eff:70.841887% The progress: 43.00%. Time passed: 16766 s. Flops: 4.1191E+03. Eff:71.511959% The progress: 44.00%. Time passed: 17319 s. Flops: 4.0814E+03. Eff:70.857103% The progress: 45.00%. Time passed: 17586 s. Flops: 4.1107E+03. Eff:71.366020% The progress: 46.00%. Time passed: 18092 s. Flops: 4.0834E+03. Eff:70.891970% The progress: 47.00%. Time passed: 18488 s. Flops: 4.0859E+03. Eff:70.936252% The progress: 48.00%. Time passed: 18721 s. Flops: 4.1177E+03. Eff:71.487722% The progress: 49.00%. Time passed: 19264 s. Flops: 4.0859E+03. Eff:70.935060% The progress: 50.00%. Time passed: 19508 s. Flops: 4.1168E+03. Eff:71.473072% The progress: 51.00%. Time passed: 20036 s. Flops: 4.0872E+03. Eff:70.959031% The progress: 52.00%. Time passed: 20282 s. Flops: 4.1191E+03. Eff:71.511301% The progress: 53.00%. Time passed: 20515 s. Flops: 4.1516E+03. Eff:72.076304% The progress: 54.00%. Time passed: 20887 s. Flops: 4.1545E+03. Eff:72.126084% The progress: 55.00%. Time passed: 21151 s. Flops: 4.1774E+03. Eff:72.523828% The progress: 56.00%. Time passed: 21668 s. Flops: 4.1496E+03. Eff:72.042190% The progress: 57.00%. Time passed: 21925 s. Flops: 4.1749E+03. Eff:72.481023% The progress: 58.00%. Time passed: 22437 s. Flops: 4.1508E+03. Eff:72.061720% The progress: 59.00%. Time passed: 22708 s. Flops: 4.1740E+03. Eff:72.465395% The progress: 60.00%. Time passed: 23225 s. Flops: 4.1511E+03. Eff:72.067448% The progress: 61.00%. Time passed: 23485 s. Flops: 4.1732E+03. Eff:72.451336% The progress: 62.00%. Time passed: 23985 s. Flops: 4.1517E+03. Eff:72.078698% The progress: 63.00%. Time passed: 24258 s. Flops: 4.1719E+03. Eff:72.428210% The progress: 64.00%. Time passed: 24765 s. Flops: 4.1508E+03. Eff:72.062001% The progress: 65.00%. Time passed: 25047 s. Flops: 4.1694E+03. Eff:72.385856% The progress: 66.00%. Time passed: 25534 s. Flops: 4.1500E+03. Eff:72.048353% The progress: 67.00%. Time passed: 25934 s. Flops: 4.1495E+03. Eff:72.040222% The progress: 68.00%. Time passed: 26260 s. Flops: 4.1595E+03. Eff:72.213162% The progress: 69.00%. Time passed: 26711 s. Flops: 4.1484E+03. Eff:72.021570% The progress: 70.00%. Time passed: 27000 s. Flops: 4.1638E+03. Eff:72.289058% The progress: 71.00%. Time passed: 27490 s. Flops: 4.1471E+03. Eff:71.998076% The progress: 72.00%. Time passed: 27783 s. Flops: 4.1612E+03. Eff:72.243704% The progress: 73.00%. Time passed: 28199 s. Flops: 4.1577E+03. Eff:72.183158% The progress: 74.00%. Time passed: 28675 s. Flops: 4.1443E+03. Eff:71.949203% The progress: 75.00%. Time passed: 28979 s. Flops: 4.1564E+03. Eff:72.160202% The progress: 76.00%. Time passed: 29373 s. Flops: 4.1541E+03. Eff:72.120198% The progress: 77.00%. Time passed: 29851 s. Flops: 4.1425E+03. Eff:71.919145% The progress: 78.00%. Time passed: 30239 s. Flops: 4.1421E+03. Eff:71.910777% The progress: 79.00%. Time passed: 30551 s. Flops: 4.1521E+03. Eff:72.084866% The progress: 80.00%. Time passed: 30961 s. Flops: 4.1488E+03. Eff:72.027715% The progress: 81.00%. Time passed: 31384 s. Flops: 4.1438E+03. Eff:71.940907% The progress: 82.00%. Time passed: 31692 s. Flops: 4.1552E+03. Eff:72.139471%

The progress: 83.00%. Time passed: 32076 s. Flops: 4.1547E+03. Eff:72.130440% The progress: 84.00%. Time passed: 32465 s. Flops: 4.1545E+03. Eff:72.125969% The progress: 85.00%. Time passed: 32794 s. Flops: 4.1624E+03. Eff:72.263809% The progress: 86.00%. Time passed: 33179 s. Flops: 4.1622E+03. Eff:72.260864% The progress: 87.00%. Time passed: 33566 s. Flops: 4.1619E+03. Eff:72.255680% The progress: 88.00%. Time passed: 34004 s. Flops: 4.1552E+03. Eff:72.139317% The progress: 89.00%. Time passed: 34390 s. Flops: 4.1555E+03. Eff:72.144300% The progress: 90.00%. Time passed: 34770 s. Flops: 4.1557E+03. Eff:72.147207% The progress: 91.00%. Time passed: 35109 s. Flops: 4.1613E+03. Eff:72.244951% The progress: 92.00%. Time passed: 35534 s. Flops: 4.1567E+03. Eff:72.165472% The progress: 93.00%. Time passed: 35887 s. Flops: 4.1607E+03. Eff:72.233757% The progress: 94.00%. Time passed: 36296 s. Flops: 4.1580E+03. Eff:72.187477% The progress: 95.00%. Time passed: 36675 s. Flops: 4.1588E+03. Eff:72.201712% The progress: 96.00%. Time passed: 37056 s. Flops: 4.1595E+03. Eff:72.212970% The progress: 97.00%. Time passed: 37412 s. Flops: 4.1629E+03. Eff:72.272460% The progress: 98.00%. Time passed: 37805 s. Flops: 4.1618E+03. Eff:72.252856% The progress: 99.00%. Time passed: 38179 s. Flops: 4.1631E+03. Eff:72.275566% Finished. Time passed: 38580 seconds.

================================================================================ T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WR00L2L2 622119 192 36 40 38596.64 4.159e+03 HPL_pdgesv() start time Fri Apr 22 16:49:43 2016

HPL_pdgesv() end time Sat Apr 23 03:33:00 2016

-------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0005241 ...... PASSED ================================================================================

Finished 1 tests with the following results:

1 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values.

--------------------------------------------------------------------------------

End of Tests.

==============================================================================

因篇幅问题不能全部显示,请点此查看更多更全内容