python – 使用运算符减少列表,将mpi4py中的每个元素相加

我正在写一个mpi python代码.例如,四个过程有以下数据:

data on procs0: [1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0]
data on procs1: [0, 0, 0, 4, 5, 6, 0, 0, 0, 0, 0, 0]
data on procs2: [0, 0, 0, 0, 0, 0, 7, 8, 9, 0, 0, 0]
data on procs3: [0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 11, 12]

我想在mpi4py库中使用reduce函数来减少procs0上的数据,结果如下:

result on procs0: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

我怎样才能使用mpi4py lib函数?

编辑:
以上是一个简单的特例,设置不能使用,请看下面的另一个案例:

data on procs0: [1,0,0,0,0,0]
data on procs1: [0,2,0,0,0,0]
data on procs2: [0,0,0,3,0,0]
data on procs3: [0,0,0,0,4,5]

理想的结果必须是:

result on procs0: [1,2,0,3,4,5]

解决方法:

我不确定你的问题是否需要数据总和或最大数据.我用mpi Reduce函数编写了一个简单的例子,它计算总和.

#!/usr/bin/env python
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD

comm.Barrier()
t_start = MPI.Wtime()

# this array lives on each processor
data = np.zeros(5)
for i in xrange(comm.rank, len(data), comm.size):
    # set data in each array that is different for each processor
    data[i] = i

# print out the data arrays for each processor
print '[%i]'%comm.rank, data
comm.Barrier()

# the 'totals' array will hold the sum of each 'data' array
if comm.rank==0:
    # only processor 0 will actually get the data
    totals = np.zeros_like(data)
else:
    totals = None

# use MPI to get the totals 
comm.Reduce(
    [data, MPI.DOUBLE],
    [totals, MPI.DOUBLE],
    op = MPI.SUM,
    root = 0
)

# print out the 'totals'
# only processor 0 actually has the data
print '[%i]'%comm.rank, totals

comm.Barrier()
t_diff = MPI.Wtime() - t_start
if comm.rank==0: print t_diff

将此代码保存在文件reduce_test.py中并使用命令mpirun -np 3 ./reduce_test.py运行它,在我的机器上输出以下内容:

[0] [ 0.  0.  0.  3.  0.]
[1] [ 0.  1.  0.  0.  4.]
[2] [ 0.  0.  2.  0.  0.]
[1] None
[2] None
[0] [ 0.  1.  2.  3.  4.]
0.00260496139526

请注意,将对comm.Reduce的调用中的参数op = MPI.SUM更改为op = MPI.MAX将计算最大值而不是总和.

上一篇:系统级性能分析工具perf的介绍与使用


下一篇:PL