
Parallel Summation using MPI in Python with mpi4py
Parallel summation involves distributing the task of summing a large set of numbers across multiple processors or computing nodes, enabling simultaneous computation and aggregation of partial results. Each processor handles a portion of the data, performs local summation, and then communicates its partial sum to a designated root processor. The root processor collects and combines these partial sums to compute the global sum, thereby leveraging parallelism to accelerate the computation process and efficiently handle large-scale data sets. In this tutorial, we explore the implementation of parallel summation using MPI in Python, demonstrating how MPI facilitates communication and coordination among processes to achieve efficient parallel computation. Visit the detailed tutorial on MPI in Python here.
Code Example
from mpi4py import MPI
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
# Define the total number of elements
N = 100
# Calculate the number of elements to be handled by each process
chunk_size = N // size
start = rank * chunk_size
end = start + chunk_size
# Perform local summation for each process
local_sum = sum(range(start + 1, end + 1))
print("The processor", rank, "local sum is:", local_sum)
# Gather all local sums to the root process (rank 0)
global_sum = comm.reduce(local_sum, op=MPI.SUM, root=0)
# Print the result at the root process
if rank == 0:
print("Global sum:", global_sum)
Explanation
Import MPI Module and Initialize MPI Environment
from mpi4py import MPIThis line imports the MPI module from the mpi4py package, enabling the use of MPI functionalities.
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()These lines initialize the MPI environment. MPI.COMM_WORLD creates a communicator object representing all processes in the MPI world. comm.Get_size() returns the number of processes in the communicator, and comm.Get_rank() returns the rank of the current process in the communicator.
Define the Total Number of Elements
N = 100This line defines the total number of elements to be processed.
Calculate Chunk Size and Local Range
chunk_size = N / size
start = rank * chunk_size
end = start + chunk_sizeThese lines calculate the number of elements to be handled by each process (chunk_size) and determine the start and end indices of the local range of elements to be processed by the current process.
Perform Local Summation
local_sum = sum(range(start + 1, end + 1))This line performs the local summation of elements for each process. It creates a range of numbers starting from start + 1 (since we want to include start) to end, and then sums up these numbers using the sum function.
Gather Local Sums to Root Process
global_sum = comm.reduce(local_sum, op=MPI.SUM, root=0)This line gathers all the local sums from each process and performs a reduction operation. The local sums are reduced using the MPI sum operation (MPI.SUM). The resulting global sum is stored in the variable global_sum and is only computed by the root process (rank 0).
Print Result at Root Process
if rank == 0:
print("Global sum:", global_sum)This conditional statement ensures that only the root process (rank 0) prints the final result. Other processes contribute their local sums but do not print the result.
6 thoughts on “Parallel Summation using MPI in Python with mpi4py”
어제 친구들과 회식 자리로강남가라오케추천다녀왔는데, 분위기도 좋고 시설도 깨끗해서 추천할 만했어요.
요즘 회식 장소 찾는 분들 많던데, 저는 지난주에강남가라오케추천코스로 엘리트 가라오케 다녀와봤습니다.
분위기 있는 술자리 찾을 땐 역시강남하퍼추천확인하고 예약하면 실패가 없더라고요.
회사 동료들이랑강남엘리트가라오케방문했는데, VIP룸 덕분에 프라이빗하게 즐길 수 있었어요.
신논현역 근처에서 찾다가강남룸살롱를 예약했는데, 접근성이 좋아서 만족했습니다.
술자리도 좋지만 요즘은강남셔츠룸가라오케이라고 불릴 만큼 서비스가 좋은 곳이 많더군요.