{"id":3169,"date":"2024-05-07T18:41:44","date_gmt":"2024-05-07T13:41:44","guid":{"rendered":"https:\/\/afzalbadshah.com\/?p=3169"},"modified":"2024-05-08T22:14:46","modified_gmt":"2024-05-08T17:14:46","slug":"parallel-summation-using-mpi-in-python-with-mpi4py","status":"publish","type":"post","link":"https:\/\/afzalbadshah.com\/index.php\/2024\/05\/07\/parallel-summation-using-mpi-in-python-with-mpi4py\/","title":{"rendered":"Parallel Summation using MPI in Python with mpi4py"},"content":{"rendered":"\n<p>Parallel summation involves distributing the task of summing a large set of numbers across multiple processors or computing nodes, enabling simultaneous computation and aggregation of partial results. Each processor handles a portion of the data, performs local summation, and then communicates its partial sum to a designated root processor. The root processor collects and combines these partial sums to compute the global sum, thereby leveraging parallelism to accelerate the computation process and efficiently handle large-scale data sets. In this tutorial, we explore the implementation of parallel summation using MPI in Python, demonstrating how MPI facilitates communication and coordination among processes to achieve efficient parallel computation. <a href=\"https:\/\/afzalbadshah.com\/index.php\/category\/courses\/mpi-with-python\/\" target=\"_blank\" rel=\"noopener\" title=\"\">Visit the detailed tutorial on MPI in Python here. <\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Code Example<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>from mpi4py import MPI\n\ncomm = MPI.COMM_WORLD\nsize = comm.Get_size()\nrank = comm.Get_rank()\n\n# Define the total number of elements\nN = 100\n\n# Calculate the number of elements to be handled by each process\nchunk_size = N \/\/ size\nstart = rank * chunk_size\nend = start + chunk_size\n\n# Perform local summation for each process\nlocal_sum = sum(range(start + 1, end + 1))\n\nprint(\"The processor\", rank, \"local sum is:\", local_sum)\n# Gather all local sums to the root process (rank 0)\nglobal_sum = comm.reduce(local_sum, op=MPI.SUM, root=0)\n\n# Print the result at the root process\nif rank == 0:\n    print(\"Global sum:\", global_sum)<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" width=\"640\" height=\"189\" src=\"https:\/\/i0.wp.com\/afzalbadshah.com\/wp-content\/uploads\/2024\/05\/image.png?resize=640%2C189&#038;ssl=1\" alt=\"\" class=\"wp-image-3196\" srcset=\"https:\/\/i0.wp.com\/afzalbadshah.com\/wp-content\/uploads\/2024\/05\/image.png?w=667&amp;ssl=1 667w, https:\/\/i0.wp.com\/afzalbadshah.com\/wp-content\/uploads\/2024\/05\/image.png?resize=300%2C89&amp;ssl=1 300w, https:\/\/i0.wp.com\/afzalbadshah.com\/wp-content\/uploads\/2024\/05\/image.png?resize=604%2C178&amp;ssl=1 604w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><figcaption class=\"wp-element-caption\">Output of the program<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Explanation<\/h3>\n\n\n\n<p><strong>Import MPI Module and Initialize MPI Environment<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from mpi4py import MPI<\/code><\/pre>\n\n\n\n<p>This line imports the MPI module from the mpi4py package, enabling the use of MPI functionalities.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>comm = MPI.COMM_WORLD\nsize = comm.Get_size()\nrank = comm.Get_rank()<\/code><\/pre>\n\n\n\n<p>These lines initialize the MPI environment. <code>MPI.COMM_WORLD<\/code> creates a communicator object representing all processes in the MPI world. <code>comm.Get_size()<\/code> returns the number of processes in the communicator, and <code>comm.Get_rank()<\/code> returns the rank of the current process in the communicator.<\/p>\n\n\n\n<p><strong>Define the Total Number of Elements<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>N = 100<\/code><\/pre>\n\n\n\n<p>This line defines the total number of elements to be processed.<\/p>\n\n\n\n<p><strong>Calculate Chunk Size and Local Range<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>chunk_size = N \/ size\nstart = rank * chunk_size\nend = start + chunk_size<\/code><\/pre>\n\n\n\n<p>These lines calculate the number of elements to be handled by each process (<code>chunk_size<\/code>) and determine the start and end indices of the local range of elements to be processed by the current process.<\/p>\n\n\n\n<p><strong>Perform Local Summation<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>local_sum = sum(range(start + 1, end + 1))<\/code><\/pre>\n\n\n\n<p>This line performs the local summation of elements for each process. It creates a range of numbers starting from <code>start + 1<\/code> (since we want to include <code>start<\/code>) to <code>end<\/code>, and then sums up these numbers using the <code>sum<\/code> function.<\/p>\n\n\n\n<p><strong>Gather Local Sums to Root Process<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>global_sum = comm.reduce(local_sum, op=MPI.SUM, root=0)<\/code><\/pre>\n\n\n\n<p>This line gathers all the local sums from each process and performs a reduction operation. The local sums are reduced using the MPI sum operation (<code>MPI.SUM<\/code>). The resulting global sum is stored in the variable <code>global_sum<\/code> and is only computed by the root process (rank 0).<\/p>\n\n\n\n<p><strong>Print Result at Root Process<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>if rank == 0:\n    print(\"Global sum:\", global_sum)<\/code><\/pre>\n\n\n\n<p>This conditional statement ensures that only the root process (rank 0) prints the final result. Other processes contribute their local sums but do not print the result.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"MPI (MPI4PY) on Jupyter Tutorial\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/videoseries?list=PLGiqyN7d0mypGRrq2Bv9mXLGoeZPXTcZA\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Material<\/h2>\n\n\n\n<p><a href=\"https:\/\/drive.google.com\/drive\/folders\/1vKaZIsBGLhzew2DKmmYfMCSrPAo4jXe-?usp=sharing\" target=\"_blank\" rel=\"noopener\" title=\"\">Download the programs (code), covering the MPI4Py.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Parallel summation involves distributing the task of summing a large set of numbers across multiple processors or computing nodes, enabling simultaneous computation and aggregation of partial results. Each processor handles a portion of the data, performs local summation, and then communicates its partial sum to a designated root processor. The root processor collects and combines these partial sums to compute the global sum, thereby leveraging parallelism to accelerate the computation process and efficiently handle large-scale data sets. In this tutorial,&#8230;<\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/afzalbadshah.com\/index.php\/2024\/05\/07\/parallel-summation-using-mpi-in-python-with-mpi4py\/\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":3172,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"enabled":false},"version":2}},"categories":[506],"tags":[540,543,576],"class_list":["post-3169","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mpi-with-python","tag-mpi","tag-mpi4py","tag-parallel-sumission"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/afzalbadshah.com\/wp-content\/uploads\/2024\/05\/MPI-Python-4-jpg.webp?fit=1280%2C720&ssl=1","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pf3emP-P7","jetpack-related-posts":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/posts\/3169","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/comments?post=3169"}],"version-history":[{"count":7,"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/posts\/3169\/revisions"}],"predecessor-version":[{"id":3198,"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/posts\/3169\/revisions\/3198"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/media\/3172"}],"wp:attachment":[{"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/media?parent=3169"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/categories?post=3169"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/afzalbadshah.com\/index.php\/wp-json\/wp\/v2\/tags?post=3169"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}