CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Main CFD Forum

How to allow MPI code run in serial mode with a single CPU core?

Register Blogs Community New Posts Updated Threads Search

Like Tree14Likes
  • 1 Post By sbaffini
  • 1 Post By sbaffini
  • 1 Post By arjun
  • 2 Post By sbaffini
  • 1 Post By andy_
  • 1 Post By aerosayan
  • 2 Post By LuckyTran
  • 2 Post By andy_
  • 1 Post By sbaffini
  • 1 Post By sbaffini
  • 1 Post By sbaffini

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   May 28, 2022, 14:27
Default How to allow MPI code run in serial mode with a single CPU core?
  #1
Senior Member
 
Sayan Bhattacharjee
Join Date: Mar 2020
Posts: 495
Rep Power: 8
aerosayan is on a distinguished road
Hello everyone,

Looking at NASA's MPI codes like CFL3D, I'm physically repulsed by how much code is conditionally not compiled when the code is compiled without MPI. They did this to be able to compile their code to run with single core (in serial mode) or to run on multiple cores with MPI.

Here's an example : https://github.com/nasa/CFL3D/blob/9...lk.F#L907-L915

Here's another example : https://github.com/nasa/CFL3D/blob/9...lk.F#L737-L748


I think this is a horrible coding practice, to only compile certain sections of code when we're using MPI, and not compile them when we're only trying to use the code for serial execution.

This introduces the possibility of horrible bugs, and increases the complexity of someone trying to understand the code, and maintain it in future.

I would like to write my code in a way that would allow me to compile with MPI, but if necessary, only use a single core for executing the whole code. This is to help in debugging, as debugging serial code is far easier than debugging distributed code.

Is there a safe and easy way to do this? It's absolutely necessary.

Stackoverflow recommends wrapping code in if(rank == 0), to only execute them if we're on core-1, but it doesn't work.

I've thought of one solution, but don't know if it will work correctly. Kindly let me know if it will work.

Potential Solution : Run code in pseudo-serial mode. Run MPI code with 2 cores. Use core-1 to only manage MPI calls, and push all work to core-2. So in core-2, it looks like everything is executing serially, and it'll be easier to debug it.

Will this work?

Thanks
~sayan
aerosayan is offline   Reply With Quote

Old   May 28, 2022, 18:31
Default
  #2
Senior Member
 
sbaffini's Avatar
 
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,195
Blog Entries: 29
Rep Power: 39
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
When you find codes written that way the reason is not for running single core, but to allow compiling without MPI. There might be a number of reasons, none of which I fully understand, so I don't have that in my codes, also considering the relevance of the relative non parallel executable.

MPI has a standard, and it requires all the function calls to do the obvious thing in case they are called by a single process.

In the vast majority of cases, things can be written to fit both the serial and the parallel case. For few things it really makes more sense to do otherwise. For example, if you are running serially, you don't really want to call a mesh partitioner at all, but might still do it if you prefer ultra clean code over microoptimizations.

Of course, it also depends from the code, the algorithm, etc., but I tend to believe that there is always a way to rearrange things to work for any number of processes
aerosayan likes this.
sbaffini is offline   Reply With Quote

Old   May 28, 2022, 18:42
Default
  #3
Senior Member
 
sbaffini's Avatar
 
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,195
Blog Entries: 29
Rep Power: 39
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
On a second tought, I simply have no idea why some codes have conditional MPI compilation and I know nothing about CFL3D. What I know is that it is not required at all, MPI just works consistently with 1 process as well
aerosayan likes this.
sbaffini is offline   Reply With Quote

Old   May 29, 2022, 07:37
Default
  #4
Senior Member
 
Arjun
Join Date: Mar 2009
Location: Nurenberg, Germany
Posts: 1,291
Rep Power: 35
arjun will become famous soon enougharjun will become famous soon enough
Quote:
Originally Posted by sbaffini View Post
On a second tought, I simply have no idea why some codes have conditional MPI compilation and I know nothing about CFL3D. What I know is that it is not required at all, MPI just works consistently with 1 process as well

Sometimes MPI is not available to run, in that case one shall be able to run without mpirun . This is only possible explanation.
sbaffini likes this.
arjun is offline   Reply With Quote

Old   May 29, 2022, 07:44
Default
  #5
Senior Member
 
sbaffini's Avatar
 
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,195
Blog Entries: 29
Rep Power: 39
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
Quote:
Originally Posted by arjun View Post
Sometimes MPI is not available to run, in that case one shall be able to run without mpirun . This is only possible explanation.
I used to think so as well, and that's what I wrote in my first answer. But, I was wondering, is it really relevant for a CFD code today to exist without MPI? I mean, the flexibility of doing this comes, in my opinion, with very little advantage (if any at all) and a relative unreasonable cost in code complexity.

In the case of CFL3D, we are certainly speaking of old code, so it is more reasonable to think that back than MPI availability might have been an issue. But I see this also in modern code, and I have no reasonable explanation for that.
arjun and aerosayan like this.
sbaffini is offline   Reply With Quote

Old   May 29, 2022, 09:47
Default
  #6
Senior Member
 
andy
Join Date: May 2009
Posts: 327
Rep Power: 18
andy_ is on a distinguished road
I have not looked at the NASA code you cite but have written and maintained serial and parallel CFD codes.

In order to maximise efficiency a serial code and a parallel code will often use both different data structures and different algorithms. Likewise in parallel for shared memory and distributed memory and even in some cases for small numbers of processors and large numbers of processors with distributed memory.

If you are writing a research code where the code is being changed a lot and serial code efficiency isn't vital then there can be a case for uniform data structures and algorithms and using the distributed data structures and algorithms on a single processor. I have done this in the past by providing a dummy MPI library to link against with routines and functions throwing an error or warning if called.
aerosayan likes this.
andy_ is offline   Reply With Quote

Old   May 29, 2022, 14:15
Default
  #7
Senior Member
 
Sayan Bhattacharjee
Join Date: Mar 2020
Posts: 495
Rep Power: 8
aerosayan is on a distinguished road
Quote:
Originally Posted by sbaffini View Post
I used to think so as well, and that's what I wrote in my first answer. But, I was wondering, is it really relevant for a CFD code today to exist without MPI? I mean, the flexibility of doing this comes, in my opinion, with very little advantage (if any at all) and a relative unreasonable cost in code complexity.

Exactly what I think too. Serial codes don't have any realistic use for industrial 3D CFD solvers. Only benefit of serial code, is they help us develop and debug our code easily. Even then, it's debatable ... we can definitely debug MPI codes too, it might be better to just bite the bullet, and debug the whole parallelized code, instead of increasing complexity by including a single threaded execution mode.
sbaffini likes this.
aerosayan is offline   Reply With Quote

Old   May 29, 2022, 17:42
Default
  #8
Senior Member
 
sbaffini's Avatar
 
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,195
Blog Entries: 29
Rep Power: 39
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
Quote:
Originally Posted by andy_ View Post
I have not looked at the NASA code you cite but have written and maintained serial and parallel CFD codes.

In order to maximise efficiency a serial code and a parallel code will often use both different data structures and different algorithms. Likewise in parallel for shared memory and distributed memory and even in some cases for small numbers of processors and large numbers of processors with distributed memory.

If you are writing a research code where the code is being changed a lot and serial code efficiency isn't vital then there can be a case for uniform data structures and algorithms and using the distributed data structures and algorithms on a single processor. I have done this in the past by providing a dummy MPI library to link against with routines and functions throwing an error or warning if called.
I see what you mean but, have you ever been in a situation where you tought "wow, thank god I have this serial algorithm, orherwise I couldn't solve this"?

Also, what kind of systems don't have MPI but still need to run a CFD code?
sbaffini is offline   Reply With Quote

Old   May 30, 2022, 04:30
Default
  #9
Senior Member
 
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,762
Rep Power: 66
LuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura aboutLuckyTran has a spectacular aura about
Fluent, OpenFOAM, Starccm all use this approach of launching a master process and submitting a slave process via mpi and it requires absolutely no change in your already parallel code.

We don't even make hardware with single cores anymore, even the dumbest smart phone has a quad-core or hex-core. It would be asinine in this day and age to truly write serial code that can only compile in a single-core environment. Let the job scheduler do the job scheduling and you just write your parallel code and stop optimizing for an environment that doesn't need optimizing. If you are running CFD (or any code for that matter) in a serial environment, speed is obviously not the issue. "If you judge a fish by its ability to climb a tree..."
sbaffini and aerosayan like this.
LuckyTran is offline   Reply With Quote

Old   May 30, 2022, 13:16
Default
  #10
Senior Member
 
andy
Join Date: May 2009
Posts: 327
Rep Power: 18
andy_ is on a distinguished road
Quote:
Originally Posted by sbaffini View Post
I see what you mean but, have you ever been in a situation where you tought "wow, thank god I have this serial algorithm, orherwise I couldn't solve this"?

Also, what kind of systems don't have MPI but still need to run a CFD code?
Not depending on particular parallel libraries was good practise in the 80s and 90s when there were various libraries around and some hardware worked best using their own proprietary library/ies (and even language in the case of the transputer and Occam). They all worked pretty much the same at the lowest level. I think one of my research CFD codes supported 5 different distributed parallel libraries at one time. The NASA code may date from this time.

It depends what you classify as serial code. CFD codes designed to run on local workstations rather than remote clusters are likely to be using few enough cores that a serial/shared memory model may be preferable. Also industry tends to perform quite a lot of modest sized parametric studies often in 2D which are normally most efficiently runs as 1 task per processor with no inter-process communication.

But if one opts to support only one set of data structures and algorithms for serial, shared memory or distributed memory for a 3D CFD code then I agree it would obviously be distributed memory. The overhead when running on local workstations or running sets of independent jobs 1 per processor is unlikely to be large. The loss of shared memory unlikely to matter much in most circumstances though it can be significantly quicker to write and develop a complicated code for shared memory on a workstation.
sbaffini and aerosayan like this.
andy_ is offline   Reply With Quote

Old   May 30, 2022, 13:25
Default
  #11
Senior Member
 
sbaffini's Avatar
 
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,195
Blog Entries: 29
Rep Power: 39
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
Quote:
Originally Posted by andy_ View Post
Not depending on particular parallel libraries was good practise in the 80s and 90s when there were various libraries around and some hardware worked best using their own proprietary library/ies (and even language in the case of the transputer and Occam). They all worked pretty much the same at the lowest level. I think one of my research CFD codes supported 5 different distributed parallel libraries at one time. The NASA code may date from this time.

It depends what you classify as serial code. CFD codes designed to run on local workstations rather than remote clusters are likely to be using few enough cores that a serial/shared memory model may be preferable. Also industry tends to perform quite a lot of modest sized parametric studies often in 2D which are normally most efficiently runs as 1 task per processor with no inter-process communication.

But if one opts to support only one set of data structures and algorithms for serial, shared memory or distributed memory for a 3D CFD code then I agree it would obviously be distributed memory. The overhead when running on local workstations or running sets of independent jobs 1 per processor is unlikely to be large. The loss of shared memory unlikely to matter much in most circumstances though it can be significantly quicker to write and develop a complicated code for shared memory on a workstation.
Ok, I see, it makes sense. I was indeed implicitly reasoning under the very important assumption that one would use MPI for shared memory as well. Which is what I do because I am bad at openMP and I don't have time for both programming paradigms, but of course that is not the only way.
aerosayan likes this.
sbaffini is offline   Reply With Quote

Old   May 30, 2022, 13:34
Default
  #12
Senior Member
 
Sayan Bhattacharjee
Join Date: Mar 2020
Posts: 495
Rep Power: 8
aerosayan is on a distinguished road
Quote:
Originally Posted by sbaffini View Post
Ok, I see, it makes sense. I was indeed implicitly reasoning under the very important assumption that one would use MPI for shared memory as well. Which is what I do because I am bad at openMP and I don't have time for both programming paradigms, but of course that is not the only way.
How to do shared memory operations with MPI? I need it, and don't want to mix OpenMP with my MPI code.

Resources appreciated. Thanks!
aerosayan is offline   Reply With Quote

Old   May 30, 2022, 13:49
Default
  #13
Senior Member
 
sbaffini's Avatar
 
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,195
Blog Entries: 29
Rep Power: 39
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
Quote:
Originally Posted by aerosayan View Post
How to do shared memory operations with MPI? I need it, and don't want to mix OpenMP with my MPI code.

Resources appreciated. Thanks!
You don't, it is MPI that just works in the shared memory case as well. You may or not have a dedicated implementation in the MPI library that will handle the case more efficiently for you, but that's not required. Honestly, considering how difficult a well done OpenMP is, for me it was a no brainer.

What MPI has, but you should use them with care, are one-sided operations, where one process accesses the memory of some other (provided it was previously designated for the task) without the latter being actively involved.
aerosayan likes this.
sbaffini is offline   Reply With Quote

Old   May 30, 2022, 13:56
Default
  #14
Senior Member
 
Sayan Bhattacharjee
Join Date: Mar 2020
Posts: 495
Rep Power: 8
aerosayan is on a distinguished road
Quote:
Originally Posted by sbaffini View Post
What MPI has, but you should use them with care, are one-sided operations, where one process accesses the memory of some other (provided it was previously designated for the task) without the latter being actively involved.
Awesome! Could you please state the name of these operations and functions? I can't find them without a "googleable" name.
aerosayan is offline   Reply With Quote

Old   May 30, 2022, 14:00
Default
  #15
Senior Member
 
sbaffini's Avatar
 
Paolo Lampitella
Join Date: Mar 2009
Location: Italy
Posts: 2,195
Blog Entries: 29
Rep Power: 39
sbaffini will become famous soon enoughsbaffini will become famous soon enough
Send a message via Skype™ to sbaffini
"MPI one-sided communication" or "MPI remote memory access" are the keywords you want. But let me suggest you the two books "Using MPI" and "Using Advanced MPI" both by MIT press, you want to read them before actively working with MPI.
aerosayan likes this.
sbaffini is offline   Reply With Quote

Old   May 30, 2022, 14:05
Default
  #16
Senior Member
 
Sayan Bhattacharjee
Join Date: Mar 2020
Posts: 495
Rep Power: 8
aerosayan is on a distinguished road
Thanks! It's going to be fun with MPI!!!!
aerosayan is offline   Reply With Quote

Reply

Tags
mpi, mpi parallel


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
OpenFOAM benchmarks on various hardware eric Hardware 823 Today 08:30
No different caculation speed between serial and openMP code? AnhDL Main CFD Forum 4 May 16, 2020 05:17
Star cd es-ice solver error ernarasimman STAR-CD 2 September 12, 2014 01:01
Working directory via command line Luiz CFX 4 March 6, 2011 21:02
Design Integration with CFD? John C. Chien Main CFD Forum 19 May 17, 2001 16:56


All times are GMT -4. The time now is 15:51.