CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Community Contributions

[cfMesh] cfmesh in parallel (MPI)

Register Blogs Community New Posts Updated Threads Search

Like Tree4Likes
  • 1 Post By Q.E.D.
  • 3 Post By franjo_j

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   May 27, 2016, 15:43
Default cfmesh in parallel (MPI)
  #1
New Member
 
Join Date: Dec 2015
Posts: 16
Rep Power: 11
Q.E.D. is on a distinguished road
Hello everyone!

I'm currently trying to run cfmesh (v1.1.1, cartesianMesh) in parallel. (MPI) But there occurs an error which I didn't manage to resolve so far. Maybe someone can help me?

The error I get is the following:

[node033:18375] *** An error occurred in MPI_Bsend
[node033:18375] *** reported by process [46912131891201,11]
[node033:18375] *** on communicator MPI_COMM_WORLD
[node033:18375] *** MPI_ERR_BUFFER: invalid buffer pointer
[node033:18375] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node033:18375] *** and potentially your MPI job)

A similiar error has already been reported. (see: https://sourceforge.net/p/cfmesh/tickets/2/) But I cannot find the solution.

Thank you in advance for your support.
Arthur
al.csc likes this.
Q.E.D. is offline   Reply With Quote

Old   June 26, 2016, 12:39
Default
  #2
Senior Member
 
Franjo Juretic
Join Date: Aug 2011
Location: Velika Gorica, Croatia
Posts: 124
Rep Power: 17
franjo_j is on a distinguished road
Send a message via Skype™ to franjo_j
Quote:
Originally Posted by Q.E.D. View Post
Hello everyone!

I'm currently trying to run cfmesh (v1.1.1, cartesianMesh) in parallel. (MPI) But there occurs an error which I didn't manage to resolve so far. Maybe someone can help me?

The error I get is the following:

[node033:18375] *** An error occurred in MPI_Bsend
[node033:18375] *** reported by process [46912131891201,11]
[node033:18375] *** on communicator MPI_COMM_WORLD
[node033:18375] *** MPI_ERR_BUFFER: invalid buffer pointer
[node033:18375] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node033:18375] *** and potentially your MPI job)

A similiar error has already been reported. (see: https://sourceforge.net/p/cfmesh/tickets/2/) But I cannot find the solution.

Thank you in advance for your support.
Arthur
Hi,

cfMesh run in parallel all the time. I guess that you are referring to MPI parallelisation.

It is not possible to understand much from what you posted here. Please provide a log file and an example that reproduces the problem (http://sscce.org/).

Regards,

Franjo
__________________
Principal Developer of cfMesh and CF-MESH+
www.cfmesh.com
Social media: LinkedIn, Twitter, YouTube, Facebook, Pinterest, Instagram
franjo_j is offline   Reply With Quote

Old   January 27, 2020, 07:26
Default
  #3
Member
 
Tom
Join Date: Apr 2017
Posts: 50
Rep Power: 9
tom.opt is on a distinguished road
Hi,
I'm having a similar issue.
I am using OpenFOAM v1912 and trying to generate an aircraft mesh.
I'm working on a cluster so i need to run it in parallel using mpi.
When I add a small number of refinement levels it works, however when I increase the refinement levels mpi crashes..






Here is a working meshDict



surfaceFile "ac.stl";

maxCellSize 0.2;

objectRefinements
{

ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}

}
surfaceMeshRefinement
{
TE
{
additionalRefinementLevels 4;
surfaceFile "TE.stl";

}
nose
{
additionalRefinementLevels 3;
surfaceFile "nose.stl";

}
tails
{
additionalRefinementLevels 3;
surfaceFile "tails.stl";

}
wing
{
additionalRefinementLevels 2;
surfaceFile "wing.stl";

}
}



Here is a crashing meshDict:





surfaceFile "ac.stl";

maxCellSize 1;

objectRefinements
{
ac1
{
type box;
cellSize 0.5;
centre (20 0.998578 -0.613427);
lengthX 80;
lengthY 30;
lengthZ 30;
}
ac2
{
type box;
cellSize 0.2;
centre (10 0.998578 -0.613427);
lengthX 50;
lengthY 15;
lengthZ 15;
}
ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}
ac4
{
type box;
cellSize 0.05;
centre (4.2 0.998578 -0.613427);
lengthX 8.9;
lengthY 2.5;
lengthZ 3;
}

}

Has there been a fix?
tom.opt is offline   Reply With Quote

Old   January 27, 2020, 08:33
Default
  #4
Senior Member
 
Franjo Juretic
Join Date: Aug 2011
Location: Velika Gorica, Croatia
Posts: 124
Rep Power: 17
franjo_j is on a distinguished road
Send a message via Skype™ to franjo_j
Quote:
Originally Posted by tom.opt View Post
Hi,
I'm having a similar issue.
I am using OpenFOAM v1912 and trying to generate an aircraft mesh.
I'm working on a cluster so i need to run it in parallel using mpi.
When I add a small number of refinement levels it works, however when I increase the refinement levels mpi crashes..






Here is a working meshDict



surfaceFile "ac.stl";

maxCellSize 0.2;

objectRefinements
{

ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}

}
surfaceMeshRefinement
{
TE
{
additionalRefinementLevels 4;
surfaceFile "TE.stl";

}
nose
{
additionalRefinementLevels 3;
surfaceFile "nose.stl";

}
tails
{
additionalRefinementLevels 3;
surfaceFile "tails.stl";

}
wing
{
additionalRefinementLevels 2;
surfaceFile "wing.stl";

}
}



Here is a crashing meshDict:





surfaceFile "ac.stl";

maxCellSize 1;

objectRefinements
{
ac1
{
type box;
cellSize 0.5;
centre (20 0.998578 -0.613427);
lengthX 80;
lengthY 30;
lengthZ 30;
}
ac2
{
type box;
cellSize 0.2;
centre (10 0.998578 -0.613427);
lengthX 50;
lengthY 15;
lengthZ 15;
}
ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}
ac4
{
type box;
cellSize 0.05;
centre (4.2 0.998578 -0.613427);
lengthX 8.9;
lengthY 2.5;
lengthZ 3;
}

}

Has there been a fix?

If the problem was due to your meshDict, then the mesher would not work even without MPI.


I assume the problem comes from a limited MPI buffer size that is not large enough to handle all messages. You can increase the buffer size by setting an environment variable MPI_BUFFER_SIZE and keep on increasing the buffer size until it starts working.


Alternatively, you may adjust the buffer size by setting a variable in you $WM_PROJECT_DIR/etc/controlDict. Have a look here: https://www.openfoam.com/releases/op.../usability.php


Franjo
__________________
Principal Developer of cfMesh and CF-MESH+
www.cfmesh.com
Social media: LinkedIn, Twitter, YouTube, Facebook, Pinterest, Instagram
franjo_j is offline   Reply With Quote

Old   January 27, 2020, 10:26
Default
  #5
Member
 
Tom
Join Date: Apr 2017
Posts: 50
Rep Power: 9
tom.opt is on a distinguished road
Quote:
Originally Posted by franjo_j View Post
If the problem was due to your meshDict, then the mesher would not work even without MPI.


I assume the problem comes from a limited MPI buffer size that is not large enough to handle all messages. You can increase the buffer size by setting an environment variable MPI_BUFFER_SIZE and keep on increasing the buffer size until it starts working.


Alternatively, you may adjust the buffer size by setting a variable in you $WM_PROJECT_DIR/etc/controlDict. Have a look here: https://www.openfoam.com/releases/op.../usability.php


Franjo



Thanks.
I updated the value of the buffer in/OpenFOAM/OpenFOAM-v1912/etc/controlDict


I went up to 900 000 000

I rerun the program. But it still seems to crash?

Is there a recommended number of cores that i should use per million cells of mesh size?
i'm planning to generate something of the order 100mil and i'm using 64 cores
tom.opt is offline   Reply With Quote

Old   January 27, 2020, 10:41
Default
  #6
Senior Member
 
Franjo Juretic
Join Date: Aug 2011
Location: Velika Gorica, Croatia
Posts: 124
Rep Power: 17
franjo_j is on a distinguished road
Send a message via Skype™ to franjo_j
Quote:
Originally Posted by tom.opt View Post
Thanks.
I updated the value of the buffer in/OpenFOAM/OpenFOAM-v1912/etc/controlDict




I rerun the program. But it still seems to crash?

Is there a recommended number of cores that i should use per million cells of mesh size?
i'm planning to generate something of the order 100mil and i'm using 64 cores

cfMesh uses shared-memory parallelization (SMP) by default and MPI is used optionally. MPI is available for cartesianMesh, only.

For example, if there are 64 cores available on a single node, there is no need to use MPI. The code will not run any faster because it uses all cores by default.

Using MPI makes sense in two cases:
1. The desired number of cores is distributed over a given number of nodes.
2. There is not enough memory on a single node.

When using MPI, the number of MPI processes shall be equal to the number of nodes, not the number of cores. Cores are used by default on each node.
__________________
Principal Developer of cfMesh and CF-MESH+
www.cfmesh.com
Social media: LinkedIn, Twitter, YouTube, Facebook, Pinterest, Instagram
franjo_j is offline   Reply With Quote

Old   January 28, 2020, 05:21
Default
  #7
Member
 
Tom
Join Date: Apr 2017
Posts: 50
Rep Power: 9
tom.opt is on a distinguished road
Quote:
Originally Posted by franjo_j View Post
cfMesh uses shared-memory parallelization (SMP) by default and MPI is used optionally. MPI is available for cartesianMesh, only.

For example, if there are 64 cores available on a single node, there is no need to use MPI. The code will not run any faster because it uses all cores by default.

Using MPI makes sense in two cases:
1. The desired number of cores is distributed over a given number of nodes.
2. There is not enough memory on a single node.

When using MPI, the number of MPI processes shall be equal to the number of nodes, not the number of cores. Cores are used by default on each node.
Thank you very much

My hpc architecture was 16 cores per node so once i adjusted that(ie set the number of domains to 4 in decomposeParDict), and also increased the buffer size as previously advised, I managed to get it to run smoothly
tom.opt is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
problem during mpi in server: expected Scalar, found on line 0 the word 'nan' muth OpenFOAM Running, Solving & CFD 3 August 27, 2018 05:18
Run Mode:Platform MPI Local Parallel core problem mztcu CFX 0 October 13, 2016 04:14
Explicitly filtered LES saeedi Main CFD Forum 16 October 14, 2015 12:58
simpleFoam parallel AndrewMortimer OpenFOAM Running, Solving & CFD 12 August 7, 2015 19:45
Sgimpi pere OpenFOAM 27 September 24, 2011 08:57


All times are GMT -4. The time now is 21:11.