|
[Sponsors] |
[OpenFOAM.org] Trouble Compiling OpenFOAM-dev using Intel Compiler 15 for use on Xeon Phi |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
August 18, 2015, 15:32 |
Trouble Compiling OpenFOAM-dev using Intel Compiler 15 for use on Xeon Phi
|
#1 |
New Member
A
Join Date: Aug 2015
Posts: 6
Rep Power: 11 |
Hi All,
As the title states, I've been having some trouble getting OpenFOAM-dev to run on the Xeon Phi's MIC architecture. I have successfully compiled and used OpenFOAM-dev using Intel Compiler 15 on the host machine (CentOS7 using x86_64 arch) before. But altering the setup for use on the Phi ("-mmic" options, etc.) has proven difficult. The below issues have surfaced compiling with either "icpc" or "mpiicpc". It doesn't seem to make a difference. Given the setup I'll lay out below based on the attached files, I have had a mostly successful compilation. The "src" directory compiles with no errors. The only issue in compiling the "applications/utilities" directory is an issue with "setSet". The "applications/solvers" directory successfully compiles the solvers that are of use to me (there are some errors with some of the "reacting foams", "solid foams", etc. but these don't matter to me). After compilation, the host machine cannot execute the binaries as expected while the Xeon Phi can. Unfortunately, when executing any command (checkMesh, blockMesh, icoFoam, etc.) the only output produced is "Segmentation Fault". There is no other output so I have had some issues troubleshooting the cause. Hopefully, somebody on this forum can provide some insight. I've attached all the files in my setup that I believe are of interest. "compileSourceMe" is sourced before compilation. "micSourceMe" is sourced to setup the environment on the Phi after compilation. The long list of linked libraries in "wmake/rules/linux64Icc/c++" under "LINKEXE" appears to be necessary. Despite looking in places like "FOAM_LIBBIN" regardless, I would get "cannot find library" errors unless I explicitly defined them like this (this was not necessary when compiling for use on the host machine). I also used this website (http://machls.cc.oita-u.ac.jp/kenkyu...-0-on-xeon-phi) as a guide for setting up the Third Party applications. This was written for Intel Compiler 14 on CentOS6.5 so it is slightly out of date and I had to make some tweaks. I'm not sure of the necessity of the "CGAL/boost" workaround anymore but this shouldn't be an issue because CGAL is optional, correct? I did not receive any errors relating to boost libraries using this method. One aside, the current attempts have been with 32bit labels as I have had some issues with scotch trying to compile with 64bit labels (I will most likely need 64bit labels in the future but, one step at a time, right?). Not sure if this is relevant to my current issue or not but figured it was worth mentioning. Thanks for the help. I put all my various "wmake rules" files into a single text file separated with headers due to the 5 file upload limit. I can upload my Allwmake logs in a separate post as well if needed. If there's anything else I can provide, just let me know. |
|
August 18, 2015, 15:39 |
|
#2 |
New Member
A
Join Date: Aug 2015
Posts: 6
Rep Power: 11 |
And here are the Allwmake logs.
|
|
August 18, 2015, 16:06 |
|
#3 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Greetings foamer123 and welcome to the forum!
I don't have access to a Xeon Phi, so I can't help with that part. Nonetheless, I do understand the reason for the issues you attached on the second post.
Bruno
__________________
|
|
August 18, 2015, 17:54 |
|
#4 |
New Member
A
Join Date: Aug 2015
Posts: 6
Rep Power: 11 |
Bruno,
Thanks for the help.
OpenFOAM now compiles without throwing any visible errors. Unfortunately the segmentation fault issue continues, albeit now with more commands to fail on . You would not happen to have any idea what could cause this on some other architecture you are more familiar with, would you? "Segmentation Fault" is the only output produced (not even the standard OpenFOAM header is read to stdout). Thanks again! -A |
|
August 18, 2015, 18:13 |
|
#5 | ||
New Member
A
Join Date: Aug 2015
Posts: 6
Rep Power: 11 |
Some additional information for anyone who stumbles upon this thread. The "dmesg" command when run on the Xeon Phi shows that the "segmentation fault" output is created by a "general protection" error, see below:
Quote:
Quote:
-A |
|||
August 18, 2015, 20:17 |
|
#6 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Mmm... I re-read your first post and took a look at the environment you load after compiling... my guess is that there might be a library that is being loaded from the main system.
Try running: Code:
ldd $(which icoFoam) Beyond this, my guess would be for you to take a step back to OpenFOAM 2.3.0 and try out the instructions you mentioned, to try and see if some of the changes you made are being done properly. Hopefully this will help you isolate the origin of the problem. As adapting to ICC 15... ah, found it, I believe this commit has most of the changes needed: https://github.com/OpenFOAM/OpenFOAM...1957a2c7fb7f08 |
|
August 19, 2015, 15:34 |
|
#7 |
New Member
A
Join Date: Aug 2015
Posts: 6
Rep Power: 11 |
Bruno,
I followed your suggestion and Code:
ldd $(which icoFoam) I've also attached a log from running Code:
strace icoFoam Code:
etc/cellModels If there is nothing new this information tells you then I believe you are correct that my next step would be to step back to v2.3.0 and follow the commit changes to make it compatible with the v15 compiler. If that is the case, it may take me some time, but I will be sure to update the thread for anyone who finds it if I am successful. Thank you again for the help Bruno, you've been very informative. -A |
|
August 19, 2015, 16:49 |
|
#8 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
You're welcome for the information, but it's too bad the Xeon Phi is so freaking expensive otherwise I would have probably already written build instructions for that thingamabob ... I gotta find out how to sign up for "free/borrowed stuff for open source development"... assuming I can then find the time to use it
Anyway, I can't find anything suspicious on the ldd output. However, on the strace output there are way too many "No such file or directory" for my taste. Including something that's worrying me: it's looking for the folder "/root/.OpenFOAM", which implies you're trying to run the solver as root, which might be why the system is blocking you out from using the Phi as root... since that's a risky step which could lead to a crash or critical damage to files needed for the main system to work... but this is just a guess. The "cellModels" file is a a reference file for knowing how cells are structured, e.g. how points are ordered/related in an hexahedral cell. Weird thing is that this is the only file that doesn't give the message "No such file or directory"... oh, OK, now I get it: the files that weren't read were those that the library loader was looking for from the "LD_LIBRARY_PATH". Either way, it was able to load the main "controlDict" and "cellModels" files, but it then ended up crashing before being able to open any more files. I ran strace on my side (normal machine, not a Phi ) and the few lines come after the point where yours crashed: Code:
open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 4 fstat(4, {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0 fstat(4, {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdd136bd000 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\v\0\0\0\v\0\0\0\0"..., 4096) = 3439 lseek(4, -2175, SEEK_CUR) = 1264 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\v\0\0\0\v\0\0\0\0"..., 4096) = 2175 close(4) = 0 munmap(0x7fdd136bd000, 4096) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0 uname({sys="Linux", node="myMachineName", ...}) = 0 fstat(1, {st_mode=S_IFREG|0664, st_size=55181, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdd136bd000 Therefore, either the crash occurs somewhere in between the two points or it's when it tries to load the local time settings... although this is probably not done directly by OpenFOAM, but probably the time function it calls will need to open this file. Looking for "Xeon Phi localtime" with Google does give some hits on some weird requirement for making the Phi work as if its an independent machine... not sure if this is only for some specific use scenarios. It might be necessary for you to build the Debug build of OpenFOAM, for properly getting down to the bottom of the problem (in case everything works fine with 2.3.0), but my guess would be this: have you tried building and using some other software that is known to work with the Xeon Phi? |
|
August 20, 2015, 11:55 |
|
#9 |
New Member
A
Join Date: Aug 2015
Posts: 6
Rep Power: 11 |
Bruno,
This is actually my first experience using a Xeon Phi, so it's a bit of a trial by fire. I do not think the issue is with the /etc/localtime file. It looks like the commands all fail when calling the times function in the previous step (I was mistaken when I said the last call was to open ../OpenFOAM-dev/etc/cellModels. I had missed that times call right at the very end of my strace). Just to double check, I configured the zoneinfo data referenced by localtime (which, admittedly, was not set up properly on the Phi). But it is now configured and produces the same time zone info as the host machine. I believe the times command is used to monitor system time devoted to processes (e.g. used by OpenFOAM to determine ExecutionTime and ClockTime when running simulations). Unfortunately, this appears to be a bash built-in which means I cannot strace it to see what it could be accessing on the Phi that could be causing issues. Although, just running times by itself outside of OpenFOAM produces output and does not fail. It looks like I will have to try compiling v2.3.0 with only those Intel 15 commit changes from the dev implemented and see if this is still an issue or not. The only reason I was using the dev version was because it had been modified for Intel Compiler 15 compatibility. So if a modified 2.3.0 will compile I wouldn't need to go back to the dev version. Would the debug version of OpenFOAM still be useful if the problem looks to be with a bash built-in? I have not used it before but know the compile option I would need to set before building it. |
|
August 20, 2015, 15:03 |
|
#10 | |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Quote:
The "times" line probably refers to this: https://github.com/OpenFOAM/OpenFOAM...argList.C#L534 Code:
if (initialise) { string dateString = clock::date(); string timeString = clock::clockTime(); // Print the banner once only for parallel runs if (Pstream::master() && bannerEnabled) { IOobject::writeBanner(Info, true) << "Build : " << Foam::FOAMbuild << nl << "Exec : " << argListStr_.c_str() << nl << "Date : " << dateString.c_str() << nl << "Time : " << timeString.c_str() << nl << "Host : " << hostName() << nl << "PID : " << pid() << endl; } Code:
Foam::string Foam::clock::date() { std::ostringstream osBuffer; time_t t = getTime(); struct tm *timeStruct = localtime(&t); osBuffer << monthNames[timeStruct->tm_mon] << ' ' << std::setw(2) << std::setfill('0') << timeStruct->tm_mday << ' ' << std::setw(4) << timeStruct->tm_year + 1900; return osBuffer.str(); } If you comment out most of the code within this method "date()" and run Allwmake again, the crash will likely occur later on after this call and you should be able to see the famous descriptive header for OpenFOAM applications when you run it. A few strategic "cout" calls might also help isolate the point where it breaks, e.g.: Code:
std::cout << "got here 00" << std::endl; |
||
Tags |
icpc, intel compiler, openfoam-dev, segmentation fault, xeon phi |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Trouble compiling utilities using source-built OpenFOAM | Artur | OpenFOAM Programming & Development | 14 | October 29, 2013 11:59 |
CFX11 + Fortran compiler ? | Mohan | CFX | 20 | March 30, 2011 19:56 |
OpenFOAM 1.5 dev | LVDH | OpenFOAM | 98 | May 5, 2010 18:01 |
OF 1.6 | Ubuntu 9.10 (64bit) | GLIBCXX_3.4.11 not found | piprus | OpenFOAM Installation | 22 | February 25, 2010 14:43 |
Intel compiler for linux x86 | marcus | Siemens | 1 | November 9, 2006 17:43 |