|
[Sponsors] |
Binary gives significant performance advantage (Mesh & Solve) |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
June 8, 2014, 09:17 |
Binary gives significant performance advantage (Mesh & Solve)
|
#1 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
I've run a handful of different external aero cases. I am very surprised to see meshing and solution running significantly faster when output is set at binary compared to ASCII. The only difference in these cases is the setting of 'writeFormat' in the controlDict. Time stated below is User time. Please note each case is an entirely different geometry, block mesh & type).
Case1 - snappyHexMesh - 4.4 million cell Binary: 1 hr 24 min ASCII: 3 hr 0 min Percentage decrease: 53% Case2 - snappyHexMesh - 1.5 million cell Binary: 20 min ASCII: 23 min Percentage decrease: 10% Case2 - simpleFoam - 1700 steps Binary: 9 hr 04 min ASCII: 11 hr 52 min Percentage decrease: 24% Case3 - snappyHexMesh - 4.1 million cells Binary: 9 hr 18 min ASCII: 10 hr 15 min Percentage decrease: 10% I appreciate that writing ASCII might slow the system, perhaps a minute or so over a long run, but nothing like the significant and repeatable amounts I've encountered. Surely OpenFOAM can only 'understand' binary, thus even when running 'ASCII' these files are read into the system memory as binary? There shouldn't be any major performance difference, but there is. Has anybody else experienced this? Any explanations or solutions other than running foamConvertMesh before and after each run would be greatly appreciated. |
|
June 8, 2014, 09:54 |
|
#2 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Greetings Jason,
There are a few important details that aren't clear in your description:
As for the frequency, the following details come to mind:
Bruno |
|
June 8, 2014, 13:35 |
|
#3 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
Bruno,
In response to your questions, this was performed with OpenFOAM 2.3 on both CentOS 6.3 and Ubuntu 12.04 using the OpenFOAM repositories on both. Of the three cases there is a mixture of serial and parallel meshing, the flow solution was run in parallel. I varied the write intervals between the cases, from infrequent to frequent, with little impact on the time. Having run this on desktop and cluster, parallel and series, different OS thus different binaries, different geometries and meshes... I don't believe it's an anomaly isolated to me or the way I have setup OpenFOAM or my cases. In fact, after sharing my results someone different ran their own test, on a different system again. This case was entirely removed from me, he also found the same performance increase with binary over ASCII. Of course I'm happy to share the values from the files if you wish, but I believe if anyone tries switching between binary and ascii on their own cases they will see an unexpected and disproportionate performance increase. Many thanks, Jason |
|
June 18, 2014, 12:01 |
Expected
|
#4 |
Member
Bruno Blais
Join Date: Sep 2013
Location: Canada
Posts: 64
Rep Power: 13 |
This is to be expected. It is noticeably faster to write binary files than writing ASCII files, therefore this should be why you get this increase in speed.
This is due to "two" main factors : -> Binary files are much more compressed, the size difference between an ASCII and binary file is important -> C++ file streams are much faster flushing out binary data than ASCII. This is the same for any languages anyway. Therefore, I am not sure I understand why this would be surprising / or a problem? |
|
June 18, 2014, 12:15 |
|
#5 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
It is to be expected? In some cases, three hours extra to write a handful of files in binary rather than ASCII?? I'm sorry, but I don't believe it is! I'd expect a read-write overhead of ASCII to be in the order of seconds or minutes for a typical 3D aero CFD solution.
As I stated in my initial post an overhead is to be expected, I am not surprised to see one. What I am surprised by is the magnitude of the performance difference. It's massive, and surely can't solely be attributed to the simple writing of files?? |
|
June 18, 2014, 12:30 |
|
#6 |
Member
Bruno Blais
Join Date: Sep 2013
Location: Canada
Posts: 64
Rep Power: 13 |
It's true that the overhead seems very large. On which material architecture did you run those test? What kind of harddrive did you use? What amount of ram does the computer/cluster have? You mentioned Ubuntu so I presume this is a personnal computer/workstation right?
I cannot comment much on SnappyHexMesh because I do not know how often it writes to disk, but for a simpleFoam simulation the 24% is plausible depending on the number of writes you do and if it is in parallel or serial. You also did not mention the size of the simpleFoam case. It is also in the range of a couple million cells? Just an addition : ASCII wastes a lot of bit, especially if you are using a large numerical precision (let's say trying to output 8 digits) or if you are not using ASCII but UTF8 as an encoding format. For a large case (let's say million of cells) and for the same precision, you could easily expect that writing each file would take twice or more the amount of time (10s binary vs 20s or more ASCII). It's a relative problem. The bigger the file you are writing is, the more you will notice the difference. |
|
June 18, 2014, 13:18 |
|
#7 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
The answer to those questions are in my reply to Bruno. I ran this across a number of desktops (Ubuntu) and on cluster (CentOS). Hardware is relatively powerful, desktops are 6-core (12 thread) Intel i7 Extreme with 64 GB RAM and SSD's, cluster nodes are Core i7 Quad Cores with 16 GB/each, I ran across two nodes, hence 16 threads and 32 GB. I kept close eye on RAM utilisation, as expected for these size cases (1-4 million cells) RAM usage was 2-6 GB range. All with high-end motherboards and good HDD and SSD.
The simpleFoam case was 'Case 2' in my original post, hence the stated 1.5 million cells. I have tried, other people too, with completely different cases, yet all with similar significant advantages from binary to ASCII. Let me phrase my confusion in a different way... there is a command in OpenFOAM called 'fileFormatConvert' that reads the settings in system/controlDict and converts the relevant files in the case to that format. On the 1.5 million cell simpleFoam case this conversion process from binary to ASCII takes, in CPU time, just a couple of minutes for every time step (there are just 10). I would imagine this is exactly the same process OpenFoam would be using with simpleFoam, thus I would expect running simpleFoam in ASCII compared to binary to have this overhead. This is also in line with common sense and my own computing experience. The reality is, however, the ASCII penalty is orders of magnitude higher. |
|
June 18, 2014, 14:29 |
|
#8 |
Member
Bruno Blais
Join Date: Sep 2013
Location: Canada
Posts: 64
Rep Power: 13 |
Then I am at lost here, this is indeed abnormal. The cost difference of writing ASCII instead of binary should not exceed that of the fileFormatConvert utility, because obviously in the last one you have to re-read the file again from scratch AND write it...
I have no idea. Do you know if the same writing method is used in both ASCII and Binary case (meaning, are MPI-IO functionnality used in the binary case that would not be used in the ASCII case?). In all cases, this is very surprising and a bit troubling... |
|
June 19, 2014, 04:40 |
|
#9 |
Senior Member
Olivier
Join Date: Jun 2009
Location: France, grenoble
Posts: 272
Rep Power: 18 |
hello,
I've done this test in the old time with OF 1.6, and conclude mostly the same (and same order of magnitude): ascii compressed files faster than << binary compressed << binary << ascii. So try aslo compressed file (ascii and binary) to see if you get the same. regards, olivier |
|
June 23, 2014, 03:52 |
|
#10 |
Senior Member
Karl-Johan Nogenmyr
Join Date: Mar 2009
Location: Linköping
Posts: 279
Rep Power: 21 |
Hi!
Do you have the standard output from the simpleFoam runs? One could maybe do some statistical analysis of the execution and wall clock times to try to point out where the difference occurs. Kalle |
|
June 23, 2014, 10:55 |
|
#11 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
Olivier, I haven't looked at the different between compressed and uncompressed. Thank you for the suggestion, I'll run some cases now to compare to see if the same is true with OF 2.3.
Kalle, I used the linux 'Time' facility which reports real time (Wall clock), user time (CPU time) and system time (overhead). The times I reported above are User times. I still have the output files, but have not done a comparison to see at which point they start to diverge, this is a good suggestion, thank you. I'll perform additional runs based in Olivier's suggestion of write compression and then I'll report back here with the findings, and analysis of when the diverge. In the meantime I would suggest OpenFOAM users run in Binary and user foamFormatConvert utility to convert to ASCII if they require it. Many thanks, Jason |
|
June 23, 2014, 15:58 |
|
#12 |
Senior Member
Karl-Johan Nogenmyr
Join Date: Mar 2009
Location: Linköping
Posts: 279
Rep Power: 21 |
Could you share the log files? I guess they are a few hundred megabytes in total, but if you could, I would be interested in looking at them!
Regards, Kalle |
|
June 25, 2014, 11:08 |
|
#13 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
In case anybody is still interested, following Olivier's comment I ran the 1.5 million cell simpleFoam case as mentioned in my first post (Case 2) with the following results:
(to refresh memory, this is with OpenFOAM 2.3. Exactly same case other than ASCII/Binary and Compression in system/controlDict) Binary Uncompressed: 9 hr 04 min Binary Compressed: 19 hr 24 min ASCII Uncompressed: 11 hr 52 min ASCII Compressed: 11 hr 04 min Based on Kalle's suggestion I compared the log files, these cases diverge in time from the very first time step, long before writing, and continue to diverge with each time step. Very odd!! Compressing binary would be very expensive, but why is it slower between steps which aren't being written or read?! In summary, I've found it's significantly beneficial to run OpenFOAM in Binary Uncompressed and convert using foamFormatConvert after the run. In terms of runtime: Binary Uncompressed < ASCII Compressed < ASCII Uncompressed << Binary Compressed Kind Regards, Jason |
|
June 26, 2014, 02:47 |
|
#14 | |
Senior Member
Karl-Johan Nogenmyr
Join Date: Mar 2009
Location: Linköping
Posts: 279
Rep Power: 21 |
Quote:
Indeed a strange thing you've run into here. What if you disable write-out altogether? Do you start both simulations from identical cases, i.e. both cases start from either ASCII data or Binary data... or are maybe your fields uniform, and there is no difference in ASCII or Binary at start? Kalle |
||
June 26, 2014, 05:32 |
|
#15 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
It's difficult to explain but it's not a rogue case, others have experienced this, and judging by Olivier's reply this dates back to early OpenFOAM. The time I stated is correct, in that consistent with the rest of my data it is recorded Linux User Time, this is a parallel case (8 processors) thus the time difference is exaggerated.
Uniform fields, to make it fairer, and as I say the times diverge long before write. I've not tried single write or no write, good suggestion, that would indeed be an interesting experiment. Thank you, I will give this a go. p.s. if you prefer real time (aka wall-clock time) Binary Uncompressed: 1 hr 14 min Binary Compressed: 2 hr 37 min ASCII Uncompressed: 1 hr 36 min ASCII Compressed: 1 hr 27 min Different numbers, but same improvement. |
|
December 30, 2014, 15:51 |
|
#16 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Greetings to all!
OK, only after 6 months did I manage to find enough time to run some tests of my own on a stable machine (mine isn't as stable ). The results are now available here: https://github.com/wyldckat/wyldckat...nce_Analysis_2 The summary results:
Want more performance out of OpenFOAM? Then please suggest proven file storage software/technology, instead of simply complaining about it ... And I'll go first : LZO and LZ4 are high speed compression algorithms that offer impressive compression/decompression speeds, where the decompression is almost as fast as memcpy. These compression algorithms don't offer as much compression ratios as gzip and bz2, but for most cases, they could offer improved throughput to disk for writing such files when the data is in binary format. Best regards, Bruno
__________________
|
|
January 1, 2015, 12:25 |
|
#17 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
Very interesting tests. I'm glad that it explains why in my previously highlighted cases binary is significantly faster than ASCII. I tend to use OpenFOAM with bespoke utilities that require ASCII output. Since my original 'discovery' 6 months ago I've been running all of my OpenFOAM calculations in binary and later running the foamFormatConvert if I need any output in ASCII. It may seem a roundabout way of going about the problem but it's very fast.
Perhaps I'm misinterpreting what I've read on your github page, but your experiments don't explain why running ASCII compressed is faster than running in uncompressed? My findings, which are repeatable: Binary Uncompressed - Fastest |
|
January 1, 2015, 13:36 |
|
#18 | |||||
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Hi Jason,
Quote:
Quote:
If we look at my artificial results from the 6th approach: Quote:
Now if we sort the timings, and ignoring the "ascii_sprintf_f" test (which is really bad):
Quote:
The explanation - based on these results and my experience on this topic (data storage and compression) - is as follows:
I hope this is now clearer? If not, I can try and do some theoretical graphs, to demonstrate how much time is spent on each operation. Best regards, Bruno |
||||||
January 1, 2015, 14:41 |
|
#19 |
New Member
Jason Moller
Join Date: Sep 2013
Location: Hampshire, UK
Posts: 14
Rep Power: 13 |
Bruno,
This is now crystal clear, thank you for taking the time to explain. This kind of information is very interesting. It's incredible to see that the times in interpretation may vary the calculation time notably. It is also pleasing to learn this isn't a problem specifically with OpenFOAM and is instead a computing phenomenon that is repeatable outside of OpenFOAM. Many thanks for your investigation, I hope many other read this thread, very useful knowledge to have for speeding up calculation time. Happy New Year, Jason |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Star CCM Overset Mesh Error (Rotating Turbine) | thezack | Siemens | 7 | October 12, 2016 12:14 |
Error in solution using "Grid Interface" | agustinvo | FLUENT | 4 | January 20, 2015 13:03 |
[ICEM] Problem making structural mesh on a surface | froztbear | ANSYS Meshing & Geometry | 1 | November 10, 2011 09:52 |
Icemcfd 11: Loss of mesh from surface mesh option? | Joe | CFX | 2 | March 26, 2007 19:10 |
unstructured vs. structured grids | Frank Muldoon | Main CFD Forum | 1 | January 5, 1999 11:09 |