Skip to content

Commit

Permalink
Merge pull request #6104 from bangerth/stripe
Browse files Browse the repository at this point in the history
Allow setting LFS stripe count.
  • Loading branch information
tjhei authored Oct 23, 2024
2 parents 528b304 + 1ade035 commit af19ae9
Show file tree
Hide file tree
Showing 6 changed files with 1,546 additions and 1,429 deletions.
4 changes: 4 additions & 0 deletions doc/modules/changes/20241022_bangerth
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
New: The new input parameter "Output directory LFS stripe count" allows for configuring
the ASPECT output directory for better performance on Lustre file systems.
<br>
(Wolfgang Bangerth, 2024/10/22)
2,875 changes: 1,448 additions & 1,427 deletions doc/parameter_view/parameters.xml

Large diffs are not rendered by default.

14 changes: 13 additions & 1 deletion doc/sphinx/parameters/global.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,19 @@ Units: \%.

**Pattern:** [DirectoryName]

**Documentation:** The name of the directory into which all output files should be placed. This may be an absolute or a relative path.
**Documentation:** The name of the directory into which all output files should be placed. This may be an absolute or a relative path. ASPECT will write output such as statistics files or visualization files into this directory or into directories further nested within.

(parameters:Output_20directory_20LFS_20stripe_20count)=
### __Parameter name:__ Output directory LFS stripe count
**Default value:** 0

**Pattern:** [Integer range 0...2147483647 (inclusive)]

**Documentation:** Many large clusters use the Lustre file system (LFS) that allows to &rsquo;stripe&rsquo; files, i.e., to use multiple file servers to store a single file. This is useful when writing very large files from multiple MPI processes, such as when creating graphical output or creating checkpoints. In those cases, if all MPI processes try to route their data to a single file server, that file server and the disks it manages may be saturated by data and everything slows down. File striping instead ensures that the data is sent to several file servers, improving performance. A description of how Lustre manages file striping can be found at https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace . How file striping can be configured is discussed at https://wiki.lustre.org/Configuring_Lustre_File_Striping .

When this parameter is set to anything other than zero, ASPECT will call the Lustre support tool, &lsquo;lst&lsquo;, as follows: &lsquo;lst setstripe -c N OUTPUT_DIR&lsquo;, where &lsquo;N&lsquo; is the value of the input parameter discussed here, and &lsquo;OUTPUT_DIR&lsquo; is the directory into which ASPECT writes its output. The file striping so set on the output directory are also inherited by the sub-directories ASPECT creates within it.

In order to use this parameter, your cluster must obviously be using the Lustre file system. What the correct value for the stripe count is is something you will have to find out from your cluster&rsquo;s local documentation, or your cluster administrator. It depends on the physical details and configuration of the file servers attached to a cluster.

(parameters:Pressure_20normalization)=
### __Parameter name:__ Pressure normalization
Expand Down
16 changes: 16 additions & 0 deletions doc/sphinx/user/run-aspect/run-faster/file-system-io.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
(sec:run-aspect:run-faster:file-system-io)=
# File system I/O

Depending on how exactly you run ASPECT and how large your computations are,
ASPECT can create output that can be hundreds of gigabytes or more. This output
also includes individual files that can be many gigabytes or more, for example
for graphical output and, in particular, for checkpointing the simulation.
For computations that run on thousands of individual MPI processes, writing
all of this information to disk can be a bottleneck.

Most large clusters have extensive documentation on how to tune file storage
for optimal performance, and if you are doing large computations, it is worth
reading through this documentation. At least for Lustre file systems, you can
also employ the {ref}`parameters:Output_20directory_20LFS_20stripe_20count`.
parameter to set LFS stripe counts -- see the documentation of that parameter
for more information.
1 change: 1 addition & 0 deletions doc/sphinx/user/run-aspect/run-faster/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,5 @@ limiting-postprocessing.md
pressure-norm-off.md
regularize.md
multithreading.md
file-system-io.md
:::
65 changes: 64 additions & 1 deletion source/simulator/parameters.cc
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,39 @@ namespace aspect
prm.declare_entry ("Output directory", "output",
Patterns::DirectoryName(),
"The name of the directory into which all output files should be "
"placed. This may be an absolute or a relative path.");
"placed. This may be an absolute or a relative path. ASPECT will "
"write output such as statistics files or visualization files "
"into this directory or into directories further nested within.");

prm.declare_entry ("Output directory LFS stripe count", "0",
Patterns::Integer(0),
"Many large clusters use the Lustre file system (LFS) that allows to 'stripe' "
"files, i.e., to use multiple file servers to store a single file. This is "
"useful when writing very large files from multiple MPI processes, such "
"as when creating graphical output or creating checkpoints. In those "
"cases, if all MPI processes try to route their data to a single file "
"server, that file server and the disks it manages may be saturated by "
"data and everything slows down. File striping instead ensures that the "
"data is sent to several file servers, improving performance. A "
"description of how Lustre manages file striping can be found at "
"https://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace . "
"How file striping can be configured is discussed at "
"https://wiki.lustre.org/Configuring_Lustre_File_Striping ."
"\n\n"
"When this parameter is set to anything other than zero, "
"ASPECT will call the Lustre support tool, `lst`, as follows: "
"`lst setstripe -c N OUTPUT_DIR`, where `N` is the value of the "
"input parameter discussed here, and `OUTPUT_DIR` is the directory "
"into which ASPECT writes its output. The file striping so set on "
"the output directory are also inherited by the sub-directories "
"ASPECT creates within it."
"\n\n"
"In order to use this parameter, your cluster must obviously be "
"using the Lustre file system. What the correct value for the stripe "
"count is is something you will have to find out from your cluster's "
"local documentation, or your cluster administrator. It depends on "
"the physical details and configuration of the file servers attached "
"to a cluster.");

prm.declare_entry ("Use operator splitting", "false",
Patterns::Bool(),
Expand Down Expand Up @@ -1568,9 +1600,40 @@ namespace aspect
else if (output_directory[output_directory.size()-1] != '/')
output_directory += "/";

// Ensure that the output directory exists. If asked for in the input file,
// set LFS striping as well to improve performance.
Utilities::create_directory (output_directory,
mpi_communicator,
false);
{
const unsigned int lfs_stripe_count = prm.get_integer("Output directory LFS stripe count");
if (lfs_stripe_count != 0)
{
if (Utilities::MPI::this_mpi_process(mpi_communicator) == 0)
{
const std::string command = "lst setstripe -c " + std::to_string(lfs_stripe_count)
+ ' ' + output_directory;
const int error_code = system (command.c_str());

Utilities::MPI::broadcast(mpi_communicator, error_code, 0);

AssertThrow (error_code == 0,
ExcMessage ("Could not successfully execute the LFS file striping "
"command '" + command + "'. The error code of the "
"system() command was " +
std::to_string(error_code)));
}
else
{
int error_code;
error_code = Utilities::MPI::broadcast(mpi_communicator, error_code, 0);

if (error_code != 0)
throw QuietException();
}
}
}


if (prm.get ("Resume computation") == "true")
resume_computation = true;
Expand Down

0 comments on commit af19ae9

Please sign in to comment.