Skip to content

Commit

Permalink
Merge pull request #693 from galacticusorg/featParametersTools
Browse files Browse the repository at this point in the history
Add new tools for working with parameter files
  • Loading branch information
abensonca authored Sep 10, 2024
2 parents eb2baad + eb85379 commit bcc0a68
Show file tree
Hide file tree
Showing 12 changed files with 1,158 additions and 154 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/cicd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1071,6 +1071,8 @@ jobs:
test-perl-modules.pl,
test-reproducibility.pl,
test-splitForests.pl,
test-parameters-diff.py,
test-parameters-extract.py,
test-output-datasets-suffixes.py,
test-star-formation-histories.py,
test-star-formation-histories-adaptive.py,
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/testModel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ jobs:
false
fi
apt -y update && apt -y upgrade
apt -y install python3-numpy python3-h5py
apt -y install python3-numpy python3-h5py python3-lxml python3-blessings
cd ${{ inputs.runPath }}
chmod u=wrx ./${{ inputs.file }}
./${{ inputs.file }} ${{ inputs.options }} 2>&1 | tee test.log
Expand Down
42 changes: 41 additions & 1 deletion doc/Advanced.tex
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,46 @@ \section{Parameter Files}\label{sec:ParameterFiles}

All parameter values (both those specified in this file and those set to default) used during a \glc\ run are output to the {\normalfont \ttfamily Parameters} group within the \glc\ output file. If parameters are present in the parameter file which do not match any known parameter in \glc\ then a warning message, listing all unknown parameters, will be given when \glc\ is run. Note that this will \emph{not} prevent \glc\ from running---sometimes it is convenient to include parameters which are not used by \glc, but which might be used by some other code.

\subsection{Extracting Parameter Files From Outputs}\index{parameters!extracting from outputs}

The values of all parameters (including those set to defaults) are stored to the Galacticus output file in the {\normalfont \ttfamily Parameters} group (see \S\ref{sec:outputFile:parametersGroup}). These parameter settings can be extracted back to an XML file using the {\normalfont \ttfamily parametersExtract.py} script:
\begin{verbatim}
./scripts/parameters/parametersExtract.py galacticus.hdf5 extractedParameters.xml
\end{verbatim}
In this example, all parameters that were used to run the {\normalfont \ttfamily galacticus.hdf5} model and that were stored in that file will be extracted and output to the {\normalfont \ttfamily extractedParameters.xml} file.

\subsection{Differencing Parameter Files}\index{parameters!differences}

The differences between two parameter files can be shown using the {\normalfont \ttfamily parametersDiff.py} script. For example:
\begin{verbatim}
./scripts/parameters/parametersDiff.py parameters1.xml parameters2.xml
\end{verbatim}
will show the differences between {\normalfont \ttfamily parameters1.xml} and {\normalfont \ttfamily parameters2.xml}.

By default, the \emph{order} of differently-named parameters is ignored when looking for differences---the order of differently-named parametersmakes no difference to \glc. However, this involves re-ordering parameters alphabetically to allow differences to be seen, which can make it more difficult for the user to identify where in the files the differences occur. By adding the option {\normalfont \ttfamily --respectOrder} the order of parameters is preserved. This may result in more differences being shown, but with more useful context for finding them in the parameter files.

Differences are detected via a textual comparison. Consequently, parameters:
\begin{verbatim}
<myParameter value="0.001"/>
\end{verbatim}
and
\begin{verbatim}
<myParameter value="1.0e-3"/>
\end{verbatim}
will be identified as a difference, even though they are numerically identical. To avoid such false differences, numerical values in parameters can be put into a canonical form using the {\normalfont \ttfamily --canonicalizeValues} option with a standard Python \href{https://docs.python.org/3/library/string.html#formatstrings}{format string}. For example:
\begin{verbatim}
./scripts/parameters/parametersDiff.py --canonicalizeValues .4f parameters1.xml parameters2.xml
\end{verbatim}
will convert all numerical values into floating point numbers with 4 digits of provision. So, in the above example the parameters would be rewritten as:/
\begin{verbatim}
<myParameter value="0.0010"/>
\end{verbatim}
and
\begin{verbatim}
<myParameter value="0.0010"/>
\end{verbatim}
and so would be seen as identical.

\subsection{Validating Parameter Files}\index{parameters!validating}

A script, {\normalfont \ttfamily scripts/aux/validateParameters.pl}, is provided to validate parameter files and thereby ensure that they are consistent with \glc's expectations and requirements. To use simply execute:
Expand Down Expand Up @@ -225,7 +265,7 @@ \subsection{Filters}
\item[{\normalfont \ttfamily wavelengthEffective}] The effective wavelength, $\lambda_\mathrm{eff}$ (defined as $\lambda_\mathrm{eff}=\left. \int_0^\infty \lambda R(\lambda) \mathrm{d}\lambda \right/ \int_0^\infty R(\lambda) \mathrm{d}\lambda$, where $R(\lambda)$ is the filter response) of the filter in \AA.
\end{description}

\subsection{Parameters}
\subsection{Parameters}\label{sec:outputFile:parametersGroup}

The {\normalfont \ttfamily Parameters} group contains a record of all parameter values (either input or default) that were used for this \glc\ run. The group contains a long list of attributes, each attribute named for the corresponding parameter and with a single entry giving the value of that parameter. If a parameter has subparameters, a group is created having the same name as the parameter, which will contain attributes corresponding to each subparameter. In cases where a parameter appears more than once in a given node of the parameter tree,it will be output with ``{\normalfont \ttfamily [N]}'' appended to its name, where ``{\normalfont \ttfamily N}'' is an integer indicating the instance of the parameter.

Expand Down
50 changes: 35 additions & 15 deletions scripts/aux/archive.pl
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,10 @@
close($dependenciesFile);

# Scan the source directory for files
my @directories = ($galacticusPath."/source");
my @directories = ($galacticusPath."/source",$galacticusPath."/scripts");
my @links;
find(\&linkFinder,@directories);
$_ = "Makefile"; &linkFinder();

# Retrieve links.
my $report;
Expand Down Expand Up @@ -64,23 +65,42 @@
sub linkFinder {
# Find links that may be downloaded at run-time.
my $fileName = $_;
my $fullName = $File::Find::name;
return
unless ( $fileName =~ m/\.(F90|Inc)$/ );
unless ( $fileName =~ m/\.(F90|Inc|py)$/ || $fileName =~ m/^Makefile/ );
open(my $file,$fileName);
while ( my $line = <$file> ) {
if ( $line =~ m/^\s*call\s+download\s*\(\s*(["'][^,]+)/ ) {
my $link = $1;
$link =~ s/["']//g;
# Replace dependencies with the actual version number here. Handle the "cloudyVersion" case as a special instance as
# we have to insert a "c" prefix.
$link =~ s/\/\/char\(([a-zA-Z]+)VersionMajor\)\/\//$dependencies->{$1}->{'versionMajor'}/g;
$link =~ s/\/\/char\(cloudyVersion\)\/\//c$dependencies->{'cloudy'}->{'version'}/g;
$link =~ s/\/\/char\(([a-zA-Z]+)Version\)\/\//$dependencies->{$1}->{'version'}/g;
# Add the link to the list to retrieve. We skip the "backup" ("old") Cloudy path here.
push(@links,$link)
unless ( $link =~ m/cloudy_releases\/c\d+\/old\// );
}
# Fortran source.
if ( $fileName =~ m/\.(F90|Inc)$/ ) {
if ( $line =~ m/^\s*call\s+download\s*\(\s*(["'][^,]+)/ ) {
my $link = $1;
$link =~ s/["']//g;
# Replace dependencies with the actual version number here. Handle the "cloudyVersion" case as a special instance as
# we have to insert a "c" prefix.
$link =~ s/\/\/char\(([a-zA-Z]+)VersionMajor\)\/\//$dependencies->{$1}->{'versionMajor'}/g;
$link =~ s/\/\/char\(cloudyVersion\)\/\//c$dependencies->{'cloudy'}->{'version'}/g;
$link =~ s/\/\/char\(([a-zA-Z]+)Version\)\/\//$dependencies->{$1}->{'version'}/g;
# Add the link to the list to retrieve. We skip the "backup" ("old") Cloudy path here.
push(@links,$link)
unless ( $link =~ m/cloudy_releases\/c\d+\/old\// );
}
}
# Python scripts.
if ( $fileName =~ m/\.py$/ ) {
if ( $line =~ m/^\s*urllib\.request\.urlretrieve\s*\(\s*(["'][^,]+)/ ) {
my $link = $1;
$link =~ s/["']//g;
# Add the link to the list to retrieve.
push(@links,$link);
}
}
# Makefiles
if ( $fileName =~ m/^Makefile*/ ) {
if ( $line =~ m/^\s*wget\s+(\-\-\S+\s+)*(\S+)/ ) {
my $link = $2;
# Add the link to the list to retrieve.
push(@links,$link);
}
}
}
close($file);
}
137 changes: 0 additions & 137 deletions scripts/aux/parametersDiff.pl

This file was deleted.

91 changes: 91 additions & 0 deletions scripts/parameters/parametersDiff.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
#!/usr/bin/env python3
import os
import xml.etree.ElementTree as ET
import argparse
from pathlib import Path
import urllib.request
import tarfile
import subprocess
import sys
import tempfile
import re

# Show differences between two Galacticus parameter files.
# Andrew Benson (07-September-2024)

# Parse command line arguments.
parser = argparse.ArgumentParser(prog='parametersDiff.py',description='Show differences between two Galacticus parameter files.')
parser.add_argument('parameterFile1')
parser.add_argument('parameterFile2')
parser.add_argument('--respectOrder', action='store_true',help='respect the order of elements when comparing files')
parser.add_argument('--canonicalizeValues', action='store',help='canonicalize all numerical values before comparing')
args = parser.parse_args()

# Install xdiff if necessary.
dynamicPath = os.environ['GALACTICUS_DATA_PATH']+"/dynamic"
xdiffPath = dynamicPath+"/xdiff-2.4"
xdiff = Path(xdiffPath+"/xdiff.py")
if not xdiff.is_file():
tarFile = xdiffPath+".tar.gz"
urllib.request.urlretrieve("https://hg.sr.ht/~nolda/xdiff/archive/2.4.tar.gz", tarFile)
tarball = tarfile.open(tarFile)
tarball.extractall(dynamicPath)
tarball.close()

# Create list of filwnames to compare.
fileNames = [ args.parameterFile1, args.parameterFile2 ]
fileNamesTmp = [ ]

# If parameter order is not to be respected, created copies of our files with parameters sorted by name.
if not args.respectOrder:
for i in range(2):
parametersDoc = ET.parse(fileNames[i])
parameters = parametersDoc.getroot()
for parent in parameters.iter():
parent[:] = sorted(parent,key=lambda x: x.tag)
ET.indent(parametersDoc, space=" ", level=0)
fileOut = tempfile.NamedTemporaryFile(mode="w",encoding="utf8",delete=False)
fileOut.write(ET.tostring(parameters, encoding="unicode"))
fileOut.close()
# Replace the file name with that of our temporary file.
fileNames[i] = fileOut.name
fileNamesTmp.append(fileOut.name)

# If values are to be canonicalized, do so.
if args.canonicalizeValues:
formatCanonical = "{:"+args.canonicalizeValues+"}"
for i in range(2):
parametersDoc = ET.parse(fileNames[i])
parameters = parametersDoc.getroot()
for parent in parameters.iter():
if "value" not in parent.attrib:
continue
values = re.split(r'\s+',parent.attrib['value'].strip())
valuesCanonical = []
for value in values:
try:
valueNumeric = float(value)
valuesCanonical.append(formatCanonical.format(valueNumeric))
except ValueError:
valuesCanonical.append(value)
parent.attrib['value'] = " ".join(valuesCanonical)

ET.indent(parametersDoc, space=" ", level=0)
fileOut = tempfile.NamedTemporaryFile(mode="w",encoding="utf8",delete=False)
fileOut.write(ET.tostring(parameters, encoding="unicode"))
fileOut.close()
# Replace the file name with that of our temporary file.
fileNames[i] = fileOut.name
fileNamesTmp.append(fileOut.name)

# Run `xdiff` to compare the files.
status = subprocess.run("python3 "+str(xdiff)+" "+" ".join(fileNames),shell=True)

# Remove any temporary files.
if not args.respectOrder:
for i in range(len(fileNamesTmp)):
os.unlink(fileNamesTmp[i])

# Return diff status.
sys.exit(status.returncode)

Loading

0 comments on commit bcc0a68

Please sign in to comment.