diff --git a/manual/main/en/html/_images/task_view.png b/manual/main/en/html/_images/task_view.png index 0ca4be0..44d038a 100644 Binary files a/manual/main/en/html/_images/task_view.png and b/manual/main/en/html/_images/task_view.png differ diff --git a/manual/main/en/html/_sources/moller/command/index.rst.txt b/manual/main/en/html/_sources/moller/command/index.rst.txt index 1b846c6..7069b0f 100644 --- a/manual/main/en/html/_sources/moller/command/index.rst.txt +++ b/manual/main/en/html/_sources/moller/command/index.rst.txt @@ -57,7 +57,7 @@ DESCRIPTION: - ``list_file`` - specifies the file that contains list of job directories. If this file is not specified, the list will be obtained from the logfile of the batch job ``log_{task}.dat``. + specifies the file that contains list of job directories. If this file is not specified, the list will be obtained from the logfile of the batch job ``stat_{task}.dat``. - ``-o``, ``--output`` ``output_file`` @@ -91,5 +91,5 @@ DESCRIPTION: FILES: - When the programs are executed concurrently using the job script generated by ``moller``, the status of the tasks are written in log files ``log_{task}.dat``. ``moller_status`` reads these log files and makes a summary. + When the programs are executed concurrently using the job script generated by ``moller``, the status of the tasks are written in log files ``stat_{task}.dat``. ``moller_status`` reads these log files and makes a summary. diff --git a/manual/main/en/html/_sources/moller/tutorial/basic.rst.txt b/manual/main/en/html/_sources/moller/tutorial/basic.rst.txt index d1bd8b4..c2e0f70 100644 --- a/manual/main/en/html/_sources/moller/tutorial/basic.rst.txt +++ b/manual/main/en/html/_sources/moller/tutorial/basic.rst.txt @@ -78,7 +78,7 @@ A list of jobs is to be created. ``moller`` is designed so that each job is exec .. code-block:: bash - $ /usr/bin/ls -1d > list.dat + $ /usr/bin/ls -1d * > list.dat In this tutorial, an utility script ``make_inputs.sh`` is enclosed which generates datasets and a list file. @@ -128,3 +128,34 @@ An example of the output is shown below: where "o" corresponds to a task that has been completed successfully, "x" corresponds to a failed task, "-" corresponds to a skipped task because the previous task has been terminated with errors, and "." corresponds to a task yet unexecuted. In the above example, the all tasks have been completed successfully. + + +Rerun failed tasks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If a task fails, the subsequent tasks within the job will not be executed. +The following is an example of job status in which each task fails by 10% change. + +.. literalinclude:: ../../../../tutorial/moller/reference/status_failed.txt + +There, the jobs of dataset_0003 and dataset_0004 failed at task1, and the subsequent task2 and task3 were not executed. The other jobs were successful at task1, and proceeded to task2. +In this way, each job is executed independently of other jobs. + +Users can rerun the failed tasks by submitting the batch job with the retry option. +For SLURM job scheduler (e.g. used in ISSP system B), resubmit the job as follows: + +.. code-block:: bash + + $ sbatch job.sh --retry list.dat + +For PBS job scheduler (e.g. used in ISSP system C), edit the job script so that the line ``retry=0`` is replaced by ``retry=1``, and resubmit the job. + +.. literalinclude:: ../../../../tutorial/moller/reference/status_retry.txt + +The tasks that have failed will be executed in the second run. +In the above example, the task1 for dataset_0003 was successful, but the task2 failed. +For dataset_0004, task1, task2, and task3 were successfully executed. +For the jobs of datasets whose tasks have already finished successfully, the second run will not do anything. + +N.B. the list file must not be modified on the rerun. The jobs are managed according to the order of entries in the list file, and therefore, if the order is changed, the jobs will not be executed properly. + diff --git a/manual/main/en/html/_static/task_view.pdf b/manual/main/en/html/_static/task_view.pdf index e85cbb6..3af0603 100644 Binary files a/manual/main/en/html/_static/task_view.pdf and b/manual/main/en/html/_static/task_view.pdf differ diff --git a/manual/main/en/html/_static/task_view.png b/manual/main/en/html/_static/task_view.png index 0ca4be0..44d038a 100644 Binary files a/manual/main/en/html/_static/task_view.png and b/manual/main/en/html/_static/task_view.png differ diff --git a/manual/main/en/html/genindex.html b/manual/main/en/html/genindex.html index c598ec1..00dd17b 100644 --- a/manual/main/en/html/genindex.html +++ b/manual/main/en/html/genindex.html @@ -93,7 +93,7 @@
moller
.
list_file
specifies the file that contains list of job directories. If this file is not specified, the list will be obtained from the logfile of the batch job log_{task}.dat
.
specifies the file that contains list of job directories. If this file is not specified, the list will be obtained from the logfile of the batch job stat_{task}.dat
.
-o
, --output
output_file
specifies the output file name. If it is omitted, the result is written to the standard output.
@@ -119,7 +119,7 @@FILES:
-When the programs are executed concurrently using the job script generated by
+moller
, the status of the tasks are written in log fileslog_{task}.dat
.moller_status
reads these log files and makes a summary.@@ -192,7 +192,7 @@When the programs are executed concurrently using the job script generated by
moller
, the status of the tasks are written in log filesstat_{task}.dat
.moller_status
reads these log files and makes a summary.Quick search
| Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | Quick search | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | Quick search | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | Generate batch job scriptCreate list file¶
A list of jobs is to be created.
-moller
is designed so that each job is executed within a directory prepared for the job with the job name. The job list can be created, for example, by the following command:diff --git a/manual/main/ja/html/index.html b/manual/main/ja/html/index.html index 1981a88..447763a 100644 --- a/manual/main/ja/html/index.html +++ b/manual/main/ja/html/index.html @@ -108,7 +108,7 @@diff --git a/manual/main/en/html/searchindex.js b/manual/main/en/html/searchindex.js index 42aeb90..06b97eb 100644 --- a/manual/main/en/html/searchindex.js +++ b/manual/main/en/html/searchindex.js @@ -1 +1 @@ -Search.setIndex({"docnames": ["index", "moller/about/index", "moller/appendix/index", "moller/basic-usage", "moller/command/index", "moller/filespec/index", "moller/index", "moller/tutorial/basic", "moller/tutorial/dsqss", "moller/tutorial/hphi", "moller/tutorial/index"], "filenames": ["index.rst", "moller/about/index.rst", "moller/appendix/index.rst", "moller/basic-usage.rst", "moller/command/index.rst", "moller/filespec/index.rst", "moller/index.rst", "moller/tutorial/basic.rst", "moller/tutorial/dsqss.rst", "moller/tutorial/hphi.rst", "moller/tutorial/index.rst"], "titles": ["Moller Users Guide", "1. Introduction", "6. Extension guide", "2. Installation and basic usage", "4. Command reference", "5. File format", "Comprehensive Calculation Utility (moller)", "3.1. Basic usage", "3.3. Example for moller calculation with DSQSS", "3.2. Example for moller calculation with HPhi", "3. Tutorial"], "terms": {"comprehens": [0, 3, 4], "calcul": [0, 2, 3, 4, 7, 10], "util": [0, 3, 7], "introduct": [0, 6], "instal": [0, 6, 8, 9], "basic": [0, 6, 10], "usag": [0, 2, 6, 10], "tutori": [0, 3, 6, 7, 8, 9], "command": [0, 2, 3, 5, 6, 7], "refer": [0, 2, 3, 6, 7], "file": [0, 2, 3, 4, 6, 8, 9, 10], "format": [0, 2, 3, 4, 6, 7], "extens": [0, 6], "In": [1, 2, 3, 5, 7, 8, 9], "recent": 1, "year": 1, "us": [1, 2, 3, 4, 5, 7, 8, 9], "machin": 1, "learn": 1, "predict": 1, "materi": 1, "properti": 1, "design": [1, 7], "substanc": 1, "known": [1, 2], "informat": 1, "ha": [1, 2, 3, 7, 8], "gain": 1, "consider": 1, "attent": 1, "The": [1, 2, 3, 4, 5, 7, 8, 9], "accuraci": 1, "depend": [1, 2, 3, 5, 8, 9], "heavili": 1, "prepar": [1, 3, 5, 10], "appropri": [1, 2], "train": 1, "data": [1, 9], "therefor": [1, 2, 8], "develop": [1, 2], "tool": [1, 2, 3, 7, 8, 9], "rapid": 1, "gener": [1, 2, 3, 4, 8, 9, 10], "expect": [1, 9], "contribut": 1, "significantli": 1, "advanc": 1, "research": 1, "provid": [1, 2, 4, 5, 7], "part": [1, 2, 5], "htp": [1, 3, 8, 9], "packag": [1, 8, 9], "support": 1, "high": 1, "throughput": 1, "comput": [1, 2, 3], "It": [1, 2, 3, 4, 5, 7], "batch": [1, 2, 3, 4, 5, 8, 9, 10], "job": [1, 3, 4, 6, 8, 9, 10], "script": [1, 3, 4, 5, 8, 9, 10], "supercomput": [1, 2, 3, 7, 8, 9], "cluster": 1, "allow": 1, "parallel": [1, 2, 3, 5, 7, 8, 9], "execut": [1, 3, 4, 5, 6, 7, 8, 9], "program": [1, 2, 3, 4, 5, 7, 8, 9], "under": [1, 8, 9], "seri": [1, 7], "condit": [1, 3, 8, 9], "paramet": [1, 2, 3, 4, 5, 7, 8], "current": [1, 7], "ohtaka": [1, 2, 3, 5, 7, 8, 9], "slurm": [1, 3, 5, 7], "schedul": [1, 3, 7], "kugui": [1, 2, 3, 5], "pb": [1, 3, 5], "institut": 1, "solid": 1, "state": 1, "physic": 1, "univers": 1, "tokyo": 1, "distribut": [1, 2], "sourc": [1, 3, 7, 8, 9], "code": [1, 2, 5, 7], "follow": [1, 2, 3, 4, 5, 7, 9], "gnu": [1, 2, 3], "public": 1, "version": [1, 2], "3": [1, 2, 3, 7], "gpl": 1, "v3": 1, "later": 1, "thi": [1, 2, 4, 5, 7, 10], "softwar": [1, 8, 9], "wa": 1, "ver": 1, "1": [1, 2, 3, 7, 8, 9], "0": [1, 3, 7, 8, 9], "beta": 1, "releas": 1, "2023": [1, 7], "12": 1, "28": 1, "kazuyoshi": 1, "yoshimi": 1, "instutit": 1, "tatsumi": 1, "aoyama": 1, "yuichi": 1, "motoyama": 1, "masahiro": 1, "fukuda": 1, "kota": 1, "ido": 1, "tetsuya": 1, "fukushima": 1, "nation": 1, "industri": 1, "scienc": 1, "technologi": 1, "aist": 1, "shusuk": 1, "kasamatsu": 1, "yamagata": 1, "takashi": 1, "koretsun": 1, "tohoku": 1, "project": 1, "corrdin": 1, "taisuk": 1, "ozaki": 1, "all": [1, 2, 4, 5, 7], "right": 1, "reserv": 1, "usabl": 1, "test": 1, "platform": [1, 2, 3, 7], "ubuntu": 1, "linux": 1, "python3": [1, 3, 8], "n": 2, "b": [2, 3, 7, 9], "content": [2, 5, 7], "section": [2, 3, 5, 7, 9], "mai": [2, 3, 5, 7], "vari": 2, "A": [2, 5, 7, 8, 9], "mean": [2, 8], "set": [2, 3, 7], "small": [2, 8], "task": [2, 3, 4, 5, 7, 8, 9], "ar": [2, 3, 4, 5, 7, 8, 9], "within": [2, 5, 7, 8, 9], "singl": [2, 5, 7], "submit": [2, 3, 5, 7, 8, 9], "larg": 2, "queue": [2, 5, 7], "i": [2, 3, 4, 5, 6, 7, 8, 9], "schemat": 2, "shown": [2, 7], "which": [2, 3, 5, 7, 8, 9], "launch": 2, "background": 2, "process": [2, 5, 7], "wait": 2, "statement": 2, "invok": [2, 3, 7], "complet": [2, 3, 4, 5, 7], "param_1": 2, "param_2": 2, "param_n": 2, "To": [2, 3, 8, 9], "manag": 2, "requir": [2, 3, 5, 7], "node": [2, 3, 5, 7], "core": [2, 5], "alloc": [2, 5], "over": 2, "so": [2, 3, 7], "thei": [2, 5], "distinct": 2, "also": [2, 3], "need": [2, 3, 7], "arrang": 2, "where": [2, 3, 5, 7], "most": 2, "run": [2, 3, 5, 10], "simultan": 2, "accord": 2, "resourc": 2, "hereaft": 2, "denot": 2, "concurr": [2, 3, 4, 7], "control": 2, "take": [2, 4, 7], "list": [2, 3, 4, 6, 8, 9, 10], "hold": 2, "item": [2, 5], "each": [2, 3, 5, 7, 8], "an": [2, 3, 5, 7, 8, 9], "exampl": [2, 5, 6, 7, 10], "given": [2, 3, 5], "dat": [2, 3, 4, 7, 8, 9], "contain": [2, 3, 4, 5, 7], "line": [2, 3, 4, 5, 7], "cat": 2, "j": 2, "number": [2, 5], "determin": 2, "runtim": 2, "from": [2, 4, 7, 8, 9], "obtain": [2, 3, 4, 7, 9], "environ": [2, 5, 6], "degre": [2, 5], "thread": [2, 5], "specifi": [2, 3, 4, 5, 7], "wai": [2, 3], "assign": [2, 5], "For": [2, 3, 7], "call": [2, 7], "srun": [2, 5], "exploit": 2, "option": [2, 3, 4, 5, 7], "exclus": [2, 5], "explicit": 2, "On": [2, 7, 9], "hand": [2, 7, 9], "do": 2, "have": [2, 5, 7], "handl": [2, 7], "divid": 2, "slot": 2, "divis": 2, "kept": 2, "form": [2, 5], "tabl": [2, 5, 7], "variabl": [2, 5], "host": 2, "pin": 2, "through": [2, 4], "mpirun": [2, 5], "mpiexec": [2, 5], "mpi": [2, 5], "implement": 2, "o": [2, 3, 4, 7, 8, 9], "tang": [2, 3], "power": [2, 3], "login": [2, 3, 7], "usenix": [2, 3], "magazin": [2, 3], "februari": [2, 3], "2011": [2, 3], "42": [2, 3], "47": [2, 3], "read": [2, 4], "input": [2, 3, 7, 8, 9], "yaml": [2, 3, 5, 7, 8, 9], "describ": [2, 3, 5, 7], "header": 2, "prologu": [2, 7], "correspond": [2, 7, 8, 9], "block": 2, "written": [2, 4, 5, 7, 8, 9], "definit": 2, "next": [2, 7], "accept": [2, 5], "addit": [2, 3, 5, 7], "argument": [2, 3, 7, 8, 9], "submiss": 2, "sbatch": [2, 3, 5, 7, 8, 9], "pass": [2, 3, 7], "name": [2, 3, 4, 5, 7, 8, 9], "retri": [2, 3], "can": [2, 3, 7], "ignor": [2, 5], "fix": [2, 3], "default": [2, 3, 4, 5], "enabl": 2, "modifi": 2, "when": [2, 4, 5], "more": [2, 4, 5, 7], "than": [2, 4, 5], "one": [2, 4, 5, 7, 8], "procedur": [2, 7], "appli": 2, "fals": [2, 5, 7], "true": [2, 5, 7], "creat": [2, 3, 10], "task_": 2, "pre": [2, 7], "keyword": [2, 5], "substitut": 2, "epilogu": [2, 7], "main": [2, 3, 8], "briefli": 2, "below": [2, 7], "run_parallel": 2, "perform": [2, 4, 7, 8, 9], "statu": [2, 3, 4, 8, 9, 10], "_find_multipl": 2, "find": 2, "actual": [2, 3, 5], "wrap": 2, "_run_parallel_task": 2, "deal": 2, "nest": 2, "separ": [2, 4], "out": [2, 5, 7], "_setup_run_parallel": 2, "account": 2, "inform": [2, 7], "summar": [2, 4, 8, 9], "_nnode": 2, "slurm_nnod": 2, "_ncore": 2, "slurm_cpus_on_nod": 2, "_node": 2, "uniqu": 2, "pbs_nodefil": 2, "entri": 2, "search": 2, "order": [2, 3, 7], "examin": 2, "ncpu": 2, "profession": 2, "omp_num_thread": 2, "moller_cor": 2, "ppn": 2, "supplement": 2, "some": 2, "befor": [2, 7], "export": 2, "noth": 2, "2": [2, 7, 8, 9], "directori": [2, 3, 4, 5, 7, 8, 9], "id": 2, "_setup_taskenv": 2, "up": [2, 9], "base": [2, 3, 5, 7], "_is_readi": 2, "check": [2, 3, 8, 9, 10], "preced": 2, "been": [2, 3, 5, 7], "successfulli": [2, 3, 4, 7], "If": [2, 3, 4, 5, 7], "remain": [2, 8, 9], "otherwis": [2, 5], "termin": [2, 3, 4, 7], "latest": 2, "profil": 2, "issp": [2, 3, 7, 8, 9], "place": 2, "Their": 2, "depict": 2, "factori": 2, "select": 2, "import": 2, "__init__": [2, 3], "py": [2, 3, 8], "regist": 2, "register_platform": 2, "system_nam": 2, "class_nam": 2, "becom": 2, "avail": [2, 8, 9], "specif": [2, 7], "should": [2, 3, 7], "deriv": 2, "baseslurm": 2, "string": [2, 5, 7], "return": 2, "valu": [2, 4, 5, 7, 9], "parallel_command": 2, "method": [2, 8, 9], "see": 2, "openpb": 2, "torqu": 2, "basepb": 2, "There": [2, 3], "two": 2, "while": 2, "self": 2, "pbs_use_old_format": 2, "latter": 2, "per": [2, 5, 7], "128": 2, "further": 2, "overridden": 2, "relev": 2, "setup": 2, "extract": [2, 4, 7], "generate_head": 2, "generate_funct": 2, "bodi": [2, 8, 9], "generate_vari": 2, "generate_function_bodi": 2, "embed": [2, 5], "multipl": [2, 5], "intern": 2, "acquir": 2, "e": [2, 5, 9], "g": [2, 5, 9], "printenv": 2, "_debug": 2, "debug": 2, "output": [2, 4, 5, 7, 8], "print": 2, "dure": [2, 3], "doe": 2, "well": [2, 3], "recommend": 2, "turn": 2, "defin": [2, 7], "prerequisit": 3, "moller": [3, 5, 7, 10], "includ": [3, 5], "librari": [3, 5], "python": 3, "x": [3, 7], "ruamel": 3, "modul": [3, 7], "tabul": 3, "must": 3, "server": 3, "offici": [3, 8, 9], "page": 3, "github": 3, "repositori": 3, "download": 3, "git": 3, "clone": 3, "http": 3, "com": [3, 5], "center": 3, "dev": 3, "onc": [3, 9], "you": [3, 7], "automat": 3, "same": 3, "time": [3, 5], "cd": [3, 7], "m": [3, 5, 8], "pip": 3, "moller_statu": [3, 6, 7, 8, 9], "structur": 3, "licens": [3, 6], "readm": 3, "md": 3, "pyproject": 3, "toml": 3, "doc": [3, 7], "ja": 3, "en": 3, "src": 3, "base_slurm": 3, "base_pb": 3, "base_default": 3, "function": 3, "sampl": [3, 5, 7, 10], "featur": 3, "descript": [3, 4, 6, 8, 9, 10], "first": [3, 7, 8], "detail": [3, 7], "manual": 3, "sh": [3, 7, 8, 9], "transfer": [3, 7], "note": [3, 7, 8, 9], "rel": 3, "path": [3, 5, 8], "absolut": 3, "readi": 3, "system": [3, 5, 6, 7, 8, 9], "case": [3, 7, 8], "c": 3, "qsub": 3, "thu": 3, "after": [3, 5, 7, 8], "finish": [3, 8, 9], "report": [3, 4], "whether": [3, 8], "resum": 3, "again": [3, 7, 8], "yet": [3, 4, 7], "unexecut": [3, 7], "unfinish": 3, "fail": [3, 4, 5, 7], "edit": 3, "chang": [3, 7, 8], "Then": [3, 8], "abov": [3, 7], "synopsi": 4, "job_script": 4, "input_yaml": 4, "supersed": 4, "output_fil": [4, 5], "result": [4, 5, 7, 8, 9], "standard": [4, 5, 7], "h": 4, "displai": 4, "help": 4, "exit": 4, "text": [4, 5, 7, 8, 9], "csv": 4, "html": 4, "ok": 4, "skip": [4, 7], "collaps": 4, "list_fil": 4, "log": [4, 7], "error": [4, 7], "comma": 4, "logfil": 4, "log_": 4, "omit": [4, 7], "filter": 4, "onli": 4, "whose": 4, "ani": 4, "make": [4, 8, 9], "summari": [4, 7], "configur": 5, "consist": [5, 7, 9], "initi": 5, "final": [5, 7], "carri": [5, 7], "betch": 5, "left": 5, "unspecifi": 5, "usual": 5, "regard": 5, "comment": 5, "none": 5, "them": 5, "target": [5, 7], "At": 5, "present": [5, 7], "either": 5, "integ": 5, "rang": 5, "both": [5, 7], "second": 5, "elaps": [5, 7], "hh": 5, "mm": 5, "ss": 5, "other": [5, 6, 7, 9], "head": 5, "direct": 5, "mail": 5, "type": [5, 7], "begin": 5, "end": [5, 7], "user": [5, 7], "requeu": 5, "bea": 5, "r": 5, "y": 5, "prior": 5, "shell": [5, 7, 9], "sequenc": 5, "kei": [5, 7], "smaller": 5, "differ": [5, 7, 8, 9], "sequenti": [5, 7], "openmpi": [5, 7], "hybrid": 5, "prog": 5, "arg1": 5, "replac": 5, "associ": 5, "assum": 5, "These": 5, "suppos": 5, "locat": 5, "what": [6, 10], "contributor": 6, "copyright": 6, "oper": 6, "hphi": [6, 10], "dsqss": [6, 10], "guid": 6, "bulk": 6, "how": [6, 10], "work": [6, 8, 9], "extend": 6, "step": 7, "we": [7, 8, 9], "explain": 7, "along": 7, "here": 7, "instruct": 7, "sever": [7, 8], "organ": 7, "synchron": 7, "between": [7, 8], "taken": 7, "everi": 7, "start": 7, "three": 7, "4": [7, 9], "among": 7, "concern": 7, "post": 7, "testjob": 7, "i8cpu": 7, "00": 7, "10": 7, "purg": 7, "load": 7, "oneapi_compil": 7, "5": [7, 8], "oneapi": 7, "classic": 7, "ulimit": 7, "": [7, 10], "unlimit": 7, "home": 7, "materiapp": 7, "intel": 7, "parallelvar": 7, "20210622": 7, "echo": 7, "hello": 7, "world": 7, "txt": 7, "sleep": 7, "hello_again": 7, "done": 7, "date": 7, "being": 7, "made": 7, "preprocess": 7, "common": 7, "hello_world": 7, "sinc": 7, "treat": 7, "basi": 7, "similarli": 7, "pleas": 7, "chapter": 7, "neither": 7, "bash": [7, 8, 9], "care": 7, "csh": 7, "usr": 7, "bin": 7, "l": [7, 8, 9], "1d": 7, "make_input": [7, 8, 9], "enclos": [7, 8, 9], "dataset": [7, 8, 9], "By": [7, 8, 9], "subdirectori": 7, "0001": 7, "0020": 7, "copi": [7, 8], "cp": 7, "confirm": 7, "0002": 7, "0003": 7, "0004": 7, "0005": 7, "0006": 7, "0007": 7, "0008": 7, "0009": 7, "0010": 7, "0011": 7, "0012": 7, "0013": 7, "0014": 7, "0015": 7, "0016": 7, "0017": 7, "0018": 7, "0019": 7, "becaus": 7, "previou": [7, 9], "open": [8, 9], "integr": 8, "mont": 8, "calro": 8, "quantum": [8, 9], "mani": [8, 9], "problem": [8, 9], "temperatur": 8, "magnet": 8, "suscept": 8, "chi": 8, "term": 8, "antiferromagnet": [8, 9], "heisenberg": [8, 9], "chain": [8, 9], "period": [8, 9], "boundari": [8, 9], "length": [8, 9], "t": 8, "sure": [8, 9], "alreadi": 8, "exist": 8, "remov": 8, "like": 8, "l_8__m_1__t_1": 8, "store": [8, 9], "gather": [8, 9], "extract_result": 8, "write": [8, 9], "column": 8, "stderr": 8, "visual": [8, 9], "gnuplot": [8, 9], "plot_m1": 8, "plt": [8, 9], "plot_m2": 8, "persist": [8, 9], "afh": 8, "excit": [8, 9], "gap": [8, 9], "vanish": 8, "reflect": 8, "veri": 8, "low": 8, "region": 8, "finit": [8, 9], "size": [8, 9], "effect": [8, 9], "spin": [8, 9], "drop": 8, "exact": 9, "diagon": 9, "delta": 9, "2s_1": 9, "2s_2": 9, "l_8": 9, "l_10": 9, "l_24": 9, "l_18": 9, "addition": 9, "extract_gap": 9, "energi": 9, "pair": 9, "fit": 9, "curv": 9, "delta_": 9, "infti": 9, "exp": 9, "cl": 9, "plot": 9, "logarithm": 9, "correct": 9, "caus": 9, "extrapol": 9, "417": 9, "41048": 9, "6": 9, "qmc": 9, "todo": 9, "kato": 9, "prl": 9, "87": 9, "047203": 9, "2001": 9}, "objects": {}, "objtypes": {}, "objnames": {}, "titleterms": {"moller": [0, 1, 2, 4, 6, 8, 9], "user": 0, "guid": [0, 2], "content": 0, "introduct": 1, "what": [1, 8, 9], "i": 1, "licens": 1, "contributor": 1, "copyright": 1, "oper": 1, "environ": 1, "extens": 2, "bulk": 2, "job": [2, 5, 7], "execut": 2, "how": [2, 8, 9], "work": 2, "structur": 2, "script": [2, 7], "brief": 2, "descript": [2, 5, 7], "function": 2, "extend": 2, "other": 2, "system": 2, "class": 2, "slurm": 2, "schedul": 2, "variant": 2, "pb": 2, "custom": 2, "featur": 2, "port": 2, "new": 2, "type": 2, "troubl": 2, "shoot": 2, "instal": 3, "basic": [3, 7], "usag": [3, 7], "command": 4, "refer": 4, "moller_statu": 4, "file": [5, 7], "format": 5, "gener": [5, 7], "set": 5, "platform": 5, "prologu": 5, "epilogu": 5, "list": [5, 7], "comprehens": 6, "calcul": [6, 8, 9], "util": 6, "prepar": [7, 8, 9], "batch": 7, "creat": 7, "run": [7, 8, 9], "check": 7, "statu": 7, "exampl": [8, 9], "dsqss": 8, "": [8, 9], "thi": [8, 9], "sampl": [8, 9], "hphi": 9, "tutori": 10}, "envversion": {"sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx": 60}, "alltitles": {"Moller Users Guide": [[0, "moller-users-guide"]], "Contents:": [[0, null]], "Introduction": [[1, "introduction"]], "What is moller?": [[1, "what-is-moller"]], "License": [[1, "license"]], "Contributors": [[1, "contributors"]], "Copyright": [[1, "copyright"]], "Operating environment": [[1, "operating-environment"]], "Extension guide": [[2, "extension-guide"]], "Bulk job execution by moller": [[2, "bulk-job-execution-by-moller"]], "How moller works": [[2, "how-moller-works"]], "Structure of moller script": [[2, "structure-of-moller-script"]], "Brief description of moller script functions": [[2, "brief-description-of-moller-script-functions"]], "How to extend moller for other systems": [[2, "how-to-extend-moller-for-other-systems"]], "Class structure": [[2, "class-structure"]], "SLURM job scheduler variants": [[2, "slurm-job-scheduler-variants"]], "PBS job scheduler variants": [[2, "pbs-job-scheduler-variants"]], "Customizing features": [[2, "customizing-features"]], "Porting to new type of job scheduler": [[2, "porting-to-new-type-of-job-scheduler"]], "Trouble shooting": [[2, "trouble-shooting"]], "Installation and basic usage": [[3, "installation-and-basic-usage"]], "Command reference": [[4, "command-reference"]], "moller": [[4, "moller"]], "moller_status": [[4, "moller-status"]], "File format": [[5, "file-format"]], "Job description file": [[5, "job-description-file"]], "General settings": [[5, "general-settings"]], "platform": [[5, "platform"]], "prologue, epilogue": [[5, "prologue-epilogue"]], "jobs": [[5, "jobs"]], "List file": [[5, "list-file"]], "Comprehensive Calculation Utility (moller)": [[6, "comprehensive-calculation-utility-moller"]], "Basic usage": [[7, "basic-usage"]], "Prepare job description file": [[7, "prepare-job-description-file"]], "Generate batch job script": [[7, "generate-batch-job-script"]], "Create list file": [[7, "create-list-file"]], "Run batch job": [[7, "run-batch-job"]], "Check status": [[7, "check-status"]], "Example for moller calculation with DSQSS": [[8, "example-for-moller-calculation-with-dsqss"]], "What\u2019s this sample?": [[8, "whats-this-sample"], [9, "whats-this-sample"]], "Preparation": [[8, "preparation"], [9, "preparation"]], "How to run": [[8, "how-to-run"], [9, "how-to-run"]], "Example for moller calculation with HPhi": [[9, "example-for-moller-calculation-with-hphi"]], "Tutorial": [[10, "tutorial"]]}, "indexentries": {}}) \ No newline at end of file +Search.setIndex({"docnames": ["index", "moller/about/index", "moller/appendix/index", "moller/basic-usage", "moller/command/index", "moller/filespec/index", "moller/index", "moller/tutorial/basic", "moller/tutorial/dsqss", "moller/tutorial/hphi", "moller/tutorial/index"], "filenames": ["index.rst", "moller/about/index.rst", "moller/appendix/index.rst", "moller/basic-usage.rst", "moller/command/index.rst", "moller/filespec/index.rst", "moller/index.rst", "moller/tutorial/basic.rst", "moller/tutorial/dsqss.rst", "moller/tutorial/hphi.rst", "moller/tutorial/index.rst"], "titles": ["Moller Users Guide", "1. Introduction", "6. Extension guide", "2. Installation and basic usage", "4. Command reference", "5. File format", "Comprehensive Calculation Utility (moller)", "3.1. Basic usage", "3.3. Example for moller calculation with DSQSS", "3.2. Example for moller calculation with HPhi", "3. Tutorial"], "terms": {"comprehens": [0, 3, 4], "calcul": [0, 2, 3, 4, 7, 10], "util": [0, 3, 7], "introduct": [0, 6], "instal": [0, 6, 8, 9], "basic": [0, 6, 10], "usag": [0, 2, 6, 10], "tutori": [0, 3, 6, 7, 8, 9], "command": [0, 2, 3, 5, 6, 7], "refer": [0, 2, 3, 6, 7], "file": [0, 2, 3, 4, 6, 8, 9, 10], "format": [0, 2, 3, 4, 6, 7], "extens": [0, 6], "In": [1, 2, 3, 5, 7, 8, 9], "recent": 1, "year": 1, "us": [1, 2, 3, 4, 5, 7, 8, 9], "machin": 1, "learn": 1, "predict": 1, "materi": 1, "properti": 1, "design": [1, 7], "substanc": 1, "known": [1, 2], "informat": 1, "ha": [1, 2, 3, 7, 8], "gain": 1, "consider": 1, "attent": 1, "The": [1, 2, 3, 4, 5, 7, 8, 9], "accuraci": 1, "depend": [1, 2, 3, 5, 8, 9], "heavili": 1, "prepar": [1, 3, 5, 10], "appropri": [1, 2], "train": 1, "data": [1, 9], "therefor": [1, 2, 7, 8], "develop": [1, 2], "tool": [1, 2, 3, 7, 8, 9], "rapid": 1, "gener": [1, 2, 3, 4, 8, 9, 10], "expect": [1, 9], "contribut": 1, "significantli": 1, "advanc": 1, "research": 1, "provid": [1, 2, 4, 5, 7], "part": [1, 2, 5], "htp": [1, 3, 8, 9], "packag": [1, 8, 9], "support": 1, "high": 1, "throughput": 1, "comput": [1, 2, 3], "It": [1, 2, 3, 4, 5, 7], "batch": [1, 2, 3, 4, 5, 8, 9, 10], "job": [1, 3, 4, 6, 8, 9, 10], "script": [1, 3, 4, 5, 8, 9, 10], "supercomput": [1, 2, 3, 7, 8, 9], "cluster": 1, "allow": 1, "parallel": [1, 2, 3, 5, 7, 8, 9], "execut": [1, 3, 4, 5, 6, 7, 8, 9], "program": [1, 2, 3, 4, 5, 7, 8, 9], "under": [1, 8, 9], "seri": [1, 7], "condit": [1, 3, 8, 9], "paramet": [1, 2, 3, 4, 5, 7, 8], "current": [1, 7], "ohtaka": [1, 2, 3, 5, 7, 8, 9], "slurm": [1, 3, 5, 7], "schedul": [1, 3, 7], "kugui": [1, 2, 3, 5], "pb": [1, 3, 5, 7], "institut": 1, "solid": 1, "state": 1, "physic": 1, "univers": 1, "tokyo": 1, "distribut": [1, 2], "sourc": [1, 3, 7, 8, 9], "code": [1, 2, 5, 7], "follow": [1, 2, 3, 4, 5, 7, 9], "gnu": [1, 2, 3], "public": 1, "version": [1, 2], "3": [1, 2, 3, 7], "gpl": 1, "v3": 1, "later": 1, "thi": [1, 2, 4, 5, 7, 10], "softwar": [1, 8, 9], "wa": [1, 7], "ver": 1, "1": [1, 2, 3, 7, 8, 9], "0": [1, 3, 7, 8, 9], "beta": 1, "releas": 1, "2023": [1, 7], "12": 1, "28": 1, "kazuyoshi": 1, "yoshimi": 1, "instutit": 1, "tatsumi": 1, "aoyama": 1, "yuichi": 1, "motoyama": 1, "masahiro": 1, "fukuda": 1, "kota": 1, "ido": 1, "tetsuya": 1, "fukushima": 1, "nation": 1, "industri": 1, "scienc": 1, "technologi": 1, "aist": 1, "shusuk": 1, "kasamatsu": 1, "yamagata": 1, "takashi": 1, "koretsun": 1, "tohoku": 1, "project": 1, "corrdin": 1, "taisuk": 1, "ozaki": 1, "all": [1, 2, 4, 5, 7], "right": 1, "reserv": 1, "usabl": 1, "test": 1, "platform": [1, 2, 3, 7], "ubuntu": 1, "linux": 1, "python3": [1, 3, 8], "n": [2, 7], "b": [2, 3, 7, 9], "content": [2, 5, 7], "section": [2, 3, 5, 7, 9], "mai": [2, 3, 5, 7], "vari": 2, "A": [2, 5, 7, 8, 9], "mean": [2, 8], "set": [2, 3, 7], "small": [2, 8], "task": [2, 3, 4, 5, 8, 9, 10], "ar": [2, 3, 4, 5, 7, 8, 9], "within": [2, 5, 7, 8, 9], "singl": [2, 5, 7], "submit": [2, 3, 5, 7, 8, 9], "larg": 2, "queue": [2, 5, 7], "i": [2, 3, 4, 5, 6, 7, 8, 9], "schemat": 2, "shown": [2, 7], "which": [2, 3, 5, 7, 8, 9], "launch": 2, "background": 2, "process": [2, 5, 7], "wait": 2, "statement": 2, "invok": [2, 3, 7], "complet": [2, 3, 4, 5, 7], "param_1": 2, "param_2": 2, "param_n": 2, "To": [2, 3, 8, 9], "manag": [2, 7], "requir": [2, 3, 5, 7], "node": [2, 3, 5, 7], "core": [2, 5], "alloc": [2, 5], "over": 2, "so": [2, 3, 7], "thei": [2, 5], "distinct": 2, "also": [2, 3], "need": [2, 3, 7], "arrang": 2, "where": [2, 3, 5, 7], "most": 2, "run": [2, 3, 5, 10], "simultan": 2, "accord": [2, 7], "resourc": 2, "hereaft": 2, "denot": 2, "concurr": [2, 3, 4, 7], "control": 2, "take": [2, 4, 7], "list": [2, 3, 4, 6, 8, 9, 10], "hold": 2, "item": [2, 5], "each": [2, 3, 5, 7, 8], "an": [2, 3, 5, 7, 8, 9], "exampl": [2, 5, 6, 7, 10], "given": [2, 3, 5], "dat": [2, 3, 4, 7, 8, 9], "contain": [2, 3, 4, 5, 7], "line": [2, 3, 4, 5, 7], "cat": 2, "j": 2, "number": [2, 5], "determin": 2, "runtim": 2, "from": [2, 4, 7, 8, 9], "obtain": [2, 3, 4, 7, 9], "environ": [2, 5, 6], "degre": [2, 5], "thread": [2, 5], "specifi": [2, 3, 4, 5, 7], "wai": [2, 3, 7], "assign": [2, 5], "For": [2, 3, 7], "call": [2, 7], "srun": [2, 5], "exploit": 2, "option": [2, 3, 4, 5, 7], "exclus": [2, 5], "explicit": 2, "On": [2, 7, 9], "hand": [2, 7, 9], "do": [2, 7], "have": [2, 5, 7], "handl": [2, 7], "divid": 2, "slot": 2, "divis": 2, "kept": 2, "form": [2, 5], "tabl": [2, 5, 7], "variabl": [2, 5], "host": 2, "pin": 2, "through": [2, 4], "mpirun": [2, 5], "mpiexec": [2, 5], "mpi": [2, 5], "implement": 2, "o": [2, 3, 4, 7, 8, 9], "tang": [2, 3], "power": [2, 3], "login": [2, 3, 7], "usenix": [2, 3], "magazin": [2, 3], "februari": [2, 3], "2011": [2, 3], "42": [2, 3], "47": [2, 3], "read": [2, 4], "input": [2, 3, 7, 8, 9], "yaml": [2, 3, 5, 7, 8, 9], "describ": [2, 3, 5, 7], "header": 2, "prologu": [2, 7], "correspond": [2, 7, 8, 9], "block": 2, "written": [2, 4, 5, 7, 8, 9], "definit": 2, "next": [2, 7], "accept": [2, 5], "addit": [2, 3, 5, 7], "argument": [2, 3, 7, 8, 9], "submiss": 2, "sbatch": [2, 3, 5, 7, 8, 9], "pass": [2, 3, 7], "name": [2, 3, 4, 5, 7, 8, 9], "retri": [2, 3, 7], "can": [2, 3, 7], "ignor": [2, 5], "fix": [2, 3], "default": [2, 3, 4, 5], "enabl": 2, "modifi": [2, 7], "when": [2, 4, 5], "more": [2, 4, 5, 7], "than": [2, 4, 5], "one": [2, 4, 5, 7, 8], "procedur": [2, 7], "appli": 2, "fals": [2, 5, 7], "true": [2, 5, 7], "creat": [2, 3, 10], "task_": 2, "pre": [2, 7], "keyword": [2, 5], "substitut": 2, "epilogu": [2, 7], "main": [2, 3, 8], "briefli": 2, "below": [2, 7], "run_parallel": 2, "perform": [2, 4, 7, 8, 9], "statu": [2, 3, 4, 8, 9, 10], "_find_multipl": 2, "find": 2, "actual": [2, 3, 5], "wrap": 2, "_run_parallel_task": 2, "deal": 2, "nest": 2, "separ": [2, 4], "out": [2, 5, 7], "_setup_run_parallel": 2, "account": 2, "inform": [2, 7], "summar": [2, 4, 8, 9], "_nnode": 2, "slurm_nnod": 2, "_ncore": 2, "slurm_cpus_on_nod": 2, "_node": 2, "uniqu": 2, "pbs_nodefil": 2, "entri": [2, 7], "search": 2, "order": [2, 3, 7], "examin": 2, "ncpu": 2, "profession": 2, "omp_num_thread": 2, "moller_cor": 2, "ppn": 2, "supplement": 2, "some": 2, "befor": [2, 7], "export": 2, "noth": 2, "2": [2, 7, 8, 9], "directori": [2, 3, 4, 5, 7, 8, 9], "id": 2, "_setup_taskenv": 2, "up": [2, 9], "base": [2, 3, 5, 7], "_is_readi": 2, "check": [2, 3, 8, 9, 10], "preced": 2, "been": [2, 3, 5, 7], "successfulli": [2, 3, 4, 7], "If": [2, 3, 4, 5, 7], "remain": [2, 8, 9], "otherwis": [2, 5], "termin": [2, 3, 4, 7], "latest": 2, "profil": 2, "issp": [2, 3, 7, 8, 9], "place": 2, "Their": 2, "depict": 2, "factori": 2, "select": 2, "import": 2, "__init__": [2, 3], "py": [2, 3, 8], "regist": 2, "register_platform": 2, "system_nam": 2, "class_nam": 2, "becom": 2, "avail": [2, 8, 9], "specif": [2, 7], "should": [2, 3, 7], "deriv": 2, "baseslurm": 2, "string": [2, 5, 7], "return": 2, "valu": [2, 4, 5, 7, 9], "parallel_command": 2, "method": [2, 8, 9], "see": 2, "openpb": 2, "torqu": 2, "basepb": 2, "There": [2, 3, 7], "two": 2, "while": 2, "self": 2, "pbs_use_old_format": 2, "latter": 2, "per": [2, 5, 7], "128": 2, "further": 2, "overridden": 2, "relev": 2, "setup": 2, "extract": [2, 4, 7], "generate_head": 2, "generate_funct": 2, "bodi": [2, 8, 9], "generate_vari": 2, "generate_function_bodi": 2, "embed": [2, 5], "multipl": [2, 5], "intern": 2, "acquir": 2, "e": [2, 5, 7, 9], "g": [2, 5, 7, 9], "printenv": 2, "_debug": 2, "debug": 2, "output": [2, 4, 5, 7, 8], "print": 2, "dure": [2, 3], "doe": 2, "well": [2, 3], "recommend": 2, "turn": 2, "defin": [2, 7], "prerequisit": 3, "moller": [3, 5, 7, 10], "includ": [3, 5], "librari": [3, 5], "python": 3, "x": [3, 7], "ruamel": 3, "modul": [3, 7], "tabul": 3, "must": [3, 7], "server": 3, "offici": [3, 8, 9], "page": 3, "github": 3, "repositori": 3, "download": 3, "git": 3, "clone": 3, "http": 3, "com": [3, 5], "center": 3, "dev": 3, "onc": [3, 9], "you": [3, 7], "automat": 3, "same": 3, "time": [3, 5], "cd": [3, 7], "m": [3, 5, 8], "pip": 3, "moller_statu": [3, 6, 7, 8, 9], "structur": 3, "licens": [3, 6], "readm": 3, "md": 3, "pyproject": 3, "toml": 3, "doc": [3, 7], "ja": 3, "en": 3, "src": 3, "base_slurm": 3, "base_pb": 3, "base_default": 3, "function": 3, "sampl": [3, 5, 7, 10], "featur": 3, "descript": [3, 4, 6, 8, 9, 10], "first": [3, 7, 8], "detail": [3, 7], "manual": 3, "sh": [3, 7, 8, 9], "transfer": [3, 7], "note": [3, 7, 8, 9], "rel": 3, "path": [3, 5, 8], "absolut": 3, "readi": 3, "system": [3, 5, 6, 7, 8, 9], "case": [3, 7, 8], "c": [3, 7], "qsub": 3, "thu": 3, "after": [3, 5, 7, 8], "finish": [3, 7, 8, 9], "report": [3, 4], "whether": [3, 8], "resum": 3, "again": [3, 7, 8], "yet": [3, 4, 7], "unexecut": [3, 7], "unfinish": 3, "fail": [3, 4, 5, 10], "edit": [3, 7], "chang": [3, 7, 8], "Then": [3, 8], "abov": [3, 7], "synopsi": 4, "job_script": 4, "input_yaml": 4, "supersed": 4, "output_fil": [4, 5], "result": [4, 5, 7, 8, 9], "standard": [4, 5, 7], "h": 4, "displai": 4, "help": 4, "exit": 4, "text": [4, 5, 7, 8, 9], "csv": 4, "html": 4, "ok": 4, "skip": [4, 7], "collaps": 4, "list_fil": 4, "log": [4, 7], "error": [4, 7], "comma": 4, "logfil": 4, "stat_": 4, "omit": [4, 7], "filter": 4, "onli": 4, "whose": [4, 7], "ani": 4, "make": [4, 8, 9], "summari": [4, 7], "configur": 5, "consist": [5, 7, 9], "initi": 5, "final": [5, 7], "carri": [5, 7], "betch": 5, "left": 5, "unspecifi": 5, "usual": 5, "regard": 5, "comment": 5, "none": 5, "them": 5, "target": [5, 7], "At": 5, "present": [5, 7], "either": 5, "integ": 5, "rang": 5, "both": [5, 7], "second": [5, 7], "elaps": [5, 7], "hh": 5, "mm": 5, "ss": 5, "other": [5, 6, 7, 9], "head": 5, "direct": 5, "mail": 5, "type": [5, 7], "begin": 5, "end": [5, 7], "user": [5, 7], "requeu": 5, "bea": 5, "r": 5, "y": 5, "prior": 5, "shell": [5, 7, 9], "sequenc": 5, "kei": [5, 7], "smaller": 5, "differ": [5, 7, 8, 9], "sequenti": [5, 7], "openmpi": [5, 7], "hybrid": 5, "prog": 5, "arg1": 5, "replac": [5, 7], "associ": 5, "assum": 5, "These": 5, "suppos": 5, "locat": 5, "what": [6, 10], "contributor": 6, "copyright": 6, "oper": 6, "hphi": [6, 10], "dsqss": [6, 10], "guid": 6, "bulk": 6, "how": [6, 10], "work": [6, 8, 9], "extend": 6, "step": 7, "we": [7, 8, 9], "explain": 7, "along": 7, "here": 7, "instruct": 7, "sever": [7, 8], "organ": 7, "synchron": 7, "between": [7, 8], "taken": 7, "everi": 7, "start": 7, "three": 7, "4": [7, 9], "among": 7, "concern": 7, "post": 7, "testjob": 7, "i8cpu": 7, "00": 7, "10": 7, "purg": 7, "load": 7, "oneapi_compil": 7, "5": [7, 8], "oneapi": 7, "classic": 7, "ulimit": 7, "": [7, 10], "unlimit": 7, "home": 7, "materiapp": 7, "intel": 7, "parallelvar": 7, "20210622": 7, "echo": 7, "hello": 7, "world": 7, "txt": 7, "sleep": 7, "hello_again": 7, "done": 7, "date": 7, "being": 7, "made": 7, "preprocess": 7, "common": 7, "hello_world": 7, "sinc": 7, "treat": 7, "basi": 7, "similarli": 7, "pleas": 7, "chapter": 7, "neither": 7, "bash": [7, 8, 9], "care": 7, "csh": 7, "usr": 7, "bin": 7, "l": [7, 8, 9], "1d": 7, "make_input": [7, 8, 9], "enclos": [7, 8, 9], "dataset": [7, 8, 9], "By": [7, 8, 9], "subdirectori": 7, "0001": 7, "0020": 7, "copi": [7, 8], "cp": 7, "confirm": 7, "0002": 7, "0003": 7, "0004": 7, "0005": 7, "0006": 7, "0007": 7, "0008": 7, "0009": 7, "0010": 7, "0011": 7, "0012": 7, "0013": 7, "0014": 7, "0015": 7, "0016": 7, "0017": 7, "0018": 7, "0019": 7, "becaus": 7, "previou": [7, 9], "subsequ": 7, "task1": 7, "task2": 7, "task3": 7, "dataset_0001": 7, "dataset_0002": 7, "dataset_0003": 7, "dataset_0004": 7, "dataset_0005": 7, "dataset_0006": 7, "dataset_0007": 7, "dataset_0008": 7, "dataset_0009": 7, "dataset_0010": 7, "dataset_0011": 7, "dataset_0012": 7, "dataset_0013": 7, "dataset_0014": 7, "dataset_0015": 7, "dataset_0016": 7, "dataset_0017": 7, "dataset_0018": 7, "dataset_0019": 7, "dataset_0020": 7, "were": 7, "success": 7, "proceed": 7, "independ": 7, "resubmit": 7, "alreadi": [7, 8], "anyth": 7, "properli": 7, "open": [8, 9], "integr": 8, "mont": 8, "calro": 8, "quantum": [8, 9], "mani": [8, 9], "problem": [8, 9], "temperatur": 8, "magnet": 8, "suscept": 8, "chi": 8, "term": 8, "antiferromagnet": [8, 9], "heisenberg": [8, 9], "chain": [8, 9], "period": [8, 9], "boundari": [8, 9], "length": [8, 9], "t": 8, "sure": [8, 9], "exist": 8, "remov": 8, "like": 8, "l_8__m_1__t_1": 8, "store": [8, 9], "gather": [8, 9], "extract_result": 8, "write": [8, 9], "column": 8, "stderr": 8, "visual": [8, 9], "gnuplot": [8, 9], "plot_m1": 8, "plt": [8, 9], "plot_m2": 8, "persist": [8, 9], "afh": 8, "excit": [8, 9], "gap": [8, 9], "vanish": 8, "reflect": 8, "veri": 8, "low": 8, "region": 8, "finit": [8, 9], "size": [8, 9], "effect": [8, 9], "spin": [8, 9], "drop": 8, "exact": 9, "diagon": 9, "delta": 9, "2s_1": 9, "2s_2": 9, "l_8": 9, "l_10": 9, "l_24": 9, "l_18": 9, "addition": 9, "extract_gap": 9, "energi": 9, "pair": 9, "fit": 9, "curv": 9, "delta_": 9, "infti": 9, "exp": 9, "cl": 9, "plot": 9, "logarithm": 9, "correct": 9, "caus": 9, "extrapol": 9, "417": 9, "41048": 9, "6": 9, "qmc": 9, "todo": 9, "kato": 9, "prl": 9, "87": 9, "047203": 9, "2001": 9, "rerun": 10}, "objects": {}, "objtypes": {}, "objnames": {}, "titleterms": {"moller": [0, 1, 2, 4, 6, 8, 9], "user": 0, "guid": [0, 2], "content": 0, "introduct": 1, "what": [1, 8, 9], "i": 1, "licens": 1, "contributor": 1, "copyright": 1, "oper": 1, "environ": 1, "extens": 2, "bulk": 2, "job": [2, 5, 7], "execut": 2, "how": [2, 8, 9], "work": 2, "structur": 2, "script": [2, 7], "brief": 2, "descript": [2, 5, 7], "function": 2, "extend": 2, "other": 2, "system": 2, "class": 2, "slurm": 2, "schedul": 2, "variant": 2, "pb": 2, "custom": 2, "featur": 2, "port": 2, "new": 2, "type": 2, "troubl": 2, "shoot": 2, "instal": 3, "basic": [3, 7], "usag": [3, 7], "command": 4, "refer": 4, "moller_statu": 4, "file": [5, 7], "format": 5, "gener": [5, 7], "set": 5, "platform": 5, "prologu": 5, "epilogu": 5, "list": [5, 7], "comprehens": 6, "calcul": [6, 8, 9], "util": 6, "prepar": [7, 8, 9], "batch": 7, "creat": 7, "run": [7, 8, 9], "check": 7, "statu": 7, "rerun": 7, "fail": 7, "task": 7, "exampl": [8, 9], "dsqss": 8, "": [8, 9], "thi": [8, 9], "sampl": [8, 9], "hphi": 9, "tutori": 10}, "envversion": {"sphinx.domains.c": 3, "sphinx.domains.changeset": 1, "sphinx.domains.citation": 1, "sphinx.domains.cpp": 9, "sphinx.domains.index": 1, "sphinx.domains.javascript": 3, "sphinx.domains.math": 2, "sphinx.domains.python": 4, "sphinx.domains.rst": 2, "sphinx.domains.std": 2, "sphinx": 60}, "alltitles": {"Moller Users Guide": [[0, "moller-users-guide"]], "Contents:": [[0, null]], "Introduction": [[1, "introduction"]], "What is moller?": [[1, "what-is-moller"]], "License": [[1, "license"]], "Contributors": [[1, "contributors"]], "Copyright": [[1, "copyright"]], "Operating environment": [[1, "operating-environment"]], "Extension guide": [[2, "extension-guide"]], "Bulk job execution by moller": [[2, "bulk-job-execution-by-moller"]], "How moller works": [[2, "how-moller-works"]], "Structure of moller script": [[2, "structure-of-moller-script"]], "Brief description of moller script functions": [[2, "brief-description-of-moller-script-functions"]], "How to extend moller for other systems": [[2, "how-to-extend-moller-for-other-systems"]], "Class structure": [[2, "class-structure"]], "SLURM job scheduler variants": [[2, "slurm-job-scheduler-variants"]], "PBS job scheduler variants": [[2, "pbs-job-scheduler-variants"]], "Customizing features": [[2, "customizing-features"]], "Porting to new type of job scheduler": [[2, "porting-to-new-type-of-job-scheduler"]], "Trouble shooting": [[2, "trouble-shooting"]], "Installation and basic usage": [[3, "installation-and-basic-usage"]], "Command reference": [[4, "command-reference"]], "moller": [[4, "moller"]], "moller_status": [[4, "moller-status"]], "File format": [[5, "file-format"]], "Job description file": [[5, "job-description-file"]], "General settings": [[5, "general-settings"]], "platform": [[5, "platform"]], "prologue, epilogue": [[5, "prologue-epilogue"]], "jobs": [[5, "jobs"]], "List file": [[5, "list-file"]], "Comprehensive Calculation Utility (moller)": [[6, "comprehensive-calculation-utility-moller"]], "Basic usage": [[7, "basic-usage"]], "Prepare job description file": [[7, "prepare-job-description-file"]], "Generate batch job script": [[7, "generate-batch-job-script"]], "Create list file": [[7, "create-list-file"]], "Run batch job": [[7, "run-batch-job"]], "Check status": [[7, "check-status"]], "Rerun failed tasks": [[7, "rerun-failed-tasks"]], "Example for moller calculation with DSQSS": [[8, "example-for-moller-calculation-with-dsqss"]], "What\u2019s this sample?": [[8, "whats-this-sample"], [9, "whats-this-sample"]], "Preparation": [[8, "preparation"], [9, "preparation"]], "How to run": [[8, "how-to-run"], [9, "how-to-run"]], "Example for moller calculation with HPhi": [[9, "example-for-moller-calculation-with-hphi"]], "Tutorial": [[10, "tutorial"]]}, "indexentries": {}}) \ No newline at end of file diff --git a/manual/main/en/moller-usersguide.pdf b/manual/main/en/moller-usersguide.pdf index 1d5a288..e91ce08 100644 Binary files a/manual/main/en/moller-usersguide.pdf and b/manual/main/en/moller-usersguide.pdf differ diff --git a/manual/main/ja/html/_images/task_view.png b/manual/main/ja/html/_images/task_view.png index 0ca4be0..44d038a 100644 Binary files a/manual/main/ja/html/_images/task_view.png and b/manual/main/ja/html/_images/task_view.png differ diff --git a/manual/main/ja/html/_sources/moller/command/index.rst.txt b/manual/main/ja/html/_sources/moller/command/index.rst.txt index 70e2dbd..87cc10b 100644 --- a/manual/main/ja/html/_sources/moller/command/index.rst.txt +++ b/manual/main/ja/html/_sources/moller/command/index.rst.txt @@ -58,7 +58,7 @@ moller_status - list_file - ジョブのリストを格納したファイルを指定します。指定がない場合は、バッチジョブから出力されるログファイル log_{task}.dat から収集します。 + ジョブのリストを格納したファイルを指定します。指定がない場合は、バッチジョブから出力されるログファイル stat_{task}.dat から収集します。 - -o, --output output_file @@ -92,5 +92,5 @@ moller_status ファイル: - mollerで生成したジョブスクリプトを用いてプログラムを並列実行すると、実行状況がログファイル log_{task}.dat に出力されます。moller_status はこのファイルを集計し、読みやすい形式に整形します。 + mollerで生成したジョブスクリプトを用いてプログラムを並列実行すると、実行状況がログファイル stat_{task}.dat に出力されます。moller_status はこのファイルを集計し、読みやすい形式に整形します。 diff --git a/manual/main/ja/html/_sources/moller/tutorial/basic.rst.txt b/manual/main/ja/html/_sources/moller/tutorial/basic.rst.txt index 38ac20a..093dc68 100644 --- a/manual/main/ja/html/_sources/moller/tutorial/basic.rst.txt +++ b/manual/main/ja/html/_sources/moller/tutorial/basic.rst.txt @@ -72,7 +72,7 @@ jobsセクションでは、タスクの処理内容を記述します。ジョ .. code-block:: bash - $ /usr/bin/ls -1d > list.dat + $ /usr/bin/ls -1d * > list.dat チュートリアルには、データセットとリストファイルを作成するユーティリティープログラムが付属しています。 @@ -115,3 +115,27 @@ mollerで生成したバッチジョブスクリプトをジョブスケジュ 「o」は正常終了したタスク、「x」はエラーになったタスク、「-」は前のタスクがエラーになったためスキップされたタスク、「.」は未実行のタスクを示します。 今回は全て正常終了していることがわかります。 + +失敗したタスクを再実行する +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +タスクが失敗した場合、そのジョブ内の後続のタスクは実行されません。以下は、各タスクが 10% の確率で失敗するケースの実行例です。 + +.. literalinclude:: ../../../../tutorial/moller/reference/status_failed.txt + +dataset_0003, dataset_0004 は task1 が失敗し、後続の task2, task3 は実行されていません。その他の dataset は task1 が成功し、次の task2 が実行されています。このように、各ジョブは他のジョブとは独立に実行されます。 + +失敗したタスクを再実行するには、バッチジョブに retry のオプションをつけて再実行します。 +SLURMジョブスケジューラ (例: 物性研システムB) の場合は次のようにバッチジョブを投入します。 + +.. code-block:: bash + + $ sbatch job.sh --retry list.dat + +PBSジョブスケジューラ (例: 物性研システムC) の場合はジョブスクリプトを編集し、 ``retry=0`` の行を ``retry=1`` に変更して、バッチジョブを再投入します。 + +.. literalinclude:: ../../../../tutorial/moller/reference/status_retry.txt + +エラーになったタスクのみ再実行されます。上記の例では、dataset_0003 は task1 が再実行され正常終了し、次の task2 の実行に失敗しています。dataset_0004 は task1, task2, task3 が正常に実行されています。task3 まで全て正常終了しているデータ・セットに対しては何も実行しません。 + +なお、再実行の際にリストファイルは変更しないでください。リストファイル内の順番でジョブを管理しているため、変更すると正しく再実行されません。 diff --git a/manual/main/ja/html/_static/task_view.pdf b/manual/main/ja/html/_static/task_view.pdf index e85cbb6..3af0603 100644 Binary files a/manual/main/ja/html/_static/task_view.pdf and b/manual/main/ja/html/_static/task_view.pdf differ diff --git a/manual/main/ja/html/_static/task_view.png b/manual/main/ja/html/_static/task_view.png index 0ca4be0..44d038a 100644 Binary files a/manual/main/ja/html/_static/task_view.png and b/manual/main/ja/html/_static/task_view.png differ diff --git a/manual/main/ja/html/genindex.html b/manual/main/ja/html/genindex.html index fd74c2c..d63d47d 100644 --- a/manual/main/ja/html/genindex.html +++ b/manual/main/ja/html/genindex.html @@ -94,7 +94,7 @@$ /usr/bin/ls -1d > list.dat +$ /usr/bin/ls -1d * > list.datIn this tutorial, an utility script
@@ -184,6 +184,72 @@make_inputs.sh
is enclosed which generates datasets and a list file.Check status +
Rerun failed tasks¶
+If a task fails, the subsequent tasks within the job will not be executed. +The following is an example of job status in which each task fails by 10% change.
+++| job | task1 | task2 | task3 | +|--------------|---------|---------|---------| +| dataset_0001 | o | o | o | +| dataset_0002 | o | x | - | +| dataset_0003 | x | - | - | +| dataset_0004 | x | - | - | +| dataset_0005 | o | o | o | +| dataset_0006 | o | o | o | +| dataset_0007 | o | x | - | +| dataset_0008 | o | o | o | +| dataset_0009 | o | o | x | +| dataset_0010 | o | o | o | +| dataset_0011 | o | o | o | +| dataset_0012 | o | o | o | +| dataset_0013 | o | x | - | +| dataset_0014 | o | o | o | +| dataset_0015 | o | o | o | +| dataset_0016 | o | o | o | +| dataset_0017 | o | o | o | +| dataset_0018 | o | o | o | +| dataset_0019 | o | o | o | +| dataset_0020 | o | o | o | +There, the jobs of dataset_0003 and dataset_0004 failed at task1, and the subsequent task2 and task3 were not executed. The other jobs were successful at task1, and proceeded to task2. +In this way, each job is executed independently of other jobs.
+Users can rerun the failed tasks by submitting the batch job with the retry option. +For SLURM job scheduler (e.g. used in ISSP system B), resubmit the job as follows:
+++$ sbatch job.sh --retry list.dat +For PBS job scheduler (e.g. used in ISSP system C), edit the job script so that the line
+retry=0
is replaced byretry=1
, and resubmit the job.++| job | task1 | task2 | task3 | +|--------------|---------|---------|---------| +| dataset_0001 | o | o | o | +| dataset_0002 | o | o | x | +| dataset_0003 | o | x | - | +| dataset_0004 | o | o | o | +| dataset_0005 | o | o | o | +| dataset_0006 | o | o | o | +| dataset_0007 | o | o | o | +| dataset_0008 | o | o | o | +| dataset_0009 | o | o | o | +| dataset_0010 | o | o | o | +| dataset_0011 | o | o | o | +| dataset_0012 | o | o | o | +| dataset_0013 | o | o | o | +| dataset_0014 | o | o | o | +| dataset_0015 | o | o | o | +| dataset_0016 | o | o | o | +| dataset_0017 | o | o | o | +| dataset_0018 | o | o | o | +| dataset_0019 | o | o | o | +| dataset_0020 | o | o | o | +The tasks that have failed will be executed in the second run. +In the above example, the task1 for dataset_0003 was successful, but the task2 failed. +For dataset_0004, task1, task2, and task3 were successfully executed. +For the jobs of datasets whose tasks have already finished successfully, the second run will not do anything.
+N.B. the list file must not be modified on the rerun. The jobs are managed according to the order of entries in the list file, and therefore, if the order is changed, the jobs will not be executed properly.
+ @@ -256,7 +322,7 @@Quick search
| Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | Quick search | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | Quick search | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | 3. TutorialCreate list fileRun batch job Check status +Rerun failed tasks 3.2. Example for moller calculation with HPhi @@ -129,7 +130,7 @@
Quick search
| Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | Related Topics | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16クイック検索
| Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16クイック検索
| Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | クイック検索 | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | クイック検索 | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | クイック検索 | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | 4.2. moller_statusmollerの構成定義ファイルを指定します。list_file
-ジョブのリストを格納したファイルを指定します。指定がない場合は、バッチジョブから出力されるログファイル log_{task}.dat から収集します。
+ジョブのリストを格納したファイルを指定します。指定がない場合は、バッチジョブから出力されるログファイル stat_{task}.dat から収集します。
-o, --output output_file
出力先のファイル名を指定します。指定がない場合は標準出力に書き出されます。
@@ -121,7 +121,7 @@4.2. moller_status
ファイル:
-mollerで生成したジョブスクリプトを用いてプログラムを並列実行すると、実行状況がログファイル log_{task}.dat に出力されます。moller_status はこのファイルを集計し、読みやすい形式に整形します。
+@@ -194,7 +194,7 @@mollerで生成したジョブスクリプトを用いてプログラムを並列実行すると、実行状況がログファイル stat_{task}.dat に出力されます。moller_status はこのファイルを集計し、読みやすい形式に整形します。
クイック検索
| Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | クイック検索 | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | クイック検索 | Powered by Sphinx 7.2.6 - & Alabaster 0.7.15 + & Alabaster 0.7.16 | 構成定義ファイルを作成する¶構成定義ファイルにはバッチジョブで実行する処理の内容を記述します。 ここで、バッチジョブとはスーパーコンピュータシステム等のジョブスケジューラに投入する実行内容を指します。それに対し、moller が対象とするプログラムの多重実行において、多重実行される一つのパラメータセットでの実行内容をジョブと呼ぶことにします。一つのジョブはいくつかの処理単位からなり、その処理単位をタスクと呼びます。moller ではタスクごとに多重実行し、タスクの前後で同期がとられます。
-