Skip to content

Commit

Permalink
deploy: 4148016
Browse files Browse the repository at this point in the history
  • Loading branch information
yxdyc committed Apr 22, 2024
1 parent 9977dea commit 975ae8f
Show file tree
Hide file tree
Showing 8 changed files with 152 additions and 9 deletions.
59 changes: 54 additions & 5 deletions _modules/data_juicer/config/config.html
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ <h1>Source code for data_juicer.config.config</h1><div class="highlight"><pre>

<span class="kn">from</span> <span class="nn">jsonargparse</span> <span class="kn">import</span> <span class="p">(</span><span class="n">ActionConfigFile</span><span class="p">,</span> <span class="n">ArgumentParser</span><span class="p">,</span> <span class="n">dict_to_namespace</span><span class="p">,</span>
<span class="n">namespace_to_dict</span><span class="p">)</span>
<span class="kn">from</span> <span class="nn">jsonargparse.typing</span> <span class="kn">import</span> <span class="n">NonNegativeInt</span><span class="p">,</span> <span class="n">PositiveInt</span>
<span class="kn">from</span> <span class="nn">jsonargparse.typing</span> <span class="kn">import</span> <span class="n">ClosedUnitInterval</span><span class="p">,</span> <span class="n">NonNegativeInt</span><span class="p">,</span> <span class="n">PositiveInt</span>
<span class="kn">from</span> <span class="nn">loguru</span> <span class="kn">import</span> <span class="n">logger</span>

<span class="kn">from</span> <span class="nn">data_juicer.ops.base_op</span> <span class="kn">import</span> <span class="n">OPERATORS</span>
Expand All @@ -116,7 +116,7 @@ <h1>Source code for data_juicer.config.config</h1><div class="highlight"><pre>

<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="s1">&#39;--config&#39;</span><span class="p">,</span>
<span class="n">action</span><span class="o">=</span><span class="n">ActionConfigFile</span><span class="p">,</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path to a configuration file.&#39;</span><span class="p">,</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path to a dj basic configuration file.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>

<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
Expand All @@ -125,9 +125,58 @@ <h1>Source code for data_juicer.config.config</h1><div class="highlight"><pre>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path to a configuration file when using auto-HPO tool.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--path_3sigma_recipe&#39;</span><span class="p">,</span>
<span class="s1">&#39;--path_k_sigma_recipe&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path to save a configuration file when using 3-sigma tool.&#39;</span><span class="p">,</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path to save a configuration file when using k-sigma tool.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--path_model_feedback_recipe&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path to save a configuration file refined by model feedback.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--model_infer_config&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">dict</span><span class="p">],</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path or a dict to model inference configuration file when &#39;</span>
<span class="s1">&#39;calling model executor in sandbox. If not specified, the model &#39;</span>
<span class="s1">&#39;inference related hooks will be disabled.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--model_train_config&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">dict</span><span class="p">],</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path or a dict to model training configuration file when &#39;</span>
<span class="s1">&#39;calling model executor in sandbox. If not specified, the model &#39;</span>
<span class="s1">&#39;training related hooks will be disabled.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--data_eval_config&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">dict</span><span class="p">],</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path or a dict to eval configuration file when calling &#39;</span>
<span class="s1">&#39;auto-evaluator for data in sandbox. &#39;</span>
<span class="s1">&#39;If not specified, the eval related hooks will be disabled.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--model_eval_config&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">dict</span><span class="p">],</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Path or a dict to eval configuration file when calling &#39;</span>
<span class="s1">&#39;auto-evaluator for model in sandbox. &#39;</span>
<span class="s1">&#39;If not specified, the eval related hooks will be disabled.&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--data_probe_algo&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="nb">str</span><span class="p">,</span>
<span class="n">default</span><span class="o">=</span><span class="s1">&#39;uniform&#39;</span><span class="p">,</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;Sampling algorithm to use. Options are &quot;uniform&quot;, &#39;</span>
<span class="s1">&#39;&quot;frequency_specified_field_selector&quot;, or &#39;</span>
<span class="s1">&#39;&quot;topk_specified_field_selector&quot;. Default is &quot;uniform&quot;. Only &#39;</span>
<span class="s1">&#39;used for dataset sampling&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span>
<span class="s1">&#39;--data_probe_ratio&#39;</span><span class="p">,</span>
<span class="nb">type</span><span class="o">=</span><span class="n">ClosedUnitInterval</span><span class="p">,</span>
<span class="n">default</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span>
<span class="n">help</span><span class="o">=</span><span class="s1">&#39;The ratio of the sample size to the original dataset size. &#39;</span>
<span class="s1">&#39;Default is 1.0 (no sampling). Only used for dataset sampling&#39;</span><span class="p">,</span>
<span class="n">required</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>

<span class="c1"># basic global paras with extended type hints</span>
Expand Down Expand Up @@ -524,7 +573,7 @@ <h1>Source code for data_juicer.config.config</h1><div class="highlight"><pre>
<span class="w"> </span><span class="sd">&quot;&quot;&quot;</span>
<span class="sd"> Add ops and its params to parser for command line.</span>

<span class="sd"> :param configurable_ops: a list of ops to be to added, each item is</span>
<span class="sd"> :param configurable_ops: a list of ops to be added, each item is</span>
<span class="sd"> a pair of op_name and op_class</span>
<span class="sd"> :param parser: jsonargparse parser need to update</span>
<span class="sd"> &quot;&quot;&quot;</span>
Expand Down
Loading

0 comments on commit 975ae8f

Please sign in to comment.