Skip to content

Paddle Error Message Writing Specification (English Verison)

Chen Weihang edited this page Apr 3, 2020 · 9 revisions

Paddle Error Message Writing Specification


Paddle报错信息文案书写规范 (中文版)


Specification summary:

  • Section 1, the error document writing template, is a form of recommendation reference, depending on the situation, if you have a simple and more user-friendly way of writing, you can use it flexibly.
  • Section 2, mandatory specification entries, write rules for error messages that must be observed, the first three have been added to CI monitoring
  • Section 3, the error information specification sample library, is some existing PADDLE_ENFORCE extracted from Paddle, rewrite it as valid examples, easy to refer to
  • Appendix, when the specification is perfected in the follow-up, firstly clarify the basis and the content to be modified in the appendix, as the filing, then modify the content of the specification.

Additional instructions:

  1. During the implementation process, the specifications may find aspects that are not considered by the existing specifications, and need to be supplemented and improved in the implementation process. Please also give positive feedback.
  2. There are 12 types of errors in the current version of the specification. If you find a type of error that cannot be covered, you can apply for a supplement.
  3. The error information specification sample library, the richer the examples, the more reference value, I encourage you to add new examples.
  4. The specification matching situation is more complicated, and the writing method that conforms to the specification may be matched to be non-compliant. At that time, please look for chenwhql (Chen Weihang).

Contents

1. Error message writing template

The prompt information of PADDLE_ENFORCE_* and PADDLE_THROW is recommended to be written according to the following structure:

注:Note: The key to the error message is to describe the error clearly. The template is for reference only.

Three-stage error document writing (error - expected - suggestion)

The first paragraph: indicate the error (must write)

  • Direct statement error:

    • Recommended description:
      • A is error, B is not initialized, C does not exist, D value is incorrect, E does not match, etc.
        • example: Mismatched label shape.
    • Deprecated description: What should A be, B should not be how
      • Something went wrong, first tell the user directly the error
      • Unless necessary, it is not recommended to point out the error in a tone that should/should not be
      • What should or should not be, the content of the second paragraph that explains the desired outcome
  • Note in this paragraph:

    1. The attribute variable should indicate the wrong body. For example, the Op input and output should indicate which Op input and output error is wrong, and distinguish the front reverse Op.
    2. Specifying the error is telling the user a fact. Generally, the magic number (a number with unknown meaning) is not allowed to be expressed in English sentences.

Second paragraph: Comparison of expected and actual values (provided as much as possible)

  • Write out what the input is expected here, and what the actual input is.

    • example: Expected labels dimension=1. Received 4.
  • Note in this paragraph:

    1. Provide the necessary information to complete, such as Shape error, need to compare the specific Shape output, and indicate the dimension of the error
    2. This paragraph can be omitted if the error in the first paragraph is a single value description. For example, A is a null pointer, B does not exist, there is no need to indicate here that expectation A is not empty, B should exist, etc.

Third paragraphs: Suggestions for revision (as far as possible)

  • Explain what caused the error here and how it should be modified

    • example: Suggested Fix: If your classifier expects one-hot encoding label,check your n_classes argument to the estimatorand/or the shape of your label.Otherwise, check the shape of your label.
  • Note in this paragraph:

    • It can be written that the modification proposal is generally applicable to some common problems, such as
      • Startup_program is not executed
      • An important parameter is not set
      • There may be a problem with an environment configuration

2. Mandatory specification entries

The PADDLE_ENFORCE_* and PADDLE_THROW tips must be written in the following entries:

1. Omitted or empty strings are not allowed (CI has monitoring)

  • The error examples:
PADDLE_ENFORCE(ctx->HasInput("X"));

PADDLE_ENFORCE(ctx->HasInput("X"), "");

2. Do not allow prompts to be too short, at least 20 characters longer (CI has monitoring)

  • The error examples:
PADDLE_ENFORCE(i != nullptr, "I must be set");

3. Must indicate the type of error (CI has monitoring)

  • There are currently 12 types of errors declared (see the detailed example in Section 3 for details).
    • InvalidArgument
    • NotFound
    • OutOfRange
    • AlreadyExists
    • ResourceExhausted
    • PreconditionNotMet
    • PermissionDenied
    • ExecutionTimeout
    • Unimplemented
    • Unavailable
    • Fatal
    • External

Usage summary: Wrap `platform::errors::ErrorType() outside the entire error prompt string (containing a list of variable length parameters)

A brief example (note the position of the parentheses):

  • Old: PADDLE_ENFORCE(true, "example: %s", str);
  • New: PADDLE_ENFORCE(true, platform::errors::InvalidArgument("example: %s", str));

The correct example:

PADDLE_ENFORCE_GT(y_dims.size(), y_num_col_dims,
                      platfrom::errors::InvalidArgument("The input tensor Y's dimensions of MulOp "
                      "should be larger than y_num_col_dims. But received Y's "
                      "dimensions = %d, Y's shape = [%s], y_num_col_dims = %d.",
                      y_dims.size(), y_dims, y_num_col_dims));

The error examples:

PADDLE_ENFORCE_GT(y_dims.size(), y_num_col_dims,
                      "The input tensor Y's dimensions of MulOp "
                      "should be larger than y_num_col_dims. But received Y's "
                      "dimensions = %d, Y's shape = [%s], y_num_col_dims = %d.",
                      y_dims.size(), y_dims, y_num_col_dims);

Note: PADDLE_ENFORCE under CUDA_ARCH does not yet support the declaration error type. If you encounter it, you can find the approver approve

4. Variable abbreviations defined by C++ developers are not allowed in prompts and should be expanded into full English words.

The error examples:

PADDLE_ENFORCE(forward_pd != nullptr,
               "Fail to find eltwise_fwd_pd in device context");

5. Make sure there are no syntax errors in the prompt

The error examples:

PADDLE_ENFORCE(context->HasInput("X"),
               "ArrayToLoDTensorOp must has input X."); //must has?

3. Error message valid sample library

Considering that developers have different understandings of the aforementioned standards, there may be doubts about the wrong classification. Therefore, as far as possible, examples of various types of errors are provided, as well as reference writing methods for related prompts. Developers are encouraged to optimize the error information. At this time, take the initiative to refer to the specification example here.

1. InvaliArgument

The user passed in an illegal parameter, including various parameter type errors, which should be the most common type of error.

1.1 ShapeError

PADDLE_ENFORCE_EQ(
    output_shape[unk_dim_idx] * capacity, -in_size,
    platform::errors::InvalidArgument(
        "The 'shape' attribute in ReshapeOp is invalid. "
        "The input tensor X'size must be divisible by known "
        "capacity of 'shape'. "
        "But received X's shape = [%s], X's size = %d, "
        "'shape' is [%s], known "
        "capacity of 'shape' is %d.",
        in_dims, in_size, framework::make_ddim(shape), capacity));

1.2 The parameter is empty (list is empty, null pointer, etc.)

PADDLE_ENFORCE_NE(vars.empty(), true, platform::errors::InvalidArgument(
                                          "Variable names are empty."));

1.3 The parameter is incorrect and is not equal to the expected value.

PADDLE_ENFORCE_GT(batch_size, 0, platform::errors::InvalidArgument(
                                    "Batch size %d is illegal.", batch_size));

PADDLE_ENFORCE_NE(
    num, 0,
    platform::errors::InvalidArgument(
        "The number of ids can not be zero, you need padding "
        "it in data generator, or if there is something wrong with "
        "the data, please check if the data contains unresolvable "
        "characters.\nplease check this error line: %s.",
        str));

1.4 Incorrect parameter format

PADDLE_ENFORCE_NE(in.format(), MKLDNNMemoryFormat::format_undef,
          platform::errors::InvalidArgument(
              "Input tensor format is invalid. Input tensor should "
              "have specified memory format."));

1.5 Parameter not initialized

PADDLE_ENFORCE_EQ(proto_->IsInitialized(), true,
                  platform::errors::InvalidArgument(
                      "Operator's Proto in op info is not initialized."));

PADDLE_ENFORCE_EQ(
    t->IsInitialized(), true,
    platform::errors::InvalidArgument(
        "The Tensor in the %s Op's Input Variable %s(%s) is "
        "not initialized.",
        Type(), name, ctx.Inputs(name).at(i)));

1.6 Incorrect parameter type

PADDLE_ENFORCE(
    tmp == *data_type || *data_type == dafault_data_type,
    platform::errors::InvalidArgument(
        "The DataType of %s Op's duplicable Variable %s must be "
        "consistent. The current variable type is (%s), but the "
        "previous variable type is (%s).",
        Type(), name, DataTypeToString(tmp),
        DataTypeToString(*data_type)));

PADDLE_ENFORCE_EQ(
    valid, true,
    platform::errors::InvalidArgument(
        "Tensor holds the wrong type, it holds %s, but desires to be %s.",
        DataTypeToString(type_),
        DataTypeToString(DataTypeTrait<T>::DataType())));

1.7 Parameter parsing error

PADDLE_ENFORCE_EQ(success, true,
                  platform::errors::InvalidArgument(
                      "Fail to parse DataFeedDesc from string: %s.",
                      data_feed_desc_str.c_str()));

1.8 LoD error

PADDLE_ENFORCE_GT(lod_level, 0, platform::errors::InvalidArgument(
                                    "Input(X) Tensor of SequencePoolOp "
                                    "does not contain LoD information."));

2. NotFound

The entity of the application cannot be found, the variable to be found is empty, the input and output do not exist, etc.

  • Separated from null pointers, variables not found and variables not correctly assigned, are two levels of concept

2.1 Op input and output not found

PADDLE_ENFORCE_EQ(
    ctx->HasInput("X"), true,
    platform::errors::NotFound("Input(X) of MulOp is not found."));
PADDLE_ENFORCE_EQ(
    ctx->HasInput("Y"), true,
    platform::errors::NotFound("Input(Y) of MulOp is not found."));
PADDLE_ENFORCE_EQ(
    ctx->HasOutput("Out"), true,
    platform::errors::NotFound("Output(Out) of MulOp is not found."));

2.2 Missing node

PADDLE_ENFORCE_NOT_NULL(
    p, platform::errors::NotFound("subgraph has no node %s.", name.c_str()));

2.3 file not found

PADDLE_ENFORCE_GT(file_cnt, 0,
                  platform::errors::NotFound("Input file list is empty."));

2.4 other

PADDLE_ENFORCE_NOT_NULL(
    var_desc, platform::errors::NotFound("%s is not found.", var_name));

PADDLE_ENFORCE_NOT_NULL(
    proto_,
    platform::errors::NotFound("Operator's Proto has not been registered."));

3. OutOfRange

PADDLE_ENFORCE_LT(
    i, N, platform::errors::OutOfRange("Array index out of bounds."));

PADDLE_ENFORCE_GT(value, lower_bound_,
                  platform::errors::OutOfRange("Attribute GreaterThan check failed."));

4. AlreadyExists

The entity being found already exists, or some individuals that only allow a single instance are found, but multiple

PADDLE_ENFORCE_EQ(
    attrs_.count(attr_name), 0,
    platform::errors::AlreadyExists(
        "The attribute %s has been set in the graph.", attr_name));

PADDLE_ENFORCE_NE(Has(pass_type), true, 
    platform::errors::AlreadyExists(
        "Pass %s has been registered.", pass_type));

PADDLE_ENFORCE_LE(ins.size(), 1UL,
    platform::errors::AlreadyExists(
        "Operator %s's input %s should contain only one variable.", type_, name));
                    
PADDLE_ENFORCE_EQ(
    fused_var_set.count(fused_var_name), 0,
    platform::errors::AlreadyExists(
         "The fused variable already exists."));

5. PermissionDenied

The current operation is not allowed to be executed.

PADDLE_ENFORCE_NE(a, b, platform::errors::PermissionDenied(
                            "Cannot connect the same node in the graph."));

6. ResourceExhausted

PADDLE_THROW_BAD_ALLOC(platform::errors::ResourceExhausted(
    "\n\nOut of memory error on GPU %d. "
    "Cannot allocate %s memory on GPU %d, "
    "available memory is only %s.\n\n"
    "Please check whether there is any other process using GPU %d.\n"
    "1. If yes, please stop them, or start PaddlePaddle on another GPU.\n"
    "2. If no, please decrease the batch size of your model.\n",
    place_.device, string::HumanReadableSize(size), place_.device,
    string::HumanReadableSize(avail), place_.device));

PADDLE_THROW_BAD_ALLOC(platform::errors::ResourceExhausted(
     "\n\nOut of memory error on GPU %d. "
     "Cannot allocate %s memory on GPU %d, "
     "available memory is only %s.\n\n"
     "Please check whether there is any other process using GPU %d.\n"
     "1. If yes, please stop them, or start PaddlePaddle on another GPU.\n"
     "2. If no, please try one of the following suggestions:\n"
     "   1) Decrease the batch size of your model.\n"
     "   2) FLAGS_fraction_of_gpu_memory_to_use is %.2lf now, "
     "please set it to a higher value but less than 1.0.\n"
     "      The command is "
     "`export FLAGS_fraction_of_gpu_memory_to_use=xxx`.\n\n",
     gpu_id_, string::HumanReadableSize(size), gpu_id_,
     string::HumanReadableSize(avail), gpu_id_,
     FLAGS_fraction_of_gpu_memory_to_use));

7. PreconditionNotMet

The currently executed operation requires certain prerequisites to be met before it can be executed.

PADDLE_ENFORCE_NOT_NULL(
    mutex_for_pick_file_,
    platform::errors::PreconditionNotMet(
        "You should call SetFileListMutex before PickOneFile"));

PADDLE_ENFORCE_NOT_NULL(
    root_scope_,
    platform::errors::PreconditionNotMet(
        "root_scope should be set before creating thread scope."));

PADDLE_ENFORCE_NE(
    fetched_var_it, fetched_vars->end(),
    platform::errors::PreconditionNotMet(
        "Cannot find fetched variable(%s). Perhaps the main_program "
        "is not set to ParallelExecutor.",
        var_name));

PADDLE_ENFORCE_EQ(finish_start_, true,
                  platform::errors::PreconditionNotMet(
                      "Datafeed has not started running yet."));

PADDLE_ENFORCE_NE(framework::product(y_dims), 0,
                  platform::errors::PreconditionNotMet(
                      "The Input variable Y(%s) has not "
                      "been initialized. You may need to confirm "
                      "if you put exe.run(startup_program) "
                      "after optimizer.minimize function.",
                      ctx->Inputs("Y").front());

PADDLE_ENFORCE_NE(FLAGS_use_ngraph, true,
                  platform::errors::PreconditionNotMet(
                      "Please compile with NGRAPH first to use NGRAPH."));

8. ExecutionTimeout

The execution response time is too long, or the communication timed out.

The sample is not found yet and is pending addition.

9. Unimplemented

Not yet implemented or supported, but may be implemented later

PADDLE_ENFORCE_NE(iter, operations_.end(),
                  platform::errors::Unimplemented(
                      "Operation %s is not supported yet.", op_type));

PADDLE_ENFORCE_EQ(
    all_reduce_ops.size(), grads.size(),
    platform::errors::Unimplemented(
        "The number of all_reduce OpHandle is not equal to the "
        "number of grads. Maybe some gradients are sparse type, "
        "it is not supported currently."));

10. Unavailable

The current service is not available or the current operation cannot be performed.

10.1 IO error

PADDLE_ENFORCE_NE(file_descriptor, -1, platform::errors::Unavaliable(
                                            "Cannot open file %s.", filename));

PADDLE_ENFORCE_EQ(fin.good(), true, platform::errors::Unavaliable(
                                        "Cannot open file %s.", filename));

PADDLE_ENFORCE_EQ(
    file.is_open(), true,
    platform::errors::Unavailable("Can not open %s to add benchmark.", path));

11. Fatal

Unexpected, serious errors, such as segmentation errors.

Used to add try-catch to handle unexpected exceptions, which developers won't use for the time being.

12. External

PADDLE_ENFORCE_CUDA_SUCCESS(
    cudaEventCreate(&event_, cudaEventDisableTiming),
    platform::errors::External(
        "Create event failed in CUDADeviceContextAllocator"));

4. Specification updates and additions

1. Added OP_INOUT_CHECK macro for Op InferShape (2020.04.03)

-The input and output check of Op InferShape. The error type and error information are very similar, but because the error type was not generally added before, they need to be modified.

-A new check macro has recently been added to handle such checks. An example usage is as follows:

OP_INOUT_CHECK (ctx-> HasInput (" X ")," Input "," X "," Mul ");

-Just pass in the conditional expression, Input or Output, Op input and output name, Op name.

-On the one hand, it simplifies the code and reduces the workload for everyone. On the other hand, it can ensure that all the input and output check error information of the Op are consistent, unify the various existing writing methods, and avoid syntax problems.

Appendix

Updates for the filing specification

1. Recent Paddle error message optimization changes

Original Paddle error message example:

Original Paddle error message example

Example of Paddle error message after optimization:

Example of Paddle error message after optimization

2. Error type new method

Take the error type UNKNOWN as an example:

Step 1: Add a new error code in paddle/fluid/platform/error_codes.proto

UNKNOWN = 13;

Step 2: Register the new error type in paddle/fluid/platform/s.h

REGISTER_ERROR(Unknown, UNKNOWN)

Step 3: Add a new error string to paddle/fluid/platform/errors.cc

Case paddle::platform::error::UNKNOWN:
       Return "UnknownError";
       Break;

Step 4: Use in the code

PADDLE_ENFORCE_EQ(flag, true, platform::errors::Unknown("example"));

3. Error type extension record

If the existing 12 error types cannot cover the errors encountered in the actual scene, you can apply for a new error type, which is clarified here.

  • Added error type name
  • Added error type application scenario description
  • Added error type PADDLE_ENFORCE example (not less than 3)

Other update records

Clone this wiki locally