diff --git a/docs/_sidebar.md b/docs/_sidebar.md index 215e58a..c0da7cd 100644 --- a/docs/_sidebar.md +++ b/docs/_sidebar.md @@ -24,6 +24,7 @@ - [Custom graders](problem_format/custom_graders.md) - [Generators](problem_format/generator.md) - [Problem examples](problem_format/problem_examples.md) + - [C++ Problem Setting Templates](problem_format/cpp_psetting_templates.md) - About - [License](about/LICENSE.md) diff --git a/docs/problem_format/cpp_psetting_templates.md b/docs/problem_format/cpp_psetting_templates.md new file mode 100644 index 0000000..9c70935 --- /dev/null +++ b/docs/problem_format/cpp_psetting_templates.md @@ -0,0 +1,59 @@ +# C++ Problem Setting Templates - `cpp_psetting_templates` + +There are three C++ input-handling templates provided for aiding problem setters. They are as follows: + +- [Validator Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/validator.cpp) +- [Identical Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/identical_checker_interactor.cpp) +- [Standard Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/standard_checker_interactor.cpp) + +## Validator + +This is a template for validating the input data of problems. It aims to be simple and of course, correct. It contains seven functions. The first three are whitespace functions: + +- `void readSpace()` expects a space at the current position in the input, and aborts the program if there is not a space. +- `void readNewLine()` expects a newline at the current position in the input. +- `void readEOF()` expects the input file to end immediately at the current position. + +The remaining four are for actual content: + +- `std::string readToken(char min_char = 0, char max_char = 127)` returns the next token in the input stream. A token is defined as a whitespace-separated string. If the next character in the input is a whitespace character, this method aborts the program. The optional arguments `min_char` and `max_char` can be used to enforce a range on the characters in the token. For instance, `readToken('a', 'z')` reads a lowercase string of english letters. +- `long long readInt(long long lo, long long hi)` parses the next token as an integer. It aborts on overflow, malformed integers, and if the resultant integer is not in the range [lo, hi], inclusive. +- `long double readFloat(long double lo, long double hi, long double eps = 1e-9)` parses the next token as a float. It aborts on overflow, malformed floats, and if the resultant float is not in the range [lo, hi], inclusive, using the provided epsilon to perform the comparison. Scientific notation and NaNs are not accepted, nor are leading zeroes. `-0` is allowed. Trailing zeroes are also permitted. +- `std::vector readIntArray(size_t N, long long lo, long long hi)` parses the next space-separated N integers into an array, and then reads a final newline. It must be given a template argument, which is the type of the array elements. For example, `readIntArray(5, 1, 10)` reads five space-separated integers into a `std::vector`, where each integer is in the range [1, 10], inclusive. + +`readFloat()` will likely be of no use for many validators, and can be safely deleted. Similarly, `readIntArray` can be deleted if unneeded. + +## Checkers/Interactors + +The next pair of templates are for checkers/interactors. The difference is the type of whitespace handling: the identical checker/interactor expects whitespace to match exactly. The standard checker/interactor handles whitespace like the `standard` checker. + +The checkers and interactors are designed for the `coci` bridged checker/interactor type. However, updating the codes used and the order of command line parameters to work with other types should not be challenging. + +Both files can be used for either checkers/interactors, with the following caveat: interactors MUST close `stdout` BEFORE calling `readEOF()`, so that the user process can terminate in case it _also_ expects an EOF. + +The general format of the checkers/interactors are the same as the validator, with a few changes: + +- `readSpace(), readNewLine(), readEOF()`: Under the identical checker, these return Presentation Error if the check fails. Under the standard checker, these return WA. +- `readToken()`: Under the identical checker, this returns Presentation Error if the token is empty, and WA if any character is not in range. +- `readInt(), readIntArray(), readFloat()`: Returns WA if the token is malformed or out of range. + +Additionally, two new functions are provided. + +- `exitWA()` unconditionally exits with a WA verdict. +- `assertWA(bool)` takes a condition and exits with WA if the condition is false. + +Under the identical checker, corresponding functions `exitPE()` and `assertPE` are provided. Standard checkers should not use the Presentation Error code, as the builtin `standard` checker does not use this code. + +Finally, there is an empty function `errorHook()`. This function is called whenever the provided function would exit with an error. It should be used to do custom handling, such as providing partial points for outputting part of an answer, or outputting `-1` in interactors to signal errors to the user submission. + +## Standard Checker/Interactor Design + +This section is purely for those interested in the design and inner workings of the standard checker/interactor routines. + +The general overview is that `readSpace()` should read non-line whitespace characters, `readNewLine` should read whitespace and expect a line whitespace character, and `readEOF` should read all whitespace and check for EOF. Additionally, any leading whitespace in the input should be trimmed. + +There are two major challenges with making a standard checker/interactor design ergonomic: +- Under interactors, it is not acceptable to consume all whitespace in the `readNewLine` method, as the user submission will likely output a single line and then wait for the interactor to send another query. If the interactor naively tried to consume all whitespace, it would block, and the user submission would TLE. +- After reading the end of the input, it's most ergonomic to have the checker read a newline, and then call `readEOF()`, as this is the canonical input format. However, the standard checker allows users to forgo the last newline, and if the `readNewLine()` method expected a newline, we would erroneously return WA. + +To solve both of these problems, we employ a lazy whitespace checking scheme. `readSpace()` and `readNewLine()` simply set a flag for `readToken()`. `readToken()` then consumes the whitespace and validates it, before reading the token. Additionally, `readEOF()`, if called after `readNewLine()`, ignores the flag and consumes all whitespace, and then checks for EOF. diff --git a/sample_files/problem_setting/identical_checker_interactor.cpp b/sample_files/problem_setting/identical_checker_interactor.cpp new file mode 100644 index 0000000..d604f5b --- /dev/null +++ b/sample_files/problem_setting/identical_checker_interactor.cpp @@ -0,0 +1,110 @@ +#include +#include +#include +#include +#include +#include +#include + +void errorHook(); +void exitWA() { + errorHook(); + std::exit(1); +} +void exitPE() { + errorHook(); + std::exit(2); +} +void assertWA(bool condition) { + if (!condition) { + exitWA(); + } +} +void assertPE(bool condition) { + if (!condition) { + // Defined for `testlib` and `coci`. + exitPE(); + } +} +void readSpace() { assertPE(getchar() == ' '); } +void readNewLine() { assertPE(getchar() == '\n'); } +void readEOF() { assertPE(getchar() == EOF); } +std::string readToken(char min_char = 0, char max_char = 127) { + static constexpr size_t MAX_TOKEN_SIZE = 1e7; + std::string token; + int c = getchar(); + assertPE(!isspace(c)); + while (!isspace(c) && c != EOF && token.size() < MAX_TOKEN_SIZE) { + assertWA(min_char <= c && c <= max_char); + token.push_back(char(c)); + c = getchar(); + } + ungetc(c, stdin); + return token; +} +namespace regex_detail { +regex_t compile(const char *pattern) { + regex_t re; + if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) { + throw std::runtime_error("Pattern failed to compile."); + } + return re; +} +bool match(regex_t re, const std::string &text) { + return regexec(&re, text.c_str(), 0, NULL, 0) == 0; +} +} // namespace regex_detail +long long readInt(long long lo, long long hi) { + // stoll is horribly insufficient, so we use a regex. + static regex_t re = regex_detail::compile("^0|-?[1-9][0-9]*$"); + std::string token = readToken(); + assertWA(regex_detail::match(re, token)); + + long long parsedInt; + try { + parsedInt = stoll(token); + } catch (const std::invalid_argument &) { + exitWA(); + } catch (const std::out_of_range &) { + exitWA(); + } + assertWA(lo <= parsedInt && parsedInt <= hi); + return parsedInt; +} +template +std::vector readIntArray(size_t N, long long lo, long long hi) { + std::vector arr; + arr.reserve(N); + for (size_t i = 0; i < N; i++) { + arr.push_back(readInt(lo, hi)); + if (i != N - 1) { + readSpace(); + } + } + readNewLine(); + return arr; +} +long double readFloat(long double min, long double max, + long double eps = 1e-9) { + // stold is horribly insufficient, so we use a regex. + static regex_t re = regex_detail::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$"); + std::string token = readToken(); + assertWA(regex_detail::match(re, token)); + long double parsedDouble; + try { + parsedDouble = stold(token); + } catch (const std::invalid_argument &) { + exitWA(); + } catch (const std::out_of_range &) { + exitWA(); + } + assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps); + return parsedDouble; +} +// This is a hook for when the reader functions error, including `assertWA`. Use +// it to do custom handling when an error happens, such as overriding WA with +// partials, or printing a flag like `-1` in interactors. +void errorHook() {} + +// If this is a checker, use the following line in main(int argc, char** argv) +// to replace stdin with the user output. freopen(argv[2], "r", stdin); diff --git a/sample_files/problem_setting/standard_checker_interactor.cpp b/sample_files/problem_setting/standard_checker_interactor.cpp new file mode 100644 index 0000000..3669c30 --- /dev/null +++ b/sample_files/problem_setting/standard_checker_interactor.cpp @@ -0,0 +1,161 @@ +#include +#include +#include +#include +#include +#include +#include +#include + +void errorHook(); +void exitWA() { + errorHook(); + std::exit(1); +} +void assertWA(bool condition) { + if (!condition) { + exitWA(); + } +} +namespace standard_whitespace_detail { +enum WhitespaceFlag { NONE = 0, SPACE = 1, NEWLINE = 2, ALL = 3 }; +WhitespaceFlag current_flag = ALL; // At checker start, consume all whitespace. + +void poke_flag(WhitespaceFlag flag) { + if (!(current_flag == NONE || (current_flag == NEWLINE && flag == ALL))) { + throw std::runtime_error("Never call two whitespace methods in a row, " + "except for readNewLine() followed by readEOF()."); + } + current_flag = flag; +} + +enum ConsumeResult { + NO_WHITESPACE, + NO_LINES, + LINES, +}; +ConsumeResult consumeWhitespace() { + int c = getchar(); + ConsumeResult result = NO_WHITESPACE; + while (isspace(c) && c != EOF) { + if (result == NO_WHITESPACE) { + result = NO_LINES; + } + if (c == '\r' || c == '\n') { + result = LINES; + } + c = getchar(); + } + ungetc(c, stdin); + current_flag = NONE; + return result; +} + +void preReadToken() { + switch (current_flag) { + case NONE: + throw std::runtime_error( + "Must not call readInt (or readToken, or readFloat) twice in a row!"); + case SPACE: + assertWA(consumeWhitespace() == NO_LINES); + break; + case NEWLINE: + assertWA(consumeWhitespace() == LINES); + break; + case ALL: + consumeWhitespace(); + break; + } +} +} // namespace standard_whitespace_detail +void readSpace() { + standard_whitespace_detail::poke_flag(standard_whitespace_detail::SPACE); +} +void readNewLine() { + standard_whitespace_detail::poke_flag(standard_whitespace_detail::NEWLINE); +} +void readEOF() { + standard_whitespace_detail::poke_flag(standard_whitespace_detail::ALL); + standard_whitespace_detail::consumeWhitespace(); + assertWA(getchar() == EOF); +} +std::string readToken(char min_char = 0, char max_char = 127) { + standard_whitespace_detail::preReadToken(); + static constexpr size_t MAX_TOKEN_SIZE = 1e7; + std::string token; + int c = getchar(); + assertWA(!isspace(c)); + while (!isspace(c) && c != EOF && token.size() < MAX_TOKEN_SIZE) { + assertWA(min_char <= c && c <= max_char); + token.push_back(char(c)); + c = getchar(); + } + ungetc(c, stdin); + return token; +} +namespace regex_detail { +regex_t compile(const char *pattern) { + regex_t re; + if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) { + throw std::runtime_error("Pattern failed to compile."); + } + return re; +} +bool match(regex_t re, const std::string &text) { + return regexec(&re, text.c_str(), 0, NULL, 0) == 0; +} +} // namespace regex_detail +long long readInt(long long lo, long long hi) { + // stoll is horribly insufficient, so we use a regex. + static regex_t re = regex_detail::compile("^0|-?[1-9][0-9]*$"); + std::string token = readToken(); + assertWA(regex_detail::match(re, token)); + + long long parsedInt; + try { + parsedInt = stoll(token); + } catch (const std::invalid_argument &) { + exitWA(); + } catch (const std::out_of_range &) { + exitWA(); + } + assertWA(lo <= parsedInt && parsedInt <= hi); + return parsedInt; +} +template +std::vector readIntArray(size_t N, long long lo, long long hi) { + std::vector arr; + arr.reserve(N); + for (size_t i = 0; i < N; i++) { + arr.push_back(readInt(lo, hi)); + if (i != N - 1) { + readSpace(); + } + } + readNewLine(); + return arr; +} +long double readFloat(long double min, long double max, + long double eps = 1e-9) { + // stold is horribly insufficient, so we use a regex. + static regex_t re = regex_detail::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$"); + std::string token = readToken(); + assertWA(regex_detail::match(re, token)); + long double parsedDouble; + try { + parsedDouble = stold(token); + } catch (const std::invalid_argument &) { + exitWA(); + } catch (const std::out_of_range &) { + exitWA(); + } + assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps); + return parsedDouble; +} +// This is a hook for when the reader functions error, including `assertWA`. Use +// it to do custom handling when an error happens, such as overriding WA with +// partials, or printing a flag like `-1` in interactors. +void errorHook() {} + +// If this is a checker, use the following line in main(int argc, char** argv) +// to replace stdin with the user output. freopen(argv[2], "r", stdin); diff --git a/sample_files/problem_setting/validator.cpp b/sample_files/problem_setting/validator.cpp new file mode 100644 index 0000000..860dc66 --- /dev/null +++ b/sample_files/problem_setting/validator.cpp @@ -0,0 +1,70 @@ +#include +#include +#include +#include +#include +#include + +void readSpace() { assert(getchar() == ' '); } +void readNewLine() { assert(getchar() == '\n'); } +void readEOF() { assert(getchar() == EOF); } + +std::string readToken(char min_char = 0, char max_char = 127) { + static constexpr size_t MAX_TOKEN_SIZE = 1e7; + std::string token; + int c = getchar(); + assert(!isspace(c)); + while (!isspace(c) && c != EOF && token.size() < MAX_TOKEN_SIZE) { + assert(min_char <= c && c <= max_char); + token.push_back(char(c)); + c = getchar(); + } + ungetc(c, stdin); + return token; +} + +// We need a regex because of how lax stoll and stold are. +namespace regex_detail { +regex_t compile(const char *pattern) { + regex_t re; + assert(regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) == 0); + return re; +} +bool match(regex_t re, const std::string &text) { + return regexec(&re, text.c_str(), 0, NULL, 0) == 0; +} +} // namespace regex_detail + +long long readInt(long long lo, long long hi) { + static regex_t re = regex_detail::compile("^0|-?[1-9][0-9]*$"); + std::string token = readToken(); + assert(regex_detail::match(re, token)); + + long long parsedInt = stoll(token); + assert(lo <= parsedInt && parsedInt <= hi); + return parsedInt; +} + +template +std::vector readIntArray(size_t N, long long lo, long long hi) { + std::vector arr; + arr.reserve(N); + for (size_t i = 0; i < N; i++) { + arr.push_back(readInt(lo, hi)); + if (i != N - 1) { + readSpace(); + } + } + readNewLine(); + return arr; +} + +long double readFloat(long double min, long double max, + long double eps = 1e-9) { + static regex_t re = regex_detail::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$"); + std::string token = readToken(); + assert(regex_detail::match(re, token)); + long double parsedDouble = stold(token); + assert(min - eps <= parsedDouble && parsedDouble <= max + eps); + return parsedDouble; +}