-
Notifications
You must be signed in to change notification settings - Fork 87
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Many templates have been floating around in the DMOJ community for validation and input handling in checkers. This commit aims to consolidate them. It has two main goals: - Correct. Duh. - Simple. Other templates that circulate, including the ones I have published, are too complex. People naively try and write their own. I am sick and tired of reading over incorrect validators. These templates forgo some principles of good design (such as object-oriented programming) in favour of pure simplicity. They should be simple enough that they are understandable by the broader community, and are not a black box. Hopefully this also dissuades re-writing.
- Loading branch information
Showing
5 changed files
with
401 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# C++ Problem Setting Templates - `cpp_psetting_templates` | ||
|
||
There are three C++ input-handling templates provided for aiding problem setters. They are as follows: | ||
|
||
- [Validator Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/validator.cpp) | ||
- [Identical Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/identical_checker_interactor.cpp) | ||
- [Standard Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/standard_checker_interactor.cpp) | ||
|
||
## Validator | ||
|
||
This is a template for validating the input data of problems. It aims to be simple and of course, correct. It contains seven functions. The first three are whitespace functions: | ||
|
||
- `void readSpace()` expects a space at the current position in the input, and aborts the program if there is not a space. | ||
- `void readNewLine()` expects a newline at the current position in the input. | ||
- `void readEOF()` expects the input file to end immediately at the current position. | ||
|
||
The remaining four are for actual content: | ||
|
||
- `std::string readToken(char min_char = 0, char max_char = 127)` returns the next token in the input stream. A token is defined as a whitespace-separated string. If the next character in the input is a whitespace character, this method aborts the program. The optional arguments `min_char` and `max_char` can be used to enforce a range on the characters in the token. For instance, `readToken('a', 'z')` reads a lowercase string of english letters. | ||
- `long long readInt(long long lo, long long hi)` parses the next token as an integer. It aborts on overflow, malformed integers, and if the resultant integer is not in the range [lo, hi], inclusive. | ||
- `long double readFloat(long double lo, long double hi, long double eps = 1e-9)` parses the next token as a float. It aborts on overflow, malformed floats, and if the resultant float is not in the range [lo, hi], inclusive, using the provided epsilon to perform the comparison. Scientific notation and NaNs are not accepted, nor are leading zeroes. `-0` is allowed. Trailing zeroes are also permitted. | ||
- `std::vector<T> readIntArray(size_t N, long long lo, long long hi)` parses the next space-separated N integers into an array, and then reads a final newline. It must be given a template argument, which is the type of the array elements. For example, `readIntArray<int>(5, 1, 10)` reads five space-separated integers into a `std::vector<int>`, where each integer is in the range [1, 10], inclusive. | ||
|
||
`readFloat()` will likely be of no use for many validators, and can be safely deleted. Similarly, `readIntArray` can be deleted if unneeded. | ||
|
||
## Checkers/Interactors | ||
|
||
The next pair of templates are for checkers/interactors. The difference is the type of whitespace handling: the identical checker/interactor expects whitespace to match exactly. The standard checker/interactor handles whitespace like the `standard` checker. | ||
|
||
The checkers and interactors are designed for the `coci` bridged checker/interactor type. However, updating the codes used and the order of command line parameters to work with other types should not be challenging. | ||
|
||
Both files can be used for either checkers/interactors, with the following caveat: interactors MUST close `stdout` BEFORE calling `readEOF()`, so that the user process can terminate in case it _also_ expects an EOF. | ||
|
||
The general format of the checkers/interactors are the same as the validator, with a few changes: | ||
|
||
- `readSpace(), readNewLine(), readEOF()`: Under the identical checker, these return Presentation Error if the check fails. Under the standard checker, these return WA. | ||
- `readToken()`: Under the identical checker, this returns Presentation Error if the token is empty, and WA if any character is not in range. | ||
- `readInt(), readIntArray(), readFloat()`: Returns WA if the token is malformed or out of range. | ||
|
||
Additionally, two new functions are provided. | ||
|
||
- `exitWA()` unconditionally exits with a WA verdict. | ||
- `assertWA(bool)` takes a condition and exits with WA if the condition is false. | ||
|
||
Under the identical checker, corresponding functions `exitPE()` and `assertPE` are provided. Standard checkers should not use the Presentation Error code, as the builtin `standard` checker does not use this code. | ||
|
||
Finally, there is an empty function `errorHook()`. This function is called whenever the provided function would exit with an error. It should be used to do custom handling, such as providing partial points for outputting part of an answer, or outputting `-1` in interactors to signal errors to the user submission. | ||
|
||
## Standard Checker/Interactor Design | ||
|
||
This section is purely for those interested in the design and inner workings of the standard checker/interactor routines. | ||
|
||
The general overview is that `readSpace()` should read non-line whitespace characters, `readNewLine` should read whitespace and expect a line whitespace character, and `readEOF` should read all whitespace and check for EOF. Additionally, any leading whitespace in the input should be trimmed. | ||
|
||
There are two major challenges with making a standard checker/interactor design ergonomic: | ||
- Under interactors, it is not acceptable to consume all whitespace in the `readNewLine` method, as the user submission will likely output a single line and then wait for the interactor to send another query. If the interactor naively tried to consume all whitespace, it would block, and the user submission would TLE. | ||
- After reading the end of the input, it's most ergonomic to have the checker read a newline, and then call `readEOF()`, as this is the canonical input format. However, the standard checker allows users to forgo the last newline, and if the `readNewLine()` method expected a newline, we would erroneously return WA. | ||
|
||
To solve both of these problems, we employ a lazy whitespace checking scheme. `readSpace()` and `readNewLine()` simply set a flag for `readToken()`. `readToken()` then consumes the whitespace and validates it, before reading the token. Additionally, `readEOF()`, if called after `readNewLine()`, ignores the flag and consumes all whitespace, and then checks for EOF. |
110 changes: 110 additions & 0 deletions
110
sample_files/problem_setting/identical_checker_interactor.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
#include <algorithm> | ||
#include <cstdio> | ||
#include <cstdlib> | ||
#include <regex.h> | ||
#include <stdexcept> | ||
#include <string> | ||
#include <vector> | ||
|
||
void errorHook(); | ||
void exitWA() { | ||
errorHook(); | ||
std::exit(1); | ||
} | ||
void exitPE() { | ||
errorHook(); | ||
std::exit(2); | ||
} | ||
void assertWA(bool condition) { | ||
if (!condition) { | ||
exitWA(); | ||
} | ||
} | ||
void assertPE(bool condition) { | ||
if (!condition) { | ||
// Defined for `testlib` and `coci`. | ||
exitPE(); | ||
} | ||
} | ||
void readSpace() { assertPE(getchar() == ' '); } | ||
void readNewLine() { assertPE(getchar() == '\n'); } | ||
void readEOF() { assertPE(getchar() == EOF); } | ||
std::string readToken(char min_char = 0, char max_char = 127) { | ||
static constexpr size_t MAX_TOKEN_SIZE = 1e7; | ||
std::string token; | ||
int c = getchar(); | ||
assertPE(!isspace(c)); | ||
while (!isspace(c) && c != EOF && token.size() < MAX_TOKEN_SIZE) { | ||
assertWA(min_char <= c && c <= max_char); | ||
token.push_back(char(c)); | ||
c = getchar(); | ||
} | ||
ungetc(c, stdin); | ||
return token; | ||
} | ||
namespace regex_detail { | ||
regex_t compile(const char *pattern) { | ||
regex_t re; | ||
if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) { | ||
throw std::runtime_error("Pattern failed to compile."); | ||
} | ||
return re; | ||
} | ||
bool match(regex_t re, const std::string &text) { | ||
return regexec(&re, text.c_str(), 0, NULL, 0) == 0; | ||
} | ||
} // namespace regex_detail | ||
long long readInt(long long lo, long long hi) { | ||
// stoll is horribly insufficient, so we use a regex. | ||
static regex_t re = regex_detail::compile("^0|-?[1-9][0-9]*$"); | ||
std::string token = readToken(); | ||
assertWA(regex_detail::match(re, token)); | ||
|
||
long long parsedInt; | ||
try { | ||
parsedInt = stoll(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(lo <= parsedInt && parsedInt <= hi); | ||
return parsedInt; | ||
} | ||
template <typename T> | ||
std::vector<T> readIntArray(size_t N, long long lo, long long hi) { | ||
std::vector<T> arr; | ||
arr.reserve(N); | ||
for (size_t i = 0; i < N; i++) { | ||
arr.push_back(readInt(lo, hi)); | ||
if (i != N - 1) { | ||
readSpace(); | ||
} | ||
} | ||
readNewLine(); | ||
return arr; | ||
} | ||
long double readFloat(long double min, long double max, | ||
long double eps = 1e-9) { | ||
// stold is horribly insufficient, so we use a regex. | ||
static regex_t re = regex_detail::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$"); | ||
std::string token = readToken(); | ||
assertWA(regex_detail::match(re, token)); | ||
long double parsedDouble; | ||
try { | ||
parsedDouble = stold(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps); | ||
return parsedDouble; | ||
} | ||
// This is a hook for when the reader functions error, including `assertWA`. Use | ||
// it to do custom handling when an error happens, such as overriding WA with | ||
// partials, or printing a flag like `-1` in interactors. | ||
void errorHook() {} | ||
|
||
// If this is a checker, use the following line in main(int argc, char** argv) | ||
// to replace stdin with the user output. freopen(argv[2], "r", stdin); |
161 changes: 161 additions & 0 deletions
161
sample_files/problem_setting/standard_checker_interactor.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
#include <algorithm> | ||
#include <cassert> | ||
#include <cstdio> | ||
#include <cstdlib> | ||
#include <regex.h> | ||
#include <stdexcept> | ||
#include <string> | ||
#include <vector> | ||
|
||
void errorHook(); | ||
void exitWA() { | ||
errorHook(); | ||
std::exit(1); | ||
} | ||
void assertWA(bool condition) { | ||
if (!condition) { | ||
exitWA(); | ||
} | ||
} | ||
namespace standard_whitespace_detail { | ||
enum WhitespaceFlag { NONE = 0, SPACE = 1, NEWLINE = 2, ALL = 3 }; | ||
WhitespaceFlag current_flag = ALL; // At checker start, consume all whitespace. | ||
|
||
void poke_flag(WhitespaceFlag flag) { | ||
if (!(current_flag == NONE || (current_flag == NEWLINE && flag == ALL))) { | ||
throw std::runtime_error("Never call two whitespace methods in a row, " | ||
"except for readNewLine() followed by readEOF()."); | ||
} | ||
current_flag = flag; | ||
} | ||
|
||
enum ConsumeResult { | ||
NO_WHITESPACE, | ||
NO_LINES, | ||
LINES, | ||
}; | ||
ConsumeResult consumeWhitespace() { | ||
int c = getchar(); | ||
ConsumeResult result = NO_WHITESPACE; | ||
while (isspace(c) && c != EOF) { | ||
if (result == NO_WHITESPACE) { | ||
result = NO_LINES; | ||
} | ||
if (c == '\r' || c == '\n') { | ||
result = LINES; | ||
} | ||
c = getchar(); | ||
} | ||
ungetc(c, stdin); | ||
current_flag = NONE; | ||
return result; | ||
} | ||
|
||
void preReadToken() { | ||
switch (current_flag) { | ||
case NONE: | ||
throw std::runtime_error( | ||
"Must not call readInt (or readToken, or readFloat) twice in a row!"); | ||
case SPACE: | ||
assertWA(consumeWhitespace() == NO_LINES); | ||
break; | ||
case NEWLINE: | ||
assertWA(consumeWhitespace() == LINES); | ||
break; | ||
case ALL: | ||
consumeWhitespace(); | ||
break; | ||
} | ||
} | ||
} // namespace standard_whitespace_detail | ||
void readSpace() { | ||
standard_whitespace_detail::poke_flag(standard_whitespace_detail::SPACE); | ||
} | ||
void readNewLine() { | ||
standard_whitespace_detail::poke_flag(standard_whitespace_detail::NEWLINE); | ||
} | ||
void readEOF() { | ||
standard_whitespace_detail::poke_flag(standard_whitespace_detail::ALL); | ||
standard_whitespace_detail::consumeWhitespace(); | ||
assertWA(getchar() == EOF); | ||
} | ||
std::string readToken(char min_char = 0, char max_char = 127) { | ||
standard_whitespace_detail::preReadToken(); | ||
static constexpr size_t MAX_TOKEN_SIZE = 1e7; | ||
std::string token; | ||
int c = getchar(); | ||
assertWA(!isspace(c)); | ||
while (!isspace(c) && c != EOF && token.size() < MAX_TOKEN_SIZE) { | ||
assertWA(min_char <= c && c <= max_char); | ||
token.push_back(char(c)); | ||
c = getchar(); | ||
} | ||
ungetc(c, stdin); | ||
return token; | ||
} | ||
namespace regex_detail { | ||
regex_t compile(const char *pattern) { | ||
regex_t re; | ||
if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) { | ||
throw std::runtime_error("Pattern failed to compile."); | ||
} | ||
return re; | ||
} | ||
bool match(regex_t re, const std::string &text) { | ||
return regexec(&re, text.c_str(), 0, NULL, 0) == 0; | ||
} | ||
} // namespace regex_detail | ||
long long readInt(long long lo, long long hi) { | ||
// stoll is horribly insufficient, so we use a regex. | ||
static regex_t re = regex_detail::compile("^0|-?[1-9][0-9]*$"); | ||
std::string token = readToken(); | ||
assertWA(regex_detail::match(re, token)); | ||
|
||
long long parsedInt; | ||
try { | ||
parsedInt = stoll(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(lo <= parsedInt && parsedInt <= hi); | ||
return parsedInt; | ||
} | ||
template <typename T> | ||
std::vector<T> readIntArray(size_t N, long long lo, long long hi) { | ||
std::vector<T> arr; | ||
arr.reserve(N); | ||
for (size_t i = 0; i < N; i++) { | ||
arr.push_back(readInt(lo, hi)); | ||
if (i != N - 1) { | ||
readSpace(); | ||
} | ||
} | ||
readNewLine(); | ||
return arr; | ||
} | ||
long double readFloat(long double min, long double max, | ||
long double eps = 1e-9) { | ||
// stold is horribly insufficient, so we use a regex. | ||
static regex_t re = regex_detail::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$"); | ||
std::string token = readToken(); | ||
assertWA(regex_detail::match(re, token)); | ||
long double parsedDouble; | ||
try { | ||
parsedDouble = stold(token); | ||
} catch (const std::invalid_argument &) { | ||
exitWA(); | ||
} catch (const std::out_of_range &) { | ||
exitWA(); | ||
} | ||
assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps); | ||
return parsedDouble; | ||
} | ||
// This is a hook for when the reader functions error, including `assertWA`. Use | ||
// it to do custom handling when an error happens, such as overriding WA with | ||
// partials, or printing a flag like `-1` in interactors. | ||
void errorHook() {} | ||
|
||
// If this is a checker, use the following line in main(int argc, char** argv) | ||
// to replace stdin with the user output. freopen(argv[2], "r", stdin); |
Oops, something went wrong.