Skip to content

Commit

Permalink
problem_format: add C++ templates
Browse files Browse the repository at this point in the history
Many templates have been floating around in the DMOJ community for
validation and input handling in checkers. This commit aims to
consolidate them. It has two main goals:

- Correct. Duh.
- Simple. Other templates that circulate, including the ones I have
  published, are too complex. People naively try and write their own. I
  am sick and tired of reading over incorrect validators.

  These templates forgo some principles of good design (such as
  object-oriented programming) in favour of pure simplicity. They should
  be simple enough that they are understandable by the broader
  community, and are not a black box. Hopefully this also dissuades
  re-writing.
  • Loading branch information
Riolku committed Jan 9, 2024
1 parent 430149b commit 76ec663
Show file tree
Hide file tree
Showing 5 changed files with 401 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/_sidebar.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
- [Custom graders](problem_format/custom_graders.md)
- [Generators](problem_format/generator.md)
- [Problem examples](problem_format/problem_examples.md)
- [C++ Problem Setting Templates](problem_format/cpp_psetting_templates.md)

- About
- [License](about/LICENSE.md)
59 changes: 59 additions & 0 deletions docs/problem_format/cpp_psetting_templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# C++ Problem Setting Templates - `cpp_psetting_templates`

There are three C++ input-handling templates provided for aiding problem setters. They are as follows:

- [Validator Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/validator.cpp)
- [Identical Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/identical_checker_interactor.cpp)
- [Standard Checker/Interactor Template](https://github.com/DMOJ/docs/blob/master/sample_files/problem_setting/standard_checker_interactor.cpp)

## Validator

This is a template for validating the input data of problems. It aims to be simple and of course, correct. It contains seven functions. The first three are whitespace functions:

- `void readSpace()` expects a space at the current position in the input, and aborts the program if there is not a space.
- `void readNewLine()` expects a newline at the current position in the input.
- `void readEOF()` expects the input file to end immediately at the current position.

The remaining four are for actual content:

- `std::string readToken(char min_char = 0, char max_char = 127)` returns the next token in the input stream. A token is defined as a whitespace-separated string. If the next character in the input is a whitespace character, this method aborts the program. The optional arguments `min_char` and `max_char` can be used to enforce a range on the characters in the token. For instance, `readToken('a', 'z')` reads a lowercase string of english letters.
- `long long readInt(long long lo, long long hi)` parses the next token as an integer. It aborts on overflow, malformed integers, and if the resultant integer is not in the range [lo, hi], inclusive.
- `long double readFloat(long double lo, long double hi, long double eps = 1e-9)` parses the next token as a float. It aborts on overflow, malformed floats, and if the resultant float is not in the range [lo, hi], inclusive, using the provided epsilon to perform the comparison. Scientific notation and NaNs are not accepted, nor are leading zeroes. `-0` is allowed. Trailing zeroes are also permitted.
- `std::vector<T> readIntArray(size_t N, long long lo, long long hi)` parses the next space-separated N integers into an array, and then reads a final newline. It must be given a template argument, which is the type of the array elements. For example, `readIntArray<int>(5, 1, 10)` reads five space-separated integers into a `std::vector<int>`, where each integer is in the range [1, 10], inclusive.

`readFloat()` will likely be of no use for many validators, and can be safely deleted. Similarly, `readIntArray` can be deleted if unneeded.

## Checkers/Interactors

The next pair of templates are for checkers/interactors. The difference is the type of whitespace handling: the identical checker/interactor expects whitespace to match exactly. The standard checker/interactor handles whitespace like the `standard` checker.

The checkers and interactors are designed for the `coci` bridged checker/interactor type. However, updating the codes used and the order of command line parameters to work with other types should not be challenging.

Both files can be used for either checkers/interactors, with the following caveat: interactors MUST close `stdout` BEFORE calling `readEOF()`, so that the user process can terminate in case it _also_ expects an EOF.

The general format of the checkers/interactors are the same as the validator, with a few changes:

- `readSpace(), readNewLine(), readEOF()`: Under the identical checker, these return Presentation Error if the check fails. Under the standard checker, these return WA.
- `readToken()`: Under the identical checker, this returns Presentation Error if the token is empty, and WA if any character is not in range.
- `readInt(), readIntArray(), readFloat()`: Returns WA if the token is malformed or out of range.

Additionally, two new functions are provided.

- `exitWA()` unconditionally exits with a WA verdict.
- `assertWA(bool)` takes a condition and exits with WA if the condition is false.

Under the identical checker, corresponding functions `exitPE()` and `assertPE` are provided. Standard checkers should not use the Presentation Error code, as the builtin `standard` checker does not use this code.

Finally, there is an empty function `errorHook()`. This function is called whenever the provided function would exit with an error. It should be used to do custom handling, such as providing partial points for outputting part of an answer, or outputting `-1` in interactors to signal errors to the user submission.

## Standard Checker/Interactor Design

This section is purely for those interested in the design and inner workings of the standard checker/interactor routines.

The general overview is that `readSpace()` should read non-line whitespace characters, `readNewLine` should read whitespace and expect a line whitespace character, and `readEOF` should read all whitespace and check for EOF. Additionally, any leading whitespace in the input should be trimmed.

There are two major challenges with making a standard checker/interactor design ergonomic:
- Under interactors, it is not acceptable to consume all whitespace in the `readNewLine` method, as the user submission will likely output a single line and then wait for the interactor to send another query. If the interactor naively tried to consume all whitespace, it would block, and the user submission would TLE.
- After reading the end of the input, it's most ergonomic to have the checker read a newline, and then call `readEOF()`, as this is the canonical input format. However, the standard checker allows users to forgo the last newline, and if the `readNewLine()` method expected a newline, we would erroneously return WA.

To solve both of these problems, we employ a lazy whitespace checking scheme. `readSpace()` and `readNewLine()` simply set a flag for `readToken()`. `readToken()` then consumes the whitespace and validates it, before reading the token. Additionally, `readEOF()`, if called after `readNewLine()`, ignores the flag and consumes all whitespace, and then checks for EOF.
110 changes: 110 additions & 0 deletions sample_files/problem_setting/identical_checker_interactor.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <regex.h>
#include <stdexcept>
#include <string>
#include <vector>

void errorHook();
void exitWA() {
errorHook();
std::exit(1);
}
void exitPE() {
errorHook();
std::exit(2);
}
void assertWA(bool condition) {
if (!condition) {
exitWA();
}
}
void assertPE(bool condition) {
if (!condition) {
// Defined for `testlib` and `coci`.
exitPE();
}
}
void readSpace() { assertPE(getchar() == ' '); }
void readNewLine() { assertPE(getchar() == '\n'); }
void readEOF() { assertPE(getchar() == EOF); }
std::string readToken(char min_char = 0, char max_char = 127) {
static constexpr size_t MAX_TOKEN_SIZE = 1e7;
std::string token;
int c = getchar();
assertPE(!isspace(c));
while (!isspace(c) && c != EOF && token.size() < MAX_TOKEN_SIZE) {
assertWA(min_char <= c && c <= max_char);
token.push_back(char(c));
c = getchar();
}
ungetc(c, stdin);
return token;
}
namespace regex_detail {
regex_t compile(const char *pattern) {
regex_t re;
if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) {
throw std::runtime_error("Pattern failed to compile.");
}
return re;
}
bool match(regex_t re, const std::string &text) {
return regexec(&re, text.c_str(), 0, NULL, 0) == 0;
}
} // namespace regex_detail
long long readInt(long long lo, long long hi) {
// stoll is horribly insufficient, so we use a regex.
static regex_t re = regex_detail::compile("^0|-?[1-9][0-9]*$");
std::string token = readToken();
assertWA(regex_detail::match(re, token));

long long parsedInt;
try {
parsedInt = stoll(token);
} catch (const std::invalid_argument &) {
exitWA();
} catch (const std::out_of_range &) {
exitWA();
}
assertWA(lo <= parsedInt && parsedInt <= hi);
return parsedInt;
}
template <typename T>
std::vector<T> readIntArray(size_t N, long long lo, long long hi) {
std::vector<T> arr;
arr.reserve(N);
for (size_t i = 0; i < N; i++) {
arr.push_back(readInt(lo, hi));
if (i != N - 1) {
readSpace();
}
}
readNewLine();
return arr;
}
long double readFloat(long double min, long double max,
long double eps = 1e-9) {
// stold is horribly insufficient, so we use a regex.
static regex_t re = regex_detail::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$");
std::string token = readToken();
assertWA(regex_detail::match(re, token));
long double parsedDouble;
try {
parsedDouble = stold(token);
} catch (const std::invalid_argument &) {
exitWA();
} catch (const std::out_of_range &) {
exitWA();
}
assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps);
return parsedDouble;
}
// This is a hook for when the reader functions error, including `assertWA`. Use
// it to do custom handling when an error happens, such as overriding WA with
// partials, or printing a flag like `-1` in interactors.
void errorHook() {}

// If this is a checker, use the following line in main(int argc, char** argv)
// to replace stdin with the user output. freopen(argv[2], "r", stdin);
161 changes: 161 additions & 0 deletions sample_files/problem_setting/standard_checker_interactor.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
#include <algorithm>
#include <cassert>
#include <cstdio>
#include <cstdlib>
#include <regex.h>
#include <stdexcept>
#include <string>
#include <vector>

void errorHook();
void exitWA() {
errorHook();
std::exit(1);
}
void assertWA(bool condition) {
if (!condition) {
exitWA();
}
}
namespace standard_whitespace_detail {
enum WhitespaceFlag { NONE = 0, SPACE = 1, NEWLINE = 2, ALL = 3 };
WhitespaceFlag current_flag = ALL; // At checker start, consume all whitespace.

void poke_flag(WhitespaceFlag flag) {
if (!(current_flag == NONE || (current_flag == NEWLINE && flag == ALL))) {
throw std::runtime_error("Never call two whitespace methods in a row, "
"except for readNewLine() followed by readEOF().");
}
current_flag = flag;
}

enum ConsumeResult {
NO_WHITESPACE,
NO_LINES,
LINES,
};
ConsumeResult consumeWhitespace() {
int c = getchar();
ConsumeResult result = NO_WHITESPACE;
while (isspace(c) && c != EOF) {
if (result == NO_WHITESPACE) {
result = NO_LINES;
}
if (c == '\r' || c == '\n') {
result = LINES;
}
c = getchar();
}
ungetc(c, stdin);
current_flag = NONE;
return result;
}

void preReadToken() {
switch (current_flag) {
case NONE:
throw std::runtime_error(
"Must not call readInt (or readToken, or readFloat) twice in a row!");
case SPACE:
assertWA(consumeWhitespace() == NO_LINES);
break;
case NEWLINE:
assertWA(consumeWhitespace() == LINES);
break;
case ALL:
consumeWhitespace();
break;
}
}
} // namespace standard_whitespace_detail
void readSpace() {
standard_whitespace_detail::poke_flag(standard_whitespace_detail::SPACE);
}
void readNewLine() {
standard_whitespace_detail::poke_flag(standard_whitespace_detail::NEWLINE);
}
void readEOF() {
standard_whitespace_detail::poke_flag(standard_whitespace_detail::ALL);
standard_whitespace_detail::consumeWhitespace();
assertWA(getchar() == EOF);
}
std::string readToken(char min_char = 0, char max_char = 127) {
standard_whitespace_detail::preReadToken();
static constexpr size_t MAX_TOKEN_SIZE = 1e7;
std::string token;
int c = getchar();
assertWA(!isspace(c));
while (!isspace(c) && c != EOF && token.size() < MAX_TOKEN_SIZE) {
assertWA(min_char <= c && c <= max_char);
token.push_back(char(c));
c = getchar();
}
ungetc(c, stdin);
return token;
}
namespace regex_detail {
regex_t compile(const char *pattern) {
regex_t re;
if (regcomp(&re, pattern, REG_EXTENDED | REG_NOSUB) != 0) {
throw std::runtime_error("Pattern failed to compile.");
}
return re;
}
bool match(regex_t re, const std::string &text) {
return regexec(&re, text.c_str(), 0, NULL, 0) == 0;
}
} // namespace regex_detail
long long readInt(long long lo, long long hi) {
// stoll is horribly insufficient, so we use a regex.
static regex_t re = regex_detail::compile("^0|-?[1-9][0-9]*$");
std::string token = readToken();
assertWA(regex_detail::match(re, token));

long long parsedInt;
try {
parsedInt = stoll(token);
} catch (const std::invalid_argument &) {
exitWA();
} catch (const std::out_of_range &) {
exitWA();
}
assertWA(lo <= parsedInt && parsedInt <= hi);
return parsedInt;
}
template <typename T>
std::vector<T> readIntArray(size_t N, long long lo, long long hi) {
std::vector<T> arr;
arr.reserve(N);
for (size_t i = 0; i < N; i++) {
arr.push_back(readInt(lo, hi));
if (i != N - 1) {
readSpace();
}
}
readNewLine();
return arr;
}
long double readFloat(long double min, long double max,
long double eps = 1e-9) {
// stold is horribly insufficient, so we use a regex.
static regex_t re = regex_detail::compile("^-?(0|[1-9][0-9])(\\.[0-9]+)?$");
std::string token = readToken();
assertWA(regex_detail::match(re, token));
long double parsedDouble;
try {
parsedDouble = stold(token);
} catch (const std::invalid_argument &) {
exitWA();
} catch (const std::out_of_range &) {
exitWA();
}
assertWA(min - eps <= parsedDouble && parsedDouble <= max + eps);
return parsedDouble;
}
// This is a hook for when the reader functions error, including `assertWA`. Use
// it to do custom handling when an error happens, such as overriding WA with
// partials, or printing a flag like `-1` in interactors.
void errorHook() {}

// If this is a checker, use the following line in main(int argc, char** argv)
// to replace stdin with the user output. freopen(argv[2], "r", stdin);
Loading

0 comments on commit 76ec663

Please sign in to comment.