Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Honor new-script markers in the preprocessor #2615

Merged

Conversation

fernewelten
Copy link
Contributor

@fernewelten fernewelten commented Dec 11, 2024

As a first attempt to mitigate #2609

The preprocessor tracks double-quoted strings that begin in the first column and start with __NEWSCRIPTSTART_. Whenever it detects such a string, it resets the line numbering and begins a new script. The new script name is whatever follows the new-script marker.

In consequence, preprocessor errors in dialogs will be reported to the Editor in such a way that you can double-click the error message to get to the dialog that has the error.

Moreover, when the preprocessor encounters an unterminated string, it will state the missing character: ' or ".

NOTE: AGS has two preprocessor files: Preprocessor.cs in the Editor project and preprocessor.cpp in the "classic" compiler project. I've tried to modify both in tandem , but the modifications in the "classic" compiler project are untested – mainly because I don't know how to set up appropriate tests for that compiler and preprocessor .

@@ -13,6 +13,7 @@
//=============================================================================
#include <algorithm>
#include <cctype>
#include <string>
Copy link
Contributor

@ivan-mogilko ivan-mogilko Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a wrong header for "strlen", should be <string.h> instead.

Alternatively, there's std::strlen in header <cstring>.

@@ -245,7 +245,9 @@ namespace Preprocessor {
size_t endOfString = FindIndexOfMatchingCharacter(text, i, text[i]);
if (endOfString == NOT_FOUND) //size_t is unsigned but it's alright
{
LogError(ErrorCode::UnterminatedString, "Unterminated string");
std::string msg = "Unterminated string: <char> is missing";
msg = msg.replace(msg.find("<char>"), 6, std::string(1, text[i]));
Copy link
Contributor

@ivan-mogilko ivan-mogilko Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting may be done using our String class, see String::Format or String::FromFormat.

like

String msg = String::FromFormat("Unterminated string: '%c' is missing", text[i]);
LogError(ErrorCode::UnterminatedString, msg);

@ivan-mogilko
Copy link
Contributor

I confirm that C# Preprocessor works as expected. This seems to resolve both the problem reported by #2609 (misleading error message) and #2389 (preprocessor not reporting an actual dialog).

@ivan-mogilko
Copy link
Contributor

ivan-mogilko commented Dec 16, 2024

In regards to the C++ preprocessor, the changes seem to fix the reporting of unterminated string, but neither the correct line counter, nor the section name are included into the reported error.

The _lineNumber and _scriptName that you set in code are simply not propagated further into LogError. Instead it uses currentline global variable internally, which is not adjusted here. I suppose the quick fix would be to also reset currentline where you reset _lineNumber, but I do not remember how the section name is included in the classic compiler's error struct.

Maybe this could be left out from this pr and dealt with separately.
EDIT: I double checked, it's been written into another global variable called ccCurScriptName. :(

After this is done in ags4 branch, the changes need to be backported to ags3 too.


In any case, following are the tests which I propose to add to preprocessor tests.
They may be added in here:
https://github.com/adventuregamestudio/ags/blob/ags4/Compiler/test/preprocessor_test.cpp

TEST(Preprocess, UnterminatedStringSingleQuote) {
    Preprocessor pp = Preprocessor();
    const char* inpl = R"EOS(
int Func1()
{
    'unterminated string
    return 0;
}
)EOS";

    clear_error();
    String res = pp.Preprocess(inpl, "UnterminatedStringSingleQuote");

    EXPECT_STREQ(last_seen_cc_error(), "Unterminated string: ' is missing");
    auto cc_err = cc_get_error();
    // script lines are 1-based, because line 0 is a NEW SCRIPT MARKER added by preproc
    EXPECT_EQ(cc_err.Line, 4);
}

TEST(Preprocess, UnterminatedStringDoubleQuote) {
    Preprocessor pp = Preprocessor();
    const char* inpl = R"EOS(
int Func1()
{
    "unterminated string
    return 0;
}
)EOS";

    clear_error();
    String res = pp.Preprocess(inpl, "UnterminatedStringDoubleQuote");

    EXPECT_STREQ(last_seen_cc_error(), "Unterminated string: \" is missing");
    auto cc_err = cc_get_error();
    // script lines are 1-based, because line 0 is a NEW SCRIPT MARKER added by preproc
    EXPECT_EQ(cc_err.Line, 4);
}

TEST(Preprocess, MultipleNewScriptMarkers) {
    Preprocessor pp = Preprocessor();
    const char* inpl = R"EOS(
\"__NEWSCRIPTSTART_Dialog1\"
int Func1()
{
    return 0;
}
\"__NEWSCRIPTSTART_Dialog2\"
int Func2()
{
    "unterminated string
    return 0;
}
\"__NEWSCRIPTSTART_Dialog3\"
int Func3()
{
    return 0;
}
)EOS";

    clear_error();
    String res = pp.Preprocess(inpl, "MultipleNewScriptMarkers");

    EXPECT_STREQ(last_seen_cc_error(), "Unterminated string: \" is missing");
    auto cc_err = cc_get_error();
    EXPECT_STREQ(ccCurScriptName, "Dialog2");
    // Line count starts from the nearest NEW SCRIPT MARKER
    EXPECT_EQ(cc_err.Line, 3);
}

@ivan-mogilko
Copy link
Contributor

ivan-mogilko commented Dec 16, 2024

To summarize, the quick fix for the C++ preprocessor would be to assign 2 global variables called currentline and ccCurScriptName along with the _scriptName and _lineNumber:

// Start the new script
_scriptName = literal.Mid(strlen(NEW_SCRIPT_TOKEN_PREFIX));
_lineNumber = 0;
ccCurScriptName = _scriptName;
currentline = 0;

In the long run it would be better to refactor the code and get rid of global variables which contain error reporting details etc. Just like the new compiler does now.

@AlanDrake AlanDrake assigned AlanDrake and unassigned AlanDrake Dec 16, 2024
@fernewelten
Copy link
Contributor Author

fernewelten commented Dec 17, 2024

Updating the PR:

  • Rewrite preprocessor.cpp so that it sets currentlineand ccCurScriptName immediately before calling cc_error() and only there, to facilitate a later rewriting of the class for another error reporting mechanism.
  • Include @ivan-mogilko 's Googletests
  • Use AGS String and String::FromFormat to format the error messages, as per @ivan-mogilko 's suggestion.
  • Fix the #include

// TODO: Do away with '_scriptOfError' when 'Preprocessor'
// is refactored so that it doesn't use 'cc_error()' for
// error reporting
static String _scriptOfError;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention previously, but I pushed a fix to the ags4 branch, and you do not need a static buffer for the script name anymore:
3facb6a

This may be removed if you rebase on top of the ags4 branch.

@@ -25,7 +25,7 @@ namespace Preprocessor {
std::vector<AGSString> SplitLines(const AGSString& str)
{
std::vector<AGSString> str_lines = str.Split('\n');
for (int i = 0; i < str_lines.size(); i++) {
for (int i = 0; i < (int) str_lines.size(); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix this other way around, make "i" variable be of size_t type instead.

Also, when the preprocessor finds unterminated strings, it tells what character is missing (quote or double-quote sign)
Also, when the preprocessor finds unterminated strings, it tells what character is missing (quote or double-quote sign)
- Include 'string.h' instead of 'string'
- Delete external 'currentline' declaration, is already in 'cc_common''
- Introduce static buffer for script name for use in 'cc_error()'
- Set 'currentline' and 'ccCurScriptName' before 'cc_error()' is called, and _only_ there
- Use 'String::FromFormat()' for formatting the error messages

Googletests:
- Add external reference to 'ccCurScriptName'
- Add googletests for multiple scripts in one file; unterminated strings
- Add conversion to shut up 'signed/unsigned' warning
Also, remove '(int)' cast in favour of 'size_t' loop var
@fernewelten fernewelten force-pushed the Editor_NewScriptStart branch from 3eff26a to 3f82698 Compare December 17, 2024 16:38
@fernewelten
Copy link
Contributor Author

Done. (Had to force-push due to the rebase)

@ivan-mogilko ivan-mogilko merged commit e7075a6 into adventuregamestudio:ags4 Dec 17, 2024
21 checks passed
@fernewelten fernewelten deleted the Editor_NewScriptStart branch December 17, 2024 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants