Skip to content

NermNermNerm/LocalizeFromSource

Repository files navigation

Localization From Source

This library aims to simplify the process of localizing Stardew Valley mods in several ways:

  • Dramatically lower the difficulty of finding and fixing missing or out-of-date translations.
  • Make initial translations more accurate by giving an easy pathway for translators to see exactly how strings are used in the source code.
  • Make the process of converting an English-only mod into a localizable one far less painful.
  • Ensure that translations are complete by detecting un-localized strings at build-time.

With this package, you go from having a de.json that looks like:

{
    // Hausveranstaltungen
    "help-with-monster": "Helfen! Helfen!  Da ist ein Monster im Keller!",

to this:

// Please help make this translation better!  Search for '>>>' to find known issues with the
// translation. When you've corrected everything and tested it locally, send your copy of this
// file back to the mod's author for integration into the mod.  Instructions on how to integrate
// it into the source repository can be found here:
// https://github.com/NermNermNerm/LocalizeFromSource?tab=readme-ov-file#ingesting-translations
//
// Built from commit: 31384eb349d8b6c85a1d42e910e9da90b752daf4  !! DO NOT DELETE THIS LINE !!
{
    // https://github.com/NermNermNerm/Junimatic/blob/b591d6f87c8cabae1ab165116c0fedb1b6475c60/Junimatic/ModConfigMenu.cs#L38
    // >>>SOURCE STRING CHANGED - originally translated by nexus:playerone on 5/6/2021
    // old: "Help! Help!  There is a monster in the basement!"
    // new: "Help! Help!  There is a monster in the attic!"
    "help-with-monster": "Helfen! Helfen!  Da ist ein Monster im Keller!",

For the translator, you can see that we've given them context on exactly what the problem with the translation is and a pointer to the file on github so they can do further research into what else changed when the monster moved into the attic. For the mod owner, the built-from-commit and the analysis done when the package is ingested means they can be confident that the translation is complete.

The next feature of this package is mainly of interest to mods that haven't been translated yet or expect to add a substantial amount of new code. Instead of localizing by taking something like this:

Game1.addHUDMessage(new HUDMessage("The sun seems a little brighter today."));

And then cutting&pasting the string into the default.json and then changing the C# into:

Game1.addHUDMessage(new HUDMessage(I18n.HudMessage_SunBrighter));

With this package, there's no fussing with JSON or cut&paste or trying to think up a good key, it's just wrapping the string with L(), like so:

Game1.addHUDMessage(new HUDMessage(L("The sun seems a little brighter today.")));

Not only does that make life easier for you, when converting your mod, it also makes it easier for localizers because the github links generated by the previously-described system will be able to link directly into the code, so instead of you having to try and come up with clever keys that will maybe give your translators a clue as to how the translation is used, they can link back to the code and see all the context. E.g. the key we chose for this example, says "HudMessage_SunBrighter" - does that seem like a good key to you? Perhaps so. It certainly told the translator that this was a HUD message and not, say, a dialog. That's good to know, but we didn't give any clue as to what might have caused the sun to be brighter. When it comes to doing accurate localization, nuance is often everything. Is the Sun actually brighter? Or is this just an expression to convey a sense of optimism? If it's an expression, does a literal translation convey the same meaning in all cultures? Would a brighter sun be welcome in Saudi Arabia? You could try to encompass that fact in your choice of key, but it's extremely difficult to anticipate everything that might be different across all cultures.

Quick Start

Note that these instructions are more difficult than they ultimately should be - the difficulty stems from the ModBuildConfig package, which hard-codes where the i18n folder is. If we can get that to be parameterized, we can do much better because we can put the generated i18n folder in the build output, and then have ModBuildConfig package it from there. But meanwhile, this will work.

  1. Install the NermNermNerm.Stardew.LocalizeFromSource nuget package

  2. Create a directory, I18nSource

  3. Copy or move i18n\default.json to i18nSource

  4. For each language translation file in your i18n folder run a command like this - substituting the language you're translating for de.json and, if you know who supplied the translation, an identifier for that author in the 'TranslationAuthor'. E.g. 'github:NermNermNerm' or 'nexus:NermNermNerm'. Note if you use a source that starts with 'automation' like 'automation:googletranslate', then every entry in the file will be flagged as coming from automation, and thus signal that a Human really ought to have a look at them.

    dotnet build /t:IngestTranslation "/p:TranslatedFile=.\i18n\de.json;TranslationAuthor=github:id
  5. delete i18n*

  6. git add --all .

  7. Add 'i18n' to your .gitignore

  8. git commit

This should result in a commit that removes everything (under source control) from i18n and adds everything to 'i18nSource'. Building will recreate the 'i18n' folder with all the annotations described above.

That just gets you started if you have an existing mod and are just interested in the annotated translation files. To use this package to help you localize, later sections will cover how to take advantage of more of the package's features.

Note that when you receive updated translation files from the community, you run that build command on the file that you were given. That will merge the changes into the i18nSource folder for you.

Localizing from the Source

The traditional way to localize a Stardew Valley mod involves finding every localizable string (by eye), copying them to the default.json file (while being careful to escape for json), inventing a key, and changing the code to use that key... and then retesting the entire mod carefully because you could have goofed up any of those manual steps. With this system, you just mark all your localizable strings with an L, like so:

ModEntry.AddQuestItem(
    objects,
    OldJunimoPortalQiid,
    L("a strange little structure"),
    L("At first it looked like a woody weed, but a closer look makes it like a little structure, and it smells sorta like the Wizard's forest-magic potion."),
    0);

A compile-time step takes care of populating the default.json file. Note that the L isn't magic, it works only if you have this line in your using block:

using static NermNermNerm.Stardew.LocalizeFromSource.SdvLocalize;

The other big advantage from using the L() syntax is that we can have links from the generated language files back to the source code so that translators can get the full context on how a string is being used.

Mixed code and localizable strings

In Stardew Valley, there are cases where what amounts to game code is mixed into a localizable string.

else if (e.NameWithoutLocale.IsEquivalentTo("Data/Quests"))
{
    e.Edit(editor =>
    {
        IDictionary<string, string> data = editor.AsDictionary<string, string>().Data;
        data[MeetLinusAtTentQuest] = "Basic/Find Linus At His Tent/Linus said he had something he needed your help with./Go to Linus' tent before 10pm/null/-1/0/-1/false";
        data[MeetLinusAt60Quest] = "Basic/Meet Linus At Level 60/Linus had something he wanted to show you at level 60 of the mines./Follow Linus to level 60/null/-1/0/-1/false";
        data[CatchIcePipsQuest] = "Basic/Catch Six Ice Pips/Catch six ice pips and put them in the mysterious fish tank.//null/-1/0/-1/false";
    });
}

With this system, you could just wrap those strings in L() and call it a day. At that point you'd be pretty much at the same point as if you used the traditional localization system. With this package, you can move it one step farther and wrap it in SdvQuest, like so:

else if (e.NameWithoutLocale.IsEquivalentTo("Data/Quests"))
{
    e.Edit(editor =>
    {
        IDictionary<string, string> data = editor.AsDictionary<string, string>().Data;
        data[MeetLinusAtTentQuest] = SdvQuest("Basic/Find Linus At His Tent/Linus said he had something he needed your help with./Go to Linus' tent before 10pm/null/-1/0/-1/false");
        data[MeetLinusAt60Quest] = SdvQuest("Basic/Meet Linus At Level 60/Linus had something he wanted to show you at level 60 of the mines./Follow Linus to level 60/null/-1/0/-1/false");
        data[CatchIcePipsQuest] = SdvQuest("Basic/Catch Six Ice Pips/Catch six ice pips and put them in the mysterious fish tank.//null/-1/0/-1/false");
    });
}

That tells the system that the string is a quest, and it will actually add just the localizable parts to default.json. That is, in this case, it wouldn't add "key": "Basic/Find Linus..., it will add "key1": "Find Linus At His Tent" and "key2": "Linus said... and so on. The win here is that if you ever have to update something in the non-localizable part of the string, you can do so without having to make the same change across all your languages.

For quests, that's nice. But there's also a SdvEvent method, and, well, I think the power of that is self-evident.

Ensuring everything that should be localized is localized

Pseudo-Localization

Generally, you can eyeball when a string should be localized and you can do a pretty good job of finding them. One quick&easy technique that works well for helping you spot strings that should have been localized but weren't is called "pseudo-localization". That's where you just take a string and add umlauts and accents to the English text so that it's easy to spot that localization has been done. It also helps you find errors with over localization - that is, cases where you marked a string as localizable, but it really shouldn't have been.

This package turns that on by default for DEBUG builds. You can control that behavior in your ModEntry:

public override void Entry(IModHelper helper)
{
    SdvLocalize.Initialize(this, doPseudoLocInDebug: false);

Static Analysis

This package also can do static analysis to help ensure that you've got everything. Turn on strict mode by adding a LocalizeFromSource.json file to your project:

{
    "isStrict": true,
    "invariantStringPatterns": [
        // dot and slash-separated identifiers
        "^\\w+[\\./\\\\_][\\w\\./\\\\_]*\\w$",

        // CamelCase or pascalCase identifiers - identified by all letters and digits with a lowercase letter right before an uppercase one.
        "^\\w*[\\p{Ll}\\d][\\p{Lu}]\\w*$",

        // Qualified item id's (O)blah.blah or (BC)Chest
        "^\\([A-Z]+\\)[\\p{L}\\d\\.]+$"
    ],
    "invariantMethods": [
        "StardewValley.Farmer.getFriendshipHeartLevelForNPC",
        "StardewValley.Game1.playSound"
    ]
}

The other properties assist in weeding out false-positives. invariantStringPatterns allows you to describe strings that can be mechanically identified as non-localizable. invariantMethods allow you to list methods that take string identifiers as arguments.

Note: the patterns shown in this example are actually already in the default list of invariant patterns for Stardew Valley. Similarly for the method list. They're just listed in this example to give you a clearer idea of what to put here. A good starting point would be { "isStrict": true, "invariantStringPatterns": [], "invariantMethods": [] }

There will always be cases that you can't mechanically identify or it's just not worth the hassle of editing the json. For those cases, you can use I(), similar to L:

var c = farmAnimals.Values.Count(a => a.type.Value.EndsWith(I(" Chicken")));

Format strings

The perils of String.Format(identifier, x, y) have been known for a long time, and a number of remedies have been devised with varying degrees of effectiveness. Interpolated strings is certainly one of the most powerful of them, and this package aims to exploit them. Alas, it can't quite be done seamlessly, you have to use LF instead of just L. Like so:

quest.currentObjective = LF($"{count} of 6 teleported");

If you do that, then you'll see a string like this generated in your default.json "{{arg0}} of 6 teleported".

Note that there is also a version for invariant strings, IF. Why can't we just do L? It's a long story involving how the compiler works. If left to its own devices, $"{count} of 6 teleported" will get turned into instructions that look a lot like allocating a StringBuilder and appending the count and the string to it. If, on the other hand, you pass an interpolated string to a method that takes a System.FormattableString as an argument, then it constructs such an object and passes that to the method. That's the behavior that LF is counting on. "Aha!" I hear you say. "Just have an overload of L that takes a string and another that takes FormattableString! Problem solved!" Ah, would that it were true. If you do that, you'll find that the FormattableString overload never gets called. That's because the type of the object that the compiler generates is not actually FormattableString but instead is a subclass, which is internal, making it so there isn't an exact type-match with either overload. The compiler has two options to convert to a type that will match one of the overloads - there's the subclass conversion and, of course, the interpolated string has a string conversion. It picks the string conversion because conversions to base types are always preferred. If you know a way to beat that, please raise an Issue! It'd sure be nice if it could be overloaded!

Note that a .net FormattableString uses the String.Format style of formatting, e.g. "{0}" and it shows up in the i18n\*.json files as "{{arg0}}". That's because this package translates from the .net style to the SDV style. The main reason why it does that is simply to make it familiar to translators who have done SDV translations in the past and, in the event there's any tooling out there, make it compatible with that too.

The idea of using "{{name}}" rather than "{0}" in SDV has a couple of purposes, one being to give translators a little bit of context on what placeholders mean. For open source mods using this package, you probably don't need to do that as reviewers now who need more context can use the link and look at the source code, which will provide far more detail on what a placeholder might be replaced with than any single identifier can provide. However, if you really feel like it's important, you can pick your own names (rather than 'arg0', 'arg1' and so on) by writing your string like this: $"{count}|count| of 6 teleported". Basically you just put the name you want to use (an identifier) surrounded by pipe characters immediately after the format. That will make it show up as "{{count}} of 6 teleported" in the default.json.

Installing and using the package

  1. Install the 'NermNermNerm.Stardew.LocalizeFromSource' NuGet package.

  2. In your .csproj file, remove the Pathoschild.Stardew.ModTranslationClassBuilder package if you plan on using the L() syntax and not the manual default.json.

  3. In your ModEntry, add this line to hook up the translator:

    public override void Entry(IModHelper helper)
    {
        SdvLocalize.Initialize(this);
  4. This step is not actually particular to using this library for localization. SDV changes the Locale to the one selected by the user only after some assets are already loaded. To my knowledge, this includes objects and buildings and crafting recipes (but recipes don't contain any localized text so they don't matter). So, if you have any custom objects or buildings, add a line like this in your Entry method as well:

    this.Helper.Events.Content.LocaleChanged += (_,_) => this.Helper.GameContent.InvalidateCache("Data/Objects");
  5. In each C# file that contains the strings that should be translated add this to the using blocks:

    using static NermNermNerm.Stardew.LocalizeFromSource.SdvLocalize;
  6. For each translatable string, wrap them in L if they are plain strings, LF if they are format strings. (And, if you are using String.Format, convert them to interpolated strings, like $"x is {x}").

If you do all these things and compile, you should see that a default.json file was generated. Indeed, if somebody supplies you with a language file (like a de.json file) that translates all the values in default.json, it should just work if you switch languages.

Common pitfalls

The argument to L MUST be a constant string and likewise LF must take a constant format string.

string s = "world";
spew(L("hello")); // Okay
spew(L($"hello {s}")); // Not okay
spew(LF($"hello {s}")); // Okay, but if s is a string, ensure it's localized appropriately!

spew(L(condition ? "hello" : "world")); // Not okay.
spew(condition ? L("hello") : L("world")); // Okay

Note that this package has both compile-time and run-time elements. All the faults above generate compile-time errors.

Quests and Events

There are special versions of L for use with quest and event descriptors. The event one is straightforward:

else if (e.NameWithoutLocale.IsEquivalentTo("Data/Quests"))
{
    e.Edit(editor =>
    {
        IDictionary<string, string> data = editor.AsDictionary<string, string>().Data;
        data[MeetLinusAtTentQuest] = SdvQuest("Basic/Find Linus At His Tent/Linus said he had something he needed your help with./Go to Linus' tent before 10pm/null/-1/0/-1/false");
        data[MeetLinusAt60Quest] = SdvQuest("Basic/Meet Linus At Level 60/Linus had something he wanted to show you at level 60 of the mines./Follow Linus to level 60/null/-1/0/-1/false");
        data[CatchIcePipsQuest] = SdvQuest("Basic/Catch Six Ice Pips/Catch six ice pips and put them in the mysterious fish tank.//null/-1/0/-1/false");
    });
}

It's just a matter of using SdvQuest instead of L. The advantage of this is that it will produce localizable strings for each part of the quest in the default.json instead of the whole thing, so if ever you need to tweak the non-localizable parts, you can do that without having to mess with the translations.

Events can be that simple too, depending on where you are storing your event code. The approach that this system supports is for if you are pasting your event code directly into your C#.

        private void EditFarmHouseEvents(IDictionary<string, string> eventData)
        {
            eventData[IF($"{MiningJunimoDreamEvent}/H/sawEvent {ReturnJunimoOrbEvent}/time 600 620")]
                = SdvEvent($@"grandpas_theme/
-2000 -1000/
farmer 13 23 2/
skippable/
fade/
addTemporaryActor Grandpa 1 1 -100 -100 2 true/
specificTemporarySprite grandpaSpirit/
viewport -1000 -1000 true/
pause 8000/
speak Grandpa ""My dear boy...^My beloved grand-daughter...#$b#I am sorry to come to you like this, but I had to thank you for rescuing my dear Junimo friend.#$b#He protected me at a time when my darkest enemy was my own failing mind.#$b#In better days, he helped me with my smelters and other mine-related machines.  He will help you too; he really enjoys watching the glow of the fires!#$b#I rest much easier now knowing that my friend is safe.  I am so proud of you...""/playmusic none/
pause 1000/
end bed");
        }

You see that there's a SdvEvent wrapping the event descriptor. This method insists that the value be a formatted string simply because you might be injecting some code in there. (For example, maybe you have a custom event command and have a constant for the name of your command.)

Like with quests, SdvEvent parses the event code and looks for localizable things within that code and puts just that into the default.json. The implementation as of now is pretty dumb. It just looks for things between quotes, filters out things that look like identifiers and/or paths, and puts them into default.json. It's true that there are a finite number of stock commands that require localized text (e.g. the speak in this example), but there are custom event commands that we can't know about. The upshot is to take a careful look at your event code and ensure that nothing is quoted that doesn't need to be and that all dialog is quoted. You can also look at your generated default.json file to see if anything incorrect showed up there. If it does, it's not the end of the world, just make sure your translators know to leave that string alone. And, of course, raise an Issue about it so that better solutions can be imagined.

Enabling 'Strict' mode

First off let's be clear here: This package is using heuristics and user-supplied clues to sort out what needs to be localized and what doesn't. Like any static analysis tool, it's not foolproof, and turning this mode on will open you up to some daily friction in your coding. This package has a lot of tools designed to reduce that friction, but it'll never zero it out. Is it worth it? That's up to you. Also, the tools we're describing here have the potential to overdo it - meaning that they may confuse a localizable string for an invariant one, causing mistakes. Again, it's up to you. Use these tools carefully.

With those disclaimers out of the way, let's get on with it. The first thing you should do is to add L and LF strings just by inspection. The goal of doing this first is to, as much as possible, make your first build in strict mode just show the false-positives. Once that's done and compiling well, create a file called LocalizeFromSource.json in the same folder as your .csproj file:

{
    "isStrict": true,
    "invariantStringPatterns": [],
    "invariantMethods": []
}

The first build after that will display all the findings. Note that the errors spewed in this way are using a relatively primitive means of reporting errors in Visual Studio, and so the line numbers will not keep up with changes in the source code, so beware of that. In any case, use this first build to just get an overview of what you need to change. You'll probably be able to assemble them into these categories:

  • String identifiers that are passed to methods. Strict mode, by default, has some ways of identifying string identifiers (e.g. looking for camelCase or path/characters...) but there are cases like Abigail (where there's just one word) that aren't clear-cut enough. For these cases, particularly api's that you use several times in your app, consider adding a line to the invariantMethods list with the fully-qualified-name of the method. The package maintains a prefab list of such methods (e.g. playSound), but it's far from comprehensive. If you come up with one or more of these, consider creating a pull request to add them to the stock list.
  • Logging. There's a case to be made to localize the logging, to make it easier for players to read the log, but the cost is that it makes it so that you have a hard time reading the log. It also makes it so that the players have a hard time searching forums and so forth for solutions to whatever problem it is that they're experiencing. In a world where machine translation and web searches exist, it's probably better to make your logging in the source language. If you've rolled some of your own logging functions, what you can do is convert them to always taking FormattableString as an argument (rather than plain string) and adding ArgumentIsCultureInvariant as an attribute to the method. Conversion to formattable string as an argument will force you to add a $ in front of all the plain strings you call it with. It's not a great solution, but it feels better than having two different overloads for your logging function.
  • Exception messages are a similar story to logging. It's going to be case-by-case. If you use exceptions within your mod as a means to pass error messages to the user, then they should be localized. However, most exception messages only ever land in the log file, and so are covered by what we said about Logging. Given the case-by-case nature of the thing, it's perhaps best to just go ahead and mark the messages with an I or L and not try and automate the messages away.
  • Methods and classes that just have a ton of non-localized strings in them that aren't ever going to have any localized strings. It happens. You can use the NoStrict attribute on methods and classes like this and it will disable strict-mode just for those methods/classes.
  • Recognizable strings. It could be that there's a string pattern that you use in your code that is mechanically identifiable and will never be localizable. For these, you can write a regular expression to recognize them (a .net regular expression with escaping to fit in a json file) and add that to the invariantStringPatterns array. Be exceedingly careful with these patterns and make sure they don't catch things they shouldn't. The compiler can get confused and make mistakes with LF if these match things they shouldn't.

But there will be a broad set of cases that just don't fit into any of these categories and so you will doubtless end up with a few I and IF calls sprinkled through your code. Hopefully these instances will have some beneficial effect in highlighting the nature of these strings and maybe making the code a little more readable rather than less so.

Ingesting Translations

Translators are encouraged use the language files produced by this package in the same way that they have in the past. The new files just provide more context. However, in this new world, we maintain the translation files as a richer dataset than just the old key & value of old. We also record who did the translation, when they did it, and exactly what it was that they translated at the time. To make that happen, when a translator supplies a new file, you need to "ingest" it.

Here are the steps you follow after a new or updated translation file arrives. Note that if you're a translator and are also a developer, you can do these steps yourself to help the mod author out:

  1. Store the translation file given to you from the user somewhere convenient, but to reduce confusion, it's best if it's outside your source repo.

  2. Review the file you've been given. Check that the 'built from commit' message is still there. Also search for '>>>' and ensure that it looks like the translator hasn't missed anything.

  3. Ensure you've cleaned your repo (e.g. commit or stash anything you happen to be working on)

  4. If the commit in the translation file isn't the head commit, create a branch, like so:

    git branch ingestFrench <the-commit-id>
    git checkout ingestFrench
  5. Run this - replacing PATH-TO-NEW-JSON with the path to the JSON file you got from the translator. AUTHORID should be replaced with an identifier for the author - by convention it is provider (typically github or nexus) and their ID on that service.

    dotnet build /t:IngestTranslation "/p:TranslatedFile=PATH-TO-NEW-JSON;TranslationAuthor=AUTHORID"
  6. Build (either in visual studio or with just dotnet build) and verify there are no errors and the new files in the i18n folder look like what was supplied.

  7. Commit the changes (should only be changes to one file in i18nSource).

  8. Merge with your main branch.

  9. Build again. If there were changes to translated strings between the time the time the release that your translator used and now, there might be some more missing strings and other issues that will require another pass of translation.

Help wanted

This library scratches an itch I've had throughout my career. Everywhere I work there's a semi-broken approach to localization, much of it essentially unfixable because of historical, contractual, or just funding reasons. With SDV mods, at least in my own mods, I can fix it. But in any system, there's room for improvement. Here are a few areas where somebody could add value.

Actually construct a call graph

Looking at my IL code (in LocalizeFromSource\Decompiler.cs) you see that it's really pretty stupid - it just looks at the gap between Ldstr instructions and the first call it can recognize. That seems to be good enough, but I'd feel a whole lot better if a call-graph could be constructed.

Roslyn analyzer

I've never written one, but I believe it'd be possible to write something that could basically do all the functions of strict mode within the IDE, prior to compilation. I believe that would yield a much better experience.

Automated Translation

For the size and number of strings in your average mod, the Google Translate API can be used for free. While an automated translation isn't good, it's better than no translation in most cases. It wouldn't be too hard to code up something to supply missing translations using that service. There's already code in this package to flag a translation as coming from an automated source.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages