-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Game updating - The Epic Epic #2357
Comments
@Nexus-Mods/nexusmods-app-developers Request for feedback, this is a bit of an ADR draft. I'd rather not expand this ticket much more as it's already fairly large, but I'd love to get feedback. |
For version mapping, I don't know where they're getting the data from, but GOGDB maps build IDs to "human versions": https://www.gogdb.org/product/1423049311#builds |
SteamDB is also able to gather patch notes and correlate them to a build ID: https://steamdb.info/patchnotes/16681394 |
I have a bunch of questions and potential changes/improvements that I would like to discuss after standup. |
Trying to illustrate what I mentioned yesterday (2024-12-09) in our meeting about full backups and game updates: After an update happens, any new files or any modified files are part of the set of game files of the new version. The full set is a combination of the full set of game files of the previously applied version and the set of new and modified files. |
Meeting notes (10/12/2024)
|
Game File Hashing and Integration with loadouts
General Requirements
Equating files
In order to detect the state of a loadout relative to an "unaltered" source-of-truth for game files
we need some way to determine what the "base" game state is, we can do this with hashing
files, but we need an authoritative state of files.
Human Friendly names
In addition, users wish to refer to their game versions by a human-friendly name. Users may say
"I have Phantom Liberty" or "I have version 1.3 of the game". But on the backend, each store (Steam, GOG, etc)
refer to these files by a hash. We need a way to resolve these human-friendly names to a collection of hashes.
Game updates
When a store updates a game, it will overwrite files in the user's game directory. At that point we may
be able to ask the store (via Game Finder) what files it last put into the game folder, but we need a way
to determine which of the files in the folders are updates, and which were modified by the user.
Hash Equality
Most stores use a cryptographic hash to identify files. For performance reasons we use xxHash3. So we
need a way to swap between hash values. Thus, we'll need some sort of global database that relates all the possible
hash types for files we are aware of.
The hashes we will likely need are:
Using these hashes, should be enough for us to swap between hash types. If we have one of these
hashes we can look up the "row" in the hash database to get the other hashes.
Suggested Implementation
In order to faciliate the above requirements, we will need to structure our code in the following way:
Index game files from the stores
We will need to go to each store we want to support, and index the files for the games we want to support. This is fairly
easy for GoG and Steam, but Epic and others may require a more involved process. We can get the information we need
from most stores without downloading the files, but the hashes from these stores will be in a cryptographic hash. In order to
build the global hash database, we will need to download the files and generate the other hashes.
It is suggested that these files be stored in a way that duplicate hashes can be deduplicated. For example, if we have a specific MD5 from one
version of the game, and the same MD5 from another version of the game, we do not need two separate entries in the database.
Depot indexing
For future reference and linking to the store, we should record what files we got from which depot/store id. In order to keep this information as
lossless as possible, it is recommended that we store each game's data in its own format. We shouldn't try to fit data from multiple stores into a
single logical model; if Steam calls them "depot" and "manifest", store them as those names, and merge the results at read-time.
Version mapping
We will then need to manually link the depot/manifest data into human-friendly names. This process is likely a bit tricky as we would need
to account for DLC and provide some sort of sanity to this information. For some stores (Steam) this information isn't stored in the API and may need to be maintained by hand. However this process is fairly simple for Steam. For example for Cyberpunk there is a 1:1 mapping of manifest Ids and game versions. For other stores like GoG it may be possible to determine this information programatically.
Overall structure
We will store each record for the data as a separate file in git, categorized by various criteria. The reason for "one record per file"
is to avoid merge conflicts and accidental merge issues. The suggested structure in git is:
In the above example, the
hashes
andstore
folders are progamatically generated, whilegames
is maintained by hand.Compilation of Data
Whenever the git repository is updated, we will need to compile the data into a single database.
There are many formats we could use for this. Originally a suggestion was to zip the files into a
.zip
or.nx
file,but the number of files and the unsorted nature of archive TOC entries makes finding a specific file a
O(n) operation. Instead, we could use a SQLite database, MnemonicDB, a custom binary format, or perhaps some "read only"
database like MasterMemory. A decision on this can be made later
and is not critical to the design.
Usage
Based on the above structure we can easily perform any number of queries, and streamline parts of the application. For example:
Creating Loadouts
Now we have an authoritative source of files for a game, so when creating a loadout we can look at what hashes we have archived and on disk,
compare those to the hashes in the database, and the data in the game store. Based on this information we can provide users with a dropdown of
all game versions we can support. If the user has the files for
1.6
, we can show1.6
in the dropdown. Internally however we won't store1.6
,as the game "version", instead we will store the manifest ids.
Separation of DLC
Since we know what manifest ids are associated with game versions and DLC we can split out specific files from the game and put them
into a separate loadout group. This will allow us to show
Skyrim
,Skyrim - Dawnguard
,Skyrim - Hearthfire
, andSkyrim - Dragonborn
as separateitems in the loadout. This is mostly for organizational purposes, but will allow users to easily see what files are associated with what DLC.
Game Updates
When ingesting changes, we can detect if the files being updated are in the hash database. If they appear to have changed from one valid game hash to another,
and we see that the store has changed the manifests it has installed into the game, we can assume that the game has been updated, and apply these changes
to the game files in the loadout.
Naturally this means that we need to get hashes up as soon as possible after a new release, but if we move any non-matching files into the
Override
group, we canlater move them into the game group if we later find a matching manifest id.
Minimal Hash
The minimal hash format assumes that two files will have the same contents if they have the same size, exist on the same path
and generally match the same content. The algorithm for this hash is as follows:
There is some overlap in the middle if the file is below a certain size, this is expected. In general,
this means that only 192KB of the file is read, instead of the entire file. In games such as Cyberpunk this may
stop the app from reading the entire 100GB of the game files.
Implementation steps
Override
groupOverride
toGame
group)Override
groupThe text was updated successfully, but these errors were encountered: