Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptionList items use GUID as key (plus debug!) #313

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

josephmyers
Copy link
Collaborator

@josephmyers josephmyers commented Oct 6, 2022

LfOptionListItem's use the GUID from LCM as their Key, instead of the abbreviation value. This fixes a bug where items without abbreviations weren't being synced. This change requires MongoDB data migration, which is performed just-in-time as part of this PR, and it also requires changes in Language Forge, which at time of writing haven't been done yet.

Please see checkin comments for detailed information regarding each checkin.

@github-actions
Copy link

github-actions bot commented Oct 6, 2022

Test Results

    2 files    20 suites   1m 23s ⏱️
312 tests 261 ✔️ 20 💤 31
315 runs  264 ✔️ 20 💤 31

For more details on these failures, see this check.

Results for commit 23a26f1.

♻️ This comment has been updated with latest results.

@josephmyers josephmyers linked an issue Oct 6, 2022 that may be closed by this pull request
Copy link
Contributor

@megahirt megahirt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Great contribution!

@josephmyers josephmyers changed the title Added debug capability to LfMerge OptionList items use GUID as key (plus debug!) Nov 9, 2022
@josephmyers josephmyers marked this pull request as ready for review November 11, 2022 21:16
@josephmyers josephmyers requested a review from megahirt November 11, 2022 21:16
Copy link
Contributor

@megahirt megahirt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As best as I can tell, this looks like a good refactor. I admit that I don't understand all the changes or the code base very well, but it certainly seems like you've gotten a good understanding of the problem and have fixed up the appropriate methods to suit.

@josephmyers
Copy link
Collaborator Author

What I'm struggling with today is the BSON "automagic" deserialization from MongoDB. A while back I changed multi-option fields like SemanticDomain to List<LfOptionListItem> to accurately represent the UI. This made the refactor less widespread, and everything was working fine.

However, after I ran "docker compose down -v" yesterday to get to a cleaner state, I have been getting critical errors during initial deserialization. MongoDB is complaining it Cannot deserialize a 'List<LfOptionListItem>' from BsonType 'Document'. This doesn't make a ton of sense, given that it is fine with LfOptionListItem's right above and with List's of LF classes right below.

It's an annoying sidetrack, and I am pretty confident it will prevent me from getting to a clean stopping point before my last day.

@josephmyers
Copy link
Collaborator Author

I just remembered that LF can add list options from the settings :(

This will be fine if LF is allowed to push new options with predefined GUID's to LCM, but if not the simplest thing to do would be to disallow the option add feature in LF. We have precedent for this.

@josephmyers
Copy link
Collaborator Author

Just hit another problem, ugh. This task!

For new projects, the MongoDB is getting partially populated before any initialization has happened. There is one option list, with items, but their GUID's are null. The screenshot below shows the DB just after LF started creating the project and just before LFMerge calls MongoConnection.Initialize()

image

Right after initial clone, we call the action TransferLcmToMongo to init the DB. But within that action (ConvertLcmToMongoLexicon.ConvertOptionListFromLcm) we are polling the DB in order to update the DB. So TransferLcmToMongo assumes you have a complete MongoDB usable as a baseline. So TransferLcmToMongo is effectively DB->LF->LCM->DB, instead of LCM->DB.

The best solution in regards to the integrity of the data would be for the action to simply do LCM->DB. It seems safe to assume that initial project setup would give you a clean DB straight from LCM. It's not clear why we're currently depending on the DB to set up the DB.

@megahirt
Copy link
Contributor

megahirt commented Nov 17, 2022

I just remembered that LF can add list options from the settings :(

This will be fine if LF is allowed to push new options with predefined GUID's to LCM, but if not the simplest thing to do would be to disallow the option add feature in LF. We have precedent for this.

Actually, we just disabled adding option list items in a recent PR. So shouldn't be an issue. sillsdev/web-languageforge#1552

@megahirt
Copy link
Contributor

Just hit another problem, ugh. This task!

For new projects, the MongoDB is getting partially populated before any initialization has happened. There is one option list, with items, but their GUID's are null.

It seems reasonable to me to just remove the partial option list. We only support projects that come from flex initially so that code is essentially just getting in the way. A PR against web language Forge to remove the initialized option list would fix this problem

@josephmyers
Copy link
Collaborator Author

Just hit another problem, ugh. This task!
For new projects, the MongoDB is getting partially populated before any initialization has happened. There is one option list, with items, but their GUID's are null.

It seems reasonable to me to just remove the partial option list. We only support projects that come from flex initially so that code is essentially just getting in the way. A PR against web language Forge to remove the initialized option list would fix this problem

Oh that is interesting. I was super stumped as to what was adding that.

This is possible by supplying pbuild with the "development" arg, which signals to docker to download vsdbg to the container. Also adding the config for VS Code to attach to a process in this container.

There are still some kinks to work out. Breakpoints are set in source normally, but during debug it's pulling source files from Nuget. So local changes aren't displayed when stepping through.

Also, it would be nice to get symbol caching up and working, as loading everything takes longer than actually running LfMerge. A fallback is to load only our own DLL's.
…gField and LfStringArrayField, since there a few places in the Convert* classes that find the LCM item from the LfString* value.

We need to shift our class design to be slightly more controlled. This will reduce bugs and make the code easier to read. I made some constructors private so that we only use static create methods, and we require a GUID to be passed in so that we can better depend on the object being fully constructed after calling its construction method (read: partially constructed objects are bad).

In LfStringField, I replaced the Guids property with LcmGuid. The former didn't appear to be used, so I was cleaning up. If we need to revert that part, we can, but it's not great API.

To uplift existing data to GUID's, I changed the setter in LfOptionListItem to only allow GUID's and to ignore incoming values that aren't GUID's. If a GUID is not provided, it will use the existing Guid property. I opted for this over a separate migration action, since that would require manually setting each property and subproperty for each entry. This action would be a big effort and quite bugprone, where changing the setter enforces the change at the one place all these properties use.

More properties were affected than just the option list items, since there are some cases where LF needs to back-convert from a simple string field to its LCM item. Therefore, many of the MultiText properties that are set during entry conversion use the entry GUID, not any GUID for that specific field. I'm not crazy about this, so maybe a further revision will be needed.

The full feature is not yet working, but I haven't worked on the Language Forge side yet.
I realized the actual problem was that the LfSense model was incorrectly representing the option list items as String*Fields. Now LfSense uses LfOptionListItem's and List<LfOptionListItem> to better represent LF. Note that Status went from LfStringArrayField -> LfOptionListItem, instead of a list. Seems like at one point LF was allowing multiple status values? Currently it is only allowing one, so I made that switch.

I also cleaned up the ConvertMongoToLcmOptionList class some. I removed the now unneeded ability to look up LCM items from StringFields. The API now exposes LookupByItem(LfOptionListItem), which looks through the Possibilities, then the _possRepo, then the CanonicalItem's, similar to what the FromStringField was doing. The new API is cleaner and less confusing.

Also added some default value for Key on LfOptionListItem. If it is requested before it's been set, it will use the GUID. Also gave it an IsEmpty method to integrate with serialization.
I can't test this, since the List<LfOptionListItem> deserialization just decided to stop working, but I think this is a good general concept.

Assuming LF can't add list items on its own, any time a GUID is stored in MongoDB, it's from LCM. So if the value from MongoDB is a GUID, find the matching value from the LCM list.
This fixes an exception thrown when trying to deserialize OptionListItem's with partial DB entries. For initial clones, the DB is somewhat filled out (why?), so this adds a try/catch to allow the DB request to fail, in which case an empty LF list will be used.

Also had to change slightly the UpdateRecord call of MongoConnection from using FindOneAndUpdate to UpdateOne. It appeared the only difference was the return type, which was unused, and the former was deserializing its target type before serializing it, causing the same DB exception.
@megahirt megahirt force-pushed the custom-field-sync-fix branch from 8e5d78a to 23a26f1 Compare January 11, 2023 14:01
@megahirt megahirt added the on-hold On hold pending developer availability label Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on-hold On hold pending developer availability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LfMerge component to #1295
2 participants