-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event Importer: Delete Org Events Prior to Event Import #424
base: develop
Are you sure you want to change the base?
Conversation
@allella I've come up with this solution which would work based on my knowledge of the duplication issue, however this will result in the This wouldn't be an issue if we were using a UUID as the Id attribute but in the current state this would be problematic for the table. |
app-modules/event-importer/src/Console/Commands/ImportEventsCommand.php
Outdated
Show resolved
Hide resolved
app-modules/event-importer/src/Console/Commands/ImportEventsCommand.php
Outdated
Show resolved
Hide resolved
I do wonder, could we limit all of this recreation a bit if we try to detect a potential duplicate, and only run the purge if we think there may be a duplicate? Like, if we do a search on our database for any events by the same org, the same day, same start time, and same service and if anything is returned, only then do we blow things away? If we did this approach, could we further limit it to only deleting events on that day when there's a potential conflict, so we're not deleting a bunch of stuff when there are no other events for that org that could possibly be a conflict? |
@allella Check this out now. I have set it up so that we only check for duplicates on |
@bogdankharchenko could you check this out. The source of the duplication is still a bit of a mystery, but a look at the HG calendar shows a number of recurring Meetup events that generate duplicates. Our best guess is it's for events that are setup on a recurring template on Meetup's end and possibly the duplicates happen due to some change the organizer's are making on Meetup. Matt's suggestion was to search if there are existing events for the same org + same date + same start time, and if there are just purge the existing record and reimport. |
@irby I spent a few mins just looking at the duplicated data, and it seems that that, the ID from graphql either comes in as a stripe or as an integer. Perhaps an easier solution is to just say, ignore events which have string/integer ID's? What I suspect is happening, is at some point this Org was under Meetup REST which was importing as string, and GraphQL is importing it as integer. And since we import event many months in advance, this is where the issue started. Does that sound plausible? |
@bogdankharchenko we found there's a token value in the Event response that we weren't querying in GraphQL. That token seemed to match the alphabetical value that was causing the duplicate of the integer ID on events that are part of a recurring series. The hunch is we'll be able to use this token to avoid or workaround the duplicates. I just pulled Matt's recent PR to log the token values and we'll see if that will help us out. |
This was the conversation about the token value. |
Yes, @bogdankharchenko to add to what @allella has said, this token field on an event is optional, but when a duplicate is identified it does match the id of the original event. For example: Event 1:
Event 2:
Event 3:
My current line of thinking is we do this: to map the I think this is perhaps the best and only way we can resolve the duplicate event issue. |
A current theory to the issue causing #411 is that updates to Meetup events, in particular recurring events, is causing a duplicate Meetup event to be created with a different service ID.
While I haven't been able to repro this issue locally, I have seen in some of the API responses that only one of the duplicates appear in the GraphQL API response. This helps support the theory that updates are causing duplicates.
In this solution, I am looking to check and remove duplicates for
meetup
andmeetup_graphql
events only. I check on events within the org for the same timestamp and delete the existing DB records before inserting the new records.