Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic search storage and routing #495

Merged
merged 36 commits into from
Aug 2, 2021
Merged

Conversation

jgonggrijp
Copy link
Member

This branch implements #486 by storing a (relatively) lightweight JSON representation of the query to the backend, giving it a serial number and an optional mnemonic label, and using that serial number for routing so that queries can be returned to and shared with other users. On deep linking, the search form is also restored with the visual representation of the query, so that the user can easily create subtle variations of pre-existing queries. This will always create a new query with a separate serial number; queries cannot be changed after creation.

The list action on the backend API returns only the user's own queries. This is partly a preparation for #163 and partly to protect the intellectual property of other users. It is still possible to see other user's queries by entering the serial numbers one by one, which is necessary in order to enable sharing of particular queries.

I just realized that there is no admin page yet to manage the queries (e.g. to delete queries on user request) and that I need to update the README of the semantic-search directory. I'll add some commits to the branch to address this.

Other than that, there is a subtle cosmetic issue that remains after the bug I fixed in 9c1ba6c. Select2 has the annoying property that if you select an <option>, it will no longer update its contents afterwards. Before 9c1ba6c, this resulted in chosen types and predicates having blank labels when re-visiting the search form through a deep link. I mostly fixed it by adding more event handling steps, but due to our complex data model, the label of an option sometimes updates more than once. As a result, the Reader class may appear labeled as "Person" in this scenario. I consider this subtle and cosmetic enough that we can document it in a low-priority issue (which I will create after submitting this PR).

Most of the code changes are glue code: import lines, event bindings, declarative configuration, short methods that just call a few pre-existing functions, etcetera. The only "exciting magic" is in frontend/src/semantic-search/model.ts, which is responsible for two-way conversion between the frontend's model-/collection-based intermediate representation and the backend's more lightweight JSON-based storage representation, and in frontend/src/utilities/abstractDeepEqual.ts, which I needed in order to test the former. There is not much code in either of these modules, but the logic is recursive, which could make it a bit trickier to wrap your head around. Because of this recursive complexity, and because correct conversion is highly critical, both are heavily tested in adjacent modules.

By means of review, I suggest focusing on trying the changes locally, though a code review is also welcome.

jgonggrijp added 30 commits July 1, 2021 17:43
@jgonggrijp
Copy link
Member Author

I just realized that there is no admin page yet to manage the queries (e.g. to delete queries on user request) and that I need to update the README of the semantic-search directory. I'll add some commits to the branch to address this.

Done.

Copy link
Contributor

@JeltevanBoheemen JeltevanBoheemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work @jgonggrijp.

One question: how is the label used (or planned to be used)? I think a nice addition may be to allow accessing a saved query by label (e.g. /explore/query/find-ipad-readers). This would probably need some validation on the label to not produce broken urls though.
I think a place where the user can see his/her latests query would also work, and display the labels there. This might be part of a personal landing page.
Both definitely fall in the 'nice-to-have' category.

list_filter = ('created', 'creator')
show_full_result_count = False

def view_on_site(self, obj):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know this method, very useful!

// Django requires the trailing slash, so add it.
return BackboneModel.prototype.url.apply(this) + '/';
return superUrl + '/';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't follow what's happening here. When a new model is made it shouldn't append the trailing slash?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. By default, Backbone will generate /path/to/api/endpoint/ for saving new models and /path/to/api/endpoint/id for fetching or updating existing models. In the first case, we don't want to append a second slash. In the second case, we do want to finish with a slash because Django will otherwise interpret it as a frontend route.

@@ -69,3 +77,9 @@ export function itemWithOccurrences(control: Controller, node: Node) {
export function searchResultsSources(control: Controller, queryParams: any) {
return control.resetSourceListFromSearchResults(queryParams);
}

export
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export is placed wrong/not consistent here. Minor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaScript/TypeScript allows it and it keeps the lines within 80 columns. It seemed less disruptive than using Egyptian parens for the parameter list. Please feel free to format differently when you see code like this (I've done it in more places).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. My IDE didn't format it as a function, but the code ran without a problem, so its the language server that's wrong :)

@jgonggrijp
Copy link
Member Author

Very nice work @jgonggrijp.

Thanks!

I think a nice addition may be to allow accessing a saved query by label (e.g. /explore/query/find-ipad-readers). This would probably need some validation on the label to not produce broken urls though.

Nice, but also a bit redundant with the id-based routes. Also introduces potential for name clashes, while duplicate names should arguably be allowed in this case.

I think a place where the user can see his/her latests query would also work, and display the labels there. This might be part of a personal landing page.

This is exactly the use case that motivated me to add the label field.

Thanks for the review!

@jgonggrijp jgonggrijp merged commit 77dc765 into develop Aug 2, 2021
@jgonggrijp jgonggrijp deleted the feature/sem-search-routing branch August 2, 2021 18:23
@jgonggrijp jgonggrijp added this to the Next release milestone Aug 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants