Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommender System Improvements #1

Open
horahoradev opened this issue Nov 23, 2023 · 2 comments
Open

Recommender System Improvements #1

horahoradev opened this issue Nov 23, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@horahoradev
Copy link
Owner

horahoradev commented Nov 23, 2023

Context and Motivation

We currently use Gorse as our recommender system.

Currently, Gorse maintains a view of our video ecosystem without using the same sources of truth (Videoservice db in Postgres). It maintains its own database with its own schema, and we call the relevant API methods to manage the state of our entities within Gorse so that it can provide us with video recommendations.

The important code snippet is in https://github.com/horahoradev/PrometheusTube/blob/main/backend/video_service/internal/models/recommender.go#L102 , which is the main recommender code (this file is horribly messy in a general sense, I or someone needs to clean it up rofl).

If user ID is 0, it's assumed to be an anonymous user (nice API bucko), and we provide a mix of most popular videos and nearest neighbors as recommendations. If the user ID is non-zero, it's an authenticated user, and we can provide personalized user recommendations based on user signals (see below diagram for an overview of how this works within Gorse).

image

What really concerns me here, beyond general code cleanliness, is https://github.com/horahoradev/PrometheusTube/blob/main/backend/video_service/internal/models/recommender.go#L121 . For every request to an individual video page, we end up making 20+ queries about video details, making video requests unnecessarily slow

There are a few possible solutions here:

  1. we could batch the query
  2. embed the information we need into gorse itself (but that presents its own problems, because we need to periodically synchronize from our source of truth; yuck!).

Prerequisites

I'd recommend learning about the following topics before approaching this:

  • Golang (going through A Tour of Go is sufficient)
  • Gorse (read the docs and about the Go API, which we use)
  • SQL

Goals

  1. Cleanup recommender system usage
  2. remove/optimize per-recommendation SQL query
@horahoradev horahoradev added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Nov 23, 2023
@Sanket-Arekar
Copy link

Hey @horahoradev I would like to work on this issue. Can you Please assign me this issue?

@horahoradev
Copy link
Owner Author

this one is going to be really tough, i'll probably have to work with you on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants