Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store matched (and unmatched) movies in MongoDB #11

Open
audiodude opened this issue Sep 13, 2024 · 3 comments
Open

Store matched (and unmatched) movies in MongoDB #11

audiodude opened this issue Sep 13, 2024 · 3 comments

Comments

@audiodude
Copy link
Collaborator

We have a Mongo DB set up here:

mongodb+srv://noisebridgeproject.audlswx.mongodb.net/

user: noisebridge
password: same as sfpythonlab.com

Let's set up our wiki_to_netflix.py code to push the data into Mongo. We need to think about what the objects we want to put in, specifically their shape. Probably something like the CSV output, but with a key for each column name. So for this set of data:

processed_data.append([title, year, netflix_id, wiki_movie_ids_list[index], wiki_genres_list[index], wiki_directors_list[index]])

We should have:

{
  title: "foo title",
  year: ...
  netflix_id: ...
  wikidata_id: ...

   ...etc...
}
@audiodude
Copy link
Collaborator Author

Some sample/psuedo code.

def insert_into_mongo(processed_data):
    client = pymongo.connect('mongo+srv://...')
    
    for movie in processed_data:
        client.mediabridge.movies.upsert({
            'title': movie[0],
            'year': movie[1]
            ...
        })

@cocomittens
Copy link
Collaborator

cocomittens commented Sep 19, 2024

@audiodude Are we storing unmatched movies also? The current output only includes found movies. I feel like it could theoretically be worth adding them as well, but I'm assuming they wouldn't actually be used to make recommendations? So unsure if we actually need them or if I'm missing something.

@audiodude
Copy link
Collaborator Author

Yes I think we should store unmatched movies with a wikidata_id of NULL. This will make it easier for the recommendation part to lookup movies by their netflix ID or title.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants