List collection thumbnails and configure home page URL #2182

SuaYoo · 2024-11-25T22:04:13Z

As a user with crawler permissions, I should be able to configure the entry URL for collection replay and the thumbnail for the entire collection.

Requirements

User can choose the home page URL in collections settings
User can choose a thumbnail from page timestamps
Collection thumbnails are displayed on the org dashboard and org public profile pages

Wireframes

Collection settings

Dashboard

tw4l · 2024-11-25T23:07:15Z

Information we want to return for collections in public collection list endpoint:

name
description
caption (Sua's idea, separate public-facing description)
date range (YYYY-YYYY or YYYY)
thumbnail (presigned url to s3 file)

tw4l · 2024-11-25T23:21:32Z

Backend functionality needed:

API endpoint to list URLs in collection, sorted desc by snapshot count (also include array of basic page info like id and timestamp for each page matching URL in response)
API endpoint to GET a public collection
API endpoint to support selecting a collection home URL/snapshot
API endpoint to support uploading a collection thumbnail
Modify public collections list endpoint to add thumbnail and date range, remove unnecessary fields

If doing all of the above via the backend, we'll also need to parse pages from uploads into the database and add a means of backfilling older uploads and crawls from before QA.

Fixes #2182 This rather large PR adds the rest of what should be needed for public collections work in the frontend. New API endpoints include: - Public collections endpoints: GET, streaming download - Paginated list of URLs in collection with snapshot (page) info for each - Collection endpoint to set home URL - Collection endpoint to upload thumbnail as stream - DELETE endpoint to remove collection thumbnail Changes to existing API endpoints include: - Paginating public collection list results - Several `pages` endpoints that previously only supported `/crawls/` in their path, e.g. `/orgs/{oid}/crawls/all/pages/reAdd`, now support `/uploads/` and `/all-crawls/` namespaces as well. This is necessitated by adding pages for uploads to the database (see below). For `/orgs/{oid}/namespace/all/pages/reAdd`, `crawls` or `uploads` will serve as a filter to only affect crawls of that given type. Other endpoints are more liberal at this point, and will perform the same action regardless of the namespace used in the route (we'll likely want to change this in a follow-up to be more consistent). - `/orgs/{oid}/namespace/all/pages/reAdd` now kicks off a background job rather than doing all of the computation in an asyncio task in the backend container. The background job additionally updates collection date ranges, page/size counts, and tags for each collection in the org after pages have been (re)added. Other big changes: - New uploads will now have their pages read into the database! Collection page counts now also include uploads - A migration was added to start a background job for each org that will add the pages for previously-uploaded WACZ files to the database and update collections accordingly - Adds a new `ImageFile` subclass of `BaseFile` for thumbnails that we can use for other user-uploaded image files moving forward, with separate output models for authenticated and public endpoints

- Allows user to choose collection replay home page and collection thumbnail (resolves #2182) - Displays collection thumbnails on org dashboard and public page - Enables downloading public collection (resolves #2233) - Adds caption as "Summary" to metadata dialog - Moves description editor to "About" tab --------- Co-authored-by: Emma Segal-Grossman <[email protected]>

Fixes #2182 This rather large PR adds the rest of what should be needed for public collections work in the frontend. New API endpoints include: - Public collections endpoints: GET, streaming download - Paginated list of URLs in collection with snapshot (page) info for each - Collection endpoint to set home URL - Collection endpoint to upload thumbnail as stream - DELETE endpoint to remove collection thumbnail Changes to existing API endpoints include: - Paginating public collection list results - Several `pages` endpoints that previously only supported `/crawls/` in their path, e.g. `/orgs/{oid}/crawls/all/pages/reAdd`, now support `/uploads/` and `/all-crawls/` namespaces as well. This is necessitated by adding pages for uploads to the database (see below). For `/orgs/{oid}/namespace/all/pages/reAdd`, `crawls` or `uploads` will serve as a filter to only affect crawls of that given type. Other endpoints are more liberal at this point, and will perform the same action regardless of the namespace used in the route (we'll likely want to change this in a follow-up to be more consistent). - `/orgs/{oid}/namespace/all/pages/reAdd` now kicks off a background job rather than doing all of the computation in an asyncio task in the backend container. The background job additionally updates collection date ranges, page/size counts, and tags for each collection in the org after pages have been (re)added. Other big changes: - New uploads will now have their pages read into the database! Collection page counts now also include uploads - A migration was added to start a background job for each org that will add the pages for previously-uploaded WACZ files to the database and update collections accordingly - Adds a new `ImageFile` subclass of `BaseFile` for thumbnails that we can use for other user-uploaded image files moving forward, with separate output models for authenticated and public endpoints

- Allows user to choose collection replay home page and collection thumbnail (resolves #2182) - Displays collection thumbnails on org dashboard and public page - Enables downloading public collection (resolves #2233) - Adds caption as "Summary" to metadata dialog - Moves description editor to "About" tab --------- Co-authored-by: Emma Segal-Grossman <[email protected]>

SuaYoo mentioned this issue Nov 25, 2024

[Feature]: Public org collections page #1051

Open

11 tasks

github-project-automation bot moved this to Triage in Webrecorder Projects Nov 25, 2024

github-project-automation bot added this to Webrecorder Projects Nov 25, 2024

SuaYoo changed the title ~~Collection thumbnails~~ Collection home page URL and thumbnails Nov 25, 2024

SuaYoo self-assigned this Nov 25, 2024

SuaYoo added front end Requires front end dev work back end Requires back end dev work labels Nov 25, 2024

ikreymer moved this from Triage to Todo in Webrecorder Projects Nov 25, 2024

SuaYoo assigned tw4l Nov 25, 2024

SuaYoo moved this from Todo to In Design in Webrecorder Projects Nov 25, 2024

SuaYoo changed the title ~~Collection home page URL and thumbnails~~ List collection thumbnails and configure home page URL Nov 26, 2024

SuaYoo mentioned this issue Nov 26, 2024

feat: Public org profile page #2172

Merged

tw4l mentioned this issue Dec 3, 2024

Backend work for public collections: thumbnails, url list, upload pages, and so on #2198

Merged

SuaYoo moved this from In Design to Ready in Webrecorder Projects Dec 4, 2024

SuaYoo moved this from Ready to Implementing in Webrecorder Projects Dec 4, 2024

SuaYoo mentioned this issue Dec 4, 2024

feat: Collection thumbnails, start page, and public view updates #2209

Merged

SuaYoo moved this from Implementing to In Review in Webrecorder Projects Dec 18, 2024

tw4l mentioned this issue Dec 18, 2024

Add page count to crawl model #2257

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List collection thumbnails and configure home page URL #2182

List collection thumbnails and configure home page URL #2182

SuaYoo commented Nov 25, 2024 •

edited

Loading

tw4l commented Nov 25, 2024 •

edited

Loading

tw4l commented Nov 25, 2024 •

edited

Loading

List collection thumbnails and configure home page URL #2182

List collection thumbnails and configure home page URL #2182

Comments

SuaYoo commented Nov 25, 2024 • edited Loading

Requirements

Wireframes

Collection settings

Dashboard

tw4l commented Nov 25, 2024 • edited Loading

tw4l commented Nov 25, 2024 • edited Loading

SuaYoo commented Nov 25, 2024 •

edited

Loading

tw4l commented Nov 25, 2024 •

edited

Loading

tw4l commented Nov 25, 2024 •

edited

Loading