Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List collection thumbnails and configure home page URL #2182

Open
Tracked by #1051
SuaYoo opened this issue Nov 25, 2024 · 2 comments · Fixed by #2209
Open
Tracked by #1051

List collection thumbnails and configure home page URL #2182

SuaYoo opened this issue Nov 25, 2024 · 2 comments · Fixed by #2209
Assignees
Labels
back end Requires back end dev work front end Requires front end dev work

Comments

@SuaYoo
Copy link
Member

SuaYoo commented Nov 25, 2024

As a user with crawler permissions, I should be able to configure the entry URL for collection replay and the thumbnail for the entire collection.

Requirements

  • User can choose the home page URL in collections settings
  • User can choose a thumbnail from page timestamps
  • Collection thumbnails are displayed on the org dashboard and org public profile pages

Wireframes

Collection settings

Image

Dashboard

Screenshot 2024-11-25 at 2 09 57 PM
@SuaYoo SuaYoo changed the title Collection thumbnails Collection home page URL and thumbnails Nov 25, 2024
@SuaYoo SuaYoo self-assigned this Nov 25, 2024
@SuaYoo SuaYoo added front end Requires front end dev work back end Requires back end dev work labels Nov 25, 2024
@ikreymer ikreymer moved this from Triage to Todo in Webrecorder Projects Nov 25, 2024
@SuaYoo SuaYoo moved this from Todo to In Design in Webrecorder Projects Nov 25, 2024
@tw4l
Copy link
Member

tw4l commented Nov 25, 2024

Information we want to return for collections in public collection list endpoint:

  • name
  • description
  • caption (Sua's idea, separate public-facing description)
  • date range (YYYY-YYYY or YYYY)
  • thumbnail (presigned url to s3 file)

@tw4l
Copy link
Member

tw4l commented Nov 25, 2024

Backend functionality needed:

  • API endpoint to list URLs in collection, sorted desc by snapshot count (also include array of basic page info like id and timestamp for each page matching URL in response)
  • API endpoint to GET a public collection
  • API endpoint to support selecting a collection home URL/snapshot
  • API endpoint to support uploading a collection thumbnail
  • Modify public collections list endpoint to add thumbnail and date range, remove unnecessary fields

If doing all of the above via the backend, we'll also need to parse pages from uploads into the database and add a means of backfilling older uploads and crawls from before QA.

@SuaYoo SuaYoo changed the title Collection home page URL and thumbnails List collection thumbnails and configure home page URL Nov 26, 2024
@SuaYoo SuaYoo moved this from In Design to Ready in Webrecorder Projects Dec 4, 2024
@SuaYoo SuaYoo moved this from Ready to Implementing in Webrecorder Projects Dec 4, 2024
@SuaYoo SuaYoo moved this from Implementing to In Review in Webrecorder Projects Dec 18, 2024
tw4l added a commit that referenced this issue Dec 23, 2024
Fixes #2182 

This rather large PR adds the rest of what should be needed for public
collections work in the frontend.

New API endpoints include:

- Public collections endpoints: GET, streaming download
- Paginated list of URLs in collection with snapshot (page) info for
each
- Collection endpoint to set home URL
- Collection endpoint to upload thumbnail as stream
- DELETE endpoint to remove collection thumbnail

Changes to existing API endpoints include:

- Paginating public collection list results
- Several `pages` endpoints that previously only supported `/crawls/` in
their path, e.g. `/orgs/{oid}/crawls/all/pages/reAdd`, now support
`/uploads/` and `/all-crawls/` namespaces as well. This is necessitated
by adding pages for uploads to the database (see below). For
`/orgs/{oid}/namespace/all/pages/reAdd`, `crawls` or `uploads` will
serve as a filter to only affect crawls of that given type. Other
endpoints are more liberal at this point, and will perform the same
action regardless of the namespace used in the route (we'll likely want
to change this in a follow-up to be more consistent).
- `/orgs/{oid}/namespace/all/pages/reAdd` now kicks off a background job
rather than doing all of the computation in an asyncio task in the
backend container. The background job additionally updates collection
date ranges, page/size counts, and tags for each collection in the org
after pages have been (re)added.

Other big changes:

- New uploads will now have their pages read into the database!
Collection page counts now also include uploads
- A migration was added to start a background job for each org that will
add the pages for previously-uploaded WACZ files to the database and
update collections accordingly
- Adds a new `ImageFile` subclass of `BaseFile` for thumbnails that we
can use for other user-uploaded image files moving forward, with
separate output models for authenticated and public endpoints
SuaYoo added a commit that referenced this issue Dec 23, 2024
- Allows user to choose collection replay home page and collection
thumbnail (resolves
#2182)
- Displays collection thumbnails on org dashboard and public page
- Enables downloading public collection (resolves
#2233)
- Adds caption as "Summary" to metadata dialog
- Moves description editor to "About" tab

---------

Co-authored-by: Emma Segal-Grossman <[email protected]>
SuaYoo pushed a commit that referenced this issue Dec 23, 2024
Fixes #2182 

This rather large PR adds the rest of what should be needed for public
collections work in the frontend.

New API endpoints include:

- Public collections endpoints: GET, streaming download
- Paginated list of URLs in collection with snapshot (page) info for
each
- Collection endpoint to set home URL
- Collection endpoint to upload thumbnail as stream
- DELETE endpoint to remove collection thumbnail

Changes to existing API endpoints include:

- Paginating public collection list results
- Several `pages` endpoints that previously only supported `/crawls/` in
their path, e.g. `/orgs/{oid}/crawls/all/pages/reAdd`, now support
`/uploads/` and `/all-crawls/` namespaces as well. This is necessitated
by adding pages for uploads to the database (see below). For
`/orgs/{oid}/namespace/all/pages/reAdd`, `crawls` or `uploads` will
serve as a filter to only affect crawls of that given type. Other
endpoints are more liberal at this point, and will perform the same
action regardless of the namespace used in the route (we'll likely want
to change this in a follow-up to be more consistent).
- `/orgs/{oid}/namespace/all/pages/reAdd` now kicks off a background job
rather than doing all of the computation in an asyncio task in the
backend container. The background job additionally updates collection
date ranges, page/size counts, and tags for each collection in the org
after pages have been (re)added.

Other big changes:

- New uploads will now have their pages read into the database!
Collection page counts now also include uploads
- A migration was added to start a background job for each org that will
add the pages for previously-uploaded WACZ files to the database and
update collections accordingly
- Adds a new `ImageFile` subclass of `BaseFile` for thumbnails that we
can use for other user-uploaded image files moving forward, with
separate output models for authenticated and public endpoints
SuaYoo added a commit that referenced this issue Dec 23, 2024
- Allows user to choose collection replay home page and collection
thumbnail (resolves
#2182)
- Displays collection thumbnails on org dashboard and public page
- Enables downloading public collection (resolves
#2233)
- Adds caption as "Summary" to metadata dialog
- Moves description editor to "About" tab

---------

Co-authored-by: Emma Segal-Grossman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back end Requires back end dev work front end Requires front end dev work
Projects
Status: In Review
Development

Successfully merging a pull request may close this issue.

2 participants