Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove thumbnail generation code #9682

Merged
merged 13 commits into from
Dec 11, 2024
Merged

Remove thumbnail generation code #9682

merged 13 commits into from
Dec 11, 2024

Conversation

ChrisBAshton
Copy link
Contributor

@ChrisBAshton ChrisBAshton commented Dec 2, 2024

The frontend rendering apps now handle attachments directly, and just use generic icons. The code to generate a custom PNG thumbnail of a PDF was removed in #9163, but Whitehall was still generating and uploading thumbnails unnecessarily: it was simply falling back to a static image asset, pub-cover.png. Every time a PDF was uploaded, this static asset would be uploaded under a new filepath to Asset Manager (which presumably has many thousands of this identical thumbnail now).

This PR removes the thumbnail variant logic from PDF attachments altogether. The code was spread across a large number of places, so I've tried to break it into commits and explain why each chunk of code is safe to delete. I'd appreciate a second opinion particularly on the test deletion commits though.

Trello card: https://trello.com/c/vjSowrs7/3224-remove-thumbnail-generation-from-whitehall


⚠️ This repo is Continuously Deployed: make sure you follow the guidance ⚠️

Follow these steps if you are doing a Rails upgrade.

@ChrisBAshton ChrisBAshton force-pushed the remove-thumbnails branch 7 times, most recently from 8f526d1 to cd87e27 Compare December 4, 2024 14:50
@ChrisBAshton ChrisBAshton marked this pull request as ready for review December 4, 2024 14:50
Copy link
Contributor

@lauraghiorghisor-tw lauraghiorghisor-tw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and OMG so many deletions!! (frantic happy clapping)

  • I suggested one test fix that should work, ok with the deletions of the other tests you pointed out. Overall sadness over the occasionally convoluted integration tests, but out of scope for this PR to try and make them better.
  • Also suggested another factory deletion that could live in this PR.

There is one thing that crossed my mind, and am unsure if it is an issue, maybe we can huddle about it. Is there any risk associated with maintaining the data in WH? We still have ALL the thumbnail assets. For documents that are published, that's only potentially an issue when we republish, but not sure if it will cause some problems for editions in draft.

Looking at:

    attachment_data.assets.each do |asset|
      AssetManager::AssetUpdater.call(asset.asset_manager_id, asset_attributes)
    end

-> it will always try and loop through all associations. I think a little well orchestrated data migration might be needed.

ChrisBAshton added a commit that referenced this pull request Dec 6, 2024
See #9682 (comment)

> I recall that, at the time, we added a variety of factories to give us
> plenty to work with for both tests where "we only care if it has an asset,
> or not even an asset" and tests where "it would be really good to know
> that update fires twice". Now that we only have the one variant you could
> perhaps get rid of the doc factory and the attachment_data_with_asset and
> replace it with pdf. csv may have its own use cases so we should probably
> keep.
@ChrisBAshton ChrisBAshton force-pushed the remove-thumbnails branch 3 times, most recently from a5ce728 to e79cf88 Compare December 6, 2024 10:59
ChrisBAshton added a commit that referenced this pull request Dec 6, 2024
See #9682 (comment)

> I recall that, at the time, we added a variety of factories to give us
> plenty to work with for both tests where "we only care if it has an asset,
> or not even an asset" and tests where "it would be really good to know
> that update fires twice". Now that we only have the one variant you could
> perhaps get rid of the doc factory and the attachment_data_with_asset and
> replace it with pdf. csv may have its own use cases so we should probably
> keep.

While here, I  tidied up some test file `let` declarations that were unused
(following the approach of "if it's only used once, define it
inline rather than as a `let`").

Note that in the dependent tests, I've deliberately used `csv_attachment`
rather than `file_attachment` as the resulting path for the latter is quite
different to the resulting path for the former. `csv_attachment`
generates a path that is most similar to the docx path that these
tests had relied on until now. This meant I was able to make fewer changes
to the tests to get them to pass.
ChrisBAshton added a commit that referenced this pull request Dec 6, 2024
See #9682 (comment)

> I recall that, at the time, we added a variety of factories to give us
> plenty to work with for both tests where "we only care if it has an asset,
> or not even an asset" and tests where "it would be really good to know
> that update fires twice". Now that we only have the one variant you could
> perhaps get rid of the doc factory and the attachment_data_with_asset and
> replace it with pdf. csv may have its own use cases so we should probably
> keep.

While here, I  tidied up some test file `let` declarations that were unused
(following the approach of "if it's only used once, define it
inline rather than as a `let`").

Note that in the dependent tests, I've deliberately used `csv_attachment`
rather than `file_attachment` as the resulting path for the latter is quite
different to the resulting path for the former. `csv_attachment`
generates a path that is most similar to the docx path that these
tests had relied on until now. This meant I was able to make fewer changes
to the tests to get them to pass.
See temporary commit 63dcb2b if you're
interested in the effect `file_attachment` vs `csv_attachment` had on the
test suite!
Copy link
Contributor

@lauraghiorghisor-tw lauraghiorghisor-tw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


with a few mentions:

  • small option test fix / the suggestion from that comment
  • do we need to get rid of the thumbnail assets from the db first?
  • should we do a republish?

lauraghiorghisor-tw pushed a commit that referenced this pull request Dec 9, 2024
See #9682 (comment)

> I recall that, at the time, we added a variety of factories to give us
> plenty to work with for both tests where "we only care if it has an asset,
> or not even an asset" and tests where "it would be really good to know
> that update fires twice". Now that we only have the one variant you could
> perhaps get rid of the doc factory and the attachment_data_with_asset and
> replace it with pdf. csv may have its own use cases so we should probably
> keep.

While here, I  tidied up some test file `let` declarations that were unused
(following the approach of "if it's only used once, define it
inline rather than as a `let`").

I've update the attachment_data factory to not include "_original" as a
suffix on the `asset_manager_id`, as it is no longer the case that file
attachments have variants per se, but only originals. That means we can
replace all usages of the previous `attachment_data_with_asset` factory
with a plain `attachment_data`, and all usages of `file_attachment_with_asset`
with plain `file_attachment` - the only tweak needed being setting the correct
`asset_manager_id` in some of the tests (without suffix).
This is only called by 'thumbnail_url', which we'll remove in the
next commit. Consolidating the method now so that it's clear that
'thumbnailable_attachments' is otherwise unused.
This was being passed to the attachment component params, but
is no longer used, so we can stop passing it.
This test was added in b0f216a and
was changed in fade4f8.
Despite its name, it never _really_ tested that PDF thumbnails
worked.

It now just tests that an array of attachments in the edition
(that have not been explicitly included in the edition body)
are rendered by GovSpeak.
Thumbnails are no longer used on the frontend nor in the backend
UI in Whitehall. Let's stop generating them!

Contains some commented out test cases that need further thought,
and will be tackled in subsequent commits.
Now that the logic defining `version :thumbnail, if: :pdf?` in
`AttachmentUploader` has been removed, there is no attachment
type that has anything other than an `:original` version. We
therefore know that `required_variants` is always just `:original`
and can simplify the method.

I did consider renaming the method to simply `uploaded?` or
`original_uploaded?`, but a method called
`all_asset_variants_uploaded?` is defined in 8 different places,
and called in almost a hundred places, so it's difficult to know
which variation of the method is being called in which place (and
therefore which method calls to rename). Safer to just leave it.

With the method simplified, we can remove the test (since it
would no longer work).
There are now no attachment types (other than images and their
variants) that result in anything other than the 'original' file
being uploaded to Asset Manager.

Co-authored-by: @lauraghiorghisor-tw
I tried swapping out the test definition for the ImageData model
instead, but the behaviour is different for ImageData - the
CarrierWave relative path is returned instead of the URL (as in the
`test "returns store path when the model has no assets, although it
should (still uploading or error has occurred)" do` test.

I'm not entirely clear why the behaviour of ImageData is different
to AttachmentData, but given attachments now only ever have the
'original' variant, I doubt this test is needed anymore. I'm also
not sure if there's a bit of corresponding logic in the AttachmentData
model that can be deleted (or the AssetManagerStorage model, which
is, after all, the thing we're supposed to be testing in this file).
Couldn't see any code simplification opportunities, so settling for
a deleted test.

Here's the code I attempted to swap it with:

```
  test "returns nil when the model has assets but the requested variant is not yet available" do
    model = build(:image_data)
    model.assets = model.assets - [model.assets.last]
    model.save!
    model.reload

    assert_nil model.file.url("s216".to_sym) # what would have been the 'last' asset
  end
```
See #9682 (comment)

> I recall that, at the time, we added a variety of factories to give us
> plenty to work with for both tests where "we only care if it has an asset,
> or not even an asset" and tests where "it would be really good to know
> that update fires twice". Now that we only have the one variant you could
> perhaps get rid of the doc factory and the attachment_data_with_asset and
> replace it with pdf. csv may have its own use cases so we should probably
> keep.

While here, I  tidied up some test file `let` declarations that were unused
(following the approach of "if it's only used once, define it
inline rather than as a `let`").

I've update the attachment_data factory to not include "_original" as a
suffix on the `asset_manager_id`, as it is no longer the case that file
attachments have variants per se, but only originals. That means we can
replace all usages of the previous `attachment_data_with_asset` factory
with a plain `attachment_data`, and all usages of `file_attachment_with_asset`
with plain `file_attachment` - the only tweak needed being setting the correct
`asset_manager_id` in some of the tests (without suffix).
If the asset doesn't respond to `variant`, it's a legacy asset
(PDF thumbnail) and we want to filter it out.

I did consider applying this filter more generally by overloading
the `assets` call as below, but that breaks a bunch of tests - so
any filtering needs to be applied on a per-case basis.

```
  def assets
    super.select { |asset| asset.variant.present? }
  end
```
@lauraghiorghisor-tw lauraghiorghisor-tw merged commit 2834e12 into main Dec 11, 2024
19 checks passed
@lauraghiorghisor-tw lauraghiorghisor-tw deleted the remove-thumbnails branch December 11, 2024 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants