Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rake task to report on a regex in editionable content #8397

Merged
merged 1 commit into from
Oct 23, 2023

Conversation

brucebolt
Copy link
Member

Sometimes we want to identify which published content matches a regular expression (e.g. if we make a change to govspeak and need to republish affected content).

Therefore adding a rake task that will report on the content that includes a given regex in the currently published edition.

This is being broken down into batches of 1000, as our infrastructure does not support large queries being made on a Rails console.

Trello card

@brucebolt brucebolt force-pushed the add-report-rake branch 2 times, most recently from 0f87fa2 to 3d370a9 Compare October 23, 2023 12:35
Sometimes we want to identify which published content matches a regular
expression (e.g. if we make a change to govspeak and need to republish
affected content).

Therefore adding a rake task that will report on the content that
includes a given regex in the currently published edition.

This is being broken down into batches of 1000, as our infrastructure
does not support large queries being made on a Rails console.
end

test "it prints the content IDs of the matching documents from published editions" do
assert_output(/#{@document_1.document.content_id}/) { Rake.application.invoke_task "reporting:matching_docs[Some text]" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if these could not use regex like assert_output(@document_1.document.content_id) { Rake.application.invoke_task "reporting:matching_docs[Some text]" }, but I don't know if there is a way to do contains

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I couldn't find an equivalent method that didn't require a regex.

task :matching_docs, [:regex] => :environment do |_, args|
regex = Regexp.new(/#{args[:regex]}/)

Document.where.not(live_edition_id: nil).find_in_batches(batch_size: 1000) do |batch|
Copy link
Contributor

@jkempster34 jkempster34 Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw on the Trello card that you were investigating other tables, are there none?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've constrained this PR to just the editionable documents. I was going to raise in today's stand-up what the scope of the spike is, as I've already hit the 2 days on the ticket.

@brucebolt brucebolt merged commit 450fd33 into main Oct 23, 2023
15 checks passed
@brucebolt brucebolt deleted the add-report-rake branch October 23, 2023 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants