-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter data by draft status #124
Conversation
8576039
to
ac267d7
Compare
ac267d7
to
c24c606
Compare
# Some parts (`ldml`, `ldmlBCP47` amd `supplementalData`) of CLDR data require that you merge all the | ||
# files with the same root element before doing lookups. | ||
# Ref: https://www.unicode.org/reports/tr35/tr35.html#XML_Format | ||
# | ||
# The return of this method is a merged XML Nokogiri document. | ||
# Note that it technically is no longer compliant with the CLDR `ldml.dtd`, since: | ||
# * it has repeated elements | ||
# * the <identity> elements no longer refer to the filename | ||
# | ||
# However, this is not an issue, since #select will find all of the matches from each of the repeated elements, | ||
# and the <identity> elements are not important to us / make no sense when combined together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this comment has been moved to DataFile#merge
The method itself has been rewritten to use reduce
, which is cleaner IMO.
c24c606
to
4dae577
Compare
28f9756
to
31812e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
def export(options = {}, &block) | ||
locales = options[:locales] || Data.locales | ||
components = options[:components] || Data.components | ||
self.minimum_draft_status = options[:minimum_draft_status] if options[:minimum_draft_status] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this should set a default if not provided, since it will throw an exception if not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer for it to explode, since it indicates that something is not working as expected. 🤷
@@ -25,6 +25,10 @@ def self.test(name, &block) | |||
end | |||
end | |||
end | |||
|
|||
def setup | |||
Cldr::Export.minimum_draft_status = Cldr::DraftStatus::CONTRIBUTED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to not set the class variable explicitly by default, so you can see when an exception is thrown in tests when it's being unexpectedly used before it's been set.
You could set it explicitly for tests that expect it to already be set, or mock it and explicitly assert that it was used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I want it to default for everything except for the tests that exercise the draft status filtering code. I've broken the tests in data_file_test.rb
into two TestCase
s, and I override the defaulting in the one that is testing the draft status filtering. 👍
31812e5
to
19f78a7
Compare
Provides a view into the data that is filtered by draft level
It is already handled by `DataFile`
ba983a6
to
92defe0
Compare
What are you trying to accomplish?
Fixes #73.
Throughout
ruby-cldr
, there weredraft?
calls in some places, but not others.What approach did you choose and why?
I added a new
--draft-status
CLI option that allows users to specify the minimum draft status that they want all exported data to have.Instead of each area of the codebase needing to know to check
draft?
all the time, all access to the data is done through a newDataFile
class, which filters the data by draft status transparently. This ensures that we are not missing places where we need to be doing the filtering.The default minimum draft status of
contributed
matches the status needed for inclusion into Unicode's ICU (and consequently whatcldr-json
exports).What should reviewers focus on?
DraftStatus
is effectively an enum. Is there a better way to define these in pure Ruby?minimum_draft_status
. It was a lot nicer than passing a newminimum_draft_status
parameter around everywhere. I feel mostly OK about it.There is a similar issue with the
alt
attribute, but that will be handled as part of #125.The impact of these changes
Users can now control what the minimum draft status to accept data from.
Less data exported as you increase your minimum draft status:
main
)--draft-status=unconfirmed
--draft-status=provisional
--draft-status=contributed
--draft-status=approved
(Computed using
find . -type f -exec ls -l {} \; | awk '{sum += $5} END {print sum}'
, since MacOS'du
command doesn't have a bytes flag)Testing
Find an entry in the
vendor
directory that hasdraft=provisional
, for example:vendor/cldr/common/main/el.xml
:Export with different minimum draft statuses and compare the results:
And see that the diffs are the
draft=provisional
entries.