fix: Document#initialize should be called exactly once #2174
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem is this PR intended to solve?
Originally this PR was started to address errors in Loofah's test suite on JRuby (see flavorjones/loofah#88) related to Nokogiri object decorators not being applied correctly. The root cause of these errors was that
{XML,HTML}Document#initialize
was not being called in the JRuby implementation. Surprising! And breaks the subclassing behavior that Loofah relies on.As I erected tests in Nokogiri's suite to make this failure obvious, I uncovered the fact that in CRuby,
Document#initialize
was actually being called twice from the.parse
method. Even more surprising! But doesn't obviously break anything.This PR addresses both of these issues, with the result that
Document#initialize
is called exactly once on all platforms.Have you included adequate test coverage?
Yes! Thorough testing is introduced around subclassing
XML::Document
,HTML::Document
,XML::DocumentFragment
, andHTML::DocumentFragment
constructor calls.new
and.parse
.Does this change affect the behavior of either the C or the Java implementations?
Yes, but as noted above the changed behavior is now correct and consistent across the platforms.