Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CSSSanitizer to sanitize_html #731

Merged
merged 2 commits into from
Oct 28, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions license_manager/apps/subscriptions/sanitize.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
import bleach

from bleach.css_sanitizer import CSSSanitizer

def sanitize_html(html_content):
"""
Sanitize HTML content to allow only safe tags and attributes,
while disallowing JavaScript and unsafe protocols.
"""
# Define allowed tags and attributes
allowed_tags = bleach.ALLOWED_TAGS # Allow all standard HTML tags
allowed_tags = set.union(bleach.ALLOWED_TAGS, set({"span"})) # Allow all standard HTML tags
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I believe {"span"} is equivalent to set({"span"}).

allowed_attrs = {"*": ["className", "class", "style", "id"]}
Copy link
Member

@adamstankiewicz adamstankiewicz Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[sanity check] className is from JSX. When React renders JSX, it converts to class in the resulting HTML. I'm not sure including the JSX className attribute in this HTML field would have an effect, as class does, when it renders. The dangerouslySetInnerHTML assumes the input is raw HTML; it would not transpile JSX -> HTML, afaik.

css_sanitizer = CSSSanitizer(allowed_css_properties=["color", "font-weight"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] I wonder if it's worth supporting the default bleach css properties, similar to extending bleach.ALLOWED_TAGS. Looks like not supplying allowed_css_properties would fallback to the defaults.

[suggestion x2] Regardless, for styling, I might recommend considering relying on the Paragon CSS utility classes vs. hardcoding any colors, font-weights, etc. (e.g., font-weight-bold, text-brand, etc.) within the custom HTML.


# Clean the HTML content
sanitized_content = bleach.clean(
html_content,
tags=allowed_tags,
attributes=allowed_attrs,
strip=True, # Strip disallowed tags completely
protocols=["http", "https"], # Only allow http and https URLs
protocols=["http", "https"], # Only allow http and https URLs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected, the docs are stating a list, despite examples.
Screenshot 2024-10-28 at 3 04 18 PM

css_sanitizer=css_sanitizer,
)

# Use bleach.linkify to ensure no javascript: links in <a> tags
Expand Down
Loading