-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(adr): ADR no. 11 Handling FQDNs #1184
Merged
Merged
Changes from 1 commit
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# 11) Handling FQDNs {#adr_0011} | ||
|
||
<!-- | ||
Don't forget to update the TOC in index.md when adding a new record | ||
--> | ||
|
||
Date: 2023-10-04 | ||
|
||
## Status | ||
|
||
proposed | ||
|
||
## Context | ||
(FQDN = Fully qualified domain name) | ||
|
||
Wikibase.cloud allows it's users to create wikis with subdomains on `wikibase.cloud` or to use their own domain names. In both cases, the (resulting) FQDN gets stored in MariaDB database as-is. In August 2023 it became apparent that FQDNs with special characters (= non-ASCII), causes troubles in the system [1], one of which being k8s only allowing handling of hostnames according to RFC 1123 [2][3]. | ||
|
||
## Decision | ||
|
||
To circumvent current and future troubles with non-ASCII domain names, from the moment the system receives the name during creation of a wiki, it gets encoded to punycode[4] (an encoding allowing unicode via ascii representation), and gets handled only in that format internally. As soon as the value leaves the internal API, it gets decoded to it's original representation in unicode. | ||
|
||
## Consequences | ||
|
||
- An ASCII-only representation like punycode should fix and not cause any more troubles with special characters in FQDNs | ||
- Existing values need to be converted in the database | ||
|
||
- [1] - https://phabricator.wikimedia.org/T345139 | ||
- [2] - https://www.rfc-editor.org/rfc/rfc1123 | ||
- [3] - `"message": "Invalid value: \"então.carolinadoran.com\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')"` | ||
- [4] - https://en.wikipedia.org/wiki/Punycode |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are all the places where the domain names get used? How happy are mediawiki, QS, etc. with non-ascii/punycode FQDNs? ElasticSearch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually remember now testing only using punycode for the k8s ingress, mediawiki wasn't happy about that, I assume similar results for other services. What I'm a bit worried about are the implications for Wikibase and for example results in the Query Service. Can't tell if this would be problematic or be actually the exact right thing to do.