Fix attribute encoding when using Shibboleth #102
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
RFC2616 states that HTTP headers are encoded in latin1 (iso-8859-1), and the Python/Django request.META (correctly) assumes that incoming headers will be encoded in this way.
However, by default, Shibboleth ignores the iso-8859-1 restriction and puts the UTF-8 encoded values from SAML into its request headers with ShibUseHeaders without transliteration ref]. This results in incorrectly encoded characters when non-ASCII / accented characters are used in e.g. the first or last name.
There are two ways we could fix this. The approach used here is to simply acknowledge the incorrect encoding and fix it (i.e. force the string to be interpreted as UTF-8 rather than Latin1. This is backwards compatible and will be invisible to any sites that don't already have incorrectly encoded names.
The alternative would be to make use of Shibboleth's
ShibRequestSetting encoding URL
option in the Apache config to force Shibboleth to URL encode the string. We would then have to decode it when we consumed it. This approach is arguably more correct since the headers would be RFC compliant, but involves much more work and requires users change their webserver config. It's not backwards compatible.