We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello,
I've noticed that the current regex used to get the Content-Type is not sufficient to cover all content types.
Current regex: https://github.com/laluka/bypass-url-parser/blob/main/src/bypass_url_parser/__init__.py#L1495
REGEX_CONTENT_TYPE = re.compile(r"Content-Type:\s+(\w+/\w+)", re.IGNORECASE)
This can be seen in regex101.com too
This will lead to empty content-types in the output.
Also, curl sometimes returns the headers case-insensitive, so to fix all these, I suggest the following regex:
(?i)content-type:\s*([-\w.]+/[-\w.]+(?:\s*;\s*[\w-]+=(?:\"[^\"]*\"|[^\s;]*))*)
So, the code would be:
REGEX_CONTENT_TYPE = re.compile(r'(?i)content-type:\s*([-\w.]+/[-\w.]+(?:\s*;\s*[\w-]+=(?:\"[^\"]*\"|[^\s;]*))*)')
Tested in regex101 as well
The text was updated successfully, but these errors were encountered:
Nice catch, I'll have a look with @jtof-fap when we find some free time ! 🌹
Sorry, something went wrong.
No branches or pull requests
Hello,
I've noticed that the current regex used to get the Content-Type is not sufficient to cover all content types.
Current regex:
https://github.com/laluka/bypass-url-parser/blob/main/src/bypass_url_parser/__init__.py#L1495
This can be seen in regex101.com too
This will lead to empty content-types in the output.
Also, curl sometimes returns the headers case-insensitive, so to fix all these, I suggest the following regex:
So, the code would be:
Tested in regex101 as well
The text was updated successfully, but these errors were encountered: