-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support POSIX compliant filenames in arc #145
Comments
Thanks @brianmajor. |
Is there a proposed solution to this issue? We should discuss it here before making any changes. As I recall, the issue is that in the VOSpace REST API filenames are part of a |
Hi Patrick, could you clarify if the VOSpace spec itself (not the uri-spec) prevents the use of some ascii characters in filenames? |
It is the URI spec that has reserved characters. In the above link there is language about encoding reserved characters to make valid URIs, so in principle that sounds like the right thing to do... the hard part will be in deciding where/when in the s/w we decode and encode. I haven't read it very carefully, so TBD. I should mention that there is a feature branch (vos2) in this repo with a more or less complete re-write of code that does this... so depending on how quickly we want to get this issue resolved, we might do it there only or we might have to make the changes in master and port them to vos2. TBD |
Here is some documentation on I'm not sure what language and framework are used on the backend but in my experience all web-frameworks have an equivalent decode function. Usually the encoded/decode is done at the REST boundary --- just before submitting the request and just after receiving it. One of the nice things about the URL encoding scheme is that it is normally safe to decode a string that was never encoded. Previously safe URI characters are simply decoded to themselves. As a result, clients can be migrated gradually to encoding things safely. |
That last thing is not quite true. Somebody at some point decided that encoding space with My reading of the URI spec above talked about hex encoding (eg Less common, but you can play with this easily at: https://meyerweb.com/eric/tools/dencoder/ |
@pdowler the encode / decode should only happen exactly once at the REST API boundary? Another alternative would be base64 encoding which is entirely free of special characters like +. I’m happy to make do any workaround you suggest and submit a PR, please just point me in the right location. |
For URL params (eg the Other such uses that come to mind that are part of the standard: Custom endpoints we added to So those are easy because the REST API already decodes params exactly once and if the client(s) don't encode exactly once that's a problem (bug) in the client. The tricky bit for URIs with spaces is that the URIs are also written in XML documents and are parsed into a URI class in memory (java and python) and those classes enforce the URI spec (eg reject illegal chars). So the hard part of allowing space (eg) is retaining encoding whenever we go from string -> URI in code... seems complicated. I think we should discuss with Adrian (lead dev on the cadc python and pyvo vospace tools) and we'll need to figure out (at least for the server side java code) how to do this kind of change along with the |
Support POSIX compliant filenames in the arc VOSpace service. Tools such as
vcp
and the web portal fail if they encounter files will non supported characters, such as spaces.The text was updated successfully, but these errors were encountered: