-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Title as first element causes shell problems due to leading --
#396
Comments
I note a comment here #361 (comment) says
But it seems like you could determine what 'some-file' is from the variable |
And one more comment. in #332 all of the examples of title-first have no leading |
From: Ken Mankoff ***@***.***>
Date: Wed, 17 Jul 2024 22:24:53 -0700
I'd like to use this `(setq denote-file-name-components-order '(title
signature keywords))` as my file name scheme. I note that if
identifier is not included or not first, it adds `@@` as a field
separator, but does not include this if it is the first field.
Indeed, the identifier gets a delimiter when it is not the first
component. We did this to still make it possible to find it directly.
I suggest when `title` is the first field, it should also drop the
`--` separator. Currently, files are named, for example,
`--foo@@20240717T222108.org`, which is a problem for a lot of bash
shell commands.
Does it work if you quote the file names? I guess it can still be
tricky.
Is it possible to remove the leading `--` ?
We had discussed this during the development of this feature. I think it
makes sense, though we have to consider how best to do it. What if
someone wants the signature to be first: should it also drop its
delimiter? And, if so, how do we disambiguate them in a scenario like
this:
SIGNATURE@@DATE.org
TITLE@@DATE.org
Inside of Emacs we can rely on the denote-file-name-components-order to
determine the current preference, but even then we cannot know what the
previous preference was and if any files were created using that one.
For shell scripts, this get trickier because the file name alone does
not tell us what the component is and then we need some other heuristic.
I would personally be inclined to make it so that the TITLE is the one
that can drop its delimiter, though I know there are people who would
like the SIGNATURE to do that, to have Luhnmann-style name.
Overall, I am open to ideas. If we can have an elegant solution, I am
all for it.
…--
Protesilaos Stavrou
https://protesilaos.com
|
No. Both
You stress the importance of never changing it enough that one option would be to leave it to the user to adjust files if they change order. Or provide a convenience function to assist in renaming. If the old and new orders are provided, I think this is trivial. If I'm missing something about the complexity here, I vote for adding support for only TITLE. Seems fairly elegant to me.
I'm not even trying to do anything that complicated at the shell. Just |
From: Ken Mankoff ***@***.***>
Date: Wed, 17 Jul 2024 22:59:39 -0700
> Does it work if you quote the file names?
No. Both `"bar"` and `'bar'` produce `--bar`.
Indeed, I tested it now. For relative paths, the "./" prefix seems to
work:
$ grep -E "test" ./--testing-the-merge@@20240626T212048__denote_hello_one_testing.org
> we cannot know what the previous preference was and if any files were
> created using that one.
You stress the importance of never changing it enough that one option
would be to leave it to the user to adjust files if they change order.
Or provide a convenience function to assist in renaming. If the old
and new orders are provided, I think this is trivial. If I'm missing
something about the complexity here, I vote for adding support for
only TITLE. Seems fairly elegant to me.
One thing I forgot to mention is how all this affects the Denote
functions that read the file name to find the relevant component.
Whatever decision we make here will need to be reflected there, so
hopefully we do not make it too complex.
> For shell scripts, this get trickier because the file name alone does
> not tell us what the component is and then we need some other
> heuristic.
I'm not even trying to do anything that complicated at the shell. Just
`grep` fails.
Hopefull the "./" is an option for you. Otherwise, you have to rely on
absolute file system paths.
…--
Protesilaos Stavrou
https://protesilaos.com
|
Does it work if you quote the file names?
I (obviously) misunderstood the question. I tried to quote the title when creating the note, not the filename when accessing.
Yes, most (all?) bash commands have options to work with leading `-`. Either a `\` or `--` to denote end-of-args, quotes, leading path elements, etc.
One thing I forgot to mention is how all this affects the Denote
functions that read the file name to find the relevant component.
Whatever decision we make here will need to be reflected there, so
hopefully we do not make it too complex.
Yes.
I mostly work in Emacs where this is only an aesthetic issue. It just seems... inelegant to see the leading `--` a dired listing and elsewhere in emacs, and adds complication elsewhere outside of emacs. But it does work as-is.
It's very good software. Thank you.
|
I'm not familiar with all the technicalities of the naming convention but here is a suggestion that could help solve this (or related) issues. The approach is to create the more complex regular expressions using rx from basic patterns like (setq file "20240802T184947--example-file-name__keyword.org")
(setq file2 "--example-file-name__keyword@@20240802T184947.org")
(setq file3 "example-file-name__keyword@@20240802T184947.org")
;; denote-title-text-regexp recreated in rx.
(setq test-denote-title-regexp
(rx (seq (literal "--")
(group (regexp "[^.]*?"))
(or (regexp "==.*")
(regexp "__.*")
(seq (literal "@@")
(regexp denote-id-regexp))))))
;; version that captures the title
(setq test-denote-title-regexp2
(rx (or (seq (zero-or-one (literal "--"))
(group (regexp "[^.]*?"))
(zero-or-one (regexp "==.*"))
(zero-or-one (regexp "__.*"))
(seq (literal "@@")
(regexp denote-id-regexp)))
(seq (literal "--")
(group-n 1 (regexp "[^.]*?"))
(or (regexp "==.*")
(regexp "__.*")
(seq (literal "@@")
(regexp denote-id-regexp)))))))
(and (string-match test-denote-title-regexp2 file3)
(match-string-no-properties 1 file3))
(and (string-match test-denote-title-regexp2 file2)
(match-string-no-properties 1 file2))
(and (string-match test-denote-title-regexp2 file)
(match-string-no-properties 1 file))
;; same performance between denote-title-regexp and rx version
(benchmark 10000
(and
(string-match
test-denote-title-regexp
file)
(match-string-no-properties 1 file)))
(benchmark 10000
(and
(string-match
denote-title-regexp
file)
(match-string-no-properties 1 file)))
;; No noticeable performance difference for the regex that captures the leading title
(benchmark 10000
(and
(string-match
test-denote-title-regexp2
file3)
(match-string-no-properties 1 file3)))
|
From: Mirko Hernández ***@***.***>
Date: Fri, 2 Aug 2024 16:21:40 -0700
I'm not familiar with all the technicalities of the naming convention
but here is a suggestion that could help solve this (or related)
issues. The approach is to create the more complex regular expressions
using rx from basic patterns like `denote-id-regexp `.
[... 64 lines elided]
I have not tried this yet. Can you tell me what difference does it make?
I am asking because I cannot tell just by looking at the code and I am
not familiar with 'rx' (it is a big macro with its own language).
From what I understand, 'rx' is a way to write regular expressions in a
more Lispy way than some long string. But the end result should always
be the same, right?
…--
Protesilaos Stavrou
https://protesilaos.com
|
Yes, exactly.
It allows the composition of regular expressions from basic patterns. Since the new denote file name convention allows many combinations of valid file names I though It would be useful to specify these using rx. A secondary benefit is that many different regular expressions can be bench-marked programmatically. |
From: Mirko Hernández ***@***.***>
Date: Mon, 5 Aug 2024 10:15:56 -0700
> From what I understand, 'rx' is a way to write regular expressions in a
> more Lispy way than some long string. But the end result should always
> be the same, right?
Yes, exactly.
Good to know!
> I have not tried this yet. Can you tell me what difference does it make?
It allows the composition of regular expressions from basic patterns.
Since the new denote file name convention allows many combinations of
valid file names I though It would be useful to specify these using
rx.
Indeed. Then we can also write more tests for it.
A secondary benefit is that many different regular expressions can be
bench-marked programmatically.
This is a nice extra.
Now the blocker is that I must learn 'rx'...
* * *
On the point of this issue though, 'rx' will not change the status quo,
meaning that users will still need to escape a leading "-" in file names.
…--
Protesilaos Stavrou
https://protesilaos.com
|
A clarification on the rx example. It allows to easily specify "conditions" in regular expressions. The following example matches 3 possible positions for the title (leading title, leading '--' then the title, '--' and title after some other construct). This would have to be repeated for the other components, although basic patterns could be written as variables ("==.", "__."). I'm not understanding why the regular expression approach is not enough. Lets say there is a leading signature, then the title will have a leading '--', if there is a leading title the signature will have a leading '=='. It seems to me that a complex regex can match all these examples. (setq test-denote-title-regexp2
(rx (or (seq (zero-or-one (literal "--"))
(group (regexp "[^.]*?"))
(zero-or-one (regexp "==.*"))
(zero-or-one (regexp "__.*"))
(seq (literal "@@")
(regexp denote-id-regexp)))
(seq (literal "--")
(group-n 1 (regexp "[^.]*?"))
(or (regexp "==.*")
(regexp "__.*")
(seq (literal "@@")
(regexp denote-id-regexp))))))) |
Side tip: most shell commands accept an |
By the way, I think this is a good example towards what @jeanphilippegg said in an earlier discussion:
|
@MirkoHernandez re:
Titles can contain nested So if signatures are allowed to drop leading identifier, there's an ambiguity: (We could look inside the file's frontmatter to disambiguate, but some of us use Denoted naming for things that don't have frontmatter - PDFs, pictures, media, directories, ...; and one of the big benefits of denoted naming so far is that code doesn't need to open the file to know which part of the name is what.) After surmounting that problem, we still have ambiguities - given |
From: mentalisttraceur ***@***.***>
Date: Mon, 9 Sep 2024 07:47:47 -0700
[... 9 lines elided]
(We *could* look inside the file's frontmatter to disambiguate, but
some of us use Denoted naming for things that don't have frontmatter -
PDFs, pictures, media, directories, ...; and one of the big benefits
of denoted naming so far is that code doesn't need to open the file to
know which part of the name is what.)
A nice thing about not relying on file contents is that we are not tied
to Emacs. The files can be used with 'find' while retaining their
semantics. If we drop delimiters, then we make this longer-term
benefit/portability harder to retain.
After surmounting that problem, we still have ambiguities - given
`foo.md`, is "foo" a title, signature, or even ID (if ID is allowed to
be anything other than ISO8601 datetime - I don't remember if that's
already implemented but it has been discussed favorably).
I have no problem with different patterns for identifiers. It is all a
matter of (i) keeping the defaults, (ii) having something we can
maintain, and (iii) does not create other complications with file names
(e.g. by using some of the characters we employ elsewhere).
…--
Protesilaos Stavrou
https://protesilaos.com
|
💡 idea!
|
To keep things simple, this is the rule I would implement:
Titles can drop their delimiter as first component, identifiers can become any string, there is no ambiguity and we remain backward-compatible. But, as you said, once it is done, we cannot allow signatures/keywords to drop their delimiter, as "foo.org" would be ambiguous. (I did not mention the case of someone who wants to make a title that looks like "20240505T050505". I would ignore this case. It may not even be worth mentioning as a limitation...) |
I'd like to use this
(setq denote-file-name-components-order '(title signature keywords))
as my file name scheme. I note that if identifier is not included or not first, it adds@@
as a field separator, but does not include this if it is the first field.I suggest when
title
is the first field, it should also drop the--
separator. Currently, files are named, for example,--foo@@20240717T222108.org
, which is a problem for a lot of bash shell commands.Is it possible to remove the leading
--
?The text was updated successfully, but these errors were encountered: