-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow regexes to contain ␀ byte #713
base: master
Are you sure you want to change the base?
Conversation
@rnpnr I really don’t know what to do with this one. Do we really care about binary files in |
I'm not sure. It definitely doesn't really work right now. As for this patch its not really intrusive and any time I can use strings with a length instead of the NUL terminated mistake I prefer it.
I think you probably need to use @lluchs if you are still interested in this I will look at any updates otherwise someone else can take this over. I don't think any of these changes should be too complicated. |
Thanks for having a look at this PR! I rebased it to current master.
The actual matching in the text is already null-byte-safe (see
I did that now and managed to make |
This allows entering patterns containing a null byte with /.
@@ -1590,8 +1589,7 @@ Regex *vis_regex(Vis *vis, const char *pattern) { | |||
text_regex_free(regex); | |||
return NULL; | |||
} | |||
if (len == 0) | |||
register_put0(vis, &vis->registers[VIS_REG_SEARCH], pattern); | |||
register_put(vis, &vis->registers[VIS_REG_SEARCH], pattern, len+1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The +1
here is not very intuitive. For some reason, register_get
gives us a length without the terminating null byte (which is still part of the buffer), but register_put
expects a length including the null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Lukas,
After reviewing again I think this looks good. I just left one other comment and then this can be merged!
vis-motions.c
Outdated
Regex *regex = vis_regex(vis, expr); | ||
Regex *regex = vis_regex(vis, expr, strlen(expr)); | ||
if (!regex) { | ||
snprintf(expr, sizeof(expr), "\\<%s\\>", buf); | ||
regex = vis_regex(vis, expr); | ||
regex = vis_regex(vis, expr, strlen(expr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These strlen
s can be avoided by saving the return value from snprintf
.
Edit: Check line 18 for the other snprintf
; github messed up my response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that now by removing the ~500 character limit on the word search. That seemed to be better than handling the case where snprintf
truncates its output.
Words longer than ~500 bytes would previously result in a truncated search pattern.
As previously noted in #359, vis has some trouble with searching in binary files. I noticed that the optional TRE library does actually support NUL bytes in the regex, and started modifying vis to make use of that.
With this patch, it's possible to search for NUL bytes by yanking such a pattern into the "/ register and then using n/N. It's not possible yet to enter a pattern containing NUL bytes directly, as the command line works with zero-terminated strings. Actually entering the pattern works fine, but it discards everything after the NUL byte. I haven't figured out how to change that yet.