From 7a7a93eb1b780ad49e44793ff73c62aeb520c599 Mon Sep 17 00:00:00 2001 From: Karl Williamson Date: Mon, 22 Jul 2024 14:33:53 -0600 Subject: [PATCH] mv hacktips text to clib perlclib is the place where the the libc and perl interfaces are most extensively documented. This commit removes redundant text from perlhacktips, and moves the non-redundant parts to perlclib. --- pod/perlclib.pod | 31 +++++++++--- pod/perlhacktips.pod | 110 ++++++------------------------------------- 2 files changed, 40 insertions(+), 101 deletions(-) diff --git a/pod/perlclib.pod b/pod/perlclib.pod index 1351cd30d6a4..a6c22f2886c9 100644 --- a/pod/perlclib.pod +++ b/pod/perlclib.pod @@ -138,7 +138,8 @@ different from their C library counterparts: fputs(s, stream) PerlIO_puts(perlio, s) -There is no equivalent to C; one should use C instead: +There is no equivalent to C (or the deprecated C); one +should use C instead: fgets(s, n, stream) sv_gets(sv, perlio, append) @@ -163,6 +164,10 @@ There is no equivalent to C; one should use C instead: t* p = malloc(n) Newx(p, n, t) t* p = calloc(n, s) Newxz(p, n, t) p = realloc(p, n) Renew(p, n, t) + +It is not portable to try to allocate 0 bytes; allocating 1 or more is +portable. + memcpy(dst, src, n) Copy(src, dst, n, t) memmove(dst, src, n) Move(src, dst, n, t) memcpy(dst, src, sizeof(t)) StructCopy(src, dst, t) @@ -218,11 +223,24 @@ If you do need raw strings, use these instead: Similiarly, you can use SVs for creating strings from formats sprintf(s, fmt, ...) sv_setpvf(sv, fmt, ...) + vsprintf(str, fmt, va_list) sv_vsetpvf(sv, fmt, va_list) + +Or for raw strings, + my_snprintf(dt, len, fmt, ...) + my_vsnprintf(dt, len, fmt, va_list) + vsprintf(str, fmt, va_list) sv_vsnprintf(sv, fmt, va_list) Note also the existence of C and C, combining -concatenation with formatting. +concatenation with formatting; and L()|perlapi/form> for +another form of formatted populating. +Note that glibc C, C, etc. are buggy before glibc +version 2.17. They won't allow a C<%.s> format with a precision to +create a string that isn't valid UTF-8 if the current underlying locale +of the program is UTF-8. What happens is that the C<%s> and its +operand are simply skipped without any notice. +L. =head2 Character Class Tests @@ -293,7 +311,8 @@ for collation. strtol(s, &p, n) Strtol(s, &p, b) strtoul(s, &p, n) Strtoul(s, &p, b) -But note that these are subject to locale; see L. +But note that even the alternative functions are subject to locale; see +L. Typical use is to do range checks on C before casting: @@ -386,9 +405,9 @@ think you do, use the C stack in F instead. strftime() Perl_sv_strftime_tm() strtod() my_strtod() or Strtod() system(s) Don't. Look at pp_system or use my_popen. - ~tempnam() mkstemp() or tmpfile() - ~tmpnam() mkstemp() or tmpfile() - tmpnam_r() mkstemp() or tmpfile() + ~tempnam() mkstemp() + ~tmpnam() mkstemp() + tmpnam_r() mkstemp() uselocale() Perl_setlocale() vsnprintf() my_vsnprintf() wctob() wcrtomb() diff --git a/pod/perlhacktips.pod b/pod/perlhacktips.pod index 2e206cfab80e..0896136341e5 100644 --- a/pod/perlhacktips.pod +++ b/pod/perlhacktips.pod @@ -992,30 +992,28 @@ L. Assuming the contents of static memory pointed to by the return values of Perl wrappers for C library functions doesn't change. Many C library functions return pointers to static storage that can be -overwritten by subsequent calls to the same or related functions. Perl -has wrappers for some of these functions. Originally many of those -wrappers returned those volatile pointers. But over time almost all of -them have evolved to return stable copies. To cope with the remaining -ones, do a L to make a copy, thus avoiding these -problems. You will have to free the copy when you're done to avoid -memory leaks. If you don't have control over when it gets freed, -you'll need to make the copy in a mortal scalar, like so +overwritten by subsequent calls to the same or related functions. If +you handle those returns before one of those functions that share the +storage gets called, this is fine, but in embedded perls, or when using +threads, such a function may get called before you get a chance to +handle it. - SvPVX(sv_2mortal(newSVpv(volatile_string, 0))) +L contains a list of +problematic functions with good advice as to how to cope with them. =back =head2 Problematic System Interfaces -=over 4 - -=item * +There are lots of issues with using various C library functions, +including security ones. You should read L which covers +things in detail. -Perl strings are NOT the same as C strings: They may contain C -characters, whereas a C string is terminated by the first C. That -is why Perl API functions that deal with strings generally take a -pointer to the first byte and either a length or a pointer to the byte -just beyond the final one. +Remember that Perl strings are NOT the same as C strings: They may +contain C characters, whereas a C string is terminated by the first +C. That is why Perl API functions that deal with strings generally +take a pointer to the first byte and either a length or a pointer to the +byte just beyond the final one. And this is the reason that many of the C library string handling functions should not be used. They don't cope with the full generality @@ -1042,84 +1040,6 @@ functions need an additional parameter to give the string length. In the case of literal string parameters, perl has defined macros that calculate the length for you. See L. -=item * - -malloc(0), realloc(0), calloc(0, 0) are non-portable. To be portable -allocate at least one byte. (In general you should rarely need to work -at this low level, but instead use the various malloc wrappers.) - -=item * - -snprintf() - the return type is unportable. Use my_snprintf() instead. - -=back - -=head2 Security problems - -Last but not least, here are various tips for safer coding. See also -L for libc/stdio replacements one should use. - -=over 4 - -=item * - -Do not use gets() - -Or we will publicly ridicule you. Seriously. - -=item * - -Do not use tmpfile() - -Use mkstemp() instead. - -=item * - -Do not use strcpy() or strcat() or strncpy() or strncat() - -Use my_strlcpy() and my_strlcat() instead: they either use the native -implementation, or Perl's own implementation (borrowed from the public -domain implementation of INN). - -=item * - -Do not use sprintf() or vsprintf() - -If you really want just plain byte strings, use my_snprintf() and -my_vsnprintf() instead, which will try to use snprintf() and -vsnprintf() if those safer APIs are available. If you want something -fancier than a plain byte string, use L()|perlapi/form> or -SVs and L|perlapi/sv_catpvf>. - -Note that glibc C, C, etc. are buggy before glibc -version 2.17. They won't allow a C<%.s> format with a precision to -create a string that isn't valid UTF-8 if the current underlying locale -of the program is UTF-8. What happens is that the C<%s> and its -operand are simply skipped without any notice. -L. - -=item * - -Do not use atoi() - -Use grok_atoUV() instead. atoi() has ill-defined behavior on -overflows, and cannot be used for incremental parsing. It is also -affected by locale, which is bad. - -=item * - -Do not use strtol() or strtoul() - -Use grok_atoUV() instead. strtol() or strtoul() (or their -IV/UV-friendly macro disguises, Strtol() and Strtoul(), or Atol() and -Atoul() are affected by locale, which is bad. - -=for apidoc_section $numeric -=for apidoc AmhD||Atol|const char * nptr -=for apidoc AmhD||Atoul|const char * nptr - -=back - =head1 DEBUGGING You can compile a special debugging version of Perl, which allows you