Skip to content

Commit

Permalink
mv hacktips text to clib
Browse files Browse the repository at this point in the history
perlclib is the place where the the libc and perl interfaces are most
extensively documented.  This commit removes redundant text from
perlhacktips, and moves the non-redundant parts to perlclib.
  • Loading branch information
khwilliamson committed Jul 23, 2024
1 parent 17bd7e6 commit 7a7a93e
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 101 deletions.
31 changes: 25 additions & 6 deletions pod/perlclib.pod
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,8 @@ different from their C library counterparts:

fputs(s, stream) PerlIO_puts(perlio, s)

There is no equivalent to C<fgets>; one should use C<sv_gets> instead:
There is no equivalent to C<fgets> (or the deprecated C<gets>); one
should use C<sv_gets> instead:

fgets(s, n, stream) sv_gets(sv, perlio, append)

Expand All @@ -163,6 +164,10 @@ There is no equivalent to C<fgets>; one should use C<sv_gets> instead:
t* p = malloc(n) Newx(p, n, t)
t* p = calloc(n, s) Newxz(p, n, t)
p = realloc(p, n) Renew(p, n, t)

It is not portable to try to allocate 0 bytes; allocating 1 or more is
portable.

memcpy(dst, src, n) Copy(src, dst, n, t)
memmove(dst, src, n) Move(src, dst, n, t)
memcpy(dst, src, sizeof(t)) StructCopy(src, dst, t)
Expand Down Expand Up @@ -218,11 +223,24 @@ If you do need raw strings, use these instead:
Similiarly, you can use SVs for creating strings from formats

sprintf(s, fmt, ...) sv_setpvf(sv, fmt, ...)
vsprintf(str, fmt, va_list) sv_vsetpvf(sv, fmt, va_list)

Or for raw strings,

my_snprintf(dt, len, fmt, ...)
my_vsnprintf(dt, len, fmt, va_list)
vsprintf(str, fmt, va_list) sv_vsnprintf(sv, fmt, va_list)

Note also the existence of C<sv_catpvf> and C<sv_vcatpvfn>, combining
concatenation with formatting.
concatenation with formatting; and L<C<Perl_form>()|perlapi/form> for
another form of formatted populating.

Note that glibc C<printf()>, C<sprintf()>, etc. are buggy before glibc
version 2.17. They won't allow a C<%.s> format with a precision to
create a string that isn't valid UTF-8 if the current underlying locale
of the program is UTF-8. What happens is that the C<%s> and its
operand are simply skipped without any notice.
L<https://sourceware.org/bugzilla/show_bug.cgi?id=6530>.

=head2 Character Class Tests

Expand Down Expand Up @@ -293,7 +311,8 @@ for collation.
strtol(s, &p, n) Strtol(s, &p, b)
strtoul(s, &p, n) Strtoul(s, &p, b)

But note that these are subject to locale; see L</Dealing with locales>.
But note that even the alternative functions are subject to locale; see
L</Dealing with locales>.

Typical use is to do range checks on C<uv> before casting:

Expand Down Expand Up @@ -386,9 +405,9 @@ think you do, use the C<JMPENV> stack in F<scope.h> instead.
strftime() Perl_sv_strftime_tm()
strtod() my_strtod() or Strtod()
system(s) Don't. Look at pp_system or use my_popen.
~tempnam() mkstemp() or tmpfile()
~tmpnam() mkstemp() or tmpfile()
tmpnam_r() mkstemp() or tmpfile()
~tempnam() mkstemp()
~tmpnam() mkstemp()
tmpnam_r() mkstemp()
uselocale() Perl_setlocale()
vsnprintf() my_vsnprintf()
wctob() wcrtomb()
Expand Down
110 changes: 15 additions & 95 deletions pod/perlhacktips.pod
Original file line number Diff line number Diff line change
Expand Up @@ -992,30 +992,28 @@ L<https://sourceforge.net/p/predef/wiki/Home/>.
Assuming the contents of static memory pointed to by the return values
of Perl wrappers for C library functions doesn't change. Many C
library functions return pointers to static storage that can be
overwritten by subsequent calls to the same or related functions. Perl
has wrappers for some of these functions. Originally many of those
wrappers returned those volatile pointers. But over time almost all of
them have evolved to return stable copies. To cope with the remaining
ones, do a L<perlapi/savepv> to make a copy, thus avoiding these
problems. You will have to free the copy when you're done to avoid
memory leaks. If you don't have control over when it gets freed,
you'll need to make the copy in a mortal scalar, like so
overwritten by subsequent calls to the same or related functions. If
you handle those returns before one of those functions that share the
storage gets called, this is fine, but in embedded perls, or when using
threads, such a function may get called before you get a chance to
handle it.

SvPVX(sv_2mortal(newSVpv(volatile_string, 0)))
L<perlclib/Dealing with embedded perls and threads> contains a list of
problematic functions with good advice as to how to cope with them.

=back

=head2 Problematic System Interfaces

=over 4

=item *
There are lots of issues with using various C library functions,
including security ones. You should read L<perlclib> which covers
things in detail.

Perl strings are NOT the same as C strings: They may contain C<NUL>
characters, whereas a C string is terminated by the first C<NUL>. That
is why Perl API functions that deal with strings generally take a
pointer to the first byte and either a length or a pointer to the byte
just beyond the final one.
Remember that Perl strings are NOT the same as C strings: They may
contain C<NUL> characters, whereas a C string is terminated by the first
C<NUL>. That is why Perl API functions that deal with strings generally
take a pointer to the first byte and either a length or a pointer to the
byte just beyond the final one.

And this is the reason that many of the C library string handling
functions should not be used. They don't cope with the full generality
Expand All @@ -1042,84 +1040,6 @@ functions need an additional parameter to give the string length. In
the case of literal string parameters, perl has defined macros that
calculate the length for you. See L<perlapi/String Handling>.

=item *

malloc(0), realloc(0), calloc(0, 0) are non-portable. To be portable
allocate at least one byte. (In general you should rarely need to work
at this low level, but instead use the various malloc wrappers.)

=item *

snprintf() - the return type is unportable. Use my_snprintf() instead.

=back

=head2 Security problems

Last but not least, here are various tips for safer coding. See also
L<perlclib> for libc/stdio replacements one should use.

=over 4

=item *

Do not use gets()

Or we will publicly ridicule you. Seriously.

=item *

Do not use tmpfile()

Use mkstemp() instead.

=item *

Do not use strcpy() or strcat() or strncpy() or strncat()

Use my_strlcpy() and my_strlcat() instead: they either use the native
implementation, or Perl's own implementation (borrowed from the public
domain implementation of INN).

=item *

Do not use sprintf() or vsprintf()

If you really want just plain byte strings, use my_snprintf() and
my_vsnprintf() instead, which will try to use snprintf() and
vsnprintf() if those safer APIs are available. If you want something
fancier than a plain byte string, use L<C<Perl_form>()|perlapi/form> or
SVs and L<C<Perl_sv_catpvf()>|perlapi/sv_catpvf>.

Note that glibc C<printf()>, C<sprintf()>, etc. are buggy before glibc
version 2.17. They won't allow a C<%.s> format with a precision to
create a string that isn't valid UTF-8 if the current underlying locale
of the program is UTF-8. What happens is that the C<%s> and its
operand are simply skipped without any notice.
L<https://sourceware.org/bugzilla/show_bug.cgi?id=6530>.

=item *

Do not use atoi()

Use grok_atoUV() instead. atoi() has ill-defined behavior on
overflows, and cannot be used for incremental parsing. It is also
affected by locale, which is bad.

=item *

Do not use strtol() or strtoul()

Use grok_atoUV() instead. strtol() or strtoul() (or their
IV/UV-friendly macro disguises, Strtol() and Strtoul(), or Atol() and
Atoul() are affected by locale, which is bad.

=for apidoc_section $numeric
=for apidoc AmhD||Atol|const char * nptr
=for apidoc AmhD||Atoul|const char * nptr

=back

=head1 DEBUGGING

You can compile a special debugging version of Perl, which allows you
Expand Down

0 comments on commit 7a7a93e

Please sign in to comment.