Skip to content

Commit

Permalink
OP_SUBSTR_NIBBLE - a specialised OP_SUBSTR variant
Browse files Browse the repository at this point in the history
This commit adds OP_SUBSTR_NIBBLE and associated machinery for fast
handling of the constructions:

        substr EXPR,0,LENGTH,''
and
        substr EXPR,0,LENGTH

Where EXPR is a scalar lexical, the OFFSET is zero, and either there
is no REPLACEMENT or it is the empty string. LENGTH can be anything
that OP_SUBSTR supports. These constraints allow for a very stripped
back and optimised version of pp_substr.

The primary motivation was for situations where a scalar, containing
some network packets or other binary data structure, is being parsed
piecemeal. Nibbling away at the scalar can be useful when you don't
know how exactly it will be parsed and unpacked until you get started.
It also means that you don't need to worry about correctly updating
a separate offset variable.

This operator also turns out to be an efficient way to (destructively)
break an expression up into fixed size chunks. For example, given:

    my $x = ''; my $str = "A"x100_000_000;

This code:

    $x = substr($str, 0, 5, "") while ($str);

is twice as fast as doing:

    for ($pos = 0; $pos < length($str); $pos += 5) {
        $x = substr($str, $pos, 5);
    }

Compared with blead, `$y = substr($x, 0, 5)` runs 40% faster and
`$y = substr($x, 0, 5, '')` runs 45% faster.
  • Loading branch information
richardleach committed Nov 26, 2024
1 parent ff0ce7d commit d6f958e
Show file tree
Hide file tree
Showing 15 changed files with 793 additions and 395 deletions.
1 change: 1 addition & 0 deletions MANIFEST
Original file line number Diff line number Diff line change
Expand Up @@ -6436,6 +6436,7 @@ t/op/studytied.t See if study works with tied scalars
t/op/sub.t See if subroutines work
t/op/sub_lval.t See if lvalue subroutines work
t/op/substr.t See if substr works
t/op/substr_nibble.t See if substr($x, 0, $l, '') optimisation works
t/op/substr_thr.t See if substr works in another thread
t/op/svflags.t See if POK is set as expected.
t/op/svleak.pl Test file for svleak.t
Expand Down
5 changes: 3 additions & 2 deletions ext/Opcode/Opcode.pm
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package Opcode 1.66;
package Opcode 1.67;

use strict;

Expand Down Expand Up @@ -322,7 +322,8 @@ invert_opset function.
slt sgt sle sge seq sne scmp
isa
substr vec stringify study pos length index rindex ord chr
substr substr_nibble vec stringify study pos length index
rindex ord chr
ucfirst lcfirst uc lc fc quotemeta trans transr chop schop
chomp schomp
Expand Down
21 changes: 20 additions & 1 deletion lib/B/Deparse.pm
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
# This is based on the module of the same name by Malcolm Beattie,
# but essentially none of his code remains.

package B::Deparse 1.80;
package B::Deparse 1.81;
use strict;
use Carp;
use B qw(class main_root main_start main_cv svref_2object opnumber perlstring
Expand Down Expand Up @@ -3419,6 +3419,25 @@ sub pp_substr {
maybe_local(@_, listop(@_, "substr"))
}

sub pp_substr_nibble {
my ($self,$op,$cx) = @_;

my $lex = ($op->private & OPpTARGET_MY);

my $val = 'substr(' . $self->deparse($op->first->sibling, $cx)
. ', 0, ' . $self->deparse($op->first->sibling->sibling->sibling, $cx)
. ( (($op->private & 7) == 3) ? '' : ", '')" );

if ($lex) {
my $targ = $op->targ;
my $var = $self->maybe_my($op, $cx, $self->padname($op->targ),
$self->padname_sv($targ),
0);
$val = $self->maybe_parens("$var = $val", $cx, 7);
}
$val;
}

sub pp_index {
# Also handles pp_rindex.
#
Expand Down
8 changes: 8 additions & 0 deletions lib/B/Deparse.t
Original file line number Diff line number Diff line change
Expand Up @@ -1757,6 +1757,14 @@ print sort(foo('bar'));
substr(my $a, 0, 0) = (foo(), bar());
$a++;
####
# 4-arg substr (non-nibble)
my $str = 'ABCD';
my $bbb = substr($str, 1, 1, '');
####
# 4-arg substr (nibble)
my $str = 'ABCD';
my $aaa = substr($str, 0, 1, '');
####
# This following line works around an unfixed bug that we are not trying to
# test for here:
# CONTEXT BEGIN { $^H{a} = "b"; delete $^H{a} } # make %^H localised
Expand Down
1 change: 1 addition & 0 deletions lib/B/Op_private.pm

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit d6f958e

Please sign in to comment.