argra****@users*****
argra****@users*****
2013年 4月 16日 (火) 04:37:14 JST
Index: docs/perl/5.10.1/perldebguts.pod diff -u /dev/null docs/perl/5.10.1/perldebguts.pod:1.1 --- /dev/null Tue Apr 16 04:37:14 2013 +++ docs/perl/5.10.1/perldebguts.pod Tue Apr 16 04:37:14 2013 @@ -0,0 +1,1787 @@ + +=encoding euc-jp + +=head1 NAME + +=begin original + +perldebguts - Guts of Perl debugging + +=end original + +perldebguts - Perl デバッグの内部 + +=head1 DESCRIPTION + +=begin original + +This is not the perldebug(1) manpage, which tells you how to use +the debugger. This manpage describes low-level details concerning +the debugger's internals, which range from difficult to impossible +to understand for anyone who isn't incredibly intimate with Perl's guts. +Caveat lector. + +=end original + +これは、デバッガの使い方を記した perldebug(1) man ページではありません。 +この man ページは、難しいものから Perl の内部にものすごく詳しい人でなければ +理解することができないようなものまで、デバッガの内部に関する低レベルな詳細を +記述しています。 +読者に対する注意です。 + +=head1 Debugger Internals + +(デバッガの内部) + +=begin original + +Perl has special debugging hooks at compile-time and run-time used +to create debugging environments. These hooks are not to be confused +with the I<perl -Dxxx> command described in L<perlrun>, which is +usable only if a special Perl is built per the instructions in the +F<INSTALL> podpage in the Perl source tree. + +=end original + +Perl has special debugging hooks at compile-time and run-time used +to create debugging environments. These hooks are not to be confused +with the I<perl -Dxxx> command described in L<perlrun>, which is +usable only if a special Perl is built per the instructions in the +F<INSTALL> podpage in the Perl source tree. +(TBT) + +=begin original + +For example, whenever you call Perl's built-in C<caller> function +from the package C<DB>, the arguments that the corresponding stack +frame was called with are copied to the C<@DB::args> array. These +mechanisms are enabled by calling Perl with the B<-d> switch. +Specifically, the following additional features are enabled +(cf. L<perlvar/$^P>): + +=end original + +For example, whenever you call Perl's built-in C<caller> function +from the package C<DB>, the arguments that the corresponding stack +frame was called with are copied to the C<@DB::args> array. These +mechanisms are enabled by calling Perl with the B<-d> switch. +Specifically, the following additional features are enabled +(cf. L<perlvar/$^P>): +(TBT) + +=over 4 + +=item * + +=begin original + +Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require +'perl5db.pl'}> if not present) before the first line of your program. + +=end original + +Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require +'perl5db.pl'}> if not present) before the first line of your program. +(TBT) + +=item * + +=begin original + +Each array C<@{"_<$filename"}> holds the lines of $filename for a +file compiled by Perl. The same is also true for C<eval>ed strings +that contain subroutines, or which are currently being executed. +The $filename for C<eval>ed strings looks like C<(eval 34)>. +Code assertions in regexes look like C<(re_eval 19)>. + +=end original + +Each array C<@{"_<$filename"}> holds the lines of $filename for a +file compiled by Perl. The same is also true for C<eval>ed strings +that contain subroutines, or which are currently being executed. +The $filename for C<eval>ed strings looks like C<(eval 34)>. +Code assertions in regexes look like C<(re_eval 19)>. +(TBT) + +=begin original + +Values in this array are magical in numeric context: they compare +equal to zero only if the line is not breakable. + +=end original + +Values in this array are magical in numeric context: they compare +equal to zero only if the line is not breakable. +(TBT) + +=item * + +=begin original + +Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed +by line number. Individual entries (as opposed to the whole hash) +are settable. Perl only cares about Boolean true here, although +the values used by F<perl5db.pl> have the form +C<"$break_condition\0$action">. + +=end original + +Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed +by line number. Individual entries (as opposed to the whole hash) +are settable. Perl only cares about Boolean true here, although +the values used by F<perl5db.pl> have the form +C<"$break_condition\0$action">. +(TBT) + +=begin original + +The same holds for evaluated strings that contain subroutines, or +which are currently being executed. The $filename for C<eval>ed strings +looks like C<(eval 34)> or C<(re_eval 19)>. + +=end original + +The same holds for evaluated strings that contain subroutines, or +which are currently being executed. The $filename for C<eval>ed strings +looks like C<(eval 34)> or C<(re_eval 19)>. +(TBT) + +=item * + +=begin original + +Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is +also the case for evaluated strings that contain subroutines, or +which are currently being executed. The $filename for C<eval>ed +strings looks like C<(eval 34)> or C<(re_eval 19)>. + +=end original + +Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is +also the case for evaluated strings that contain subroutines, or +which are currently being executed. The $filename for C<eval>ed +strings looks like C<(eval 34)> or C<(re_eval 19)>. +(TBT) + +=item * + +=begin original + +After each C<require>d file is compiled, but before it is executed, +C<DB::postponed(*{"_<$filename"})> is called if the subroutine +C<DB::postponed> exists. Here, the $filename is the expanded name of +the C<require>d file, as found in the values of %INC. + +=end original + +After each C<require>d file is compiled, but before it is executed, +C<DB::postponed(*{"_<$filename"})> is called if the subroutine +C<DB::postponed> exists. Here, the $filename is the expanded name of +the C<require>d file, as found in the values of %INC. +(TBT) + +=item * + +=begin original + +After each subroutine C<subname> is compiled, the existence of +C<$DB::postponed{subname}> is checked. If this key exists, +C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine +also exists. + +=end original + +After each subroutine C<subname> is compiled, the existence of +C<$DB::postponed{subname}> is checked. If this key exists, +C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine +also exists. +(TBT) + +=item * + +=begin original + +A hash C<%DB::sub> is maintained, whose keys are subroutine names +and whose values have the form C<filename:startline-endline>. +C<filename> has the form C<(eval 34)> for subroutines defined inside +C<eval>s, or C<(re_eval 19)> for those within regex code assertions. + +=end original + +A hash C<%DB::sub> is maintained, whose keys are subroutine names +and whose values have the form C<filename:startline-endline>. +C<filename> has the form C<(eval 34)> for subroutines defined inside +C<eval>s, or C<(re_eval 19)> for those within regex code assertions. +(TBT) + +=item * + +=begin original + +When the execution of your program reaches a point that can hold a +breakpoint, the C<DB::DB()> subroutine is called if any of the variables +C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables +are not C<local>izable. This feature is disabled when executing +inside C<DB::DB()>, including functions called from it +unless C<< $^D & (1<<30) >> is true. + +=end original + +When the execution of your program reaches a point that can hold a +breakpoint, the C<DB::DB()> subroutine is called if any of the variables +C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables +are not C<local>izable. This feature is disabled when executing +inside C<DB::DB()>, including functions called from it +unless C<< $^D & (1<<30) >> is true. +(TBT) + +=item * + +=begin original + +When execution of the program reaches a subroutine call, a call to +C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the +name of the called subroutine. (This doesn't happen if the subroutine +was compiled in the C<DB> package.) + +=end original + +When execution of the program reaches a subroutine call, a call to +C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the +name of the called subroutine. (This doesn't happen if the subroutine +was compiled in the C<DB> package.) +(TBT) + +=back + +=begin original + +Note that if C<&DB::sub> needs external data for it to work, no +subroutine call is possible without it. As an example, the standard +debugger's C<&DB::sub> depends on the C<$DB::deep> variable +(it defines how many levels of recursion deep into the debugger you can go +before a mandatory break). If C<$DB::deep> is not defined, subroutine +calls are not possible, even though C<&DB::sub> exists. + +=end original + +Note that if C<&DB::sub> needs external data for it to work, no +subroutine call is possible without it. As an example, the standard +debugger's C<&DB::sub> depends on the C<$DB::deep> variable +(it defines how many levels of recursion deep into the debugger you can go +before a mandatory break). If C<$DB::deep> is not defined, subroutine +calls are not possible, even though C<&DB::sub> exists. +(TBT) + +=head2 Writing Your Own Debugger + +(独自のデバッガを書く) + +=head3 Environment Variables + +(環境変数) + +=begin original + +The C<PERL5DB> environment variable can be used to define a debugger. +For example, the minimal "working" debugger (it actually doesn't do anything) +consists of one line: + +=end original + +The C<PERL5DB> environment variable can be used to define a debugger. +For example, the minimal "working" debugger (it actually doesn't do anything) +consists of one line: +(TBT) + + sub DB::DB {} + +=begin original + +It can easily be defined like this: + +=end original + +It can easily be defined like this: +(TBT) + + $ PERL5DB="sub DB::DB {}" perl -d your-script + +=begin original + +Another brief debugger, slightly more useful, can be created +with only the line: + +=end original + +Another brief debugger, slightly more useful, can be created +with only the line: +(TBT) + + sub DB::DB {print ++$i; scalar <STDIN>} + +=begin original + +This debugger prints a number which increments for each statement +encountered and waits for you to hit a newline before continuing +to the next statement. + +=end original + +This debugger prints a number which increments for each statement +encountered and waits for you to hit a newline before continuing +to the next statement. +(TBT) + +=begin original + +The following debugger is actually useful: + +=end original + +The following debugger is actually useful: +(TBT) + + { + package DB; + sub DB {} + sub sub {print ++$i, " $sub\n"; &$sub} + } + +=begin original + +It prints the sequence number of each subroutine call and the name of the +called subroutine. Note that C<&DB::sub> is being compiled into the +package C<DB> through the use of the C<package> directive. + +=end original + +It prints the sequence number of each subroutine call and the name of the +called subroutine. Note that C<&DB::sub> is being compiled into the +package C<DB> through the use of the C<package> directive. +(TBT) + +=begin original + +When it starts, the debugger reads your rc file (F<./.perldb> or +F<~/.perldb> under Unix), which can set important options. +(A subroutine (C<&afterinit>) can be defined here as well; it is executed +after the debugger completes its own initialization.) + +=end original + +When it starts, the debugger reads your rc file (F<./.perldb> or +F<~/.perldb> under Unix), which can set important options. +(A subroutine (C<&afterinit>) can be defined here as well; it is executed +after the debugger completes its own initialization.) +(TBT) + +=begin original + +After the rc file is read, the debugger reads the PERLDB_OPTS +environment variable and uses it to set debugger options. The +contents of this variable are treated as if they were the argument +of an C<o ...> debugger command (q.v. in L<perldebug/Options>). + +=end original + +After the rc file is read, the debugger reads the PERLDB_OPTS +environment variable and uses it to set debugger options. The +contents of this variable are treated as if they were the argument +of an C<o ...> debugger command (q.v. in L<perldebug/Options>). +(TBT) + +=head3 Debugger internal variables + +(デバッガの内部変数) + +=begin original + +In addition to the file and subroutine-related variables mentioned above, +the debugger also maintains various magical internal variables. + +=end original + +In addition to the file and subroutine-related variables mentioned above, +the debugger also maintains various magical internal variables. +(TBT) + +=over 4 + +=item * + +=begin original + +C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which +holds the lines of the currently-selected file (compiled by Perl), either +explicitly chosen with the debugger's C<f> command, or implicitly by flow +of execution. + +=end original + +C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which +holds the lines of the currently-selected file (compiled by Perl), either +explicitly chosen with the debugger's C<f> command, or implicitly by flow +of execution. +(TBT) + +=begin original + +Values in this array are magical in numeric context: they compare +equal to zero only if the line is not breakable. + +=end original + +Values in this array are magical in numeric context: they compare +equal to zero only if the line is not breakable. +(TBT) + +=item * + +=begin original + +C<%DB::dbline>, is an alias for C<%{"::_<current_file"}>, which +contains breakpoints and actions keyed by line number in +the currently-selected file, either explicitly chosen with the +debugger's C<f> command, or implicitly by flow of execution. + +=end original + +C<%DB::dbline>, is an alias for C<%{"::_<current_file"}>, which +contains breakpoints and actions keyed by line number in +the currently-selected file, either explicitly chosen with the +debugger's C<f> command, or implicitly by flow of execution. +(TBT) + +=begin original + +As previously noted, individual entries (as opposed to the whole hash) +are settable. Perl only cares about Boolean true here, although +the values used by F<perl5db.pl> have the form +C<"$break_condition\0$action">. + +=end original + +As previously noted, individual entries (as opposed to the whole hash) +are settable. Perl only cares about Boolean true here, although +the values used by F<perl5db.pl> have the form +C<"$break_condition\0$action">. +(TBT) + +=back + +=head3 Debugger customization functions + +(デバッガカスタマイズ関数) + +=begin original + +Some functions are provided to simplify customization. + +=end original + +Some functions are provided to simplify customization. +(TBT) + +=over 4 + +=item * + +=begin original + +See L<perldebug/"Configurable Options"> for a description of options parsed by +C<DB::parse_options(string)>. + +=end original + +See L<perldebug/"Configurable Options"> for a description of options parsed by +C<DB::parse_options(string)>. +(TBT) + +=item * + +=begin original + +C<DB::dump_trace(skip[,count])> skips the specified number of frames +and returns a list containing information about the calling frames (all +of them, if C<count> is missing). Each entry is reference to a hash +with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine +name, or info about C<eval>), C<args> (C<undef> or a reference to +an array), C<file>, and C<line>. + +=end original + +C<DB::dump_trace(skip[,count])> skips the specified number of frames +and returns a list containing information about the calling frames (all +of them, if C<count> is missing). Each entry is reference to a hash +with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine +name, or info about C<eval>), C<args> (C<undef> or a reference to +an array), C<file>, and C<line>. +(TBT) + +=item * + +=begin original + +C<DB::print_trace(FH, skip[, count[, short]])> prints +formatted info about caller frames. The last two functions may be +convenient as arguments to C<< < >>, C<< << >> commands. + +=end original + +C<DB::print_trace(FH, skip[, count[, short]])> prints +formatted info about caller frames. The last two functions may be +convenient as arguments to C<< < >>, C<< << >> commands. +(TBT) + +=back + +=begin original + +Note that any variables and functions that are not documented in +this manpages (or in L<perldebug>) are considered for internal +use only, and as such are subject to change without notice. + +=end original + +Note that any variables and functions that are not documented in +this manpages (or in L<perldebug>) are considered for internal +use only, and as such are subject to change without notice. +(TBT) + +=head1 Frame Listing Output Examples + +(フレームリスト出力の例) + +=begin original + +The C<frame> option can be used to control the output of frame +information. For example, contrast this expression trace: + +=end original + +The C<frame> option can be used to control the output of frame +information. For example, contrast this expression trace: +(TBT) + + $ perl -de 42 + Stack dump during die enabled outside of evals. + + Loading DB routines from perl5db.pl patch level 0.94 + Emacs support available. + + Enter h or `h h' for help. + + main::(-e:1): 0 + DB<1> sub foo { 14 } + + DB<2> sub bar { 3 } + + DB<3> t print foo() * bar() + main::((eval 172):3): print foo() + bar(); + main::foo((eval 168):2): + main::bar((eval 170):2): + 42 + +=begin original + +with this one, once the C<o>ption C<frame=2> has been set: + +=end original + +with this one, once the C<o>ption C<frame=2> has been set: +(TBT) + + DB<4> o f=2 + frame = '2' + DB<5> t print foo() * bar() + 3: foo() * bar() + entering main::foo + 2: sub foo { 14 }; + exited main::foo + entering main::bar + 2: sub bar { 3 }; + exited main::bar + 42 + +=begin original + +By way of demonstration, we present below a laborious listing +resulting from setting your C<PERLDB_OPTS> environment variable to +the value C<f=n N>, and running I<perl -d -V> from the command line. +Examples use various values of C<n> are shown to give you a feel +for the difference between settings. Long those it may be, this +is not a complete listing, but only excerpts. + +=end original + +By way of demonstration, we present below a laborious listing +resulting from setting your C<PERLDB_OPTS> environment variable to +the value C<f=n N>, and running I<perl -d -V> from the command line. +Examples use various values of C<n> are shown to give you a feel +for the difference between settings. Long those it may be, this +is not a complete listing, but only excerpts. +(TBT) + +=over 4 + +=item 1 + + entering main::BEGIN + entering Config::BEGIN + Package lib/Exporter.pm. + Package lib/Carp.pm. + Package lib/Config.pm. + entering Config::TIEHASH + entering Exporter::import + entering Exporter::export + entering Config::myconfig + entering Config::FETCH + entering Config::FETCH + entering Config::FETCH + entering Config::FETCH + +=item 2 + + entering main::BEGIN + entering Config::BEGIN + Package lib/Exporter.pm. + Package lib/Carp.pm. + exited Config::BEGIN + Package lib/Config.pm. + entering Config::TIEHASH + exited Config::TIEHASH + entering Exporter::import + entering Exporter::export + exited Exporter::export + exited Exporter::import + exited main::BEGIN + entering Config::myconfig + entering Config::FETCH + exited Config::FETCH + entering Config::FETCH + exited Config::FETCH + entering Config::FETCH + +=item 3 + + in $=main::BEGIN() from /dev/null:0 + in $=Config::BEGIN() from lib/Config.pm:2 + Package lib/Exporter.pm. + Package lib/Carp.pm. + Package lib/Config.pm. + in $=Config::TIEHASH('Config') from lib/Config.pm:644 + in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 + in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li + in @=Config::myconfig() from /dev/null:0 + in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574 + +=item 4 + + in $=main::BEGIN() from /dev/null:0 + in $=Config::BEGIN() from lib/Config.pm:2 + Package lib/Exporter.pm. + Package lib/Carp.pm. + out $=Config::BEGIN() from lib/Config.pm:0 + Package lib/Config.pm. + in $=Config::TIEHASH('Config') from lib/Config.pm:644 + out $=Config::TIEHASH('Config') from lib/Config.pm:644 + in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 + in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ + out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ + out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 + out $=main::BEGIN() from /dev/null:0 + in @=Config::myconfig() from /dev/null:0 + in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 + out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 + out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 + out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 + in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574 + +=item 5 + + in $=main::BEGIN() from /dev/null:0 + in $=Config::BEGIN() from lib/Config.pm:2 + Package lib/Exporter.pm. + Package lib/Carp.pm. + out $=Config::BEGIN() from lib/Config.pm:0 + Package lib/Config.pm. + in $=Config::TIEHASH('Config') from lib/Config.pm:644 + out $=Config::TIEHASH('Config') from lib/Config.pm:644 + in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 + in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E + out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E + out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 + out $=main::BEGIN() from /dev/null:0 + in @=Config::myconfig() from /dev/null:0 + in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 + out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 + in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 + out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 + +=item 6 + + in $=CODE(0x15eca4)() from /dev/null:0 + in $=CODE(0x182528)() from lib/Config.pm:2 + Package lib/Exporter.pm. + out $=CODE(0x182528)() from lib/Config.pm:0 + scalar context return from CODE(0x182528): undef + Package lib/Config.pm. + in $=Config::TIEHASH('Config') from lib/Config.pm:628 + out $=Config::TIEHASH('Config') from lib/Config.pm:628 + scalar context return from Config::TIEHASH: empty hash + in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 + in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 + out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 + scalar context return from Exporter::export: '' + out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 + scalar context return from Exporter::import: '' + +=back + +=begin original + +In all cases shown above, the line indentation shows the call tree. +If bit 2 of C<frame> is set, a line is printed on exit from a +subroutine as well. If bit 4 is set, the arguments are printed +along with the caller info. If bit 8 is set, the arguments are +printed even if they are tied or references. If bit 16 is set, the +return value is printed, too. + +=end original + +In all cases shown above, the line indentation shows the call tree. +If bit 2 of C<frame> is set, a line is printed on exit from a +subroutine as well. If bit 4 is set, the arguments are printed +along with the caller info. If bit 8 is set, the arguments are +printed even if they are tied or references. If bit 16 is set, the +return value is printed, too. +(TBT) + +=begin original + +When a package is compiled, a line like this + +=end original + +When a package is compiled, a line like this +(TBT) + + Package lib/Carp.pm. + +=begin original + +is printed with proper indentation. + +=end original + +is printed with proper indentation. +(TBT) + +=head1 Debugging regular expressions + +(正規表現のデバッグ) + +=begin original + +There are two ways to enable debugging output for regular expressions. + +=end original + +There are two ways to enable debugging output for regular expressions. +(TBT) + +=begin original + +If your perl is compiled with C<-DDEBUGGING>, you may use the +B<-Dr> flag on the command line. + +=end original + +If your perl is compiled with C<-DDEBUGGING>, you may use the +B<-Dr> flag on the command line. +(TBT) + +=begin original + +Otherwise, one can C<use re 'debug'>, which has effects at +compile time and run time. It is not lexically scoped. + +=end original + +Otherwise, one can C<use re 'debug'>, which has effects at +compile time and run time. It is not lexically scoped. +(TBT) + +=head2 Compile-time output + +(コンパイル時出力) + +=begin original + +The debugging output at compile time looks like this: + +=end original + +The debugging output at compile time looks like this: +(TBT) + + Compiling REx `[bc]d(ef*g)+h[ij]k$' + size 45 Got 364 bytes for offset annotations. + first at 1 + rarest char g at 0 + rarest char d at 0 + 1: ANYOF[bc](12) + 12: EXACT <d>(14) + 14: CURLYX[0] {1,32767}(28) + 16: OPEN1(18) + 18: EXACT <e>(20) + 20: STAR(23) + 21: EXACT <f>(0) + 23: EXACT <g>(25) + 25: CLOSE1(27) + 27: WHILEM[1/1](0) + 28: NOTHING(29) + 29: EXACT <h>(31) + 31: ANYOF[ij](42) + 42: EXACT <k>(44) + 44: EOL(45) + 45: END(0) + anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating) + stclass `ANYOF[bc]' minlen 7 + Offsets: [45] + 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1] + 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0] + 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0] + 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0] + Omitting $` $& $' support. + +=begin original + +The first line shows the pre-compiled form of the regex. The second +shows the size of the compiled form (in arbitrary units, usually +4-byte words) and the total number of bytes allocated for the +offset/length table, usually 4+C<size>*8. The next line shows the +label I<id> of the first node that does a match. + +=end original + +The first line shows the pre-compiled form of the regex. The second +shows the size of the compiled form (in arbitrary units, usually +4-byte words) and the total number of bytes allocated for the +offset/length table, usually 4+C<size>*8. The next line shows the +label I<id> of the first node that does a match. +(TBT) + +=begin original + +The + +=end original + +The +(TBT) + + anchored `de' at 1 floating `gh' at 3..2147483647 (checking floating) + stclass `ANYOF[bc]' minlen 7 + +=begin original + +line (split into two lines above) contains optimizer +information. In the example shown, the optimizer found that the match +should contain a substring C<de> at offset 1, plus substring C<gh> +at some offset between 3 and infinity. Moreover, when checking for +these substrings (to abandon impossible matches quickly), Perl will check +for the substring C<gh> before checking for the substring C<de>. The +optimizer may also use the knowledge that the match starts (at the +C<first> I<id>) with a character class, and no string +shorter than 7 characters can possibly match. + +=end original + +line (split into two lines above) contains optimizer +information. In the example shown, the optimizer found that the match +should contain a substring C<de> at offset 1, plus substring C<gh> +at some offset between 3 and infinity. Moreover, when checking for +these substrings (to abandon impossible matches quickly), Perl will check +for the substring C<gh> before checking for the substring C<de>. The +optimizer may also use the knowledge that the match starts (at the +C<first> I<id>) with a character class, and no string +shorter than 7 characters can possibly match. +(TBT) + +=begin original + +The fields of interest which may appear in this line are + +=end original + +The fields of interest which may appear in this line are +(TBT) + +=over 4 + +=item C<anchored> I<STRING> C<at> I<POS> + +=item C<floating> I<STRING> C<at> I<POS1..POS2> + +=begin original + +See above. + +=end original + +See above. +(TBT) + +=item C<matching floating/anchored> + +=begin original + +Which substring to check first. + +=end original + +Which substring to check first. +(TBT) + +=item C<minlen> + +=begin original + +The minimal length of the match. + +=end original + +The minimal length of the match. +(TBT) + +=item C<stclass> I<TYPE> + +=begin original + +Type of first matching node. + +=end original + +Type of first matching node. +(TBT) + +=item C<noscan> + +=begin original + +Don't scan for the found substrings. + +=end original + +Don't scan for the found substrings. +(TBT) + +=item C<isall> + +=begin original + +Means that the optimizer information is all that the regular +expression contains, and thus one does not need to enter the regex engine at +all. + +=end original + +Means that the optimizer information is all that the regular +expression contains, and thus one does not need to enter the regex engine at +all. +(TBT) + +=item C<GPOS> + +=begin original + +Set if the pattern contains C<\G>. + +=end original + +Set if the pattern contains C<\G>. +(TBT) + +=item C<plus> + +=begin original + +Set if the pattern starts with a repeated char (as in C<x+y>). + +=end original + +Set if the pattern starts with a repeated char (as in C<x+y>). +(TBT) + +=item C<implicit> + +=begin original + +Set if the pattern starts with C<.*>. + +=end original + +Set if the pattern starts with C<.*>. +(TBT) + +=item C<with eval> + +=begin original + +Set if the pattern contain eval-groups, such as C<(?{ code })> and +C<(??{ code })>. + +=end original + +Set if the pattern contain eval-groups, such as C<(?{ code })> and +C<(??{ code })>. +(TBT) + +=item C<anchored(TYPE)> + +=begin original + +If the pattern may match only at a handful of places, (with C<TYPE> +being C<BOL>, C<MBOL>, or C<GPOS>. See the table below. + +=end original + +If the pattern may match only at a handful of places, (with C<TYPE> +being C<BOL>, C<MBOL>, or C<GPOS>. See the table below. +(TBT) + +=back + +=begin original + +If a substring is known to match at end-of-line only, it may be +followed by C<$>, as in C<floating `k'$>. + +=end original + +If a substring is known to match at end-of-line only, it may be +followed by C<$>, as in C<floating `k'$>. +(TBT) + +=begin original + +The optimizer-specific information is used to avoid entering (a slow) regex +engine on strings that will not definitely match. If the C<isall> flag +is set, a call to the regex engine may be avoided even when the optimizer +found an appropriate place for the match. + +=end original + +The optimizer-specific information is used to avoid entering (a slow) regex +engine on strings that will not definitely match. If the C<isall> flag +is set, a call to the regex engine may be avoided even when the optimizer +found an appropriate place for the match. +(TBT) + +=begin original + +Above the optimizer section is the list of I<nodes> of the compiled +form of the regex. Each line has format + +=end original + +Above the optimizer section is the list of I<nodes> of the compiled +form of the regex. Each line has format +(TBT) + +=begin original + +C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>) + +=end original + +C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>) +(TBT) + +=head2 Types of nodes + +(ノードの型) + +=begin original + +Here are the possible types, with short descriptions: + +=end original + +Here are the possible types, with short descriptions: +(TBT) + + # TYPE arg-description [num-args] [longjump-len] DESCRIPTION + + # Exit points + END no End of program. + SUCCEED no Return from a subroutine, basically. + + # Anchors: + BOL no Match "" at beginning of line. + MBOL no Same, assuming multiline. + SBOL no Same, assuming singleline. + EOS no Match "" at end of string. + EOL no Match "" at end of line. + MEOL no Same, assuming multiline. + SEOL no Same, assuming singleline. + BOUND no Match "" at any word boundary + BOUNDL no Match "" at any word boundary + NBOUND no Match "" at any word non-boundary + NBOUNDL no Match "" at any word non-boundary + GPOS no Matches where last m//g left off. + + # [Special] alternatives + ANY no Match any one character (except newline). + SANY no Match any one character. + ANYOF sv Match character in (or not in) this class. + ALNUM no Match any alphanumeric character + ALNUML no Match any alphanumeric char in locale + NALNUM no Match any non-alphanumeric character + NALNUML no Match any non-alphanumeric char in locale + SPACE no Match any whitespace character + SPACEL no Match any whitespace char in locale + NSPACE no Match any non-whitespace character + NSPACEL no Match any non-whitespace char in locale + DIGIT no Match any numeric character + NDIGIT no Match any non-numeric character + + # BRANCH The set of branches constituting a single choice are hooked + # together with their "next" pointers, since precedence prevents + # anything being concatenated to any individual branch. The + # "next" pointer of the last BRANCH in a choice points to the + # thing following the whole choice. This is also where the + # final "next" pointer of each individual branch points; each + # branch starts with the operand node of a BRANCH node. + # + BRANCH node Match this alternative, or the next... + + # BACK Normal "next" pointers all implicitly point forward; BACK + # exists to make loop structures possible. + # not used + BACK no Match "", "next" ptr points backward. + + # Literals + EXACT sv Match this string (preceded by length). + EXACTF sv Match this string, folded (prec. by length). + EXACTFL sv Match this string, folded in locale (w/len). + + # Do nothing + NOTHING no Match empty string. + # A variant of above which delimits a group, thus stops optimizations + TAIL no Match empty string. Can jump here from outside. + + # STAR,PLUS '?', and complex '*' and '+', are implemented as circular + # BRANCH structures using BACK. Simple cases (one character + # per match) are implemented with STAR and PLUS for speed + # and to minimize recursive plunges. + # + STAR node Match this (simple) thing 0 or more times. + PLUS node Match this (simple) thing 1 or more times. + + CURLY sv 2 Match this simple thing {n,m} times. + CURLYN no 2 Match next-after-this simple thing + # {n,m} times, set parens. + CURLYM no 2 Match this medium-complex thing {n,m} times. + CURLYX sv 2 Match this complex thing {n,m} times. + + # This terminator creates a loop structure for CURLYX + WHILEM no Do curly processing and see if rest matches. + + # OPEN,CLOSE,GROUPP ...are numbered at compile time. + OPEN num 1 Mark this point in input as start of #n. + CLOSE num 1 Analogous to OPEN. + + REF num 1 Match some already matched string + REFF num 1 Match already matched string, folded + REFFL num 1 Match already matched string, folded in loc. + + # grouping assertions + IFMATCH off 1 2 Succeeds if the following matches. + UNLESSM off 1 2 Fails if the following matches. + SUSPEND off 1 1 "Independent" sub-regex. + IFTHEN off 1 1 Switch, should be preceded by switcher . + GROUPP num 1 Whether the group matched. + + # Support for long regex + LONGJMP off 1 1 Jump far away. + BRANCHJ off 1 1 BRANCH with long offset. + + # The heavy worker + EVAL evl 1 Execute some Perl code. + + # Modifiers + MINMOD no Next operator is not greedy. + LOGICAL no Next opcode should set the flag only. + + # This is not used yet + RENUM off 1 1 Group with independently numbered parens. + + # This is not really a node, but an optimized away piece of a "long" node. + # To simplify debugging output, we mark it as if it were a node + OPTIMIZED off Placeholder for dump. + +=for unprinted-credits +Next section M-J. Dominus (mjd-p****@plove*****) 20010421 + +=begin original + +Following the optimizer information is a dump of the offset/length +table, here split across several lines: + +=end original + +Following the optimizer information is a dump of the offset/length +table, here split across several lines: +(TBT) + + Offsets: [45] + 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1] + 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0] + 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0] + 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0] + +=begin original + +The first line here indicates that the offset/length table contains 45 +entries. Each entry is a pair of integers, denoted by C<offset[length]>. +Entries are numbered starting with 1, so entry #1 here is C<1[4]> and +entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:> +(the C<1: ANYOF[bc]>) begins at character position 1 in the +pre-compiled form of the regex, and has a length of 4 characters. +C<5[1]> in position 12 +indicates that the node labeled C<12:> +(the C<< 12: EXACT <d> >>) begins at character position 5 in the +pre-compiled form of the regex, and has a length of 1 character. +C<12[1]> in position 14 +indicates that the node labeled C<14:> +(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the +pre-compiled form of the regex, and has a length of 1 character---that +is, it corresponds to the C<+> symbol in the precompiled regex. + +=end original + +The first line here indicates that the offset/length table contains 45 +entries. Each entry is a pair of integers, denoted by C<offset[length]>. +Entries are numbered starting with 1, so entry #1 here is C<1[4]> and +entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:> +(the C<1: ANYOF[bc]>) begins at character position 1 in the +pre-compiled form of the regex, and has a length of 4 characters. +C<5[1]> in position 12 +indicates that the node labeled C<12:> +(the C<< 12: EXACT <d> >>) begins at character position 5 in the +pre-compiled form of the regex, and has a length of 1 character. +C<12[1]> in position 14 +indicates that the node labeled C<14:> +(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the +pre-compiled form of the regex, and has a length of 1 character---that +is, it corresponds to the C<+> symbol in the precompiled regex. +(TBT) + +=begin original + +C<0[0]> items indicate that there is no corresponding node. + +=end original + +C<0[0]> items indicate that there is no corresponding node. +(TBT) + +=head2 Run-time output + +(実行時出力) + +=begin original + +First of all, when doing a match, one may get no run-time output even +if debugging is enabled. This means that the regex engine was never +entered and that all of the job was therefore done by the optimizer. + +=end original + +First of all, when doing a match, one may get no run-time output even +if debugging is enabled. This means that the regex engine was never +entered and that all of the job was therefore done by the optimizer. +(TBT) + +=begin original + +If the regex engine was entered, the output may look like this: + +=end original + +If the regex engine was entered, the output may look like this: +(TBT) + + Matching `[bc]d(ef*g)+h[ij]k$' against `abcdefg__gh__' + Setting an EVAL scope, savestack=3 + 2 <ab> <cdefg__gh_> | 1: ANYOF + 3 <abc> <defg__gh_> | 11: EXACT <d> + 4 <abcd> <efg__gh_> | 13: CURLYX {1,32767} + 4 <abcd> <efg__gh_> | 26: WHILEM + 0 out of 1..32767 cc=effff31c + 4 <abcd> <efg__gh_> | 15: OPEN1 + 4 <abcd> <efg__gh_> | 17: EXACT <e> + 5 <abcde> <fg__gh_> | 19: STAR + EXACT <f> can match 1 times out of 32767... + Setting an EVAL scope, savestack=3 + 6 <bcdef> <g__gh__> | 22: EXACT <g> + 7 <bcdefg> <__gh__> | 24: CLOSE1 + 7 <bcdefg> <__gh__> | 26: WHILEM + 1 out of 1..32767 cc=effff31c + Setting an EVAL scope, savestack=12 + 7 <bcdefg> <__gh__> | 15: OPEN1 + 7 <bcdefg> <__gh__> | 17: EXACT <e> + restoring \1 to 4(4)..7 + failed, try continuation... + 7 <bcdefg> <__gh__> | 27: NOTHING + 7 <bcdefg> <__gh__> | 28: EXACT <h> + failed... + failed... + +=begin original + +The most significant information in the output is about the particular I<node> +of the compiled regex that is currently being tested against the target string. +The format of these lines is + +=end original + +The most significant information in the output is about the particular I<node> +of the compiled regex that is currently being tested against the target string. +The format of these lines is +(TBT) + +=begin original + +C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE> + +=end original + +C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE> +(TBT) + +=begin original + +The I<TYPE> info is indented with respect to the backtracking level. +Other incidental information appears interspersed within. + +=end original + +The I<TYPE> info is indented with respect to the backtracking level. +Other incidental information appears interspersed within. +(TBT) + +=head1 Debugging Perl memory usage + +(Perl のメモリ使用のデバッグ) + +=begin original + +Perl is a profligate wastrel when it comes to memory use. There +is a saying that to estimate memory usage of Perl, assume a reasonable +algorithm for memory allocation, multiply that estimate by 10, and +while you still may miss the mark, at least you won't be quite so +astonished. This is not absolutely true, but may provide a good +grasp of what happens. + +=end original + +Perl is a profligate wastrel when it comes to memory use. There +is a saying that to estimate memory usage of Perl, assume a reasonable +algorithm for memory allocation, multiply that estimate by 10, and +while you still may miss the mark, at least you won't be quite so +astonished. This is not absolutely true, but may provide a good +grasp of what happens. +(TBT) + +=begin original + +Assume that an integer cannot take less than 20 bytes of memory, a +float cannot take less than 24 bytes, a string cannot take less +than 32 bytes (all these examples assume 32-bit architectures, the +result are quite a bit worse on 64-bit architectures). If a variable +is accessed in two of three different ways (which require an integer, +a float, or a string), the memory footprint may increase yet another +20 bytes. A sloppy malloc(3) implementation can inflate these +numbers dramatically. + +=end original + +Assume that an integer cannot take less than 20 bytes of memory, a +float cannot take less than 24 bytes, a string cannot take less +than 32 bytes (all these examples assume 32-bit architectures, the +result are quite a bit worse on 64-bit architectures). If a variable +is accessed in two of three different ways (which require an integer, +a float, or a string), the memory footprint may increase yet another +20 bytes. A sloppy malloc(3) implementation can inflate these +numbers dramatically. +(TBT) + +=begin original + +On the opposite end of the scale, a declaration like + +=end original + +On the opposite end of the scale, a declaration like +(TBT) + + sub foo; + +=begin original + +may take up to 500 bytes of memory, depending on which release of Perl +you're running. + +=end original + +may take up to 500 bytes of memory, depending on which release of Perl +you're running. +(TBT) + +=begin original + +Anecdotal estimates of source-to-compiled code bloat suggest an +eightfold increase. This means that the compiled form of reasonable +(normally commented, properly indented etc.) code will take +about eight times more space in memory than the code took +on disk. + +=end original + +Anecdotal estimates of source-to-compiled code bloat suggest an +eightfold increase. This means that the compiled form of reasonable +(normally commented, properly indented etc.) code will take +about eight times more space in memory than the code took +on disk. +(TBT) + +=begin original + +The B<-DL> command-line switch is obsolete since circa Perl 5.6.0 +(it was available only if Perl was built with C<-DDEBUGGING>). +The switch was used to track Perl's memory allocations and possible +memory leaks. These days the use of malloc debugging tools like +F<Purify> or F<valgrind> is suggested instead. See also +L<perlhack/PERL_MEM_LOG>. + +=end original + +The B<-DL> command-line switch is obsolete since circa Perl 5.6.0 +(it was available only if Perl was built with C<-DDEBUGGING>). +The switch was used to track Perl's memory allocations and possible +memory leaks. These days the use of malloc debugging tools like +F<Purify> or F<valgrind> is suggested instead. See also +L<perlhack/PERL_MEM_LOG>. +(TBT) + +=begin original + +One way to find out how much memory is being used by Perl data +structures is to install the Devel::Size module from CPAN: it gives +you the minimum number of bytes required to store a particular data +structure. Please be mindful of the difference between the size() +and total_size(). + +=end original + +One way to find out how much memory is being used by Perl data +structures is to install the Devel::Size module from CPAN: it gives +you the minimum number of bytes required to store a particular data +structure. Please be mindful of the difference between the size() +and total_size(). +(TBT) + +=begin original + +If Perl has been compiled using Perl's malloc you can analyze Perl +memory usage by setting the $ENV{PERL_DEBUG_MSTATS}. + +=end original + +If Perl has been compiled using Perl's malloc you can analyze Perl +memory usage by setting the $ENV{PERL_DEBUG_MSTATS}. +(TBT) + +=head2 Using C<$ENV{PERL_DEBUG_MSTATS}> + +(C<$ENV{PERL_DEBUG_MSTATS}> を使う) + +=begin original + +If your perl is using Perl's malloc() and was compiled with the +necessary switches (this is the default), then it will print memory +usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS} +> 1 >>, and before termination of the program when C<< +$ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to +the following example: + +=end original + +If your perl is using Perl's malloc() and was compiled with the +necessary switches (this is the default), then it will print memory +usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS} +> 1 >>, and before termination of the program when C<< +$ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to +the following example: +(TBT) + + $ PERL_DEBUG_MSTATS=2 perl -e "require Carp" + Memory allocation statistics after compilation: (buckets 4(4)..8188(8192) + 14216 free: 130 117 28 7 9 0 2 2 1 0 0 + 437 61 36 0 5 + 60924 used: 125 137 161 55 7 8 6 16 2 0 1 + 74 109 304 84 20 + Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048. + Memory allocation statistics after execution: (buckets 4(4)..8188(8192) + 30888 free: 245 78 85 13 6 2 1 3 2 0 1 + 315 162 39 42 11 + 175816 used: 265 176 1112 111 26 22 11 27 2 1 1 + 196 178 1066 798 39 + Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144. + +=begin original + +It is possible to ask for such a statistic at arbitrary points in +your execution using the mstat() function out of the standard +Devel::Peek module. + +=end original + +It is possible to ask for such a statistic at arbitrary points in +your execution using the mstat() function out of the standard +Devel::Peek module. +(TBT) + +=begin original + +Here is some explanation of that format: + +=end original + +Here is some explanation of that format: +(TBT) + +=over 4 + +=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)> + +=begin original + +Perl's malloc() uses bucketed allocations. Every request is rounded +up to the closest bucket size available, and a bucket is taken from +the pool of buckets of that size. + +=end original + +Perl's malloc() uses bucketed allocations. Every request is rounded +up to the closest bucket size available, and a bucket is taken from +the pool of buckets of that size. +(TBT) + +=begin original + +The line above describes the limits of buckets currently in use. +Each bucket has two sizes: memory footprint and the maximal size +of user data that can fit into this bucket. Suppose in the above +example that the smallest bucket were size 4. The biggest bucket +would have usable size 8188, and the memory footprint would be 8192. + +=end original + +The line above describes the limits of buckets currently in use. +Each bucket has two sizes: memory footprint and the maximal size +of user data that can fit into this bucket. Suppose in the above +example that the smallest bucket were size 4. The biggest bucket +would have usable size 8188, and the memory footprint would be 8192. +(TBT) + +=begin original + +In a Perl built for debugging, some buckets may have negative usable +size. This means that these buckets cannot (and will not) be used. +For larger buckets, the memory footprint may be one page greater +than a power of 2. If so, case the corresponding power of two is +printed in the C<APPROX> field above. + +=end original + +In a Perl built for debugging, some buckets may have negative usable +size. This means that these buckets cannot (and will not) be used. +For larger buckets, the memory footprint may be one page greater +than a power of 2. If so, case the corresponding power of two is +printed in the C<APPROX> field above. +(TBT) + +=item Free/Used + +=begin original + +The 1 or 2 rows of numbers following that correspond to the number +of buckets of each size between C<SMALLEST> and C<GREATEST>. In +the first row, the sizes (memory footprints) of buckets are powers +of two--or possibly one page greater. In the second row, if present, +the memory footprints of the buckets are between the memory footprints +of two buckets "above". + +=end original + +The 1 or 2 rows of numbers following that correspond to the number +of buckets of each size between C<SMALLEST> and C<GREATEST>. In +the first row, the sizes (memory footprints) of buckets are powers +of two--or possibly one page greater. In the second row, if present, +the memory footprints of the buckets are between the memory footprints +of two buckets "above". +(TBT) + +=begin original + +For example, suppose under the previous example, the memory footprints +were + +=end original + +For example, suppose under the previous example, the memory footprints +were +(TBT) + + free: 8 16 32 64 128 256 512 1024 2048 4096 8192 + 4 12 24 48 80 + +=begin original + +With non-C<DEBUGGING> perl, the buckets starting from C<128> have +a 4-byte overhead, and thus an 8192-long bucket may take up to +8188-byte allocations. + +=end original + +With non-C<DEBUGGING> perl, the buckets starting from C<128> have +a 4-byte overhead, and thus an 8192-long bucket may take up to +8188-byte allocations. +(TBT) + +=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS> + +=begin original + +The first two fields give the total amount of memory perl sbrk(2)ed +(ess-broken? :-) and number of sbrk(2)s used. The third number is +what perl thinks about continuity of returned chunks. So long as +this number is positive, malloc() will assume that it is probable +that sbrk(2) will provide continuous memory. + +=end original + +The first two fields give the total amount of memory perl sbrk(2)ed +(ess-broken? :-) and number of sbrk(2)s used. The third number is +what perl thinks about continuity of returned chunks. So long as +this number is positive, malloc() will assume that it is probable +that sbrk(2) will provide continuous memory. +(TBT) + +=begin original + +Memory allocated by external libraries is not counted. + +=end original + +Memory allocated by external libraries is not counted. +(TBT) + +=item C<pad: 0> + +=begin original + +The amount of sbrk(2)ed memory needed to keep buckets aligned. + +=end original + +The amount of sbrk(2)ed memory needed to keep buckets aligned. +(TBT) + +=item C<heads: 2192> + +=begin original + +Although memory overhead of bigger buckets is kept inside the bucket, for +smaller buckets, it is kept in separate areas. This field gives the +total size of these areas. + +=end original + +Although memory overhead of bigger buckets is kept inside the bucket, for +smaller buckets, it is kept in separate areas. This field gives the +total size of these areas. +(TBT) + +=item C<chain: 0> + +=begin original + +malloc() may want to subdivide a bigger bucket into smaller buckets. +If only a part of the deceased bucket is left unsubdivided, the rest +is kept as an element of a linked list. This field gives the total +size of these chunks. + +=end original + +malloc() may want to subdivide a bigger bucket into smaller buckets. +If only a part of the deceased bucket is left unsubdivided, the rest +is kept as an element of a linked list. This field gives the total +size of these chunks. +(TBT) + +=item C<tail: 6144> + +=begin original + +To minimize the number of sbrk(2)s, malloc() asks for more memory. This +field gives the size of the yet unused part, which is sbrk(2)ed, but +never touched. + +=end original + +To minimize the number of sbrk(2)s, malloc() asks for more memory. This +field gives the size of the yet unused part, which is sbrk(2)ed, but +never touched. +(TBT) + +=back + +=head1 SEE ALSO + +=begin original + +L<perldebug>, +L<perlguts>, +L<perlrun> +L<re>, +and +L<Devel::DProf>. + +=end original + +L<perldebug>, +L<perlguts>, +L<perlrun>, +L<re>, +L<Devel::DProf> + +=begin meta + +Translate: SHIRAKATA Kentaro <argra****@ub32*****> (5.10.1) +Status: in progress + +=end meta + Index: docs/perl/5.10.1/perliol.pod diff -u docs/perl/5.10.1/perliol.pod:1.1 docs/perl/5.10.1/perliol.pod:1.2 --- docs/perl/5.10.1/perliol.pod:1.1 Wed Apr 3 04:38:27 2013 +++ docs/perl/5.10.1/perliol.pod Tue Apr 16 04:37:14 2013 @@ -26,13 +26,14 @@ =end original -This document describes the behavior and implementation of the PerlIO -abstraction described in L<perlapio> when C<USE_PERLIO> is defined (and -C<USE_SFIO> is not). -(TBT) +この文書は、C<USE_PERLIO> が定義されている(そして C<USE_SFIO> が +定義されていない)場合に L<perlapio> で記述されている PerlIO 抽象化の +振る舞いと実装について記述しています。 =head2 History and Background +(歴史と背景) + =begin original The PerlIO abstraction was introduced in perl5.003_02 but languished as @@ -42,7 +43,7 @@ =end original -The PerlIO abstraction was introduced in perl5.003_02 but languished as +PerlIO 抽象化は was introduced in perl5.003_02 but languished as just an abstraction until perl5.7.0. However during that time a number of perl extensions switched to using it, so the API is mostly fixed to maintain (source) compatibility. @@ -63,6 +64,8 @@ =head2 Basic Structure +(基本構造) + =begin original PerlIO is a stack of layers. @@ -212,6 +215,8 @@ =head2 Layers vs Disciplines +(層とディシプリン) + =begin original Initial discussion of the ability to modify IO streams behaviour used @@ -244,6 +249,8 @@ =head2 Data Structures +(データ構造体) + =begin original The basic data structure is a PerlIOl: @@ -333,6 +340,8 @@ =head2 Functions and Attributes +(関数と属性) + =begin original The functions and attributes are accessed via the "tab" (for table) @@ -413,8 +422,7 @@ =end original -Opening and setup functions -(TBT) +オープンと設定のための関数 =item 2. @@ -425,7 +433,6 @@ =end original 基本 IO 操作 -(TBT) =item 3. @@ -436,7 +443,6 @@ =end original Stdio クラスバッファリングオプション。 -(TBT) =item 4. @@ -446,7 +452,7 @@ =end original -Perl の伝統的なバッファへの「高速」アクセスに体操する関数。 +Perl の伝統的なバッファへの「高速」アクセスに対応する関数。 =back @@ -471,6 +477,8 @@ =head2 Per-instance Data +(インスタンス単位のデータ) + =begin original The per-instance data are held in memory beyond the basic PerlIOl @@ -508,6 +516,8 @@ =head2 Layers in action. +(実行中の層) + table perlio unix | | +-----------+ +----------+ +--------+ @@ -623,6 +633,8 @@ =head2 Per-instance flag bits +(インスタンス単位のフラグビット) + =begin original The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced @@ -835,6 +847,8 @@ =head2 Methods in Detail +(メソッドの詳細) + =over 4 =item fsize @@ -932,7 +946,6 @@ =end original 層がバッファリングされている。 -(TBT) =item * PERLIO_K_RAW @@ -1564,8 +1577,8 @@ =end original -Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient. -(TBT) +ファイル終端指示子を返します。 +普通は C<PerlIOBase_eof()> で十分です。 =begin original @@ -1586,8 +1599,8 @@ =end original -Return error indicator. C<PerlIOBase_error()> is normally sufficient. -(TBT) +エラー指示子を返します。 +普通は C<PerlIOBase_error()> で十分です。 =begin original @@ -1596,9 +1609,8 @@ =end original -Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set, -0 otherwise. -(TBT) +エラー (普通は C<PERLIO_F_ERROR> がセットされている) の場合は 1、 +さもなければ 0 を返します。 =item Clearerr @@ -1655,8 +1667,7 @@ =end original -Return the number of bytes that last C<Fill()> put in the buffer. -(TBT) +直前の C<Fill()> がバッファに設定したバイト数を返します。 =item Get_ptr @@ -1668,8 +1679,7 @@ =end original -Return the current read pointer relative to this layer's buffer. -(TBT) +現在の層のバッファに関連する現在の読み込みポインタを返します。 =item Get_cnt @@ -1681,8 +1691,7 @@ =end original -Return the number of bytes left to be read in the current buffer. -(TBT) +現在のバッファで読み込まれるために残っているバイト数を返します。 =item Set_ptrcnt @@ -1706,14 +1715,15 @@ =head2 Utilities +(ユーティリティ) + =begin original To ask for the next layer down use PerlIONext(PerlIO *f). =end original -To ask for the next layer down use PerlIONext(PerlIO *f). -(TBT) +次の層を調べるには PerlIONext(PerlIO *f) を使います。 =begin original @@ -1745,7 +1755,7 @@ =end original -PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type. +PerlIOSelf(PerlIO* f, type) は PerlIOBase cast to a type. (TBT) =begin original @@ -1816,6 +1826,8 @@ =head2 Implementing PerlIO Layers +(PerlIO 層の実装) + =begin original If you find the implementation document unclear or not sufficient, @@ -1854,8 +1866,7 @@ =end original -PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core. -(TBT) +Perl コアの PerlIO::encoding, PerlIO::scalar, PerlIO::via。 =begin original @@ -1863,8 +1874,7 @@ =end original -PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN. -(TBT) +CPAN の PerlIO::gzip と APR::PerlIO (mod_perl 2.0)。 =item * Perl implementations @@ -1874,8 +1884,7 @@ =end original -PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN. -(TBT) +Perl コアの PerlIO::via::QuotedPrint と CPAN の PerlIO::via::*。 =back @@ -1944,14 +1953,15 @@ =head2 Core Layers +(コア層) + =begin original The file C<perlio.c> provides the following layers: =end original -The file C<perlio.c> provides the following layers: -(TBT) +ファイル C<perlio.c> は以下の層を提供します: =over 4 @@ -2133,6 +2143,8 @@ =head2 Extension Layers +(エクステンション層) + =begin original Layers can made available by extension modules. When an unknown layer @@ -2152,8 +2164,8 @@ =end original -Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to: -(TBT) +I<layer> が不明な層とします。 +F<PerlIO.pm> は以下を試します: require PerlIO::layer; @@ -2174,8 +2186,7 @@ =end original -The following extension layers are bundled with perl: -(TBT) +以下のエクステンション層は perl に組み込まれています: =over 4 @@ -2206,8 +2217,7 @@ =end original -Provides support for reading data from and writing data to a scalar. -(TBT) +スカラに対するデータの読み書き対応を提供します。 open( $fh, "+<:scalar", \$scalar ); @@ -2232,8 +2242,8 @@ =end original -Please note that this layer is implied when calling open() thus: -(TBT) +open() を呼び出すときにこの層が暗示されていることにどうか注意してください; +従って: open( $fh, "+<", \$scalar ); @@ -2245,8 +2255,8 @@ =end original -Provided to allow layers to be implemented as Perl code. For instance: -(TBT) +Perl コードとして実装された層を使えるようにします。 +例えば: use PerlIO::via::StripHTML; open( my $fh, "<:via(StripHTML)", "index.html" ); @@ -2257,8 +2267,7 @@ =end original -See L<PerlIO::via> for details. -(TBT) +詳しくは L<PerlIO::via> を参照してください。 =back @@ -2270,44 +2279,20 @@ =end original -Things that need to be done to improve this document. -(TBT) +この文書を改良するために行われる必要のあることです。 =over =item * -=begin original - -Explain how to make a valid fh without going through open()(i.e. apply -a layer). For example if the file is not opened through perl, but we -want to get back a fh, like it was opened by Perl. - -=end original - Explain how to make a valid fh without going through open()(i.e. apply a layer). For example if the file is not opened through perl, but we want to get back a fh, like it was opened by Perl. -(TBT) - -=begin original How PerlIO_apply_layera fits in, where its docs, was it made public? -=end original - -How PerlIO_apply_layera fits in, where its docs, was it made public? -(TBT) - -=begin original - Currently the example could be something like this: -=end original - -Currently the example could be something like this: -(TBT) - PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...) { char *mode; /* "w", "r", etc */ @@ -2332,61 +2317,25 @@ =item * -=begin original - fix/add the documentation in places marked as XXX. -=end original - -fix/add the documentation in places marked as XXX. -(TBT) - =item * -=begin original - The handling of errors by the layer is not specified. e.g. when $! should be set explicitly, when the error handling should be just delegated to the top layer. -=end original - -The handling of errors by the layer is not specified. e.g. when $! -should be set explicitly, when the error handling should be just -delegated to the top layer. -(TBT) - -=begin original - Probably give some hints on using SETERRNO() or pointers to where they can be found. -=end original - -Probably give some hints on using SETERRNO() or pointers to where they -can be found. -(TBT) - =item * -=begin original - -I think it would help to give some concrete examples to make it easier -to understand the API. Of course I agree that the API has to be -concise, but since there is no second document that is more of a -guide, I think that it'd make it easier to start with the doc which is -an API, but has examples in it in places where things are unclear, to -a person who is not a PerlIO guru (yet). - -=end original - I think it would help to give some concrete examples to make it easier to understand the API. Of course I agree that the API has to be concise, but since there is no second document that is more of a guide, I think that it'd make it easier to start with the doc which is an API, but has examples in it in places where things are unclear, to a person who is not a PerlIO guru (yet). -(TBT) =back Index: docs/perl/5.10.1/perlreapi.pod diff -u /dev/null docs/perl/5.10.1/perlreapi.pod:1.1 --- /dev/null Tue Apr 16 04:37:14 2013 +++ docs/perl/5.10.1/perlreapi.pod Tue Apr 16 04:37:14 2013 @@ -0,0 +1,1567 @@ + +=encoding euc-jp + +=head1 NAME + +=begin original + +perlreapi - perl regular expression plugin interface + +=end original + +perlreapi - perl 正規表現プラグインインターフェース + +=head1 DESCRIPTION + +=begin original + +As of Perl 5.9.5 there is a new interface for plugging and using other +regular expression engines than the default one. + +=end original + +Perl 5.9.5 から、デフォルトと異なるその他の正規表現エンジンを使うための +新しいインターフェスがあります。 + +=begin original + +Each engine is supposed to provide access to a constant structure of the +following format: + +=end original + +それぞれのエンジンは以下の形式の定数構造体へのアクセスを提供することに +なっています: + + typedef struct regexp_engine { + REGEXP* (*comp) (pTHX_ const SV * const pattern, const U32 flags); + I32 (*exec) (pTHX_ REGEXP * const rx, char* stringarg, char* strend, + char* strbeg, I32 minend, SV* screamer, + void* data, U32 flags); + char* (*intuit) (pTHX_ REGEXP * const rx, SV *sv, char *strpos, + char *strend, U32 flags, + struct re_scream_pos_data_s *data); + SV* (*checkstr) (pTHX_ REGEXP * const rx); + void (*free) (pTHX_ REGEXP * const rx); + void (*numbered_buff_FETCH) (pTHX_ REGEXP * const rx, const I32 paren, + SV * const sv); + void (*numbered_buff_STORE) (pTHX_ REGEXP * const rx, const I32 paren, + SV const * const value); + I32 (*numbered_buff_LENGTH) (pTHX_ REGEXP * const rx, const SV * const sv, + const I32 paren); + SV* (*named_buff) (pTHX_ REGEXP * const rx, SV * const key, + SV * const value, U32 flags); + SV* (*named_buff_iter) (pTHX_ REGEXP * const rx, const SV * const lastkey, + const U32 flags); + SV* (*qr_package)(pTHX_ REGEXP * const rx); + #ifdef USE_ITHREADS + void* (*dupe) (pTHX_ REGEXP * const rx, CLONE_PARAMS *param); + #endif + +=begin original + +When a regexp is compiled, its C<engine> field is then set to point at +the appropriate structure, so that when it needs to be used Perl can find +the right routines to do so. + +=end original + +正規表現がコンパイルされるとき、C<engine> フィールドが適切な構造体を +指すように設定されるので、使われる必要があるとき、Perl はそうするための +正しいルーチンを見つけられます。 + +=begin original + +In order to install a new regexp handler, C<$^H{regcomp}> is set +to an integer which (when casted appropriately) resolves to one of these +structures. When compiling, the C<comp> method is executed, and the +resulting regexp structure's engine field is expected to point back at +the same structure. + +=end original + +新しい正規表現ハンドラをインストールするために、 +C<$^H{regcomp}> is set +to an integer which (when casted appropriately) resolves to one of these +structures. When compiling, the C<comp> method is executed, and the +resulting regexp structure's engine field is expected to point back at +the same structure. +(TBT) + +=begin original + +The pTHX_ symbol in the definition is a macro used by perl under threading +to provide an extra argument to the routine holding a pointer back to +the interpreter that is executing the regexp. So under threading all +routines get an extra argument. + +=end original + +The pTHX_ symbol in the definition is a macro used by perl under threading +to provide an extra argument to the routine holding a pointer back to +the interpreter that is executing the regexp. So under threading all +routines get an extra argument. +(TBT) + +=head1 Callbacks + +=head2 comp + + REGEXP* comp(pTHX_ const SV * const pattern, const U32 flags); + +=begin original + +Compile the pattern stored in C<pattern> using the given C<flags> and +return a pointer to a prepared C<REGEXP> structure that can perform +the match. See L</The REGEXP structure> below for an explanation of +the individual fields in the REGEXP struct. + +=end original + +Compile the pattern stored in C<pattern> using the given C<flags> and +return a pointer to a prepared C<REGEXP> structure that can perform +the match. See L</The REGEXP structure> below for an explanation of +the individual fields in the REGEXP struct. +(TBT) + +=begin original + +The C<pattern> parameter is the scalar that was used as the +pattern. previous versions of perl would pass two C<char*> indicating +the start and end of the stringified pattern, the following snippet can +be used to get the old parameters: + +=end original + +The C<pattern> parameter is the scalar that was used as the +pattern. previous versions of perl would pass two C<char*> indicating +the start and end of the stringified pattern, the following snippet can +be used to get the old parameters: +(TBT) + + STRLEN plen; + char* exp = SvPV(pattern, plen); + char* xend = exp + plen; + +=begin original + +Since any scalar can be passed as a pattern it's possible to implement +an engine that does something with an array (C<< "ook" =~ [ qw/ eek +hlagh / ] >>) or with the non-stringified form of a compiled regular +expression (C<< "ook" =~ qr/eek/ >>). perl's own engine will always +stringify everything using the snippet above but that doesn't mean +other engines have to. + +=end original + +Since any scalar can be passed as a pattern it's possible to implement +an engine that does something with an array (C<< "ook" =~ [ qw/ eek +hlagh / ] >>) or with the non-stringified form of a compiled regular +expression (C<< "ook" =~ qr/eek/ >>). perl's own engine will always +stringify everything using the snippet above but that doesn't mean +other engines have to. +(TBT) + +=begin original + +The C<flags> parameter is a bitfield which indicates which of the +C<msixp> flags the regex was compiled with. It also contains +additional info such as whether C<use locale> is in effect. + +=end original + +The C<flags> parameter is a bitfield which indicates which of the +C<msixp> flags the regex was compiled with. It also contains +additional info such as whether C<use locale> is in effect. +(TBT) + +=begin original + +The C<eogc> flags are stripped out before being passed to the comp +routine. The regex engine does not need to know whether any of these +are set as those flags should only affect what perl does with the +pattern and its match variables, not how it gets compiled and +executed. + +=end original + +The C<eogc> flags are stripped out before being passed to the comp +routine. The regex engine does not need to know whether any of these +are set as those flags should only affect what perl does with the +pattern and its match variables, not how it gets compiled and +executed. +(TBT) + +=begin original + +By the time the comp callback is called, some of these flags have +already had effect (noted below where applicable). However most of +their effect occurs after the comp callback has run in routines that +read the C<< rx->extflags >> field which it populates. + +=end original + +By the time the comp callback is called, some of these flags have +already had effect (noted below where applicable). However most of +their effect occurs after the comp callback has run in routines that +read the C<< rx->extflags >> field which it populates. +(TBT) + +=begin original + +In general the flags should be preserved in C<< rx->extflags >> after +compilation, although the regex engine might want to add or delete +some of them to invoke or disable some special behavior in perl. The +flags along with any special behavior they cause are documented below: + +=end original + +In general the flags should be preserved in C<< rx->extflags >> after +compilation, although the regex engine might want to add or delete +some of them to invoke or disable some special behavior in perl. The +flags along with any special behavior they cause are documented below: +(TBT) + +=begin original + +The pattern modifiers: + +=end original + +The pattern modifiers: +(TBT) + +=over 4 + +=item C</m> - RXf_PMf_MULTILINE + +=begin original + +If this is in C<< rx->extflags >> it will be passed to +C<Perl_fbm_instr> by C<pp_split> which will treat the subject string +as a multi-line string. + +=end original + +If this is in C<< rx->extflags >> it will be passed to +C<Perl_fbm_instr> by C<pp_split> which will treat the subject string +as a multi-line string. +(TBT) + +=item C</s> - RXf_PMf_SINGLELINE + +=item C</i> - RXf_PMf_FOLD + +=item C</x> - RXf_PMf_EXTENDED + +=begin original + +If present on a regex C<#> comments will be handled differently by the +tokenizer in some cases. + +=end original + +If present on a regex C<#> comments will be handled differently by the +tokenizer in some cases. +(TBT) + +=begin original + +TODO: Document those cases. + +=end original + +TODO: Document those cases. +(TBT) + +=item C</p> - RXf_PMf_KEEPCOPY + +=back + +=begin original + +Additional flags: + +=end original + +Additional flags: +(TBT) + +=over 4 + +=item RXf_PMf_LOCALE + +=begin original + +Set if C<use locale> is in effect. If present in C<< rx->extflags >> +C<split> will use the locale dependent definition of whitespace under +when RXf_SKIPWHITE or RXf_WHITE are in effect. Under ASCII whitespace +is defined as per L<isSPACE|perlapi/ISSPACE>, and by the internal +macros C<is_utf8_space> under UTF-8 and C<isSPACE_LC> under C<use +locale>. + +=end original + +Set if C<use locale> is in effect. If present in C<< rx->extflags >> +C<split> will use the locale dependent definition of whitespace under +when RXf_SKIPWHITE or RXf_WHITE are in effect. Under ASCII whitespace +is defined as per L<isSPACE|perlapi/ISSPACE>, and by the internal +macros C<is_utf8_space> under UTF-8 and C<isSPACE_LC> under C<use +locale>. +(TBT) + +=item RXf_UTF8 + +=begin original + +Set if the pattern is L<SvUTF8()|perlapi/SvUTF8>, set by Perl_pmruntime. + +=end original + +Set if the pattern is L<SvUTF8()|perlapi/SvUTF8>, set by Perl_pmruntime. +(TBT) + +=begin original + +A regex engine may want to set or disable this flag during +compilation. The perl engine for instance may upgrade non-UTF-8 +strings to UTF-8 if the pattern includes constructs such as C<\x{...}> +that can only match Unicode values. + +=end original + +A regex engine may want to set or disable this flag during +compilation. The perl engine for instance may upgrade non-UTF-8 +strings to UTF-8 if the pattern includes constructs such as C<\x{...}> +that can only match Unicode values. +(TBT) + +=item RXf_SPLIT + +=begin original + +If C<split> is invoked as C<split ' '> or with no arguments (which +really means C<split(' ', $_)>, see L<split|perlfunc/split>), perl will +set this flag. The regex engine can then check for it and set the +SKIPWHITE and WHITE extflags. To do this the perl engine does: + +=end original + +If C<split> is invoked as C<split ' '> or with no arguments (which +really means C<split(' ', $_)>, see L<split|perlfunc/split>), perl will +set this flag. The regex engine can then check for it and set the +SKIPWHITE and WHITE extflags. To do this the perl engine does: +(TBT) + + if (flags & RXf_SPLIT && r->prelen == 1 && r->precomp[0] == ' ') + r->extflags |= (RXf_SKIPWHITE|RXf_WHITE); + +=back + +=begin original + +These flags can be set during compilation to enable optimizations in +the C<split> operator. + +=end original + +These flags can be set during compilation to enable optimizations in +the C<split> operator. +(TBT) + +=over 4 + +=item RXf_SKIPWHITE + +=begin original + +If the flag is present in C<< rx->extflags >> C<split> will delete +whitespace from the start of the subject string before it's operated +on. What is considered whitespace depends on whether the subject is a +UTF-8 string and whether the C<RXf_PMf_LOCALE> flag is set. + +=end original + +If the flag is present in C<< rx->extflags >> C<split> will delete +whitespace from the start of the subject string before it's operated +on. What is considered whitespace depends on whether the subject is a +UTF-8 string and whether the C<RXf_PMf_LOCALE> flag is set. +(TBT) + +=begin original + +If RXf_WHITE is set in addition to this flag C<split> will behave like +C<split " "> under the perl engine. + +=end original + +If RXf_WHITE is set in addition to this flag C<split> will behave like +C<split " "> under the perl engine. +(TBT) + +=item RXf_START_ONLY + +=begin original + +Tells the split operator to split the target string on newlines +(C<\n>) without invoking the regex engine. + +=end original + +Tells the split operator to split the target string on newlines +(C<\n>) without invoking the regex engine. +(TBT) + +=begin original + +Perl's engine sets this if the pattern is C</^/> (C<plen == 1 && *exp +== '^'>), even under C</^/s>, see L<split|perlfunc>. Of course a +different regex engine might want to use the same optimizations +with a different syntax. + +=end original + +Perl's engine sets this if the pattern is C</^/> (C<plen == 1 && *exp +== '^'>), even under C</^/s>, see L<split|perlfunc>. Of course a +different regex engine might want to use the same optimizations +with a different syntax. +(TBT) + +=item RXf_WHITE + +=begin original + +Tells the split operator to split the target string on whitespace +without invoking the regex engine. The definition of whitespace varies +depending on whether the target string is a UTF-8 string and on +whether RXf_PMf_LOCALE is set. + +=end original + +Tells the split operator to split the target string on whitespace +without invoking the regex engine. The definition of whitespace varies +depending on whether the target string is a UTF-8 string and on +whether RXf_PMf_LOCALE is set. +(TBT) + +=begin original + +Perl's engine sets this flag if the pattern is C<\s+>. + +=end original + +Perl's engine sets this flag if the pattern is C<\s+>. +(TBT) + +=item RXf_NULL + +=begin original + +Tells the split operator to split the target string on +characters. The definition of character varies depending on whether +the target string is a UTF-8 string. + +=end original + +Tells the split operator to split the target string on +characters. The definition of character varies depending on whether +the target string is a UTF-8 string. +(TBT) + +=begin original + +Perl's engine sets this flag on empty patterns, this optimization +makes C<split //> much faster than it would otherwise be. It's even +faster than C<unpack>. + +=end original + +Perl's engine sets this flag on empty patterns, this optimization +makes C<split //> much faster than it would otherwise be. It's even +faster than C<unpack>. +(TBT) + +=back + +=head2 exec + + I32 exec(pTHX_ REGEXP * const rx, + char *stringarg, char* strend, char* strbeg, + I32 minend, SV* screamer, + void* data, U32 flags); + +=begin original + +Execute a regexp. + +=end original + +Execute a regexp. +(TBT) + +=head2 intuit + + char* intuit(pTHX_ REGEXP * const rx, + SV *sv, char *strpos, char *strend, + const U32 flags, struct re_scream_pos_data_s *data); + +=begin original + +Find the start position where a regex match should be attempted, +or possibly whether the regex engine should not be run because the +pattern can't match. This is called as appropriate by the core +depending on the values of the extflags member of the regexp +structure. + +=end original + +Find the start position where a regex match should be attempted, +or possibly whether the regex engine should not be run because the +pattern can't match. This is called as appropriate by the core +depending on the values of the extflags member of the regexp +structure. +(TBT) + +=head2 checkstr + + SV* checkstr(pTHX_ REGEXP * const rx); + +=begin original + +Return a SV containing a string that must appear in the pattern. Used +by C<split> for optimising matches. + +=end original + +Return a SV containing a string that must appear in the pattern. Used +by C<split> for optimising matches. +(TBT) + +=head2 free + + void free(pTHX_ REGEXP * const rx); + +=begin original + +Called by perl when it is freeing a regexp pattern so that the engine +can release any resources pointed to by the C<pprivate> member of the +regexp structure. This is only responsible for freeing private data; +perl will handle releasing anything else contained in the regexp structure. + +=end original + +Called by perl when it is freeing a regexp pattern so that the engine +can release any resources pointed to by the C<pprivate> member of the +regexp structure. This is only responsible for freeing private data; +perl will handle releasing anything else contained in the regexp structure. +(TBT) + +=head2 Numbered capture callbacks + +(番号付き捕捉コールバック) + +=begin original + +Called to get/set the value of C<$`>, C<$'>, C<$&> and their named +equivalents, ${^PREMATCH}, ${^POSTMATCH} and $^{MATCH}, as well as the +numbered capture buffers (C<$1>, C<$2>, ...). + +=end original + +Called to get/set the value of C<$`>, C<$'>, C<$&> and their named +equivalents, ${^PREMATCH}, ${^POSTMATCH} and $^{MATCH}, as well as the +numbered capture buffers (C<$1>, C<$2>, ...). +(TBT) + +=begin original + +The C<paren> parameter will be C<-2> for C<$`>, C<-1> for C<$'>, C<0> +for C<$&>, C<1> for C<$1> and so forth. + +=end original + +The C<paren> parameter will be C<-2> for C<$`>, C<-1> for C<$'>, C<0> +for C<$&>, C<1> for C<$1> and so forth. +(TBT) + +=begin original + +The names have been chosen by analogy with L<Tie::Scalar> methods +names with an additional B<LENGTH> callback for efficiency. However +named capture variables are currently not tied internally but +implemented via magic. + +=end original + +The names have been chosen by analogy with L<Tie::Scalar> methods +names with an additional B<LENGTH> callback for efficiency. However +named capture variables are currently not tied internally but +implemented via magic. +(TBT) + +=head3 numbered_buff_FETCH + + void numbered_buff_FETCH(pTHX_ REGEXP * const rx, const I32 paren, + SV * const sv); + +=begin original + +Fetch a specified numbered capture. C<sv> should be set to the scalar +to return, the scalar is passed as an argument rather than being +returned from the function because when it's called perl already has a +scalar to store the value, creating another one would be +redundant. The scalar can be set with C<sv_setsv>, C<sv_setpvn> and +friends, see L<perlapi>. + +=end original + +Fetch a specified numbered capture. C<sv> should be set to the scalar +to return, the scalar is passed as an argument rather than being +returned from the function because when it's called perl already has a +scalar to store the value, creating another one would be +redundant. The scalar can be set with C<sv_setsv>, C<sv_setpvn> and +friends, see L<perlapi>. +(TBT) + +=begin original + +This callback is where perl untaints its own capture variables under +taint mode (see L<perlsec>). See the C<Perl_reg_numbered_buff_fetch> +function in F<regcomp.c> for how to untaint capture variables if +that's something you'd like your engine to do as well. + +=end original + +This callback is where perl untaints its own capture variables under +taint mode (see L<perlsec>). See the C<Perl_reg_numbered_buff_fetch> +function in F<regcomp.c> for how to untaint capture variables if +that's something you'd like your engine to do as well. +(TBT) + +=head3 numbered_buff_STORE + + void (*numbered_buff_STORE) (pTHX_ REGEXP * const rx, const I32 paren, + SV const * const value); + +=begin original + +Set the value of a numbered capture variable. C<value> is the scalar +that is to be used as the new value. It's up to the engine to make +sure this is used as the new value (or reject it). + +=end original + +Set the value of a numbered capture variable. C<value> is the scalar +that is to be used as the new value. It's up to the engine to make +sure this is used as the new value (or reject it). +(TBT) + +=begin original + +Example: + +=end original + +Example: +(TBT) + + if ("ook" =~ /(o*)/) { + # `paren' will be `1' and `value' will be `ee' + $1 =~ tr/o/e/; + } + +=begin original + +Perl's own engine will croak on any attempt to modify the capture +variables, to do this in another engine use the following callback +(copied from C<Perl_reg_numbered_buff_store>): + +=end original + +Perl's own engine will croak on any attempt to modify the capture +variables, to do this in another engine use the following callback +(copied from C<Perl_reg_numbered_buff_store>): +(TBT) + + void + Example_reg_numbered_buff_store(pTHX_ REGEXP * const rx, const I32 paren, + SV const * const value) + { + PERL_UNUSED_ARG(rx); + PERL_UNUSED_ARG(paren); + PERL_UNUSED_ARG(value); + + if (!PL_localizing) + Perl_croak(aTHX_ PL_no_modify); + } + +=begin original + +Actually perl will not I<always> croak in a statement that looks +like it would modify a numbered capture variable. This is because the +STORE callback will not be called if perl can determine that it +doesn't have to modify the value. This is exactly how tied variables +behave in the same situation: + +=end original + +Actually perl will not I<always> croak in a statement that looks +like it would modify a numbered capture variable. This is because the +STORE callback will not be called if perl can determine that it +doesn't have to modify the value. This is exactly how tied variables +behave in the same situation: +(TBT) + + package CaptureVar; + use base 'Tie::Scalar'; + + sub TIESCALAR { bless [] } + sub FETCH { undef } + sub STORE { die "This doesn't get called" } + + package main; + + tie my $sv => "CatptureVar"; + $sv =~ y/a/b/; + +=begin original + +Because C<$sv> is C<undef> when the C<y///> operator is applied to it +the transliteration won't actually execute and the program won't +C<die>. This is different to how 5.8 and earlier versions behaved +since the capture variables were READONLY variables then, now they'll +just die when assigned to in the default engine. + +=end original + +Because C<$sv> is C<undef> when the C<y///> operator is applied to it +the transliteration won't actually execute and the program won't +C<die>. This is different to how 5.8 and earlier versions behaved +since the capture variables were READONLY variables then, now they'll +just die when assigned to in the default engine. +(TBT) + +=head3 numbered_buff_LENGTH + + I32 numbered_buff_LENGTH (pTHX_ REGEXP * const rx, const SV * const sv, + const I32 paren); + +=begin original + +Get the C<length> of a capture variable. There's a special callback +for this so that perl doesn't have to do a FETCH and run C<length> on +the result, since the length is (in perl's case) known from an offset +stored in C<< rx->offs >> this is much more efficient: + +=end original + +Get the C<length> of a capture variable. There's a special callback +for this so that perl doesn't have to do a FETCH and run C<length> on +the result, since the length is (in perl's case) known from an offset +stored in C<< rx->offs >> this is much more efficient: +(TBT) + + I32 s1 = rx->offs[paren].start; + I32 s2 = rx->offs[paren].end; + I32 len = t1 - s1; + +=begin original + +This is a little bit more complex in the case of UTF-8, see what +C<Perl_reg_numbered_buff_length> does with +L<is_utf8_string_loclen|perlapi/is_utf8_string_loclen>. + +=end original + +This is a little bit more complex in the case of UTF-8, see what +C<Perl_reg_numbered_buff_length> does with +L<is_utf8_string_loclen|perlapi/is_utf8_string_loclen>. +(TBT) + +=head2 Named capture callbacks + +(名前付き捕捉コールバック) + +=begin original + +Called to get/set the value of C<%+> and C<%-> as well as by some +utility functions in L<re>. + +=end original + +Called to get/set the value of C<%+> and C<%-> as well as by some +utility functions in L<re>. +(TBT) + +=begin original + +There are two callbacks, C<named_buff> is called in all the cases the +FETCH, STORE, DELETE, CLEAR, EXISTS and SCALAR L<Tie::Hash> callbacks +would be on changes to C<%+> and C<%-> and C<named_buff_iter> in the +same cases as FIRSTKEY and NEXTKEY. + +=end original + +There are two callbacks, C<named_buff> is called in all the cases the +FETCH, STORE, DELETE, CLEAR, EXISTS and SCALAR L<Tie::Hash> callbacks +would be on changes to C<%+> and C<%-> and C<named_buff_iter> in the +same cases as FIRSTKEY and NEXTKEY. +(TBT) + +=begin original + +The C<flags> parameter can be used to determine which of these +operations the callbacks should respond to, the following flags are +currently defined: + +=end original + +The C<flags> parameter can be used to determine which of these +operations the callbacks should respond to, the following flags are +currently defined: +(TBT) + +=begin original + +Which L<Tie::Hash> operation is being performed from the Perl level on +C<%+> or C<%+>, if any: + +=end original + +Which L<Tie::Hash> operation is being performed from the Perl level on +C<%+> or C<%+>, if any: +(TBT) + + RXapif_FETCH + RXapif_STORE + RXapif_DELETE + RXapif_CLEAR + RXapif_EXISTS + RXapif_SCALAR + RXapif_FIRSTKEY + RXapif_NEXTKEY + +=begin original + +Whether C<%+> or C<%-> is being operated on, if any. + +=end original + +Whether C<%+> or C<%-> is being operated on, if any. +(TBT) + + RXapif_ONE /* %+ */ + RXapif_ALL /* %- */ + +=begin original + +Whether this is being called as C<re::regname>, C<re::regnames> or +C<re::regnames_count>, if any. The first two will be combined with +C<RXapif_ONE> or C<RXapif_ALL>. + +=end original + +Whether this is being called as C<re::regname>, C<re::regnames> or +C<re::regnames_count>, if any. The first two will be combined with +C<RXapif_ONE> or C<RXapif_ALL>. +(TBT) + + RXapif_REGNAME + RXapif_REGNAMES + RXapif_REGNAMES_COUNT + +=begin original + +Internally C<%+> and C<%-> are implemented with a real tied interface +via L<Tie::Hash::NamedCapture>. The methods in that package will call +back into these functions. However the usage of +L<Tie::Hash::NamedCapture> for this purpose might change in future +releases. For instance this might be implemented by magic instead +(would need an extension to mgvtbl). + +=end original + +Internally C<%+> and C<%-> are implemented with a real tied interface +via L<Tie::Hash::NamedCapture>. The methods in that package will call +back into these functions. However the usage of +L<Tie::Hash::NamedCapture> for this purpose might change in future +releases. For instance this might be implemented by magic instead +(would need an extension to mgvtbl). +(TBT) + +=head3 named_buff + + SV* (*named_buff) (pTHX_ REGEXP * const rx, SV * const key, + SV * const value, U32 flags); + +=head3 named_buff_iter + + SV* (*named_buff_iter) (pTHX_ REGEXP * const rx, const SV * const lastkey, + const U32 flags); + +=head2 qr_package + + SV* qr_package(pTHX_ REGEXP * const rx); + +=begin original + +The package the qr// magic object is blessed into (as seen by C<ref +qr//>). It is recommended that engines change this to their package +name for identification regardless of whether they implement methods +on the object. + +=end original + +The package the qr// magic object is blessed into (as seen by C<ref +qr//>). It is recommended that engines change this to their package +name for identification regardless of whether they implement methods +on the object. +(TBT) + +=begin original + +The package this method returns should also have the internal +C<Regexp> package in its C<@ISA>. C<qr//->isa("Regexp")> should always +be true regardless of what engine is being used. + +=end original + +The package this method returns should also have the internal +C<Regexp> package in its C<@ISA>. C<qr//->isa("Regexp")> should always +be true regardless of what engine is being used. +(TBT) + +=begin original + +Example implementation might be: + +=end original + +Example implementation might be: +(TBT) + + SV* + Example_qr_package(pTHX_ REGEXP * const rx) + { + PERL_UNUSED_ARG(rx); + return newSVpvs("re::engine::Example"); + } + +=begin original + +Any method calls on an object created with C<qr//> will be dispatched to the +package as a normal object. + +=end original + +Any method calls on an object created with C<qr//> will be dispatched to the +package as a normal object. +(TBT) + + use re::engine::Example; + my $re = qr//; + $re->meth; # dispatched to re::engine::Example::meth() + +=begin original + +To retrieve the C<REGEXP> object from the scalar in an XS function use +the C<SvRX> macro, see L<"REGEXP Functions" in perlapi|perlapi/REGEXP +Functions>. + +=end original + +To retrieve the C<REGEXP> object from the scalar in an XS function use +the C<SvRX> macro, see L<"REGEXP Functions" in perlapi|perlapi/REGEXP +Functions>. +(TBT) + + void meth(SV * rv) + PPCODE: + REGEXP * re = SvRX(sv); + +=head2 dupe + + void* dupe(pTHX_ REGEXP * const rx, CLONE_PARAMS *param); + +=begin original + +On threaded builds a regexp may need to be duplicated so that the pattern +can be used by multiple threads. This routine is expected to handle the +duplication of any private data pointed to by the C<pprivate> member of +the regexp structure. It will be called with the preconstructed new +regexp structure as an argument, the C<pprivate> member will point at +the B<old> private structure, and it is this routine's responsibility to +construct a copy and return a pointer to it (which perl will then use to +overwrite the field as passed to this routine.) + +=end original + +On threaded builds a regexp may need to be duplicated so that the pattern +can be used by multiple threads. This routine is expected to handle the +duplication of any private data pointed to by the C<pprivate> member of +the regexp structure. It will be called with the preconstructed new +regexp structure as an argument, the C<pprivate> member will point at +the B<old> private structure, and it is this routine's responsibility to +construct a copy and return a pointer to it (which perl will then use to +overwrite the field as passed to this routine.) +(TBT) + +=begin original + +This allows the engine to dupe its private data but also if necessary +modify the final structure if it really must. + +=end original + +This allows the engine to dupe its private data but also if necessary +modify the final structure if it really must. +(TBT) + +=begin original + +On unthreaded builds this field doesn't exist. + +=end original + +On unthreaded builds this field doesn't exist. +(TBT) + +=head1 The REGEXP structure + +(REGEXP 構造体) + +=begin original + +The REGEXP struct is defined in F<regexp.h>. All regex engines must be able to +correctly build such a structure in their L</comp> routine. + +=end original + +The REGEXP struct is defined in F<regexp.h>. All regex engines must be able to +correctly build such a structure in their L</comp> routine. +(TBT) + +=begin original + +The REGEXP structure contains all the data that perl needs to be aware of +to properly work with the regular expression. It includes data about +optimisations that perl can use to determine if the regex engine should +really be used, and various other control info that is needed to properly +execute patterns in various contexts such as is the pattern anchored in +some way, or what flags were used during the compile, or whether the +program contains special constructs that perl needs to be aware of. + +=end original + +The REGEXP structure contains all the data that perl needs to be aware of +to properly work with the regular expression. It includes data about +optimisations that perl can use to determine if the regex engine should +really be used, and various other control info that is needed to properly +execute patterns in various contexts such as is the pattern anchored in +some way, or what flags were used during the compile, or whether the +program contains special constructs that perl needs to be aware of. +(TBT) + +=begin original + +In addition it contains two fields that are intended for the private +use of the regex engine that compiled the pattern. These are the +C<intflags> and C<pprivate> members. C<pprivate> is a void pointer to +an arbitrary structure whose use and management is the responsibility +of the compiling engine. perl will never modify either of these +values. + +=end original + +In addition it contains two fields that are intended for the private +use of the regex engine that compiled the pattern. These are the +C<intflags> and C<pprivate> members. C<pprivate> is a void pointer to +an arbitrary structure whose use and management is the responsibility +of the compiling engine. perl will never modify either of these +values. +(TBT) + + typedef struct regexp { + /* what engine created this regexp? */ + const struct regexp_engine* engine; + + /* what re is this a lightweight copy of? */ + struct regexp* mother_re; + + /* Information about the match that the perl core uses to manage things */ + U32 extflags; /* Flags used both externally and internally */ + I32 minlen; /* mininum possible length of string to match */ + I32 minlenret; /* mininum possible length of $& */ + U32 gofs; /* chars left of pos that we search from */ + + /* substring data about strings that must appear + in the final match, used for optimisations */ + struct reg_substr_data *substrs; + + U32 nparens; /* number of capture buffers */ + + /* private engine specific data */ + U32 intflags; /* Engine Specific Internal flags */ + void *pprivate; /* Data private to the regex engine which + created this object. */ + + /* Data about the last/current match. These are modified during matching*/ + U32 lastparen; /* last open paren matched */ + U32 lastcloseparen; /* last close paren matched */ + regexp_paren_pair *swap; /* Swap copy of *offs */ + regexp_paren_pair *offs; /* Array of offsets for (@-) and (@+) */ + + char *subbeg; /* saved or original string so \digit works forever. */ + SV_SAVED_COPY /* If non-NULL, SV which is COW from original */ + I32 sublen; /* Length of string pointed by subbeg */ + + /* Information about the match that isn't often used */ + I32 prelen; /* length of precomp */ + const char *precomp; /* pre-compilation regular expression */ + + char *wrapped; /* wrapped version of the pattern */ + I32 wraplen; /* length of wrapped */ + + I32 seen_evals; /* number of eval groups in the pattern - for security checks */ + HV *paren_names; /* Optional hash of paren names */ + + /* Refcount of this regexp */ + I32 refcnt; /* Refcount of this regexp */ + } regexp; + +=begin original + +The fields are discussed in more detail below: + +=end original + +The fields are discussed in more detail below: +(TBT) + +=head2 C<engine> + +=begin original + +This field points at a regexp_engine structure which contains pointers +to the subroutines that are to be used for performing a match. It +is the compiling routine's responsibility to populate this field before +returning the regexp object. + +=end original + +This field points at a regexp_engine structure which contains pointers +to the subroutines that are to be used for performing a match. It +is the compiling routine's responsibility to populate this field before +returning the regexp object. +(TBT) + +=begin original + +Internally this is set to C<NULL> unless a custom engine is specified in +C<$^H{regcomp}>, perl's own set of callbacks can be accessed in the struct +pointed to by C<RE_ENGINE_PTR>. + +=end original + +Internally this is set to C<NULL> unless a custom engine is specified in +C<$^H{regcomp}>, perl's own set of callbacks can be accessed in the struct +pointed to by C<RE_ENGINE_PTR>. +(TBT) + +=head2 C<mother_re> + +=begin original + +TODO, see L<http://www.mail-archive.com/perl5****@perl*****/msg17328.html> + +=end original + +TODO, see L<http://www.mail-archive.com/perl5****@perl*****/msg17328.html> +(TBT) + +=head2 C<extflags> + +=begin original + +This will be used by perl to see what flags the regexp was compiled +with, this will normally be set to the value of the flags parameter by +the L<comp|/comp> callback. See the L<comp|/comp> documentation for +valid flags. + +=end original + +This will be used by perl to see what flags the regexp was compiled +with, this will normally be set to the value of the flags parameter by +the L<comp|/comp> callback. See the L<comp|/comp> documentation for +valid flags. +(TBT) + +=head2 C<minlen> C<minlenret> + +=begin original + +The minimum string length required for the pattern to match. This is used to +prune the search space by not bothering to match any closer to the end of a +string than would allow a match. For instance there is no point in even +starting the regex engine if the minlen is 10 but the string is only 5 +characters long. There is no way that the pattern can match. + +=end original + +The minimum string length required for the pattern to match. This is used to +prune the search space by not bothering to match any closer to the end of a +string than would allow a match. For instance there is no point in even +starting the regex engine if the minlen is 10 but the string is only 5 +characters long. There is no way that the pattern can match. +(TBT) + +=begin original + +C<minlenret> is the minimum length of the string that would be found +in $& after a match. + +=end original + +C<minlenret> is the minimum length of the string that would be found +in $& after a match. +(TBT) + +=begin original + +The difference between C<minlen> and C<minlenret> can be seen in the +following pattern: + +=end original + +The difference between C<minlen> and C<minlenret> can be seen in the +following pattern: +(TBT) + + /ns(?=\d)/ + +=begin original + +where the C<minlen> would be 3 but C<minlenret> would only be 2 as the \d is +required to match but is not actually included in the matched content. This +distinction is particularly important as the substitution logic uses the +C<minlenret> to tell whether it can do in-place substitution which can result in +considerable speedup. + +=end original + +where the C<minlen> would be 3 but C<minlenret> would only be 2 as the \d is +required to match but is not actually included in the matched content. This +distinction is particularly important as the substitution logic uses the +C<minlenret> to tell whether it can do in-place substitution which can result in +considerable speedup. +(TBT) + +=head2 C<gofs> + +=begin original + +Left offset from pos() to start match at. + +=end original + +Left offset from pos() to start match at. +(TBT) + +=head2 C<substrs> + +=begin original + +Substring data about strings that must appear in the final match. This +is currently only used internally by perl's engine for but might be +used in the future for all engines for optimisations. + +=end original + +Substring data about strings that must appear in the final match. This +is currently only used internally by perl's engine for but might be +used in the future for all engines for optimisations. +(TBT) + +=head2 C<nparens>, C<lasparen>, and C<lastcloseparen> + +=begin original + +These fields are used to keep track of how many paren groups could be matched +in the pattern, which was the last open paren to be entered, and which was +the last close paren to be entered. + +=end original + +These fields are used to keep track of how many paren groups could be matched +in the pattern, which was the last open paren to be entered, and which was +the last close paren to be entered. +(TBT) + +=head2 C<intflags> + +=begin original + +The engine's private copy of the flags the pattern was compiled with. Usually +this is the same as C<extflags> unless the engine chose to modify one of them. + +=end original + +The engine's private copy of the flags the pattern was compiled with. Usually +this is the same as C<extflags> unless the engine chose to modify one of them. +(TBT) + +=head2 C<pprivate> + +=begin original + +A void* pointing to an engine-defined data structure. The perl engine uses the +C<regexp_internal> structure (see L<perlreguts/Base Structures>) but a custom +engine should use something else. + +=end original + +A void* pointing to an engine-defined data structure. The perl engine uses the +C<regexp_internal> structure (see L<perlreguts/Base Structures>) but a custom +engine should use something else. +(TBT) + +=head2 C<swap> + +=begin original + +TODO: document + +=end original + +TODO: document +(TBT) + +=head2 C<offs> + +=begin original + +A C<regexp_paren_pair> structure which defines offsets into the string being +matched which correspond to the C<$&> and C<$1>, C<$2> etc. captures, the +C<regexp_paren_pair> struct is defined as follows: + +=end original + +A C<regexp_paren_pair> structure which defines offsets into the string being +matched which correspond to the C<$&> and C<$1>, C<$2> etc. captures, the +C<regexp_paren_pair> struct is defined as follows: +(TBT) + + typedef struct regexp_paren_pair { + I32 start; + I32 end; + } regexp_paren_pair; + +=begin original + +If C<< ->offs[num].start >> or C<< ->offs[num].end >> is C<-1> then that +capture buffer did not match. C<< ->offs[0].start/end >> represents C<$&> (or +C<${^MATCH> under C<//p>) and C<< ->offs[paren].end >> matches C<$$paren> where +C<$paren >= 1>. + +=end original + +If C<< ->offs[num].start >> or C<< ->offs[num].end >> is C<-1> then that +capture buffer did not match. C<< ->offs[0].start/end >> represents C<$&> (or +C<${^MATCH> under C<//p>) and C<< ->offs[paren].end >> matches C<$$paren> where +C<$paren >= 1>. +(TBT) + +=head2 C<precomp> C<prelen> + +=begin original + +Used for optimisations. C<precomp> holds a copy of the pattern that +was compiled and C<prelen> its length. When a new pattern is to be +compiled (such as inside a loop) the internal C<regcomp> operator +checks whether the last compiled C<REGEXP>'s C<precomp> and C<prelen> +are equivalent to the new one, and if so uses the old pattern instead +of compiling a new one. + +=end original + +Used for optimisations. C<precomp> holds a copy of the pattern that +was compiled and C<prelen> its length. When a new pattern is to be +compiled (such as inside a loop) the internal C<regcomp> operator +checks whether the last compiled C<REGEXP>'s C<precomp> and C<prelen> +are equivalent to the new one, and if so uses the old pattern instead +of compiling a new one. +(TBT) + +=begin original + +The relevant snippet from C<Perl_pp_regcomp>: + +=end original + +The relevant snippet from C<Perl_pp_regcomp>: +(TBT) + + if (!re || !re->precomp || re->prelen != (I32)len || + memNE(re->precomp, t, len)) + /* Compile a new pattern */ + +=head2 C<paren_names> + +=begin original + +This is a hash used internally to track named capture buffers and their +offsets. The keys are the names of the buffers the values are dualvars, +with the IV slot holding the number of buffers with the given name and the +pv being an embedded array of I32. The values may also be contained +independently in the data array in cases where named backreferences are +used. + +=end original + +This is a hash used internally to track named capture buffers and their +offsets. The keys are the names of the buffers the values are dualvars, +with the IV slot holding the number of buffers with the given name and the +pv being an embedded array of I32. The values may also be contained +independently in the data array in cases where named backreferences are +used. +(TBT) + +=head2 C<substrs> + +=begin original + +Holds information on the longest string that must occur at a fixed +offset from the start of the pattern, and the longest string that must +occur at a floating offset from the start of the pattern. Used to do +Fast-Boyer-Moore searches on the string to find out if its worth using +the regex engine at all, and if so where in the string to search. + +=end original + +Holds information on the longest string that must occur at a fixed +offset from the start of the pattern, and the longest string that must +occur at a floating offset from the start of the pattern. Used to do +Fast-Boyer-Moore searches on the string to find out if its worth using +the regex engine at all, and if so where in the string to search. +(TBT) + +=head2 C<subbeg> C<sublen> C<saved_copy> + +=begin original + +Used during execution phase for managing search and replace patterns. + +=end original + +Used during execution phase for managing search and replace patterns. +(TBT) + +=head2 C<wrapped> C<wraplen> + +=begin original + +Stores the string C<qr//> stringifies to. The perl engine for example +stores C<(?-xism:eek)> in the case of C<qr/eek/>. + +=end original + +Stores the string C<qr//> stringifies to. The perl engine for example +stores C<(?-xism:eek)> in the case of C<qr/eek/>. +(TBT) + +=begin original + +When using a custom engine that doesn't support the C<(?:)> construct +for inline modifiers, it's probably best to have C<qr//> stringify to +the supplied pattern, note that this will create undesired patterns in +cases such as: + +=end original + +When using a custom engine that doesn't support the C<(?:)> construct +for inline modifiers, it's probably best to have C<qr//> stringify to +the supplied pattern, note that this will create undesired patterns in +cases such as: +(TBT) + + my $x = qr/a|b/; # "a|b" + my $y = qr/c/i; # "c" + my $z = qr/$x$y/; # "a|bc" + +=begin original + +There's no solution for this problem other than making the custom +engine understand a construct like C<(?:)>. + +=end original + +There's no solution for this problem other than making the custom +engine understand a construct like C<(?:)>. +(TBT) + +=head2 C<seen_evals> + +=begin original + +This stores the number of eval groups in the pattern. This is used for security +purposes when embedding compiled regexes into larger patterns with C<qr//>. + +=end original + +This stores the number of eval groups in the pattern. This is used for security +purposes when embedding compiled regexes into larger patterns with C<qr//>. +(TBT) + +=head2 C<refcnt> + +=begin original + +The number of times the structure is referenced. When this falls to 0 the +regexp is automatically freed by a call to pregfree. This should be set to 1 in +each engine's L</comp> routine. + +=end original + +The number of times the structure is referenced. When this falls to 0 the +regexp is automatically freed by a call to pregfree. This should be set to 1 in +each engine's L</comp> routine. +(TBT) + +=head1 HISTORY + +=begin original + +Originally part of L<perlreguts>. + +=end original + +元は L<perlreguts> の一部です。 + +=head1 AUTHORS + +=begin original + +Originally written by Yves Orton, expanded by E<AElig>var ArnfjE<ouml>rE<eth> +Bjarmason. + +=end original + +元は Yves Orton によって書かれ E<AElig>var ArnfjE<ouml>rE<eth> +Bjarmason によって拡張されました。 + +=head1 LICENSE + +Copyright 2006 Yves Orton and 2007 E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason. + +This program is free software; you can redistribute it and/or modify it under +the same terms as Perl itself. + +=begin meta + +Translate: SHIRAKATA Kentaro <argra****@ub32*****> (5.10.1) +Status: in progress + +=end meta + +=cut +