Search     or:     and:
 LINUX 
 Language 
 Kernel 
 Package 
 Book 
 Test 
 OS 
 Forum 
 iakovlev.org 
 Books
  Краткое описание
 Linux
 W. R. Стивенс TCP 
 W. R. Стивенс IPC 
 A.Rubini-J.Corbet 
 K. Bauer 
 Gary V. Vaughan 
 Д Вилер 
 В. Сталлинг 
 Pramode C.E. 
 Steve Pate 
 William Gropp 
 K.A.Robbins 
 С Бекман 
 Р Стивенс 
 Ethereal 
 Cluster 
 Languages
 C
 Perl
 M.Pilgrim 
 А.Фролов 
 Mendel Cooper 
 М Перри 
 Kernel
 C.S. Rodriguez 
 Robert Love 
 Daniel Bovet 
 Д Джеф 
 Максвелл 
 G. Kroah-Hartman 
 B. Hansen 
NEWS
Последние статьи :
  Тренажёр 16.01   
  Эльбрус 05.12   
  Алгоритмы 12.04   
  Rust 07.11   
  Go 25.12   
  EXT4 10.11   
  FS benchmark 15.09   
  Сетунь 23.07   
  Trees 25.06   
  Apache 03.02   
 
TOP 20
 Linux Kernel 2.6...5164 
 Trees...935 
 Максвелл 3...861 
 Go Web ...814 
 William Gropp...794 
 Ethreal 3...779 
 Ethreal 4...766 
 Gary V.Vaughan-> Libtool...764 
 Rodriguez 6...755 
 Clickhouse...748 
 Steve Pate 1...748 
 Ext4 FS...748 
 Ethreal 1...736 
 Secure Programming for Li...718 
 C++ Patterns 3...711 
 Ulrich Drepper...692 
 Assembler...686 
 DevFS...654 
 Стивенс 9...644 
 MySQL & PosgreSQL...621 
 
  01.01.2024 : 3621733 посещений 

iakovlev.org

Встроенные перловые структуры


CONTENTS


В этом разделе рассматриваются встроенные перловые структуры. Это будет полезно не только тем,кто работает с перл, но и всем,кому интересно внутренне устройство интерпретатора.

Introduction

Perl написан на C и имеет библиотеки,которые напрямую можно прилинковать к коду,написанному собственно на C/C++. Для этого нужно знать,как перл хранит свои собственные структуры и как интерпретируются перловые типы данных.

Может так случиться,что вы напишите свои расширения или модули, для которых понадобятся специфические структуры.

На момент написания этой книги текущей версией перл была 5.002b. b означает beta.

Здесь вы найдете примеры того,как вызывать перловые функции из си. И наоборот,вы можете вызывать си-шные функции из перл. Си-шные функции,которые вы захотите прилинковать к перловым библиотекам, потребуют хидеров из перлового дистрибутива.

Вам понадобится GNU C compiler. Если у вас его нет - идите на oak.oakland.edu ftp.

Обзор исходников самого Perl

This section covers some of the header files in your Perl distribution. Table 25.1 provides a brief description of what the ones covered here contain. You can track the values or specific definitions by starting from these header files.

Table 25.1. The Perl header files.
FilePurpose
XSUB.h Defines the XSUB interface (see Chapter 27, "Writing Extensions in C," for more information)
av.h Array variable information
config.h Generated when Perl is installed
cop.h Glob pointers
cv.h Conversion structure
dosish.h Redefining stat, fstat, and fflush for DOS
embed.h For embedding Perl in C
EXTERN.h Global and external variables
form.h For form feed and line feed definitions
gv.h Glob pointer definitions
handy.h Used for embedding Perl in C
hv.h Hash definitions
INTERN.h Perl internal variables
keywords.h For Perl keywords
mg.h Definitions for using MAGIC structures
op.h Perl operators
patchlevel.h For current patchlevel information
pp.h For preprocessor directives
perl.h Main header for Perl
perly.h For the yylex parser
proto.h Function prototypes
regexp.h Regular expressions
scope.h Scoping rule definitions
sv.h Scalar values
util.h Blank header file
unixish.h For UNIX-specific definitions

The source files in the Perl distribution are as follows. They come with very sparse comments.

av.c mg.cpp_sys.c
deb.c miniperlmain.c regcomp.c
doio.c op.cregexec.c
doop.c perl.crun.c
dump.c perlmain.c scope.c
globals.c perly.csv.c
gv.c pp.ctaint.c
hv.c pp_ctl.ctoke.c
malloc.c pp_hot.cutil.c

The name of each file gives a hint as to what the code in the file does. Run head *.c > text to get a list of the headers for the files.

Now that you know a little about what source files to consult, you're ready to learn about the building blocks of Perl programs: the variables.

Perl Variable Types

Perl has three basic data types: scalars, arrays, and hashes. Perl enables you to have references to these data types as well as references to subroutines. Most references use scalar values to store their values, but you can have arrays of arrays, arrays of references, and so on. It's quite possible to build complicated data structures using the three basic types in Perl.

Variables in Perl programs can even have two types of values, depending on how they are interpreted. For instance, $i can be an integer when used in a numeric operation, and $i is a string when used in a string operation. Another example is the $!, which is the errno code when used as a number but a string when used within a print statement.

Naming Conventions

Because variables internal to Perl source code can have many types of values and definitions, the name must be descriptive enough to indicate what type it is. By convention, there are three tokens in Perl source code variable names: arrays, hashes, and scalar variables. A scalar variable can be further qualified to define the type of value it holds. The list of token prefixes for these Perl types are shown in the following list:

AV
Array variables
HV
Hash variables
SV
Generic scalar variables
I32
32-bit integer (scalar)
I16
16-bit integer (scalar)
IV
Integer or pointer only (scalar)
NV
Double only (scalar)
PV
String pointer only (scalar)

If you see SV in a function or variable name, the function is probably working on a scalar item. The convention is followed closely in the Perl source code, and you should be able to glean the type of most variable names as you scan through the code. Function names in the source code can begin with sv_ for scalar variables and related operations, av_ for array variables, and hv_ for hashes.

Let's now cover these variable types and the functions to manipulate them.

Scalars and Scalar Functions

Scalar variables in the Perl source are those with SV in their names. A scalar variable on a given system is the size of a pointer or an integer, whichever is larger. Specific types of scalars exist to specify numbers such as IV for integer or pointer, NV for doubles, and so on. The SV definition is really a typedef declaration of the sv structure in the header file called sv.h. NV, IV, PV, I32, and I16 are type-specific definitions of SV for doubles, generic pointers, strings, and 32- and 16-bit numbers.

Floating-point numbers and integers in Perl are stored as doubles. Thus, a variable with NV will be a double that you can cast in a C program to whatever type you want.

Four types of routines exist to create an SV variable. All four return a pointer to a newly created variable. You call these routines from within an XS Perl extension file:

  • SV *newSViv(IV I);
  • SV *newSVnv(double d);
  • SV *newSVpv(char * p, int len);
  • SV *newSVsv(SV *svp);

The way to read these function declarations is as follows. Take the newSViv(IV) declaration, for example. The new portion of the declaration asks Perl to create a new object. The SV indicates a scalar variable. The iv indicates a specific type to create: iv for integer, nv for double, pv for a string of a specified length, and sv for all other types of scalars.

Three functions exist to get the value stored in an SV. The type of value returned depends on what type of value was set at the time of creation:

int SvIV(SV*); This function returns an integer value of the SV being pointed to. Cast the return value to a pointer if that is how you intend to use it. A sister macro, SvIVX(SV*), does the same thing as the SvIV function.
double SvNV(SV*); This function returns a floating-point number. The SvNVX macro does the same thing as the SvNV() function.
char *SvPV(SV*, STRLEN len); This function returns a pointer to a char. The STRLEN in this function call is really specifying a pointer to the len variable. The pointer to len is used by the function to return the length of the string in SV. The SvPVX pointer returns the string too, but you do not have to specify the STRLEN len argument.

You can modify the value contained in an already existing SV by using the following functions:

void sv_setiv(SV* ptr, IV incoming); This function sets the value of the SV being pointed to by ptr to the integer value in incoming.
void sv_setnv(SV* ptr, double); This function sets the value of the SV being pointed to by ptr to the value in incoming.
void sv_setsv(SV* dst, SV*src); This function sets the value of the SV being pointed to by dst to the value pointed to by src in incoming.

Perl does not keep NULL-terminated strings like C does. In fact, Perl strings can have multiple NULLs in them. Perl tracks strings by a pointer and the length of the string. Strings can be modified in one of these ways:

void sv_setpvn(SV* ptr, char* anyt , int len);
This sets the value of the SV being pointed to by ptr to the value in anyt. The string anyt contains an array of char items and does not have to be a NULL-terminated string. In fact, the string anyt can contain NULLs because the function uses the value in len to keep the string in memory.
void sv_setpv(SV* ptr, char* nullt);
This sets the value of the SV being pointed to by ptr to the value in nullt. The nullt string is a NULL-terminated string like those in C, and the function calculates and sets the length for you automatically.
SvGROW(SV* ptr, STRLEN newlen);
This function increases the size of a string to the size in newlen. You cannot decrease the size of a string using this function. Make a new variable and copy into it. You can use the function SvCUR(SV*) to get the length of a string and SvCUR_set(SV*, I32 length) to set the length of a string.
void sv_catpv(SV* ptr, char*);
This function appends a NULL-terminated string to a string in SV.
void sv_catpvn(SV* ptr, char*, int);
This function appends a string of length len to the SV pointed at by ptr.
void sv_catsv(SV*dst, SV*src);
This appends another SV* to an SV.

Your C program using these programs will crash if you are not careful enough to check whether these variables exist. To check whether a scalar variable exists, you can call these functions:

SvPOK(SV*ptr) For string
SvIOK(SV*ptr) For integer
SvNOK(SV*ptr) For double
SvTRUE(SV *ptr) For Boolean value

A value of FALSE received from these functions means that the variable does not exist. You can only get two returned values, either TRUE or FALSE, from the functions that check whether a variable is a string, integer, or double. The SvTRUE(SV*) macro returns 0 if the value pointed at by SV is an integer zero or if SV does not exist. Two other global variables, sv_yes and sv_no, can be used instead of TRUE and FALSE, respectively.

Note
The Perl scalar undef value is stored in an SV instance called sv_undef. The sv_undef value is not (SV *) 0 as you would expect in most versions of C.

You can get a pointer to an existing scalar by specifying its variable name in the call to the function:

SV*  perl_get_sv("myScalar", FALSE);

The FALSE parameter requests the function to return sv_undef if the variable does not exist. If you specify a TRUE value as the second parameter, a new scalar variable is created for you and assigned the name myScalar in the current name space.

In fact, you can use package names in the variable name. For example, the following call creates a variable called desk in the VRML package:

SV *desk;
desk =   perl_get_sv("VRML::desk", FALSE);

Now let's look at collections of scalars: arrays.

Array Functions

The functions for handling array variables are similar in operation to those for scalar variables. To create an array called myarray, you would use this call:

AV *myarray = (AV* ) newAV();

To get the array by specifying the name, you can also use the following function. This perl_get_av() returns NULL if the variable does not exist:

AV*  perl_get_av(char *myarray, bool makeIt);

The makeIt variable can be set to TRUE if you want the array created, and FALSE if you are merely checking for its existence and do not want the array created if it does not exist.

To initialize an array at the time of creation, you can use the av_make() function. Here's the syntax for the av_make() function:

AV *myarray =  (AV  *)av_make(I32 num, SV **data);

The num parameter is the size of the AV array, and data is a pointer to an array of pointers to scalars to add to this new array called myarray. Do you see how the call uses pointers to SV, rather than SVs? The added level of indirection permits Perl to store any type of SV in an array. So, you can store strings, integers, and doubles all in one array in Perl. The array passed into the av_make() function is copied into a new memory area; therefore, the original data array does not have to persist.

Check the av.c source file in your Perl distribution for more details on the functions and their parameters. Here is a quick list of the functions you would most likely perform on AVs.

void av_push(AV *ptr, SV *item);
Pushes an item to the back of an array.
SV* av_pop(AV *ptr);
Pops an item off the back of an array.
SV* av_shift(AV *ptr);
Removes an item from the front of the array.
void av_unshift(AV *ptr, I32 num);
Inserts num items into the front of the array. The operation in this function only creates space for you. You still have to call the av_store() function (defined below) to assign values to the newly added items.
I32 av_len(AV *ptr);
Returns the highest index in array.
SV** av_fetch(AV *ptr, I32 offset, I32 lval);
Gets the value in the array at the offset. If lval is a nonzero value, the value at the offset is replaced with the value of lval.
SV** av_store(AV *ptr, I32 key, SV* item);
Stores the value of the item at the offset.
void av_clear(AV *ptr);
Sets all items to zero but does not destroy the array in *ptr.
void av_undef(AV *ptr);
Removes the array and all its items.
void av_extend(AV *ptr, I32 size);
Resizes the array to the maximum of the current size or the passed size.

Hash Functions

Hash variables have HV in their names and are created in a manner similar to creating array functions. To create an HV type, you call this function:

HV*  newHV();

Here's how to use an existing hash function and refer to it by name:

HV*  perl_get_hv("myHash", FALSE);

The function returns NULL if the variable does not exist. If the hash does not already exist and you want Perl to create the variable for you, use:

HV*  perl_get_hv("myHash", TRUE);

As with the AV type, you can perform the following functions on an HV type of variable:

  • SV** hv_store(HV* hptr,
  • char* key  The key of the hash item
  • U32 klen,  The length of the key
  • SV* val,   The scalar to insert
  • U32 hash)  Zero unless you compute the hash function yourself
  • SV**  hv_fetch(HV* hptr,
  • char* keyq The key of the hash
  • U32 klen   The length of the key
  • I32 lval)  Its value

Check the file hv.c in your Perl distribution for the function source file for details about how the hash function is defined. Both of the previous functions return pointers to pointers. The return value from either function will be NULL.

The following functions are defined in the source file:

bool hv_exists(HV*, char* key, U32 klen);
This function returns TRUE or FALSE.
SV* hv_delete(HV*, char* key, U32 klen, I32 flags);
This function deletes the item, if it exists, at the specified key.
void hv_clear(HV*);
This function leaves the hash but removes all its items.
void hv_undef(HV*);
This function removes the hash and its items.

You can iterate through the hash table using indexes and pointers to hash table entries using the HE pointer type. To iterate through the array (such as with the each command in Perl), you can use hv_iterinit(HV*) to set the starting point and then get the next item as an HE pointer from a call to the hv_iternext(HV*) function. To get the item being traversed, make a call to this function:

SV*    hv_iterval(HV* hashptr, HE* entry);

The next SV is available via a call to this function:

SV*    hv_iternextsv(HV*hptr, char** key, I32* retlen);

The key and retlen arguments are return values for the key and its length. See line 600 in the hv.c.

Mortality of Variables

Values in Perl exist until explicitly freed. They are freed by the Perl garbage collector when the reference count to them is zero, by a call to the undef function, or if they were declared local or my and the scope no longer exists. In all other cases, variables declared in one scope persist even after execution has left the code block in which they were declared. For example, declaring and using $a in a function keeps $a in the main program even after returning from the subroutine. This is why it's necessary to create local variables in subroutines using the my keyword so that the Perl interpreter will automatically destroy these variables, which will no longer be used after the subroutine returns.

References to variables in Perl can also be modified using the following functions:

int SvREFCNT(SV* sv);
This function returns the current reference count to an existing SV.
void SvREFCNT_inc(SV* sv);
This function increments the current reference count.
void SvREFCNT_dec(SV* sv);
This function decrements the current reference count. You can make the reference count be zero to delete the SV being pointed to and let the garbage handler get rid
of it.

Because the values declared within code blocks persist for a long time, they are referred to as immortal. Sometimes declaring and creating variable names in code blocks have the side effect of persisting even if you do not want them to. When writing code that declares and creates such variables, it's a good idea to create variables that you do not want to persist as mortal; that is, they die when code leaves the current scope.

The functions that create a mortal variable are as follows:

SV* sv_newmortal(); This function creates a new mortal variable and returns a pointer to it.
SV* sv_2mortal(SV*); This function converts an existing immortal SV into a mortal variable. Be careful not to convert an already mortal SV into a mortal SV because this operation may result in the reference count for the variable to be decremented twice, leading to unpredictable results.
SV* sv_mortalcopy(SV*); This function copies an existing SV (without regard to the mortality of the passed SV) into a new mortal SV.

To create AV and HV types, you have to cast the input parameters to and from these three functions as AV* and HV*.

Subroutines and Stacks

Perl subroutines use the stack to get and return values to the callers. Chapter 27, "Writing Extensions in C," covers how the stack is manipulated. This section describes the functions available for you to manipulate the stack.

Note
Look in the XSUB.h file for more details than this chapter can give you. The details in the header include macro definitions for manipulating the stack in an extension module.

Arguments on a stack to a Perl function are available via the ST(n) macro set. The topmost item on the stack is ST(0), and the mth one is ST(m-1). You may assign the return value to a static value, like this:

SV *arg1 = ST(1); // Assign argument[1] to arg1;

You can even increase the size of the argument stack in a function. (This is necessary if you are returning a list from a function call, for example. I cover this in more detail in Chapter 27.) To increase the length of the stack, make a call to the macro:

EXTEND(sp, num);

sp is the stack pointer and num is an extra number of elements to add to the stack. You cannot decrease the size of the stack.

To add items to the stack, you have to specify the type of variable you're adding. Four functions are available for four of the most generic types to push:

  • PUSHi(IV)
  • PUSHn(double)
  • PUSHp(char*, I32)
  • PUSHs(SV*)

If you want the stack to be adjusted automatically, make the calls to these macros:

  • XPUSHi(IV)
  • XPUSHn(double)
  • XPUSHp(char*, I32)
  • XPUSHs(SV*)

These macros are a bit slower but simpler to use.

In Chapter 27, you'll see how to use stacks in the section titled "The typemap File." Basically, a typemap file is used by the extensions compiler xsubpp for the rules to convert from Perl's internal data types (hash, array, and so on) to C's data types (int, char *, and so on). These rules are stored in the typemap file in your Perl distribution's ./lib/ExtUtils directory.

The definitions of the structures in the typemap file are specified in the internal format for Perl.

What Is Magic?

As you go through the online docs and the source for Perl, you'll often see the word magic. The mysterious connotations of this word are further enhanced by the almost complete lack of documentation on what magic really is. In order to understand the phrases "then magic is applied to whatever" or "automagically [sic]" in the Perl documentation, you have to know what "magic" in Perl really means. Perhaps after reading this section, you will have a better feel for Perl internal structures and actions of the Perl interpreter.

Basically, a scalar value in Perl can have special features for it to become "magical." When you apply magic to a variable, that variable is placed into a linked list for methods. A method is called for each type of magic assigned to that variable when a certain action takes place, such as retrieving or storing the contents of the variable. Please refer to a comparable scheme in Perl when using the tie() function as described in Chapter 6, "Binding Variables to Objects." When tie-ing a variable to an action, you are defining actions to take when a scalar is accessed or when an array item is read from. In the case of magic actions of a scalar, you have a set of magic methods that are called when the Perl interpreter takes a similar action on (like getting a value from or putting a value into) a scalar variable.

To check whether a variable has magic methods associated with it, you can get the flags for it using the SvFLAGS(sv) macro. The (sv) here is the name of the variable. The SvMAGIC(variable) macro returns the list of methods that are magically applied to the variable. The SvTYPE() of the variable is SVt_PVMG if it has a list of methods. A normal SV type is upgraded to a magical status by Perl if a method is requested for it.

The structure to maintain the list is found in the file mg.h in the Perl source files:

struct magic {
    MAGIC*     mg_moremagic;
               // pointer to next method. NULL if none.
    MGVTBL*    mg_virtual;
                // pointer to table of methods.
    U16   mg_private;   // internal variable
    char  mg_type;      // type of methods
    U8    mg_flags;     // flags for this method
    SV*   mg_obj;       // Reference to itself
    char* mg_ptr;       // name of the magic variable
    I32   mg_len;       // length of the name
};

The mg_type value sets up how the magic function is applied. The following items are used in the magic table. You can see the values in use in the sv.c file at about line 1950. Table 25.2, which has been constructed from the switch statement, tells you how methods have to be applied.

Table 25.2. Types of Magic functions in mg_type.

Mg_type
Virtual Magic TableAction Calling Method
\0
vtbl_sv Null operation
A
vtbl_amagic Operator overloading
a
vtbl_amagicelem Operator overloading
c
0 Used in operator overloading
B
vtbl_bm Unknown
E
vtbl_env %ENV hash
e
vtbl_envelem %ENV hash element
g
vtbl_mglob Regexp applied globally
I
vtbl_isa @ISA array
i
vtbl_isaelem @ISA array element
L
0 Unknown
l
tbl_dbline n line debugger
P
tbl_pack Tied array or hash
p
vtbl_packelem Tied array or hash element
q
vtbl_packelem Tied scalar or handle
S
vtbl_sig Signal hash
s
vtbl_sigelem Signal hash element
t
vtbl_taint Modified tainted variable
U
vtbl_uvar Unknown variable type
v
vtbl_vec Vector
x
vtbl_substr Substring
*
vtbl_glob The GV type
#
vtbl_arylen Array length
 .
vtbl_pos $. scalar variable

The magic virtual tables are defined in embed.h. The mg_virtual field in each magic entry is assigned to the address of the virtual table.

Each entry in the magic virtual table has five items, each of which is defined in the following structure in the file mg.h:

struct mgvtbl {
    int        (*svt_get)    _((SV *sv, MAGIC* mg));
    int        (*svt_set)    _((SV *sv, MAGIC* mg));
    U32        (*svt_len)    _((SV *sv, MAGIC* mg));
    int        (*svt_clear)  _((SV *sv, MAGIC* mg));
    int        (*svt_free)   _((SV *sv, MAGIC* mg));
};

The svt_get() function is called when the data in SV is retrieved. The svt_set() function is called when the data in SV is stored. The svt_len() function is called when the length of the string is changed. The svt_clear() function is called when SV is cleared, and the svt_free() function is called when SV is destroyed.

All tables shown in the perl.h file are assigned mgvtbl structures. The values in each mgvtbl structure for each item in a table define a function to call when an action that affects entries in this table is taken by the Perl interpreter. Here is an excerpt from the file:

EXT MGVTBL vtbl_sv =
     {magic_get, magic_set, magic_len,0,0};
EXT MGVTBL vtbl_env =
     {0,  0,   0,   0,   0};
EXT MGVTBL vtbl_envelem =
     {0,  magic_setenv, 0,magic_clearenv, 0};
EXT MGVTBL vtbl_sig =
     {0,  0,    0,  0,  0};
EXT MGVTBL vtbl_sigelem =
     {0,  magic_setsig, 0,0, 0};

The vbtl_sv is set to call three methods: magic_get(), magic_set(), and magic_len() for the magic entries in sv. The zeros for vtbl_sig indicate that no magic methods are called.

The Global Variable (GV) Type

If you are still awake, you'll notice a reference to GV in the source file. GV stands for global variable, and the value stored in GV is any data type from scalar to a subroutine reference. GV entries are stored in a hash table, and the keys to each entry are the names of the symbols being stored. A hash table with GV entries is also referred to as a stash. Internally, a GV type is the same as an HV type.

Keys in a stash are also package names, with the data item pointing to other GV tables containing the symbol within the package.

Where to Look for More Information

Most of the information in this chapter has been gleaned from source files or the online documents on the Internet. There is a somewhat old file called perlguts.html by Jeff Okamoto (e-mail okamoto@corp.hp.com) in the www.metronet.com archives that has the Perl API functions and information about the internals.

Note that the perlguts.html file was dated 1/27/1995, so it's probably not as up-to-date as you would like.

Please refer to the perlguts.html or the perlguts man page for a comprehensive listing of the Perl API. If you want a listing of the functions in the Perl source code and the search strings, use the ctags *.c command on all the .c files in the Perl source directory. The result will be a very long file (800 lines), called tags, in the same directory. A header of this file is shown here:

ELSIF     perly.c   /^"else : ELSIF '(' expr ')' block else",$/
GvAVn     gv.c /^AV *GvAVn(gv)$/
GvHVn     gv.c /^HV *GvHVn(gv)$/
Gv_AMupdate    gv.c /^Gv_AMupdate(stash)$/
HTOV util.c    /^HTOV(htovs,short)$/
PP   pp.c /^PP(pp_abs)$/
PP   pp.c /^PP(pp_anoncode)$/
PP   pp.c /^PP(pp_anonhash)$/
PP   pp.c /^PP(pp_anonlist)$/
PP   pp.c /^PP(pp_aslice)$/

If you are a vi hack, you can type :tag functionName to go to the line and file immediately from within a vi session. Ah, the old vi editor still has a useful function in this day and age. emacs users can issue the command etags *.c and get a comparable tags file for use with the M-x find-tag command in emacs.

Summary

This chapter is a reference-only chapter to prepare you for what lies ahead in the rest of the book. You'll probably be referring to this chapter quite a bit as you write include extensions.

There are three basic types of variables in Perl: SV for scalar, AV for arrays, and HV for hash values. Macros exist for getting data from one type to another. You'll need to know about these internal data types if you're going to be writing Perl extensions, dealing with platform-dependent issues, or (ugh) embedding C code in Perl and vice versa. The dry information in this chapter will serve you well in the rest of this book.

Chapter 26

Writing C Extensions in Perl


CONTENTS


This chapter introduces a way to embed the Perl interpreter into a C program. After reading this chapter, you should be able to integrate the Perl interpreter or its library code with a C application. The information in this chapter relies heavily on the discussion of Perl's internal data types from Chapter 25, "Perl Internal Files and Structures."

Introduction

The Perl interpreter is written in C and can therefore be linked with C code. The combination process is relatively straightforward. However, there are some caveats that you should be aware of. I discuss these caveats in this chapter. Also, even though you can combine C and Perl code, you might want to rethink the way you want to use each language. Both C and Perl have their strong points. C code can be optimized to a greater degree than can Perl code. Perl is great for processing text files and strings, whereas manipulation strings in C is a bit clumsy at times.

Several alternatives exist for combining C and Perl code. You can use extensions, call complete programs, run a background process, and so on. For example, if you want to create a library of mathematical functions to use in your Perl programs, you can write an extension and use it from within your Perl code. If the functionality you need from a C program can be encapsulated into an executable program, you can call this program from within a system call. The system call will force your Perl program to wait until the called program terminates. You can start the executable program as a background process from the command line or from within a Perl program using a fork system call.

If one of these methods will solve your problem, you may not have to take the more complicated route of embedding Perl in C code. Think through your design thoroughly before deciding which solution is best. If you feel that your design requires you to write a C program and also use the functionality of Perl from within your program, you are left with no alternative except to follow the more complicated route.

The first question to answer is Whether the solution you are deriving is based for the UNIX platform? Most of the tested situations to date for combining C code with Perl are those on the UNIX platform. Such a combination is therefore not viable for non-UNIX platforms. If you intend to port the combined code to an NT system, it simply won't compile, let alone work. Therefore, if you've concluded that you must embed the Perl interpreter in your C code, keep in mind that you're also limiting your solution to UNIX platforms. Embedding Perl in C code on Windows will not work at the moment, so you might want to rethink your design to see whether you can obtain comparable functionality using extension libraries.

The second question to answer is What exactly is the solution you are trying to achieve by combining Perl with C? Again, if all you want do is use C code from within Perl, you might want to consider writing the C code as a Perl extension. In such an event, you should rethink your design to see how to use only portions of the C code as parts of an extension library for a Perl program. The methods to do just this are covered in Chapter 27, "Writing Extensions in C."

Normally you want to write extensions in C because the compiled code is faster. For example, complex mathematical calculations are probably best written in C rather than in Perl. The compiled C code runs much faster than interpreted code in Perl for specific tasks. Also, with the use of extensions, it's possible to send in pointers to variables, existing with the Perl program, into the extension code. By receiving pointers to the Perl variables, code within extensions can modify the contents of variables directly.

Of course, if you are attempting to run another complete C program from Perl, consider using the system(), backtick, or exec() calls. They are convenient ways of calling complete programs. With the open() call using pipes, it's easy to collect the standard output from the executed application. However, you are constrained to reading only the output from the child process you started. The child process cannot manipulate any Perl variables directly. The Perl code and the called code run as completely different applications in their own address spaces. This is in contrast to working with Perl's extensions that run C as part of the calling process. (Of course, you can take explicit measures to create a bi-directional pipe through which to communicate.)

The flexibility in the two methods outlined up to now in this section solves a majority of the problems solved using C and Perl. In both methods, though, the calling program is a Perl application. What if you want to call Perl code from within a C application? You can write the Perl code as an executable script and then make a system() call from within the C code. However, you'll be constrained by the same things as you were using the system() call from within Perl. There is no direct connection between the child process started with system() and the calling application, unless you take explicit measures to create a communication channel like a socket or pipe.

Sometimes you just have to call Perl code from within a C program. For example, the C language is not designed to process regular expressions. The functionality of parsing tokens from a string using the strtok() function may not be sufficient for your problem. There are ways to port some regular expression parsers like grep into C. (A good example is Allen Holub's port of grep in an article in Dr. Dobbs, October, 1984, and in The Dr. Dobbs Toolbox of C, Brady Books, 1986.) Perhaps all you need is simple regular expression parsing, in which case using a port of grep makes sense. The grep code you link will not have the overhead of the entire Perl preprocessor.

However, if all else fails and you do want the Perl interpreter in your C program, or at the very least want to call a Perl subroutine from within your C program, this chapter might provide the information you need. Here are the topics I will cover:

  • How to add a Perl interpreter to a C program
  • How to call a Perl subroutine from within a C program
  • How to use Perl pattern matches and string substitutions

Compiling a C Program That Calls Perl

First of all, make sure you have read access to the Perl 5 distribution including Perl header files and linking Perl libraries. Make sure that your Perl 5.002 source code distribution is complete and installed correctly. Copying the Perl executable program from another machine won't work because you need the whole source tree to work with your C program.

Essentially, the files you need are the EXTERN.h and perl.h header files. The Perl libraries should exist in a directory using the following format:

/usr/local/lib/perl5/your_architecture_here/CORE

Execute this statement for a hint about where to find CORE:

perl -e 'use Config; print $Config{archlib}\n'

On my Linux system, this program returned the following pathname (the path might be completely different for your machine):

/usr/lib/perl5/i486-linux/5.002

The command line to use for compiling and running a program is

gcc -o filename filename.c -L($LIBS) -I($IncS)

The -L and -I flags define the locations of the library and header files. The libraries that you are linking with are

perl -e 'use Config; print $Config{libs} , "\n";'

An easier way to find out which libraries you are using is to use the ExtUtils::embed module extensions. This library is available from any CPAN site and is in the public domain.

What Is ExtUtils::embed?

ExtUtils::embed is a set of utility functions used when embedding a Perl interpreter and extensions in your C/C++ applications. You do not have to use this library, but it does a lot of the basic grunt work (such as finding the correct libraries, include files, and definitions) for you. The author of the ExtUtils::embed package is Doug MacEachern (dougm@osf.org). You can address problems and comments to him directly.

Installing the package is easy. Simply untar the archive, which will create a directory called ExtUtils-embed-version_number. Change directory into this new directory and run these commands:

perl Makefile.PL
make
make test
make install

Using the embed.pm Package

Usually, you'll place a call to the embed.pm function in a makefile. The instructions on how to do this are in the embed.pm package itself. In practice, however, I just took the values returned from a few calls to the embed.pm functions and placed them directly in a makefile. This way there was a written record in the makefile for the explicit pathnames used to build the programs I happen to be using. The risk was that if the location of the Perl library changed in the future, my makefile would break. This change is too unlikely to happen on my machine because I administer it. Your case might be different if you are on a multiuser system or are writing this makefile for the benefit of a group.

The one-line command to compile and link the ex1.c file in a makefile is

$gcc -o ex1 ex1.c 'perl -MExtUtils::embed -e ccopts -e ldopts'

In fact, this line can be placed in a shell script like this:

$gcc -o $1 $1.c 'perl -MExtUtils::embed -e ccopts -e ldopts'

Each execution of the makefile will cause the Perl script to recreate the include and link paths. This might slow down each execution of make. A make variable set to a constant value will probably let make process the makefile faster. I discuss this procedure in a moment. In any event, the returned values from the call to use the ExtUtils::embed function are used to define the link libraries and include files. The returned values can be used directly in makefiles should the Perl -e command not work or if you prefer to use a make variable.

Listing 26.1 presents a sample makefile for the Linux machine.


Listing 26.1. A sample makefile.
1 IncK= -D__USE_BSD_SIGNAL -Dbool=char -DHAS_BOOL
Â-I/usr/local/include  -rdynamic  -I /usr/lib/perl5/i486-linux/5.002/CORE
2 LIBK =  -L/usr/local/lib /usr/lib/perl5/i486-linux/5.002
Â/auto/DynaLoader/DynaLoader.a -L/usr/lib/perl5/i486-linux/5.002/CORE
Â-lperl -lgdbm -ldbm -ldb -ldl -lm -lc -lbsd
3 K_LIBS = -lgdbm -ldbm -ldb -ldl -lm -lc -lbsd
4
5 ex2 : ex2.c
6     $(cc) -fno-strict-prototype ex2.c -o ex2 -L$(LIBK) -I$(IncK)

Note the flags for the gcc compiler. The main problem with the gcc compiler is how the "safefree" function prototype is declared in the proto.h and handy.h files. There is a slight difference in the syntax of each declaration, but programmatically it makes no difference. To turn off the checking in gcc, simply use the -fno-strict-prototype flag at the command line for the gcc command.

The LIBK and IncK paths are set to values that are derived from running the Perl program shown in Listing 26.2. There is one possible problem you must be aware of when you run this program: if you get errors stating that it cannot find embed.pm in the @Inc array, then modify the @Inc array to include the path where the file is located. The commented lines in Listing 26.2 are examples.

If all else fails, copy the embed.pm path to the directory you happen to be in. If you get an error stating that embed.pm could not be included or that it was empty, you have to modify the embed.pm file. Go to the statement with the _END_ label and add the line 1; before it. This step is only necessary if the ExtUtils::embed file could not be included.


Listing 26.2. A sample makefile.
 1 #!/usr/bin/perl
 2
 3 use ExtUtils::embed;
 4
 5 #
 6 # unshift(@Inc,"/usr/local/lib/perl5/site_perl");
 7 #
 8
 9 &ccopts;  # Create the path for headers
10 &ldopts;  # Create the path for libraries

The call to the &ccopts function creates the include path for use with the -I flag. The call to &ldopts creates the path for libraries to be linked with the -L flag. Pipe the output to a saved file and create the makefile as shown in Listing 26.1.

Adding a Perl Interpreter to Your C Program

A C program using a Perl interpreter is really creating and running a PerlInterpreter object. The PerlInterpreter object type is defined in the Perl library, and the C program simply makes a reference to this library object. Several sample files come in the embed.pm file and are used here as examples.

Listing 26.3 presents a quick example of how to embed a Perl interpreter in a C program.


Listing 26.3. The first example from the embed.pm module.
 1 /*
 2 ** Sample program used from embed.pm module package.
 3 */
 4 #include <stdio.h>
 5 #include <EXTERN.h>
 6 #include <perl.h>
 7
 8 static PerlInterpreter *my_perl;
 9
10 main(int argc, char **argv, char **env)
11 {
12
13         my_perl = perl_alloc();
14         perl_construct(my_perl);
15         perl_parse(my_perl, NULL, argc, argv, env);
16         perl_run(my_perl);
17         perl_destruct(my_perl);
18         perl_free(my_perl);
19 }

Lines 3 to 6 are the include headers you have to use to get this to work. The EXTERN.h and perl.h files will be picked from where the Perl distribution is installed.

At line 8, the C program creates a pointer reference to the PerlInterpreter object defined in the Perl libraries. The reference will actually be resolved in line 13 when the object is created.

At line 10, the main program interface is called. All three arguments are required. Do not use either of these lines because they both caused compiler error, even with the -fno-strict-prototypes flag set:
main(int argc, char **argv);
main(int argc, char *argv[], char *env[])

Lines 14 and 15 construct the PerlInterpreter object and parse any environment variables and command-line arguments. You can read and execute Perl statements from a file at any time in a C program by simply placing the name of the file in argv[1] before calling the perl_run function. The perl_run function is called at line 16 in the sample code in Listing 26.2. The function can be called repeatedly in the C code before the calls are made to destruct and free the object (lines 17 and 18 in the sample code).

Now make this program, called ex2.c, and run it. In the sample run that follows, note how variables are defined and used interactively in this sample run:

$ ex2
$a = 1;
$b = 3;
print $a," ",$b," ",$a+$b,"\n";
^D
1 3 4
$

To run a script file, simply redirect the contents of a file into the input of the interpreter. The file you are feeding into your C mini-interpreter does not have to have its execute bit set in its permissions. Here's a sample run.

$ cat test.pl
$a = 1;
$b = 1;
print "a + b = ", $a + $b, "\n";
$
$ ex2 < test.pl
a + b = 2
$

There you have it-a small Perl interpreter embedded in C code. Once you have this interpreter embedded in your C code, you can evaluate Perl statements by simply feeding them into the interpreter one line at a time.

There will be occasions, though, when you simply want to call a Perl subroutine directly from within the C code. I show you how to do this in the next section.

Calling Perl Subroutines from Within a C Program

In order to call a Perl subroutine by name, simply replace the call to perl_run() with a call to perl_call_argv(). An example is shown in the code in Listing 26.4.


Listing 26.4. Calling a subroutine in Perl directly.
 1 #include <stdio.h>
 2 #include <EXTERN.h>
 3 #include <perl.h>
 4
 5     static PerlInterpreter *my_perl;
 6
 7     int main(int argc, char **argv, char **env)
 8     {
 9         my_perl = perl_alloc();
10         perl_construct(my_perl);
11
12         perl_parse(my_perl, NULL, argc, argv, env);
13  /* The next line calls a function in the file named in
14   * argv[1] of the program !!*/
15         perl_call_argv("showUser", G_DISCARD | G_NOARGS, argv);
16         perl_destruct(my_perl);
17         perl_free(my_perl);
18     }

Look closely at line 15. This is really the only line that is different from the code shown in Listing 26.3. A Perl subroutine called showUser is being called here. The showUser subroutine takes no arguments, so you specify a G_NOARGS flag, and returns no values, so specify the G_DISCARD flag. The argv vector is used to store the filename in argv[1]. To invoke this program, type the name of the file (showMe.pl, in this case) at the command line:

$ ex3 showMe.pl
Process 2689 : UID is 501 and GID is 501

The showMe.pl file named in argv[1] is shown in Listing 26.5.


Listing 26.5. The file containing the subroutine being called.
1 #!/usr/bin/perl
2 sub showUser {
3     print "Process $$ : UID is $< and GID is $(\n";
4 }

Calling Perl functions with arguments and using the return values requires manipulation of the Perl stack. When you make the call, you push values onto the stack, and upon return from the function you get the returned value off the stack. Let's see how to work with the calling stack.

Note
A note to the Perl-savvy reader: You can get the values from Perl special variables directly. See the section titled "Getting Special Variable Values," later in this chapter.

Here is another Perl subroutine, which prints whatever parameters are passed to it:

sub PrintParameters
    {
        my(@args) = @_ ;
        for $i (@args) { print "$i\n" }
    }

Here is the function to use when calling this Perl subroutine:

static char *words[] = {"My", "karma", "over", "my", "dogma", NULL} ;
    static void justDoit()
    {
        dSP;
        perl_call_argv("PrintParameters", G_DISCARD, words) ;
    }

As you can see, it's easy to construct parameters and then use them directly in C statements to call Perl functions or programs.

Working with the Perl Stack

Perl has several C functions to use when calling Perl subroutines. Here are the most important ones to know:

  • perl_call_sv
  • perl_call_pv
  • perl_call_method
  • perl_call_argv

The perl_call_sv function is called by the other three functions in the list. These three functions simply fiddle with their arguments before calling the perl_call_sv function. All these functions take a flags parameter to pass options. We'll discuss the meaning of these flags shortly.

All these functions return an integer of type I32. The value returned is the number of items on the Perl stack after the call returns. It's a good idea to check this value when using unknown code. Calling functions use the return value to determine how many returned values to pick up from the stack after the calling function returns.

Note
One important point to note here is that you'll see a lot of types of returned values and arguments passed into functions. These declarations come from the Perl sources and header files starting from perl.h. You really do not need to know how I32 is defined, but its name suggests it's a 32-bit integer. If you are really curious to see how it works or what the definitions are, check out the perl.h header file. Please refer to Chapter 25 for more information on internal Perl variables.

The perl_call_sv Function

The syntax for this function is

I32 perl_call_sv(SV* sv, I32 flags) ;

The perl_call_sv takes two arguments. The first argument, sv, is a pointer to an SV structure. (The SV structure is also defined in the header file perl.h.) The SV stands for scalar vector. This way you can specify the subroutine to call either as a C string by name or by a reference to a subroutine.

The perl_call_pv Function

The function perl_call_pv is like the perl_call_sv call except that it expects its first parameter to be a string containing the name of the subroutine to call. The syntax to this function is

I32 perl_call_pv(char *subname, I32 flags) ;

For example, to specify the name of function, you would make a call like this:

perl_call_pv("showUser",0);

The function name can be qualified to use a package name. To explicitly call a routine in a package, simply prepend the name with a PackageName:: string. For example, to call the function Time in the Datum package, you would make this call:

perl_call_pv("Datum::Time",0);

The perl_call_method Function

This function is used to call a method from a Perl class. The syntax for this call is

I32 perl_call_method(char *methodName, I32 flags) ;

methodName is set to the name of the method to be called. The class in which the called method is defined is passed on the stack and not in the argument list. The class can be specified either by name or by reference to an object.

The perl_call_argv Function

This subroutine calls the subroutine specified in the subname parameter. The values of the flags that are passed in when making this function call will be discussed shortly. The argv pointer is set in the same manner as the ARGV array to a Perl program: each item in the argv array is a pointer to NULL-terminated strings. The syntax for this call is

I32 perl_call_argv(char *subname, I32 flags, register char **argv) ;

The Flags to Use

The flags parameter in all the functions discussed in the previous section is a bitmask that can consist of the following values:

  • G_DISCARD
  • G_NOARGS
  • G_SCALAR
  • G_ARRAY
  • G_EVAL

The following sections detail how each flag affects the behavior of the called function.

The G_DISCARD Flag

The G_DISCARD flag is used to specify whether a function returns a value or not. If the G_DISCARD flag is set to True, no values are returned. By default, a returned value or a set of values from the perl_call functions are placed on the stack, and placing these values on the stack does take time and processing. If you are not interested in these values, set the G_DISCARD flag to get rid of these items. If this flag is not specified, there might be returned values pushed onto the stack that you might not be aware of unless you explicitly check the returned value of the function call. Therefore, specify the flag explicitly if you don't care to even look for returned values.

The G_NOARGS Flag

Normally you'll be passing arguments into a Perl subroutine. The default procedure is to create the @_ array for the subroutine to work with. If you are not passing any parameters to the Perl subroutine, you can save processing cycles by not creating the @_ array. Setting the flag stops the flag from being created.

Be sure that the function you are calling does not use arguments. If the called Perl subroutine does use parameters and you do not pass any parameters, your program might return totally bogus results. In such cases, the last value of @_ is used.

The G_SCALAR Flag

The flag specifies that only one scalar value will be expected back from this function. The called subroutine might attempt to return a list, in which case only the last item in the list will be returned. Setting this flag sets the context under which the called subroutine will run. G_SCALAR is the default context if no context is specified.

When this flag is specified, the returned value from the called function will either be 0 or 1: 0 is returned if the G_DISCARD flag is set; otherwise, 1 is returned. If a 1 is returned, you must remove the returned value to restore the stack back to what it was before the call was made.

Tip
The called subroutine can call the wantarray function to determine if it was called in scalar or array context. If the call returns false, the called function is called in scalar context. If the call returns true, the called function is called in array context.

The G_ARRAY Flag

Set this flag when you want to return a list from the called Perl subroutine. This flag causes the Perl subroutine to be called in a list context. 0 is returned when the G_DISCARD flag is specified along with this flag. If the G_DISCARD flag is not specified, the number of items on the stack is returned. This number is usually the number of items the calling routine pushed on the stack when making the function call; however, the number of items could be different if the called subroutine has somehow manipulated the stack internally. If G_ARRAY was specified and the returned value is 0, an error has occurred in the called routine.

The G_EVAL Flag

The called subroutine might crash while processing bogus input parameters. It's possible to trap an abnormal termination by specifying the G_EVAL statement. The net effect of this flag is to put the called subroutine in an evaluated block from which it's possible to trap all but the more shameful errors. Whenever control returns from the function, you have to check the contents of the $@ variable for any possible error messages. Remember that any anomaly can set the contents of $@, so check the value as soon as you return from a subroutine call.

The value returned from the perl_call_* function is dependent on what other flags have been specified and whether an error has occurred. Here are all the different cases that can occur: If an error occurs when a G_SCALAR is specified, the value on top of the stack will be undef. The value of $@ will also be set. Just remember to pop the undef from the stack.

Using SCALAR Context

Here is an example of how to call a Perl function that takes three arguments and returns an integer. This is an example of using a function in a scalar context because it returns a scalar value. The code is shown in Listing 26.6.


Listing 26.6. A function used in a scalar context.
 1 /* How to call a Perl subroutine from C */
 2 #include <stdio.h>
 3 #include <EXTERN.h>
 4 #include <perl.h>
 5
 6     static int getSeconds(int s, int m, int h);
 7     static PerlInterpreter *my_perl;
 8
 9     int main(int argc, char **argv, char **env)
10     {
11         my_perl = perl_alloc();
12         perl_construct(my_perl);
13
14         perl_parse(my_perl, NULL, argc, argv, env);
15     getSeconds(10,30,4);   /* TIME = 10:30:04 AM */
16         perl_destruct(my_perl);
17         perl_free(my_perl);
18     }
19
20     static int getSeconds(int s, int m, int h)
21     {
22         dSP ;        /* init stack pointer */
23         int count ;    /* keep return value  */
24         ENTER ;         /* start temporary area */
25         SAVETMPS;
26         PUSHMARK(sp) ;        /* push mark for last argument. */
27         XPUSHs(sv_2mortal(newSViv(s))); /* leftmost argument */
28         XPUSHs(sv_2mortal(newSViv(m))); /* go from left to right */
29         XPUSHs(sv_2mortal(newSViv(h))); /* rightmost argument */
30         PUTBACK ;            /* make stack pointer available */
31         count = perl_call_pv("seconds", G_SCALAR); /* call */
32         SPAGAIN ;       /* reset stack pointer */
33         if (count != 1)    /* check return value */
34             croak("Whoa Nelly! This is wrong\n") ;
35         printf ("The number of seconds so far = %f for  %d:%d:%d\n",
         POPi,h,m,s) ;
36         PUTBACK ;  /* put the popped value on stack again */
37         FREETMPS ; /* free up temporary variables (NOT count) */
38         LEAVE ;    /* get out and clean up stack */
39     }

Lines 2 through 4 declare the mandatory headers. At line 7 we declare the prototype for the function we are going to call. You can declare the function here instead if you like. The call to this function, getSeconds, is made at line 15. The function itself is declared static to prevent unlikely confusion with any predefined functions in the Perl libraries.

At line 22, the call to the dSP macro initializes the stack. At line 23, the count variable is declared to be a nontemporary variable on the stack. The ENTER and SAVETMPS macros at lines 24 and 25 start the temporary variable area. A marker to this stack location is pushed on at line 26.

The ENTER/SAVETMPS pair creates the start of code for all temporary variables that will be destroyed on the return from the C function. The FREETMPS/LEAVE pair will be used to clean up and destroy the space allocated on the calling stack for these temporary variables.

Now the parameters to the Perl subroutine are pushed onto the stack one at a time from the leftmost parameter to the rightmost parameter. Remember this order because the Perl function expecting these parameters will be declared as this:

sub seconds {
        my($h, $m, $s) = @_ ;
    my $t;
    $t = $s + $m * 60 + $h * 3600;
           return $t
    }

The stack pointer is made into a globally available value with the PUTBACK macro in line 30. The actual call to the subroutine is made in line 31 with the G_SCALAR flag. On return from the subroutine, we have to reset the stack pointer with the SPAGAIN (stack pointer again) macro. The count must be 1, or else we have an error.

The returned value is in POPi. The called function returned an integer value. To get other types of values, you can use one of the following macros:

  • POPs for an SV
  • POPp for a pointer (such as a pointer to a string)
  • POPn for a double
  • POPi for an integer
  • POPl for a long

The PUTBACK macro is used to reset the Perl stack back to a consistent state just before exiting the function. The POPi macro call only updated the local copy of the stack pointer. We have to set the global value on the stack, too. All parameters pushed onto the stack must be bracketed by the PUSHMARK and PUTBACK macros. These macros count the number of parameters being pushed and hence let Perl know how to size the @_ array. The PUSHMARK macro tells Perl to mark the stack pointer and must be specified even if you are using no parameters. The PUTBACK macro sets the global copy of the stack pointer to the value of the local copy of the stack pointer.

Here's another example of how to use a returned string from a function using the POPp macro. (See Listing 26.7.) Look at lines 23 through 25 to see how strings and integers are pushed onto the stack with the XPUSHs(sv_2mortal(newSVpv(str,offset))); and XPUSHs(sv_2mortal(newSViv(offset))); functions. The returned value from the actual call to the Perl function is retrieved with the POPp macro.


Listing 26.7. Using a returned string value.
 1 /* Using returned strings from functions */
 2 #include <stdio.h>
 3 #include <EXTERN.h>
 4 #include <perl.h>
 5
 6     static int MySubString(char *a, int offset, int len);
 7     static PerlInterpreter *my_perl;
 8
 9     int main(int argc, char **argv, char **env)
10     {
11         my_perl = perl_alloc();
12         perl_construct(my_perl);
13         perl_parse(my_perl, NULL, argc, argv, env);
14          MySubString("Kamran Was Here",7,3); /* return 'Was' */
15         perl_destruct(my_perl);
16         perl_free(my_perl);
17     }
18
19     static int MySubString(char *a, int offset, int len)
20     {
21         dSP ;
22         PUSHMARK(sp) ;
23         XPUSHs(sv_2mortal(newSVpv(a, 0)));
24         XPUSHs(sv_2mortal(newSViv(offset)));
25         XPUSHs(sv_2mortal(newSViv(len)));
26         PUTBACK ;
27         perl_call_pv("Csubstr", G_SCALAR);
28         SPAGAIN ;
29         printf ("The substring is %s\n",(char *)POPp) ;
30         PUTBACK ;
31         FREETMPS ;
32         LEAVE ;
33     }

The function Csubstr calls the Perl substr() as shown here:

sub Csubstr {
    my ($s,$o,$l) = @_;
    return substr($s,$o,$l);
    }

The value of the substr() function call is returned back from the subroutine call. It's this returned value that is used in the C program.

Returning Lists from Subroutines

Many Perl functions return lists as their results. C programs can retrieve these values as well. Here's a simple Perl function that returns the ratio of two numbers. (See Listing 26.8.) The C program to call this function is shown in Listing 26.9.


Listing 26.8. Ratio of numbers in a Perl function.
 1 sub GetRatio
 2 {
 3    my($a, $b) = @_ ;
 4 my $c, $d;
 5 if ($a == 0) { $c = 1; $d = 0; }
 6 elsif ($b == 0) { $c = 0; $d = 1; }
 7 else {
 8       $c = $a/$b;
 9       $d = $b/$a;
10      }
11 ($c,$d);
12 }

Look at lines 34 and 35 in Listing 26.9. The returned values from the Perl function are picked off one at a time using the POPn macro to get double values from the stack. The global stack is readjusted before returning from the C function.


Listing 26.9. Calling the GetRatio function.
 1    /* How to return lists back from Perl functions */
 2
 3 #include <stdio.h>
 4 #include <EXTERN.h>
 5 #include <perl.h>
 6
 7     static void getRatio(int a, int b);
 8     static PerlInterpreter *my_perl;
 9
10     int main(int argc, char **argv, char **env)
11     {
12         my_perl = perl_alloc();
13         perl_construct(my_perl);
14         perl_parse(my_perl, NULL, argc, argv, env);
15     getRatio(8,3);
16         perl_destruct(my_perl);
17         perl_free(my_perl);
18     }
19
20
21     static void getRatio(int a, int b)
22     {
23         dSP ;
24         int count ;
25         ENTER ;
26         SAVETMPS;
27         PUSHMARK(sp) ;
28         XPUSHs(sv_2mortal(newSViv(a)));
29         XPUSHs(sv_2mortal(newSViv(b)));
30         PUTBACK ;
31         count = perl_call_pv("GetRatio", G_ARRAY);
32         SPAGAIN ;
33         if (count != 2) croak("Whoa! \n") ;
34         printf ("%d / %d = %f\n", a, b, POPn) ;
35         printf ("%d / %d = %f\n", b, a, POPn) ;
36         PUTBACK ;
37         FREETMPS ;
38         LEAVE ;
39     }

Placing G_SCALAR instead of G_ARRAY in the code in Listing 29.9 would have forced a scalar value to be returned. Only the last item of the array would have been returned and the value of count would be set to 1.

Note how this Perl subroutine takes precautions not to crash by first checking the divisor by zero. Now, this time let's not return a value. Instead, let's call the die() function if a bogus value is sent into the function GetRatio. We'll try to trap the errors caused by calling the function with the G_EVAL flag set.

Using G_EVAL

The G_EVAL flag is useful when calling functions you think may die(). The G_EVAL flag is OR-ed in with any other flags to such a call. Listing 26.10 presents a Perl function that calls the die function in case one of the arguments sent into it is zero. Because we know that this function can die, we'll send in a value that causes it to die. The code to make this fatal call (to illustrate how G_EVAL is used) is shown in Listing 26.11.


Listing 26.10. A Perl function that can die.
 1 sub GetRatioEval
 2     {
 3        my($a, $b) = @_ ;
 4     my $c, $d;
 5     die "Hey! A is 0 \n" if ($a == 0);
 6     die "Hey! B is 0 \n" if ($b == 0);
 7     $c = $a/$b;
 8     $d = $b/$a;
 9            ($c,$d);
10     }


Listing 26.11. A C program to use the G_EVAL flag.
 1  /* Call the suicidal function to use G_EVAL */
 2
 3 #include <stdio.h>
 4 #include <EXTERN.h>
 5 #include <perl.h>
 6
 7     static void getRatio(int a, int b);
 8     static PerlInterpreter *my_perl;
 9
10     int main(int argc, char **argv, char **env)
11     {
12         my_perl = perl_alloc();
13         perl_construct(my_perl);
14         perl_parse(my_perl, NULL, argc, argv, env);
15          getRatio(8,0);
16         perl_destruct(my_perl);
17         perl_free(my_perl);
18     }
19
20
21     static void getRatio(int a, int b)
22     {
23         dSP ;
24         int count ;
25          SV *svp;          /* New line */
26         ENTER ;
27         SAVETMPS;
28         PUSHMARK(sp) ;
29         XPUSHs(sv_2mortal(newSViv(a)));
30         XPUSHs(sv_2mortal(newSViv(b)));
31         PUTBACK ;
32         count = perl_call_pv("GetRatioEval", G_ARRAY | G_EVAL);
33         SPAGAIN ;
34         svp = GvSV(gv_fetchpv("@", TRUE, SVt_PV));
35         if (SvTRUE(svp))
36         {
37             printf ("Die by division: %s\n", SvPV(svp, na)) ;
38             POPs ;
39         }
40         else
41         {
42         if (count != 2) croak("Whoa! \n") ;
43         printf ("%d / %d = %f\n", a, b, POPn) ;
44         printf ("%d / %d = %f\n", b, a, POPn) ;
45     }
46         PUTBACK ;
47         FREETMPS ;
48         LEAVE ;
49     }

In the code shown in Listing 26.11, the call to the Perl function will terminate in a die() function call. The returned value from this Perl function call is checked in line 34. The variable we are looking at is the $@ variable in Perl. The syntax for this call is

svp = GvSV(gv_fetchpv("@", TRUE, SVt_PV));

The value of the variable is checked in the following lines, and the stack is adjusted with a call to pop off the string. The string is retrieved with a call to the SvPV() function in line 38.

Getting Special Variable Values

In Listing 26.7 we recovered values of special variables, $< and $(, to get the UID and GID of the calling process via a Perl subroutine. The Perl subroutine was simply an example of how to call a routine. Now let's see how we can get values of special variables in Perl directly. The call to get these values is

svp = GvSV(gv_fetchpv(variableName, defaultValue, SVt_PV));

Let's look at the function in Listing 26.12 to see how to get the UID and GID of a calling C program.


Listing 26.12. Function for getting the values of $< and $( directly.
 1 static int getUserInfo()
 2 {
 3     dSP ;
 4 int tmp;
 5 SV *svp;
 6     PUSHMARK(sp);
 7     svp = GvSV(gv_fetchpv("<", 0, SVt_PV));
 8 tmp =  SvIV(svp);
 9 printf ("\n UID = %d",tmp);
10     svp = GvSV(gv_fetchpv("(", 0, SVt_PV));
11 tmp =  SvIV(svp);
12 printf ("\n GID = %d",tmp);
13 }

The GvSV() function returns the current value of the variable named in the first parameter. The default value is 0 for these calls. The returned value from each GvSV call is a pointer from which an integer is extracted with a call to SvIV().

Using the ST Macros

The POPi, POPp, and POPn macros are great for getting items off the stack one at a time. To get the individual items in the stack though, you have to use the ST() macros. Basically, ST(n) returns the nth item from the top of the stack. However, you have to adjust the number of items on the stack yourself.

Listing 26.13 illustrates how the stack will look using the ST macros with a different function. Line 39 is where the ST(i) macro is used. The stack length is adjusted in lines 35 and 36.


Listing 26.13. Using the ST macros.
 1 /* demonstration C program */
 2 /* Using the ST macros.*/
 3
 4 #include <stdio.h>
 5 #include <EXTERN.h>
 6 #include <perl.h>
 7
 8     static void squares(int a);
 9     static PerlInterpreter *my_perl;
10
11     int main(int argc, char **argv, char **env)
12     {
13         my_perl = perl_alloc();
14         perl_construct(my_perl);
15         perl_parse(my_perl, NULL, argc, argv, env);
16     getRatio(8);
17         perl_destruct(my_perl);
18         perl_free(my_perl);
19     }
20
21
22     static void Squares(int a)
23     {
24         dSP ;
25     I32 ax;
26     int i;
27         int count ;
28         ENTER ;
29         SAVETMPS;
30         PUSHMARK(sp) ;
31         XPUSHs(sv_2mortal(newSViv(a)));
32         PUTBACK ;
33         count = perl_call_pv("squares", G_ARRAY);
34         SPAGAIN ;
35         sp -= count ;
36         ax = (sp - stack_base) + 1 ;   /* adjust the stack */
37         if (count != 2) croak("Whoa! \n") ;
38         for (i = 0; i < count; i++)
39                 printf ("%d ", SvIV(ST(i))) ;
40         PUTBACK ;
41         FREETMPS ;
42         LEAVE ;
43     }

The for loop recovers only as many values as are on the stack. If you do not want to adjust the stack, you can remove lines 35 and 36 from this code and replace the code in lines 38 and 39 with this:

for (i = 0; i < count; i++)
          printf ("%d ", POPi) ;

Of course, the choice of which type of function to use is entirely up to you.

Evaluating Perl Expressions

In addition to calling Perl code, you can also use the eval function to directly evaluate a Perl statement. This lets you use a C program by itself without having the need to declare the code for Perl elsewhere. By using C strings to hold your Perl programs, you can create entire applications using one C source file. Listing 26.14 presents a sample file.


Listing 26.14. Using expressions in C.
 1 #include <stdio.h>
 2 #include <EXTERN.h>
 3 #include <perl.h>
 4
 5    static PerlInterpreter *my_perl;
 6
 7
 8 /*
 9 ** This is a wrapper around the eval call
10 */
11    int evalExpression(char *evaluatedString)
12    {
13      char *argv[2];
14      argv[0] = evaluatedString;
15      argv[1] = NULL;
16      perl_call_argv("_eval_", 0, argv);
17    }
18
19 main (int argc, char **argv, char **env)
20 {
21
22 /*
23 ** Plain code to do some parsing.
24 */
25      char *codeToUse[] = { "", "-e", "sub _eval_ { eval $_[0] }" };
26      STRLEN length;
27
28      my_perl = perl_alloc();
29      perl_construct( my_perl );
30
31     /*
32     ** Fake out the call by creating your own argc, argv, and env
33     */
34      perl_parse(my_perl, NULL, 3, codeToUse, env);
35
36      evalExpression("$x = 3; $y = 2; $rho= sqrt($x * $x + $y * $y);");
37      printf("x = %d, y = %d and rho = %f \n",
38         SvIV(perl_get_sv("x", FALSE)),
39         SvIV(perl_get_sv("y", FALSE)),
40         SvNV(perl_get_sv("rho", FALSE)));
41      evalExpression("$wisdom
42      = 'Able was I ere I saw Elba'; $wisdom = reverse($wisdom); ");
43      printf("wisdom = %s\n", SvPV(perl_get_sv("wisdom", FALSE), length));
44
45      evalExpression("$wisdom = 'I ran a mile today, and said \
46     Here Lady Take your purse'; $wisdom = reverse($wisdom); ");
47      printf(" %s\n", SvPV(perl_get_sv("wisdom", FALSE), length));
48
49      evalExpression("$joke = 'I was walking down the street when something
50 caught my eye and dragged it twenty feet'; $joke = uc($joke); ");
51      printf("%s\n", SvPV(perl_get_sv("joke", FALSE), length));
52
53      perl_destruct(my_perl);
54      perl_free(my_perl);
55 }

Here's the output from running this program:

x = 3, y = 2 and rho = 3.605551
wisdom = ablE was I ere I saw elbA
 esrup ruoy ekaT ydaL ereH dias dna ,yadot elim a nar I
I WAS WALKING DOWN THE STREET WHEN SOMETHING
CAUGHT MY EYE AND DRAGGED IT TWENTY FEET

Now let's look at some of the lines of relevance in the listing. Lines 11 through 17 define a function that will serve as the wrapper around the Perl eval() function. Basically, the function evalExpression takes one string as an argument, creates an argument vector, and then calls the _eval_ function with this newly created argument vector.

Then in line 25 we actually define the _eval_ function as if it was typed on the command line. The text is preserved in the string variable codeToUse. The perl_parse function is then called with the codeToUse string and a length of 3 arguments as argc. The environment is passed in verbatim.

At line 36 we test out how to use integers and floating point numbers in a calculation. Note how the returned values are extracted by naming functions without $. The $ is automatically prepended to the variable name; therefore, you should not use it explicitly. If you prepend $ yourself, the value of the variable $$var, not $var, will be extracted.

Strings are probably where you'll benefit the most when using Perl functions from within C. Lines 42 through 51 show examples of how to call Perl functions to reverse and change the case of some strings. Note how the strings can cross multiple lines.

Actually, you can put in entire Perl functions instead of these calls and have the eval operator work on these functions as strings. This enables you to create very powerful applications using the power of each of the languages (C and Perl).

Pattern Matches and Substitutions from Your C Program

It's possible to use the Perl interpreter to do pattern matches and string substitutions from within C code. Pattern matches in Perl are easy when compared with the same process in C. This is one area where you can use Perl to make up for the lack of pattern matching and string substitution features in C. Here's a definition of two functions in C that use the Perl interpreter:

  • int match(char *inputString, char *matchString)
  • This function takes an input string, and attempts to match it with the pattern in matchString. The matchString is a string with a regular pattern in it; for example, "/Gumb[oy]/" or "/[Dd]onald/". Be careful when using double quotes within patterns because you'll have to escape them with a backslash ("\").
  • int substitute(char *inputString, char *substitution)
  • This function takes an input string followed by a substitution operation such as "s/courage/stupidity/" or "s/ition/ate/g". The inputString will be modified if any substitution is made.

Listing 26.15 presents the code to define and use these functions.


Listing 26.15. Using pattern matching and substitution in C.
  1 /*
  2 ** Using the Perl interpreter to do pattern matching
  3 ** and string substitution.
  4 */
  5 #include <stdio.h>
  6 #include <EXTERN.h>
  7 #include <perl.h>
  8
  9 /* Undefine this to see how it all works. */
 10 #define KDEBUG 1
 11
 12 static PerlInterpreter *my_perl;
 13
 14 /*
 15 ** A wrapper around the Perl eval statement
 16 */
 17 int doExpression(char *string)
 18 {
 19      char *argv[2];
 20      argv[0] = string;
 21      argv[1] = NULL;
 22      perl_call_argv("_eval_", 0, argv);
 23 }
 24
 25 /*
 26 ** A global to this program since I am too lazy to use mallocs
 27 */
 28 static char command[256];
 29
 30
 31 /*
 32 ** Do a pattern match
 33 */
 34 char match(char *string, char *pattern)
 35 {
 36 sprintf(command, "$string = '%s';
       $return = $string =~ %s", string, pattern);
 37 #ifdef KDEBUG
 38     printf (" %s", command);
 39 #endif
 40 doExpression(command);
 41 return SvIV(perl_get_sv("return", FALSE));
 42 }
 43
 44 /*
 45 ** Do a string substitution
 46 */
 47 int substitute(char *string, char *pattern)
 48 {
 49 char   *bfptr = command;
 50 STRLEN length;
 51 sprintf(command, "$string = '%s'; $ret = ($string =~ %s)",string,pattern);
 52
 53 #ifdef KDEBUG
 54 printf (" %s", command);
 55 #endif
 56
 57 doExpression(command);
 58 bfptr = (char *) SvPV(perl_get_sv("string", FALSE), length);
 59 strcpy(string,bfptr);
 60 return SvIV(perl_get_sv("ret", FALSE));
 61 }
 62
 63
 64 /*
 65 ** The main program itself.
 66 */
 67 main (int argc, char **argv, char **env)
 68 {
 69 char *embedding[] = { "", "-e", "sub _eval_ { eval $_[0] }" };
 70 STRLEN length;
 71 char *text;
 72 int i,j;
 73
 74      my_perl = perl_alloc();     /* Allocate the interpreter */
 75      perl_construct( my_perl ); /* Call the constructor */
 76     /* Fake the call to the script to use */
 77      perl_parse(my_perl, NULL, 3, embedding, env);
 78
 79     /*
 80     ** Do the loop
 81     */
 82
 83     while (1) {
 84          doExpression("$reply = <STDIN>; chop $reply;");
 85     text = SvPV(perl_get_sv("reply", FALSE), length);
 86          printf("reply = %s\n", text);
 87
 88          if (match(text, "/[Qq]uit|[Ee]xit/")) /* Bail out? */
 89             break;
 90          if (match(text, "/[Cc]om/"))
 91         {
 92                printf("match: Text contains the word 'com'\n\n");
 93            substitute(text, "s/com/commercial/g");
 94              printf("\nAfter substitution = %s\n", text);
 95         }
 96          else
 97                printf("match: Text doesn't contain the word Com.\n\n");
 98
 99         } /* while loop ends */
100
101     /*
102     ** Clean up after yourself.
103     */
104      perl_destruct(my_perl);
105      perl_free(my_perl);
106 }

The program is derived from the code in Listing 26.14. Most of the constructs are the same with the exception of the global command buffer declared at line 28. You might want to consider using mallocs within functions for a more complicated application. The KDEBUG flag is set to 1 so as to show how the command strings are constructed. You can comment out the code in line 10 for a less verbose output.

The match function is defined at line 34. The command string is constructed at line 36 with the command executed at line 40. The value of the $return string is returned. The $return contains the first matched pattern and will be empty if no matches are found. If $return is empty, 0 will be returned.

The substitute function is defined at line 47. A command string is constructed at line 51 using the global "command" buffer. At line 58, we retrieve the value of the $ret variable after the substitution has been made, and we overwrite this new string onto the input string. The length of the new string is returned at line 60. The bulk of the work in the program is being done in lines 83 to 99 during the while loop. At line 84, we collect an input string from the user and point to it with the text char *pointer. Then we match the incoming string to see whether we have to exit in line 88 and break out of the loop if there is a match. At line 91, we search for another pattern and, if found, substitute a string in the input pattern.

Summary

This chapter shows ways of incorporating the Perl interpreter into your C code. Writing a makefile involves getting the correct paths to the libraries. The ExtUtils::embed module can help you get these paths. The C code that utilizes the Perl functions must initialize and maintain a PerlInterpreter object. Calls from C into Perl have to maintain a stack on which values are sent into and retrieved from functions. Both scalar and array values can be returned from Perl functions, and the calling routine has to be aware of how to handle these inputs. The Perl functions being called can either reside on disk or can be embedded in C strings. When combining code from both these languages, you have to balance the division to get the most out of each language by using their strong points. For example, use Perl for string manipulation and use C for coding complex operations involving calculations.

Chapter 27

Writing Extensions in C


CONTENTS


In this chapter you'll work with the Perl XS language, which is used to create an interface between Perl and a C library. Such interfaces are called extensions to Perl because they enable your code to look and feel just like it is a part of Perl. Extensions are useful in appending extra functionality to Perl. This chapter covers the basics of writing extensions. The examples are simple enough to build on in order to create your own extensions library.

Introduction to XS

In Perl, XS refers to a programming language interface used to create an interface between C code and Perl scripts. Using the XS API, you can create a library that can be loaded dynamically into Perl.

The XS interface defines language components that wrap around Perl constructs. To use the XS language, you need the xsubpp compiler to embed the constructs for you. The base construct in the XS language is the XSUB function. The XSUB function is called when calls are made to pass data and control between Perl and C code. Basically, Perl scripts call the XSUB routines to get to the C code in the libraries encapsulated by the XSUB functions.

To ensure that correct data types are mapped between C and Perl scripts, the XS compiler does a mapping of one C type of variable to a type in Perl. The mapping is maintained in a file called typemap. When you're looking for extension files, it's often instructive to see how files are mapped by using the typemap file in the same directory as the source file. typemap files are covered later in this chapter.

You may be asking yourself why someone would want to write an extension in C when Perl is a perfectly good working language. For one thing, Perl is slow compared to C. Your C compiler can generate really tight code. All that power does have drawbacks. Another reason is that you might already have source code written and working in C, and porting this existing code to Perl would not make much sense. Also, it's possible to write code that can be accessed with Perl and C if you write the interfaces to your core functions correctly.

Steps for Creating an Extension

The best way to show the creation of an extension is by example. This section steps you through the creation of the futureValue world function for Perl.

Step 1: Change to the Perl Directory

Go to the directory where you installed the Perl distribution. This is important. Do not try this in any other directory or you'll get errors.

Step 2: Run h2sx

For Step 2, you have to run the header to extension program h2xs. To obtain a list of the options for this program, use the -h option, as shown here:

$ h2xs -h

h2xs [-Acfh] [-n module_name] [headerfile [extra_libraries]]
    -f   Force creation of the extension even if the C header does not exist.
    -n   Specify a name to use for the extension (recommended).
    -c   Omit the constant() function and specialised AUTOLOAD from the XS file.
    -A   Omit all autoloading facilities (implies -c).
    -h   Display this help message
extra_libraries
         are any libraries that might be needed for loading the
         extension, e.g. -lm would try to link in the math library.

Run h2XS -n Finance. This creates a directory named Finance, possibly under the subdirectory ext/ if it exists in the current working directory. Four files are created in the Finance directory: MANIFEST, Makefile.PL, Finance.pm, and Finance.xs. Here's what the output on your terminal will look like:

$ h2xs -n Finance
Writing ext/Finance/Finance.pm
Writing ext/Finance/Finance.xs
Writing ext/Finance/Makefile.PL

The MANIFEST file in the ext/Finance directory contains the names of the four files created. You have to change directories to ./ext/Finance to be able to work with these four files. Also, depending on who installed your Perl distribution, you might have to run as root.

The contents of file Makefile.PL are shown in Listing 27.1.


Listing 27.1. The contents of Makefile.PL.
 1 use ExtUtils::MakeMaker;
 2 # See lib/ExtUtils/MakeMaker.pm for details of how to influence
 3 # the contents of the Makefile that is written.
 4 WriteMakefile(
 5     'NAME'       => 'Finance',
 6     'VERSION'    => '0.1',
 7     'LIBS'       => ['-lm'],   # e.g., '-lm'
 8     'DEFINE'     => '',     # e.g., '-DHAVE_SOMETHING'
 9     'Inc'        => '',     # e.g., '-I/usr/include/other'
10 );

The h2xs script also creates a .pm file. The contents of file Finance.pm are shown in Listing 27.2. The Finance.pm file is where you add your exports and any Perl code.


Listing 27.2. The module file: Finance.pm.
 1 package Finance;
 2
 3 require Exporter;
 4 require DynaLoader;
 5 require AutoLoader;
 6
 7 @ISA = qw(Exporter DynaLoader);
 8 # Items to export into callers namespace by default. Note: do not export
 9 # names by default without a very good reason. Use EXPORT_OK instead.
10 # Do not simply export all your public functions/methods/constants.
11 @EXPORT = qw(
12
13 );
14 sub AUTOLOAD {
15     # This AUTOLOAD is used to 'autoload' constants from the constant()
16     # XS function.  If a constant is not found then control is passed
17     # to the AUTOLOAD in AutoLoader.
18
19     local($constname);
20     ($constname = $AUTOLOAD) =~ s/.*:://;
21     $val = constant($constname, @_ ? $_[0] : 0);
22     if ($! != 0) {
23      if ($! =~ /Invalid/) {
24          $AutoLoader::AUTOLOAD = $AUTOLOAD;
25          goto &AutoLoader::AUTOLOAD;
26      }
27      else {
28          ($pack,$file,$line) = caller;
29          die "Your vendor has not defined Finance macro
$constname, used at $file line $line.
30 ";
31      }
32     }
33     eval "sub $AUTOLOAD { $val }";
34     goto &$AUTOLOAD;
35 }
36
37 bootstrap Finance;
38
39 # Preloaded methods go here.
40
41 # Autoload methods go after _ _END_ _, and are processed by the autosplit program.
42
43 1;
44 _ _END_ _

All scripts which use Finance.pm will now have to tell Perl to use functions in the Finance.pm extension with the following command:

use Finance;

When Perl sees this use command, it searches for a Finance.pm file of the same name in the various directories listed in the @Inc array. If it cannot find the file, Perl stops with an error message.

The .pm file extension generally requests that the Exporter and Dynamic Loader extensions also be loaded. You need them for exporting functions and dynamic loading. Perl uses the @ISA array to get any methods that are not found in the current package. After this set, the library is loaded as an extension into Perl.

There are two files to look at in the Finance example we just created. The Finance.xs file (Listing 27.3) holds the C routines that contain all the C code for the extension, and the Finance.pm file (Listing 27.2) contains routines that tell Perl how to load this extension and what functions are exported.

Step 3: Create a Makefile

In Step 3, you have to generate a makefile. Generating and invoking the make command makefile will create a working version of the library Finance.so in the ../../lib/ext/Finance directory. After testing, you can move the finished versions of these files to the /usr/lib or /usr/local/lib tree. In all further testing in this section, you must point the @Inc array to the ../../lib/ext/Finance location for this finance.so file.

You may also see a blib directory in the ext/Finance directory. The man page templates for your extensions are kept here.

Finally, the Finance.xs file where you place all your code for extension in C is shown in Listing 27.3.


Listing 27.3. The initial Finance.xs file.
 1 #ifdef _ _cplusplus
 2 extern "C" {
 3 #endif
 4 #include "EXTERN.h"
 5 #include "perl.h"
 6 #include "XSUB.h"
 7 #ifdef _ _cplusplus
 8 }
 9 #endif
10
11 static int
12 not_here(s)
13 char *s;
14 {
15     croak("%s not implemented on this architecture", s);
16     return -1;
17 }
18
19 static double
20 constant(name, arg)
21 char *name;
22 int arg;
23 {
24     errno = 0;
25     switch (*name) {
26     }
27     errno = EINVAL;
28     return 0;
29
30 not_there:
31     errno = ENOENT;
32     return 0;
33 }
34
35 MODULE = Finance        PACKAGE = Finance
36
37 double
38 constant(name,arg)
39      char *         name
40      int       arg

Step 4: Add Some Code

Now that you have created the necessary files for your own Perl extension, you can move to Step 3, which involves adding your own code to the newly generated extension files. Add some simple futureValue world application code to the Finance.xs file. Listing 27.4 shows what Finance.xs looks like with the code addition. Be sure to create this file because you'll be using it throughout the rest of the chapter.


Listing 27.4. After code is added to Finance.xs.
  1 #ifdef _ _cplusplus
  2 extern "C" {
  3 #endif
  4 #include "EXTERN.h"
  5 #include "perl.h"
  6 #include "XSUB.h"
  7 #ifdef _ _cplusplus
  8 }
  9 #endif
 10
 11 static int
 12 not_here(s)
 13 char *s;
 14 {
 15     croak("%s not implemented on this architecture", s);
 16     return -1;
 17 }
 18
 19 static double
 20 constant(name, arg)
 21 char *name;
 22 int arg;
 23 {
 24     errno = 0;
 25     switch (*name) {
 26     }
 27     errno = EINVAL;
 28     return 0;
 29
 30 not_there:
 31     errno = ENOENT;
 32     return 0;
 33 }
 34
 35 MODULE = Finance        PACKAGE = Finance
 36
 37 double
 38 constant(name,arg)
 39      char *         name
 40      int       arg
 41
 42 double
 43 futureValue(present,rate,time)
 44      double present
 45      double rate
 46      double time
 47
 48      CODE:
 49      double d;
 50      extern double pow(double x, double y);
 51
 52      d = present * pow((1.0 + rate),time);
 53      /* printf("\n Future Value = %f \n", d); */
 54      RETVAL =  d;
 55
 56      OUTPUT:
 57      RETVAL
 58
 59 double
 60 presentValue(future,rate,time)
 61      double future
 62      double rate
 63      double time
 64
 65      CODE:
 66      double d;
 67      extern double pow(double x, double y);
 68
 69      d = future / pow((1.0 + rate),time);
 70      /* printf("\n Present Value = %f \n", d); */
 71      RETVAL =  d;
 72
 73      OUTPUT:
 74      RETVAL
 75
 76 void
 77 Gordon(d,r,g)
 78      double d
 79      double r
 80      double g
 81
 82      CODE:
 83           /* Return Gordon Growth Model Value of stock */
 84      g = d / (r -g);
 85      printf("\n g = %f", g);
 86
 87 void
 88 depreciateSL(p,s,n)
 89      double p
 90      double s
 91      double n
 92
 93      PpcODE:
 94      double sum;
 95      double value[10];
 96      double dep;
 97      int i;
 98      int in;
 99      in  = (int)n;
100      if ((n < 10) && (n > 0))
101           {
102           EXTEND(sp,in);
103           sum  = p;
104           printf("value = %f for %d years \n", sum,in);
105           dep = (sum - s)/ n;
106           for (i=0; i<n; i++)
107                {
108                sum -= dep;
109                printf("value = %f \n", sum);
110                value[i] = sum;
111                PUSHs(sv_2mortal(newSViv(sum)));
112                }
113           }

A function to calculate the future value of an investment is defined in line 43. The function to calculate the present value of money to be received in the future from an investment is defined starting at line 60. Both functions return one value and require three arguments, which are defined one per each line following the function declaration. For example, for the present value function in line 60, the three arguments must be defined in one line as

double future, rate, time;

At line 77, I define a Gordon Growth model function for a stock. This function only prints something and does not return any values. At line 88, I define a straight line depreciation model function that returns more than one value on the calling stack.

Caution
Be careful to put the type of function and the name of the function on separate lines. In other words, do not shorten the two lines into one. The return parameter and the function name must be on two lines for the XS specification. Any arguments to the function would have be to listed one at a time on each line following the function name.

Next, run the command perl Makefile.PL. This creates a real makefile, which make needs. The Makefile.PL checks to see whether your Perl distribution is complete and then writes the makefile for you.

If you get any errors at this point, you should check to see whether your Perl distribution is complete. As a check, try running the Makefile.PL script as root. If your Perl distribution was installed by root, you may not have permission to overwrite some files. It won't hurt to try.

If you do not get any errors, proceed.

Run the makefile on your newly created Makefile. The following output should be pretty close to what you see:

# make

umask 0 && cp Finance.pm ../../lib/Finance.pm
../../perl -I../../lib -I../../lib ../../lib/ExtUtils/xsubpp
-typemap ../../lib/ExtUtils/typemap Finance.XS
>Finance.tc && mv Finance.tc Finance.c
cc -c -D_ _USE_BSD_SIGNAL -Dbool=char -DHAS_BOOL
-O2    -DVERSION=\"0.1\" -fpic -I../..  Finance.c
Running Mkbootstrap for Finance ()
chmod 644 Finance.bs
LD_RUN_PATH="" cc -o ../../lib/auto/Finance/Finance.so -shared
     -L/usr/local/lib Finance.o
chmod 755 ../../lib/auto/Finance/Finance.so
cp Finance.bs ../../lib/auto/Finance/Finance.bs
chmod 644 ../../lib/auto/Finance/Finance.bs

Wait! Before you execute this gem of a script shown above, look at the output in Listing 27.6. The shared version of Finance.so is in the directory ../../lib/auto/Finance/Finance.so. It is important that you copy Finance.so to your @Inc path. It's important at this point to either modify the @Inc array in your scripts that use Finance.pm or, by default, point to a known test location where this .so file will reside. Perl will search the directories listed in @Inc to load the extension module.

Step 5: Test Your Extension Module

Now, in the Test1 directory, create the test script shown in Listing 27.5 and name it t.pl.


Listing 27.5. The test program.
 1 #!/usr/bin/perl
 2
 3 use Finance;
 4
 5 Finance::Gordon(50,4,2);
 6 print "\n";
 7
 8 #
 9 #
10 #
11 $pv = 1000.0;
12 $time = 10 * 12;
13 $rate = 0.05 / 12;
14 $fv = Finance::futureValue($pv,$rate,$time);
15 printf "\n Future Value of %f after %f periods accruing interest ",
16            $pv, $rate;
17 printf "\n for %f months will be r %f \n",
18           $time, $fv;
19
20 printf "\n Let us go backward and see if we get the same numbers";
21 $pvBack = Finance::presentValue($fv,$rate,$time);
22 printf "\n Present Value ", $pvBack;
23
24 printf "\n Difference = %f", ($pv - $pvBack);
25
26 printf "\n ------- see any difference? --------- \n" ;

Notice that Finance::Gordon is used to explicitly call the Gordon Growth Model function (see line 5). Look up the formula in your finance textbook if you don't believe me. It would be cumbersome to keep typing in Finance:: to all your functions. Add the declaration to the @EXPORT array in Finance.pm and remake. Now you can use the function Gordon by itself. The @EXPORT array in the .pm file tells Perl which of the extension's routines should be placed in the calling package's own name space.

Final Considerations

There are some things you should be aware of before you export everything in your module. Sure, it saves you typing and makes the code easier to read by not having all those Finance:: prefixes everywhere. However, what about the same function name residing in both the main and the modules? In this case, the function in the main may prevail. Why create an ambiguity when the choice of a good name for a function will suffice? Also, it's not a good idea to export every function in your extension. After all, the idea behind the extension is to hide some of the functionality and intricacies in the extension from the application that is using this extension module.

Most of the time you do not want to export the names of your extension's subroutines because they might accidentally clash with other modules' subroutines from other extensions or from the calling program itself.

The xsubpp Compiler

If you examine the makefile for your extension, you'll see a call to a program called xsubpp. This is a preprocessor compiler for XS code.

The compiler xsubpp takes the XS code in the .xs file, converts it into C code, and places it in a file whose suffix is .c. The C code created makes heavy use of the C functions within Perl.

Tip
Make sure the xsubpp compiler is in your path and that the first line in the file points to /usr/bin/perl and not to .miniperl (if you did not install miniperl).

An XSUB function is just like a C function in that it takes arguments and returns one or more single values (if not declared void). Values may also be returned via pointers to arguments passed to the function.

Now move on to something a little bit more exotic and create a function that takes arguments and returns something. This function calculates and returns the Julian day given a calendar day. The Julian day calculation is very important in astronomical calculations since it's a reference counter from all the days since January 1, 4713 B.C. and was founded by the French scholar Joseph Scaliger (1540-1609) in 1583 A.D. The formula for calculating the Julian day is given in forms in just as many astronomical texts. One version of this formula is:

Julian Day = 367 * YEAR - 7 * (YEAR + (M + 9)/12)4 +
                      (275 * M / 9) + 1721013.5;

Don't try to shorten the formula by reducing it to an algebraic equivalent since the formula relies on dropping bits off the right side of the decimal point. One such way of implementing this formula is the function for calculating the Julian day, as shown in Listing 27.6.


Listing 27.6. The Julian.c file.
 1 #include <math.h>
 2
 3 long JulianDay( int month, int day, int year)
 4 {
 5 double t1, t2;  long jd;
 6 t1 = 7 * floor((year + floor(month * 9)/12)/4);
 7 t2 = floor(275 * month / 9);
 8 jd = (long) (floor(367 * year - t1 + t2 + day + 1721013.5));
 9  return(jd);
10 }
11 /* SAMPLE USAGE
12 main(int argc, char *argv[])
13 {
14 long day1, day2;
15 day1 = JulianDay(7,21,1962);
16 day2 = JulianDay(2,2,1996);
17 printf("\n Thou are only %d days old", day2 - day1);
18 }
19 */

Run h2xs -A -n Julian as before to get the Julian module started. This creates the ext/Julian directory with the necessary files. This time, however, you'll be adding a lot more code into the functions in Julian.xs, as shown in Listing 27.7.


Listing 27.7. Adding functions to Julian.xs.
 1 #ifdef _ _cplusplus
 2 extern "C" {
 3 #endif
 4 #include "EXTERN.h"
 5 #include "perl.h"
 6 #include "XSUB.h"
 7 #ifdef _ _cplusplus
 8 }
 9 #endif
10
11 MODULE = Julian         PACKAGE = Julian
12
13 #include <math.h>
14
15 long
16 JulianDay( month, day, year)
17      int month
18      int day
19      long year
20
21      CODE:
22           long jul;
23 double t1, t2;
24 t1 = 7 * floor((year + floor(month * 9)/12)/4);
25 t2 = floor(275 * month / 9);
26 jul = (long) (floor(367 * year - t1 + t2 + day + 1721013.5));
27
28           }
29           RETVAL = jul;
30      OUTPUT:
31      RETVAL

Note that each line containing the arguments after the declaration of JulianDay is indented one tab. It is not necessary to have a tab between the type of variable and the name of the variable. Also note that that there is no semicolon following the declaration of each variable.

Edit the file Makefile.PL so that the corresponding line looks like this:

'LIBS'      => ['-lm'],   # e.g., '-lm'

Notice that an extra library to link in has been specified in this case, the math library, libm. You'll learn later in this chapter how to write XSUBs that can call every routine in a library.

Generate the makefile and run make. A test script for running this program is shown in List-ing 27.8.


Listing 27.8. Testing the Julian Day module.
 1 #!/usr/bin/perl
 2
 3 use Julian;
 4
 5 $m1 = 07;
 6 $d1 = 21;
 7 $d2 = 22;
 8
 9 $yr1 = 1996;
10 $yr2 = 1996;
11
12 $jd1 = &Julian::JulianDay($m1,$d1,$yr1);
13 $jd2 = &Julian::JulianDay($m1,$d2,$yr2);
14
15 print "Day 1: $jd1 \n";
16 print "Day 2: $jd2 \n";
17 print "Delta: ",$jd2 - $jd1, "\n" ;

Now that you've learned some of the ways to create and use C extensions, here's how to put them all together. First of all, look at the Julian.c file in Listing 27.9, put together by the Perl script.

The function is called XS, and the arguments are specified via the dXSARGS keyword.


Listing 27.9. The Julian.c file.
 1 /*
 2  * This file was generated automatically by xsubpp version 1.923 from the
 3  * contents of Julian.xs. Don't edit this file, edit Julian.XS instead.
 4  *
 5  *     ANY chANGES MADE HERE WILL BE LOST!
 6  *
 7  */
 8
 9 #ifdef _ _cplusplus
10 extern "C" {
11 #endif
12 #include "EXTERN.h"
13 #include "perl.h"
14 #include "XSUB.h"
15 #ifdef _ _cplusplus
16 }
17 #endif
18
19 #include <math.h>
20 XS(XS_Julian_JulianDay)
21 {
22     dXSARGS;
23     if (items != 3)
24        croak("Usage: Julian::JulianDay(month, day, year)");
25     {
26        int    month = (int)SvIV(ST(0));
27        int    day = (int)SvIV(ST(1));
28        long   year = (long)SvIV(ST(2));
29        long   RETVAL;
30               long jul;
23               double t1, t2;
24               t1 = 7 * floor((year + floor(month * 9)/12)/4);
25               t2 = floor(275 * month / 9);
26               jul = (long) (floor(367 * year - t1 + t2 + day + 1721013.5));
27               }
28               RETVAL = jul;
29        ST(0) = sv_newmortal();
30        sv_setiv(ST(0), (IV)RETVAL);
31     }
32     XSRETURN(1);
33 }
34
35 #ifdef _ _cplusplus
36 extern "C"
37 #endif
38 XS(boot_Julian)
39 {
40     dXSARGS;
41     char* file = _ _FILE_ _;
42
43         newXS("Julian::JulianDay", XS_Julian_JulianDay, file);
44     ST(0) = &sv_yes;
45     XSRETURN(1);
46 }

The meaning of keywords such as newXS, SvIV, and so on in the output C file are explained in Chapter 25, "Perl Internal Files and Structures." However, for the moment, concentrate on the compiler itself and how it expects input.

Input and Output Parameters to Functions

The functions compiled by the xsubpp compiler are referred to as XSUB. You specify the parameters that are passed into the XSUB just after you declare the function return value and name. The list of parameters looks very C-like, but the lines must be indented by a tab stop, and each line should not have an ending semicolon.

The list of output parameters occurs after the OUTPUT: directive. The default value returned is RETVAL. The use of RETVAL tells Perl that you want to send this value back as the return value of the XSUB function. You still have set RETVAL to something. You can also specify which variables used in the XSUB function should be placed into the respective Perl variables that are passed in.

The typemap File

The xsubpp compiler uses rules to convert from Perl's internal data types to C's data types. These rules are stored in the typemap file. The rules in typemap contain mappings for converting ints, unsigned ints, and so on into Perl scalars. Arrays are mapped to char** or void pointers, and so on. The typemap file with all the mappings is located in the ExtUtils directory under the Perl installation.

The typemap file is split into three sections. The first section is a mapping of various C data types into a tag value. The second section is for converting input parameters to C, and the third is for outputting parameters from C to Perl.

Take a look at Listing 27.9 again. Note the SvIV for the month declaration. Now look in the typemap file for the declaration of int. You'll see it defined as T_IV. Now go to the second INPUT part in the file to see how T_IV is mapped for input. You'll see the following lines, which map the integer from the Perl variable:

T_IV
     $var = ($type)SvIV($arg);

Similarly, in the OUTPUT section of the typemap file, you'll see the following lines to generate a returned value. This fragment places an integer into the ST array (which is indexed from 0 on up for all the incoming and outgoing arguments of a function):

T_IV
     sv_setif($arg, (IV)$var);

Tip
With C pointers, the asterisk indirection operator (*) defines the type foo pointer. When using the address of a variable, the ampersand (&) is considered part of the variable, and you should use a pointer type.

If you forgot to create the typemap file, you might see output that looks like this:

Error: 'const char *' not in typemap in Julian.xs, line XXX

This error means that you have used a C data type that xsubpp doesn't know how to convert between Perl and C. The solution is to create a custom typemap file that tells xsubpp how to do the conversions.

You can define your own typemap entries if you find certain parameters in the file that you cannot find in the existing typemap file. For example, the type double is understood by Perl, but not double *. In this case you have to make an entry in the typemap file to convert the pointer to double to something Perl will understand. Try a void pointer.

The bootstrap Function

All Perl extensions require a Perl module with a call to the bootstrap function, which loads the extension into Perl. The module's functions are two-fold: Export all the extension's functions and global references to variables to the Perl script using the extension, and load the XSUBs into Perl using dynamic linking. Thus, you require two modules: the Exporter to export your functions and the DynaLoader for dynamic loading. See the following example for the Julian package:

package Julian;     # my package

require Exporter;   # <--- so that I can export functions
require DynaLoader; # <--- for loading extensions

@ISA = qw(Exporter DynaLoader); #inherit functionality.

@EXPORT = qw( JulianDay );  # to reduce use of Julian:: in scripts.

bootstrap Julian;   # Load the extension in to Perl
1;

Passing Arguments

Parameters are passed into an XSUB function via an argument stack. The same stack is used to store the XSUB's return value. All Perl functions are stack oriented and use indexes in their own stack to access their variables.

The stacks are organized bottom up and can be indexed using the ST(index) macro. The first position on that stack that belongs to the active function is referred to as index 0 for that function. The positions on the stack are referred to as ST(0) for the first item, ST(1) for the next, and so on. The incoming parameters and outgoing return values for an XSUB are always position 0. Parameters are pushed left to right.

Caution
ST(x) is a macro. Be careful not to do something like this when using it: ST(x++). Depending on how the macro is defined, you may increment x more than once. It won't now, but it just might. Handle macros with care.

The RETVAL Variable and the OUTPUT Section

The OUTPUT section of the xs file is where you place return values. The return value is always ST(0). However, this value will not be set unless the OUTPUT section with RETVAL is defined. You must have the two lines in a function to get it to return a value:

OUTPUT:
RETVAL

You also have to remember to set RETVAL somewhere along the code to have a value to return. The type of RETVAL is the type of the function you declared at the top. So, the JulianDay function has a long RETVAL, whereas the futureValue function has a double RETVAL. For return types of void, the RETVAL variable is not defined and you cannot use it.

Input parameters in an XSUB are normally initialized with their values from the values pushed on the argument stack at the time of the call. Entries in the typemap file are used to map the Perl values into their C counterparts in the XSUB function. You can use code that would be generated by the xsubpp compile directly to gain access to a variable. For example, in the following function, the first argument is accessed via the SvPV map function:

GetDayOfWeek(julianDay,dayOfWeek)
     long julianDay = (long)SvPV(ST(0),na)
     long dayOfWeek = 0

In this example, dayOfWeek is assigned to a value of 0 as the default value. This is done so that if nothing is passed in for dayOfWeek, then it will be set to 0. Defensive programming like this makes the package easier to use.

You can even place assignments in the parameter list, like this:

DayOfWeek(julianDay,dayOfWeek = 0)
       long julianDay = (long)SvPV(ST(0),na)
      long dayOfWeek

The default values set in the parameters may only be a number or a string, not pointers. Also, you can define such values from a right-to-left order in the parameter list. Thus, the following line would cause unspeakable errors from xsubpp:

DayOfWeek(dayOfWeek = 0, julianDay)

To allow the XSUB for DayOfWeek() to have a default date value, you could rearrange the parameters to the XSUB. A Perl program will then be able to call DayOfWeek() with either of the following statements:

$status = DayOfWeek( $julianDay );
$status = DayOfWeek( );   # Force it default to jday of 0

The code in the Julian.xs file would look like the following:

int
DayOfWeek(jday = 0)
     long jday;

     CODE:

     if (jday == 0) {
          RETVAL = 0;
     }
     else      {
          RETVAL = (jday % 7);
     }

     OUTPUT:
     RETVAL

The XSUB code generated for this segment of code would look this:

XS(XS_Julian_DayOfWeek)
{
    dXSARGS;
    if (items < 0 || items > 1)
     croak("Usage: Julian::DayOfWeek(jday = 0)");
    {
     long jday;
     int  RETVAL;

     if (items < 1)
         jday = 0;
     else {
         jday = (long)SvIV(ST(0));
     }
     if (jday == 0) {
          RETVAL = 0;
     }
     else      {
     RETVAL = (jday % 7);
     }

     ST(0) = sv_newmortal();
     sv_setiv(ST(0), (IV)RETVAL);
    }
    XSRETURN(1);
}

In this code fragment, the special variable items tells the routine how many parameters have been passed into the function. The items variable is tested to see how to initialize jday in this example. The sv_newmortal() function is used to clear out the return values for this XSUB function.

The use of ellipses (...) for passing variable-length argument lists is also supported in XSUBs. Your function can easily get the number of arguments passed into it by looking at the special items variable. The items keyword is a reserved variable and the xsubpp compiler supplies items for all XSUBs. Using the items variable lets you accept an unknown number of arguments in your XSUB function.

Keywords

There are several special keywords in the .xs file that can also be used when writing extensions.

The MODULE Keyword

The MODULE keyword is used to start the XS code and to specify the name of the package currently being defined. There is only one MODULE keyword per .xs file. All text before the MODULE keyword is not processed in any way by xsubpp. Do not modify the code before the MODULE keyword. If you have to add code, it will be passed through to the final C file.

Here's the syntax for the MODULE keyword:

MODULE packageName

The packageName is used as the name of the bootstrap function for this module extension. The MODULE keyword is generated for you by xsubpp.

The PACKAGE Keyword

On occasion, you may have more than one package per module. In this case, the PACKAGE keyword is used to indicate which package within the module contains the code that follows. Generally, the name following the PACKAGE keyword is the same as that following the MODULE keyword.

The PACKAGE keyword is used with the MODULE keyword and must follow on the same line as the MODULE keyword. You have to edit the .xs file yourself to make sure which package gets which function.

The CODE: Keyword

The CODE: keyword is used to indicate where the real C code for a function begins. Use just C code until you start a new block with another keyword, such as OUTPUT:. You can use C comments (/*...*/), ampersands, and so on, and they will not be touched by the xsubpp compiler.

xsubpp matches certain C preprocessor directives that are allowed within the CODE: block. It also matches # used for Perl comments. The compiler passes the preprocessor directives that it recognizes through to the final C file untouched and will remove the commented lines. Comments can be added to XSUBs by placing # at the beginning of the line, too. Nested comments are not supported.

Be careful not to make the comment look like a C preprocessor directive! The xsubpp compiler could be confused if a Perl comment begins to look like a C preprocessor directive. The following is a bad idea:

#define a variable

Is the above line a comment or a C statement? I do not know how this will be interpreted.

If you are going to mess with the argument stack, though, you'll want to use the PpcODE keyword, which will be discussed later in the chapter.

The OUTPUT: Keyword

The OUTPUT: keyword specifies the return values from a function. You have seen it used earlier in the case of the RETVAL assignment as a return. The OUTPUT: keyword generates code that does the mapping of the XSUB function's variables back to those in the Perl program calling XSUB. This keyword is used after the code in the CODE: area. The RETVAL variable is not the default return variable in the CODE: area. Only by specifying it after the OUTPUT: keyword are you letting xsubpp know that it's a return variable for this function.

The OUTPUT: keyword also lists the input parameters for use as output variables. This may be necessary when a parameter has been modified within the function and the programmer would like the update to be seen by Perl.

Say that you define a function, which returns the day of the week, given an isFriday() function in the Julian package. The function returns true (1) if the day is a Friday. The day of the week is returned in the second parameter passed to the function. The function is shown in the Julian.xs file as this:

int
IsFriday(jDay, dayOfWeek)
     long jDay
     int dayOfWeek = NO_INIT

     CODE:
          int dw;
          dw = (int)(jDay % 7);
          if (dw == 3) RETVAL = 1;
          else RETVAL = 0;
          dayOfWeek = dw;
     OUTPUT:
     dayOfWeek
     RETVAL

This example uses a NO_INIT keyword to show that the dayOfWeek is an output value. The xsubpp compiler normally generates code to read the values of all function parameters from the argument stack and assign them to C variables upon entry to the function. NO_INIT tells the compiler that these passed parameters are used for output rather than for input and that they are assigned before the function terminates. Thus, the function isFriday() uses the dayOfWeek variable only as an output variable and does not care about its initial contents.

PpcODE: for Returning More Than One Value

The PpcODE: keyword can be used instead of the CODE: keyword. The PpcODE: keyword tells the xsubpp compiler that the code in the XSUBs will be modifying the return stack itself. You'll use the PpcODE: keyword to return lists instead of values. Only void functions allow the use of PpcODE: keywords.

Look at the following section of code from the Finance.pm module and extension. It returns the straight line depreciation list for an asset. The function is called depreciateSL:

void
depreciateSL(p,s,n)
     double p
     double s
     double n

     PpcODE:
     double sum;
     double value[10];
     double dep;
     int i;
     int in;
     in  = (int)n;
     if ((n < 10) && (n > 0))
          {
          EXTEND(sp,in);
          sum  = p;
          printf("value = %f for %d years \n", sum,in);
          dep = (sum - s)/ n;
          for (i=0; i<n; i++)
               {
               sum -= dep;
               printf("value = %f \n", sum);
               value[i] = sum;
               PUSHs(sv_2mortal(newSViv(sum)));
               }
          }

Note the use of the PpcODE variable to show that you will be messing with the stack. The EXTEND(sp,n) macro is used to make room on the argument stack for n return values with the call to EXTEND(sp,n). The PpcODE: directive causes the xsubpp compiler to create a stack pointer called sp, which is conveniently used in the EXTEND() call. The values are then pushed onto the stack with the PUSHs() macro, with the value returned from the sv_2mortal() call to map the value to the stack.

The way to call this function is as follows in Listing 27.10:


Listing 27.10. Using the Julian.xs file.
 1 #!/usr/bin/perl
 2 use Finance;
 3 #
 4 #  Demonstrate the return value back as list from an extension
 5 #
 6 $present = 900;
 7 $time = 5;
 8 $salvage = 70;
 9 print "\n -============= first part ===============- \n";
10 print "First assign it all back to a list";
11 @deplist =  Finance::depreciateSL($present,$salvage,$time);
12 $n = 0;
13 foreach $i (@deplist) {
14     print "\n Year ", $n, " Returned Value = ", $i;
15      $n++;
16 }
17 print "\n -============= second part ===============- \n";
18 print "Now with explicit values \n";
19 ($y1,$y2,$y3,$y4,$y5) =  Finance::depreciateSL($present,$salvage,$time);
20 print "Returned 1 " . $y1 . "\n";
21 print "Returned 2" . $y2 . "\n";
22 print "Returned 3 " . $y3 . "\n";
23 print "Returned 4 " . $y4 . "\n";
24 print "Returned 5 " . $y5 . "\n";

Lines 13 through 15 run a for loop to test different values for a Julian day. Here's the output from this script:

-============= first part ===============-
First assign it all back to a listvalue = 900.000000 for 5 years
value = 734.000000
value = 568.000000
value = 402.000000
value = 236.000000
value = 70.000000

 Year 0 Returned Value = 734
 Year 1 Returned Value = 568
 Year 2 Returned Value = 402
 Year 3 Returned Value = 236
 Year 4 Returned Value = 70
 -============= second part ===============-
Now with explicit values
value = 900.000000 for 5 years
value = 734.000000
value = 568.000000
value = 402.000000
value = 236.000000
value = 70.000000
Returned 1 734
Returned 2568
Returned 3 402
Returned 4 236
Returned 5 70

Returning Undef and Empty Lists

In Perl, there are times when you'll want to return empty or undefined lists. You have to use the PpcODE block, not the CODE: block. To return empty lists, just return nothing. The Perl code sends back a pointer to any empty list.

For return undef, you have to set the value of ST(0) to undef. You can use the sv_newmortal() call to initialize a return value to undef and set it to ST(0).

ST(0) = sv_newmortal();  /* Undefine the return value. */

In later sections of the code, you can use the sv_setnv() function to set the value to something else if you want. Here's the syntax for this call:

sv_setnv(ST(index), value);

Alternatively, you can set the ST(index) value to sv_undef, where a pointer to list in the ST(index) was expected. More than likely, the index value is 0, unless you are returning references to arrays. For example, to return an undefined list, use this statement:

ST(0) = &sv_undef;

Note
Remember that if you have to use these constructs you have to use the PpcODE block, not the CODE: block.

The BOOT: Keyword

The BOOT: keyword is used to add code to the extension's bootstrap function. The bootstrap function is generated by the xsubpp compiler and normally holds the statements necessary to register any XSUBs with Perl. With the BOOT: keyword, the programmer can tell the compiler to add extra statements to the bootstrap function.

This keyword may be used any time after the first MODULE keyword and should appear on a line by itself. The first blank line after the keyword terminates the code block. A sample usage is shown here:

BOOT:
# The following message will be printed when the bootstrap function
# executes.
     printf("LOADING JULIAN\n");

The Listings of Modules

The listing of Julian.xs using the BOOT: and other keywords is shown in Listing 27.11.


Listing 27.11. The Julian.xs file.
 1 #ifdef _ _cplusplus
 2 extern "C" {
 3 #endif
 4 #include "EXTERN.h"
 5 #include "perl.h"
 6 #include "XSUB.h"
 7 #ifdef _ _cplusplus
 8 }
 9 #endif
10
11 MODULE = Julian         PACKAGE = Julian
12
13 BOOT:
14      printf("\n LOADING JULIAN");
15
16 #include <math.h>
17
18 int
19 IsFriday(jDay, dayOfWeek)
20      long jDay
21      int dayOfWeek = NO_INIT
22
23      CODE:
24           int dw;
25           dw = (int)(jDay % 7);
26           if (dw == 3) RETVAL = 1;
27           else RETVAL = 0;
28           dayOfWeek = dw;
29      OUTPUT:
30      dayOfWeek
31      RETVAL
32
33 long
34 JulianDay( month, day, year)
35      int month
36      int day
37      long year
38
39      CODE:
40           long jul;
23 double t1, t2;
24 t1 = 7 * floor((year + floor(month * 9)/12)/4);
25 t2 = floor(275 * month / 9);
26 jul = (long) (floor(367 * year - t1 + t2 + day + 1721013.5));
27
28           RETVAL = jul;
29      OUTPUT:
30      RETVAL
31
32 int
33 DayOfWeek(jday = 0)
34      long jday;
35
36      CODE:
37
38      if (jday == 0) {
39           RETVAL = 0;
40      }
41      else      {
42      RETVAL = (jday % 7);
43      }
44
45      OUTPUT:
46      RETVAL

Summary

Extensions to Perl are written in the XS language. Templates for creating the modules are written by the h2xs program. Module-specific Perl is kept in the *.pm file, and the C code is kept in the .xs file. The xsubpp program parses the .xs file to generate a C file, which in turn is compiled and linked to form the shared library (with an .so extension). xsubpp expects the .xs files to follow the XS language syntax and specification. After compilation, the .so files with the extension code can be moved into your installation directory.

Оставьте свой комментарий !

Ваше имя:
Комментарий:
Оба поля являются обязательными

 Автор  Комментарий к данной статье