GULP Wrapper Generator User Manual

$CambiosHeader: gulp2/GulpUserManual.htm,v 1.2 2021/12/22 07:51:13 cambios Exp $
Jonathan W. Greene
gulpinfo@cambioscomputing.com

Copyright 2004-2021 Cambios Computing, LLC

Contents

Introduction

GULP generates Python wrappers for a C++ API.

To use GULP you need:

Writing the .hpp Files

Ideally, any .hpp file acceptable to the C++ compiler would be sufficient. However there are a few additional restrictions.

FIX add more information

Supported C++ Types

Wrapped functions and methods may take and return any of the following types:

You can define custom type mappings for other types in the configuration file.

Iterators

If a class has a pair of methods beginXXX() and endXXX() returning iterators of the same type, these will generate a single method itrXXX() in the corresonding Python class that returns a Python iterator.

If the C++ code uses a map iterator like this:


std::map<std::string, Child*>::const_iterator Parent::beginChild();
std::map<std::string, Child*>::const_iterator Parent::endChild();

the python wrapper could be used like this:

for name, child in parent.itrChild():

If the C++ code uses a set iterator like this:


std::set<Child*>::const_iterator Parent::beginChild();
std::set<Child*>::const_iterator Parent::endChild();

the python wrapper could be used like this:

for child in parent.itrChild():

Providing Memory Management Guidance

You can provide comments in method or function declarations to help GULP manage new/delete of passed and returned objects. These comments go between the type and the parameters name, e.g.
Node * /*TAKE*/ getCopy();
void attachNode(Node * /*GIVETO(this)*/ n);

Comments applicable to objects passed as arguments:

Comments applicable to objects returned from methods or functions:

Class Name Mapping

C++ classes whose name has a suffix of "_" or "__" are treated specially.

If there is a base class CtAtom_, its methods are placed on the Python class CtAtom. Such classes are typically Sgen base classes, or other base classes that should be invisibly inherited by the subclass.

If there is a subclass CtPoint__, it is assumed to be a subclass of CtPoint that adds additional methods but no data members. This is typically used when we want to add additional special methods that should show up only in Python (e.g. __getitem__), and aren't part of the standard C++ API. These additional methods will show up on the Python class CtPoint. We call new and delete only for CtPoint, but for other method calls we cast to a CtPoint__.

Writing the Configuration File

Special Type Mappings

Gulp maps most C++ things to and from Python automatically. However if you define special classes or types, you can tell Gulp how to map them as follows. First, list the type in the configuration file with the following syntax: EnterSpecialType('CtPoint')
EnterSpecialType('CtPoint', 'comment')
EnterSpecialType('CtPoint', 'comment', 'Point') # optional short name; use if name of class has non-letter characters

You'll also need to provide the following in your C++ code:

Namespaces

If extension functions and classes are in a namespace, you must indicate that by a statement: EnterNamespace('Ct')

Trapping Exceptions

If you want the wrappers to catch a type of C++ exception: EnterException('CtException', '(const char *)%s')

The first argument is the name of the C++ exception class to catch.

The second argument is a string showing how to convert a reference to such an exception into a const char *.

Additional Include Files

To add include files to the generated wrap.cpp file (but not generate wrappers for things in them): EnterIncludeFile('ct_gulp.hpp')

Passing or Returning Instances or References (&)

If a C++ function (or method) takes an object instance or reference as an argument, the wrapper will dereference the pointer stored in the Python object accordingly. However functions that return an object or reference are normally ignored. To allow these: EnterObject('CtPoint', allowRefs=1)

Ownership is never transferred when passing or returning an object or reference. When passing in a reference, it is assumed that the recipient function will make a copy if necessary. When returning a reference, the caller gets a new, independent copy of the object; it is assumed that the object has a proper copy constructor.

How Gulp Wrappers Work

The module init function, written in C, creates a PyClassObject for each extension class. The appropriate base classes, whether C or Python, are attached. The appropriate methods are attached to the class object's dictionary.

The __init__ method for each extension class calls new with the appropriate constructor arguments to create corresponding C instance.

The __del__ method for each extension class deletes, if necessary, the C instance.

Memory Model

Any C instance we point to is owned either by us or, directly or indirectly, by another Python instance. In the latter case, keep a ref to the owning PyInstanceObject in the subsidiary PyInstanceObject. This reference will keep the owner alive to avoid prematurely killing the subsidiary C instance.

Of course it is still possible for the subsidiary to be killed on the C side by some method of the parent, e.g. parent.clear(). If all wrappable C classes can derive from a base class Wrappable, we can at least detect this problem and avoid crashing the Python interpreter. Two possibilities:

States of an Instance Object (or None object used as an instance):
ObjectOwnerThisState
NoneN/AN/ANull instance of any class
Instnullnullinstance that has been deleted on the C++ side or for some other reason become invalid
InstselfnonNullinstance owned by self. (Ref count of owner is not incremented.)
InstownernonNullinstance owned by another owner. (Ref count of owner is incremented.)
InstnullnonNullinstance owned by indeterminate owner other than self.

(The last case gives dumb behavior like SWIG with thisown=0: when the PyObject is deleted, the C instance isn't. This is useful for static objects.)

When we create a python instance A to hold an extension instance obtained from python instance B, the owner in A is set to the owner in B, which may or may not be B itself. This avoids keeping alive intermediate python objects which are not otherwise needed.

Ownership guidance in comments in C++ code

You can add comments in the C++ code to help instruct Gulp how to handle new/delete of objects returned or passed as arguments to functions or methods.

Note: one thing this scheme doesn't support is the kind of ownership relations an interator would have. The child objects returned by methods of an iterator are owned by the parent object of the iterator. So the iterator needs to know the parent object. But the iterator can be deleted any time. Gulp uses specially hard-coded iterators to produce this behavior.

Subclasses

Assume S is a subclass of base class B. Method m is implemented differently on each class. Method n is implemented only by B.

Case 1: call s->m. The void * "this" pointer is cast to the class indicated by the Python class of the object. Since the class hierarchy of the Python classes reflects the C++ hierarchy, the correct version of s->m is called.

Case 2: call s->n. The wrapper for B::n gets called, which casts the pointer to a B*, not an S*. If method n calls method m, it is B::m rather than S::m which gets called. But this is the same behavior we would see in C++.

Pickling

Pickle with custom __getstate__: If invalid object: raise exception (can't pickle invalid object) If owned by another object: raise exception (can't pickle subsidiary object) Else: Pickle dictionary and a self-contained representation of C instance. Unpickle with custom __setstate__: Unpickle dictionary. If representation is None, it is a null instance. If representation is not None, unpickle C instance.

How to Store the Additional Data (Owner and This) in the Python Instance Object?

Before Python 2.2, it was difficult or impossible to use anything other than a PyInstanceObject for an instance. So we store the data in a PyCObject self._this. The pointer to the C instance ("this") is stored in PyCObject as as cobject and the owning PyInstanceObject (Owner), which may be ourselves, as desc.

In 2.2, can set __new__ method on class to allow us to use other types of objects for instances. This could allow us to store our data in an expanded version of PyInstanceObject in the future.

Sharing C APIs Among Extension Modules

See the section in Python documentation entitled: 1.12 Providing a C API for an Extension Module. We use this mechanism for extension modules to access the gulp library module as well.

Iterators

A Python class can have a method __iter__(self) returning an iterator. Python iterators also support the __iter__ method, which returns themselves (not a fresh iterator). The iterator also has a method next(self) that returns the first and then next item. If there is no such item, it raises the StopIteration exception.

In Python 2.2, these methods are used to support the "for x in y" syntax.

Raising Exceptions

Sometimes it may be necessary to have the wrapped C function raise a Python exception. This can be done by calling PyErr_Set*, and then throwing a GulpException. The wrappers will catch the GulpException but not alter the exception set with PyErr_Set*.

Notes On Creating Extension Classes

On Windows

On windows, python is generally build with Visual C++. This is the way the prebuilt binaries are distributed. With VC++, a special /GC flag is required to use dynamic_casting or an error message C4541 results. The python distribution is built with the opposite: /GC-. To avoid this complication, avoid use of dynamic_cast.

Background on the Design of Gulp vs Other Alternatives

Why Not Ref Counting Of C Instances?

Requires substantial changes to existing C code, additional memory. Since Python object might refer to subsidiary instance, either ref count on parent instance should be incremented or parent should first check ref count of subsidiary children before deleting itself. Problems with circular references?

SWIG

Swig marks python object as owning the C instance or not. When PyObject is destructed, C instance only deleted if it is owned. Problems can arise if non-owned C object deleted while a Python object is still pointing to it.

Wrappy

"Attribute caching" (sec. 4.3): has something to do with saving owner pointers, but not clear.

"Class wrapping method", "second method" (sec. 4.2): store pointer to corresponding PyObject in C instance. Use this to return same PyObject each time we need one for this C instance. PyObject ref count is incremented an extra time when constructed and decremented by C instance's destructor so PyObject is around for lifetime of C instance. This may not be a good thing if we don't need PyObject!

See http://www.cgl.ucsf.edu/home/gregc/wrappy/ for details.

SIP

Sip uses a shadow class, but attaches methods lazily. When an unknown method is invoked, the built-in function getattro on the ClassObjectType or InstanceObjectType are called. These look up the right method, and in the case of the instance version, attach it to the instance. Here are relevant comments from siplib.c:

This is the replacement class getattro function. We want to create a class's methods on demand because creating all of them at the start slows the startup time and consumes too much memory, especially for scripts that use a small subset of a large class library. We used to define a __getattr__ function for each class, but this doesn't work when you want to use the standard technique for calling a base class's method from the same method in a derived class, ie. Base.method(self,...).

We don't cache methods called using class.method(self,...). If we did then, because we use the Python class and instance getattr functions before we see if an attribute is lazy, then in the case that we were looking for a method that is being called using self.method(...) for the first time - and after it has been called using class.method(self,...) - the Python instance getattr function would find the unbound method which would cause an exception. Because the most common calling method is cached, the performance penalty should be negligable.