I’m currently struggling with a bug in my open source project and it has to do with temporary object lifetimes. I believe the core issue is that I have a Python object that holds a pointer to an underlying C++ object that’s gotten deleted too early. I’ve created a toy SWIG project to try to isolate the issue.
The code: swig_test.tar.gz
To build the project on your machine, just unzip and run make:
1 2 3 |
tar xfz swig_test.tar.gz cd swig_test make |
This will build a Python C++ extension called swig_test
and install it into your home directory.
The Buffer
class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
struct Buffer { Buffer(); Buffer(const Buffer & copy); ~Buffer(); Buffer & operator=(const Buffer & rhs); Buffer & operator<<(const Buffer & rhs); Buffer & operator<<(double rhs); std::string __str__() const; std::string __repr__() const; private: std::vector<double> _data; int _id; }; |
The Buffer
class is just a container, I’m using operator<<
to concatenate data to the end of the Buffer
. This is just a toy example illustrating the problem I’m having with SWIG, my actual class does a whole lot more.
The _data
member is just a list of numbers that the container holds, _id
is a unique identifier so we can tell which Buffer
gets deleted.
The swig_test.i
file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
%module swig_test %include "std_string.i" %{ #include "Buffer.hpp" #include <iostream> %} %ignore Buffer::operator=; %include "Buffer.hpp" |
This is about as basic as it can be.
The go_test.py Python script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
from swig_test import Buffer def zeros(n): ''' Returns a Buffer filled with 'n' zeros. ''' b = Buffer() for i in xrange(n): b << 0.0 return b def ones(n): ''' Returns a Buffer filled with 'n' ones. ''' b = Buffer() for i in xrange(n): b << 1.0 return b def main(): #-------------------------------------------------------------------------- # This sections works as expected print "-" * 80 print "Works as expected:" b0 = zeros(3) print " b0 = ", b0 b1 = ones(3) print " b1 = ", b1 y = b0 << b1 print " b0 = ", b0 print " y = ", y print " b1 = ", b1 print " repr(b0) = ", repr(b0) print " repr(y) = ", repr(y) #-------------------------------------------------------------------------- # Funny things are happening here! print "Funny business:" b2 = zeros(3) << ones(3) print " repr(b2) = ", repr(b2) b3 = zeros(3) << 4.0 print " repr(b3) = ", repr(b3) if __name__ == "__main__": main() |
Now when running make, the following test script is executed that illustrates the issue I’m running into:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
python go_test.py -------------------------------------------------------------------------------- Works as expected: b0 = Buffer(0, 0, 0, ) b1 = Buffer(1, 1, 1, ) b0 = Buffer(0, 0, 0, 1, 1, 1, ) y = Buffer(0, 0, 0, 1, 1, 1, ) b1 = Buffer(1, 1, 1, ) repr(b0) = Buffer(id = 0, vector at 0x020bf450, data at 0x020aeb30, size = 6) repr(y) = Buffer(id = 0, vector at 0x020bf450, data at 0x020aeb30, size = 6) Funny business: Deleting Buffer(id = 2) Deleting Buffer(id = 3) repr(b2) = Buffer(id = 2, vector at 0x020bf790, data at 0x00, size = 4257068) Deleting Buffer(id = 4) repr(b3) = Buffer(id = 4, vector at 0x02037040, data at 0x0204a4e0, size = 6) Deleting Buffer(id = 0) Deleting Buffer(id = 1) |
The ‘Deleting Buffer(id = X)’ is being generated from inside the C++ code, so we can see here that in the ‘Funny business’ section, the C++ Buffer objects are getting deleted too early! The Python objects ‘b2’ and ‘b3’ should be holding references to the C++ Buffer objects with id=2 and id=4.
This summarizes the issue I’m having.
Pure Python Implementation
I’ve been searching the web for information to learn what’s going one here. I found this stackoverflow question and though it might be related, so I’ve implemented a pure Python version of Buffer in the script go_test_pure.py
, here’s it’s output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
python go_test_pure.py -------------------------------------------------------------------------------- Works as expected: b0 = Buffer(0.0, 0.0, 0.0, ) b1 = Buffer(1.0, 1.0, 1.0, ) b0 = Buffer(0.0, 0.0, 0.0, 1.0, 1.0, 1.0, ) y = Buffer(0.0, 0.0, 0.0, 1.0, 1.0, 1.0, ) b1 = Buffer(1.0, 1.0, 1.0, ) repr(b0) = Buffer(id = 0, data at 0x7FFEBD772CB0, size = 6) repr(y) = Buffer(id = 0, data at 0x7FFEBD772CB0, size = 6) No funny business: Deleting Buffer(id = 3) repr(b2) = Buffer(id = 2, data at 0x7FFEBD7AB050, size = 6) repr(b3) = Buffer(id = 4, data at 0x7FFEBD772CF8, size = 4) Deleting Buffer(id = 1) Deleting Buffer(id = 0) Deleting Buffer(id = 2) Deleting Buffer(id = 4) |
However, everything works as expected. Notice that Buffer ids 2, 4 are deleted at the end, so the pure Python version works as I expect.
Experiment 1: No Destructors
So the first experiment I tried was to explicitly turn off destructors in the SWIG file with %nodefaultdtor and ignoring the C++ destructor:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
%module swig_test %include "std_string.i" %{ #include "Buffer.hpp" #include <iostream> %} %ignore Buffer::operator=; %ignore Buffer::~Buffer; // ignoring C++ destructor %nodefaultdtor Buffer; // don't generate a destructor %include "Buffer.hpp" |
And here’s the output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
python go_test.py -------------------------------------------------------------------------------- Works as expected: b0 = Buffer(0, 0, 0, ) b1 = Buffer(1, 1, 1, ) b0 = Buffer(0, 0, 0, 1, 1, 1, ) y = Buffer(0, 0, 0, 1, 1, 1, ) b1 = Buffer(1, 1, 1, ) repr(b0) = Buffer(id = 0, vector at 0x02865fa0, data at 0x02855b30, size = 6) repr(y) = Buffer(id = 0, vector at 0x02865fa0, data at 0x02855b30, size = 6) Funny business: swig/python detected a memory leak of type 'Buffer *', no destructor found. swig/python detected a memory leak of type 'Buffer *', no destructor found. repr(b2) = Buffer(id = 2, vector at 0x028662e0, data at 0x028119f0, size = 6) swig/python detected a memory leak of type 'Buffer *', no destructor found. repr(b3) = Buffer(id = 4, vector at 0x0281d930, data at 0x027f1520, size = 4) swig/python detected a memory leak of type 'Buffer *', no destructor found. swig/python detected a memory leak of type 'Buffer *', no destructor found. |
Now the Python objects b2, b3 are pointing to ids 2, 4 and they are valid. But this approach leaks memory, so this is not a solution.
Experiment 2: Using A Typemap
My next experiment involves using a typemap like so:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
%module swig_test %include "std_string.i" %{ #include "Buffer.hpp" #include <iostream> %} %ignore Buffer::operator=; %typemap(out) Buffer & { // TYPEMAP $result = $self; // TYPEMAP } %include "Buffer.hpp" |
If you were to compare the generated SWIG wrapper code in swig_test.cpp, you will see that this typemap appears to do the right thing, by returning the original PyObject that points to the C++ Buffer object on the left-hand-side of the expression, just like the pure Python method returns self. However, at run-time, a SWIG internal conversion error occurs and the code fails with:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
python go_test.py -------------------------------------------------------------------------------- Works as expected: Deleting Buffer(id = 0) Traceback (most recent call last): File "go_test.py", line 71, in main() File "go_test.py", line 38, in main b0 = zeros(3) File "go_test.py", line 12, in zeros b << 0.0 TypeError: unsupported operand type(s) for <<: 'Buffer' and 'float' make: *** [all] Error 1 |
Asking The Community For Help
I’ve run out of ideas and now I need some new eyes to take a look. Hopefully a SWIG expert can quickly point out my error and I can move on with writing code.
So I’ve posted this question to stackoverflow with a link to this blog post. If a solution comes up I’ll be sure to link to it from here.
A Solution!
After some more searching I came across this thread which eventually lead to a working solution. Using a typemap(out)
in combination of a Py_INCREF()
did the trick.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
%module swig_test %include "std_string.i" %{ #include "Buffer.hpp" #include <iostream> %} %ignore Buffer::operator=; %typemap(out) Buffer & operator<< { if(result) { /* suppress unused warning */ } Py_INCREF($self); $result = $self; } %include "Buffer.hpp" |
Now I get the desired behavior with no memory leaks!