Pybind11 demystified. Ch 2: Hack Python built-in modules

Xianbo QIAN
3 min readAug 28, 2022

Writing Python extensions is undoubtedly one way of enhancing Python, but you can also fork the repo and build your version of the Python interpreter.

In practice, it’s probably infrequent that you need to build Python yourself as a user. But it’s good to play with it to get familiar with the Python code base and understand what Python C API is doing underneath. This learning experience will undoubtedly help you write better extension code and give you more confidence when judging the correctness of your code.

Build Python

git clone git@github.com:python/cpython.git --depth=1
cd cpython
mkdir debug # Built artifacts are stored in the debug folder
cd debug
../configure --with-pydebug --enable-optimizations --with-lto
make -j8 # change 8 to the number of CPUs on your workstation

Once built, you can find an executable file named python in the current path (under debug folder).

Update the implementation of a Python built-in module

Now let’s tweak the dictionary module by changing its documentation string defined in this C variable.

# You should be in the `debug` folder# Edit Objects/dictobject.c and update `dictionary_doc`
vim ../Objects/dictobject.c
make -j8 build_all
./python
>>> print(dict.__doc__) <- You should be able to see the new string

Congratulations! You’ve just made the first change to the Python interpreter!

Tailoring built-in Python modules

We can also remove specific modules by updating the Setup configuration file.

# You should be in the `debug` folder# Comment out the last line (pwd module) by prepending #
vi Modules/Setup.bootstrap
# Rebuild Python
make -j8 build_all
# Now let's verify that the pwd module no longer exists
./python
>>> import pwd
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'pwd'

Adding a new built-in module

The CPython code base already has a comprehensive example of a built-in module called xxmodule.c (Note that this file is in the parent folder, not the debug folder). This example can also be used as a template when writing new modules.

Let’s enable the xxmodule so you can see what’s going on. I highly recommend you read through its source code as pybind11 is a nice C++ wrapper for these C codes and you’ll find out how the wrapper code was designed later in the series.

# Still in the `debug` folder# Enable xxmodule by adding a new line with content `xx xxmodule.c`. 
# Make sure to finish this file with an empty space
vi Modules/Setup.local
# Rebuild Python
make -j8 build_all
./python
>>> import xx
>>> xx.Str('hi')
'hi'
>>> xx.bug([1,2,3])
1

Quiz: After reading the source code, do you understand what’s the bug in the bug method?

Answer: This is a very subtle bug related to reference counting. Check here for an excellent explanation. And below are some example code that triggers this error:

>>> class ValueA:
... def __del__(self):
... print("A is removed\n")
...
>>> class ValueB:
... def __del__(self):
... print("B is removed\n")
... global input
... del input[0]
...
>>> input = [ValueA(), ValueB()]
>>> xx.bug(input)
B is removed
A is removed<refcnt -2459565876494606883 at 0x7f8b47e61c30> <- weird output>>> input
[0]
>>> input = [ValueA(), ValueB()]
>>> input
[<__main__.ValueA object at 0x7f8b47b4fbc0>, <__main__.ValueB object at 0x7f8b47b4edb0>]
>>> input[0]
<__main__.ValueA object at 0x7f8b47b4fbc0> <- expected output

If you’re interested, there are other examples under the same folder, such as xxlimited.c and xxsubtype.c, for your reference.

>> Chapter 3: Your first extension

--

--