Skip to content

Using Metaclasses

type()

The biggest difference between dynamic and static languages lies in the definition of functions and classes—they are not defined at compile time but are dynamically created at runtime.

For example, to define a Hello class, create a hello.py module:

python
class Hello(object):
    def hello(self, name='world'):
        print('Hello, %s.' % name)

When the Python interpreter loads the hello module, it executes all the statements sequentially, resulting in the dynamic creation of a Hello class object. Here's how to test it:

python
>>> from hello import Hello
>>> h = Hello()
>>> h.hello()
Hello, world.
>>> print(type(Hello))
<class 'type'>
>>> print(type(h))
<class 'hello.Hello'>

The type() function can be used to check the type of a type or variable. Hello is a class, so its type is type, while h is an instance, so its type is the Hello class.

We said that class definitions are dynamically created at runtime, and the method to create classes is using the type() function.

The type() function can both return the type of an object and create new types. For example, you can create the Hello class using the type() function without using the class Hello(object)... syntax:

python
>>> def fn(self, name='world'):  # First, define a function
...     print('Hello, %s.' % name)
...
>>> Hello = type('Hello', (object,), dict(hello=fn))  # Create Hello class
>>> h = Hello()
>>> h.hello()
Hello, world.
>>> print(type(Hello))
<class 'type'>
>>> print(type(h))
<class '__main__.Hello'>

To create a class object, the type() function takes three arguments in sequence:

  1. Class name: The name of the class.
  2. Parent classes tuple: The tuple of parent classes from which the class inherits. Note that Python supports multiple inheritance; if there is only one parent class, remember to include the comma in the tuple.
  3. Attributes dictionary: A dictionary that binds method names to functions. Here, we bind the function fn to the method name hello.

Classes created using the type() function are identical to those defined directly using the class statement. This is because when the Python interpreter encounters a class definition, it simply scans the class definition's syntax and calls the type() function to create the class.

In normal circumstances, we use the class Xxx... syntax to define classes. However, the type() function also allows us to dynamically create classes, meaning that dynamic languages inherently support the creation of classes at runtime. In contrast, static languages require constructing source code strings and invoking a compiler or using tools to generate bytecode to create classes at runtime, which is inherently more complex.

Metaclasses

Besides using the type() function to dynamically create classes, you can also control class creation behavior using metaclasses.

Metaclass—translated as "meta-class"—can be simply explained as follows:

  • When you define a class, you can create instances of that class. So, to create a class, you need a metaclass.
  • In other words: define a metaclass first, then create classes based on that metaclass, and finally create instances from those classes.

Thus, metaclasses allow you to create or modify classes. In other words, you can think of classes as "instances" created by metaclasses.

Metaclasses are the most difficult and magical parts of Python's object-oriented programming. In normal circumstances, you won't encounter the need to use metaclasses, so the following content might be hard to understand and generally isn't necessary for everyday use.

Let's look at a simple example where a metaclass adds an add method to our custom MyList class:

Define ListMetaclass. Following the default convention, metaclass class names always end with Metaclass to clearly indicate that they are metaclasses:

python
# Metaclass must inherit from `type` since it is a template for classes
class ListMetaclass(type):
    def __new__(cls, name, bases, attrs):
        attrs['add'] = lambda self, value: self.append(value)
        return type.__new__(cls, name, bases, attrs)

With ListMetaclass defined, when creating a new class, you need to specify that it uses ListMetaclass by passing the keyword argument metaclass:

python
class MyList(list, metaclass=ListMetaclass):
    pass

When you pass the metaclass keyword argument, the magic takes effect. It instructs the Python interpreter to use ListMetaclass.__new__() to create the MyList class. Here, we can modify the class definition, such as adding new methods, and then return the modified definition.

The __new__() method receives the following parameters in order:

  1. cls: The current class object being created.
  2. name: The name of the class.
  3. bases: A tuple of the parent classes the class inherits from.
  4. attrs: A dictionary of the class's attributes.

Test whether MyList can call the add() method:

python
>>> L = MyList()
>>> L.add(1)
>>> L
[1]

In contrast, a regular list does not have an add() method:

python
>>> L2 = list()
>>> L2.add(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute 'add'

What's the point of dynamic modification? Directly writing the add() method within the MyList class would be simpler, right? Under normal circumstances, you should indeed write it directly. Using a metaclass to modify the class is overkill.

However, there are situations where you need to modify class definitions using metaclasses. An example of this is Object-Relational Mapping (ORM).

ORM Example

ORM stands for "Object Relational Mapping," which maps a row in a relational database to an object, meaning one class corresponds to one table. This makes coding simpler as you don't need to directly manipulate SQL statements.

To write an ORM framework, all classes must be dynamically defined because only the user can define classes based on the table structure.

Let's attempt to write an ORM framework based on the following usage interface. For example, if a user wants to define a User class to operate on the corresponding database table User, we expect them to write code like this:

python
class User(Model):
    # Define class attributes mapping to table columns:
    id = IntegerField('id')
    name = StringField('username')
    email = StringField('email')
    password = StringField('password')

# Create an instance:
u = User(id=12345, name='Michael', email='test@orm.org', password='my-pwd')
# Save to the database:
u.save()

Here, the parent class Model and the attribute types StringField and IntegerField are provided by the ORM framework. All other magic methods, such as save(), are automatically handled by the parent class Model. Although writing the metaclass is complex, the ORM user finds it exceptionally simple to use.

Now, let's implement this ORM according to the above interface.

Step 1: Define the Field Class

First, define the Field class, which is responsible for storing the field name and field type of a database table:

python
class Field(object):

    def __init__(self, name, column_type):
        self.name = name
        self.column_type = column_type

    def __str__(self):
        return '<%s:%s>' % (self.__class__.__name__, self.name)

Step 2: Define Specific Field Types

Building upon the Field class, further define various field types like StringField and IntegerField:

python
class StringField(Field):
    def __init__(self, name):
        super(StringField, self).__init__(name, 'varchar(100)')

class IntegerField(Field):
    def __init__(self, name):
        super(IntegerField, self).__init__(name, 'bigint')

Step 3: Define the ModelMetaclass

Next, write the most complex part—the ModelMetaclass:

python
class ModelMetaclass(type):
    def __new__(cls, name, bases, attrs):
        if name == 'Model':
            return type.__new__(cls, name, bases, attrs)
        print('Found model: %s' % name)
        mappings = dict()
        for k, v in attrs.items():
            if isinstance(v, Field):
                print('Found mapping: %s ==> %s' % (k, v))
                mappings[k] = v
        for k in mappings.keys():
            attrs.pop(k)
        attrs['__mappings__'] = mappings  # Save the attribute to column mapping
        attrs['__table__'] = name  # Assume table name is the same as class name
        return type.__new__(cls, name, bases, attrs)

Step 4: Define the Model Base Class

Define the base class Model that uses the ModelMetaclass:

python
class Model(dict, metaclass=ModelMetaclass):
    def __init__(self, **kw):
        super(Model, self).__init__(**kw)

    def __getattr__(self, key):
        try:
            return self[key]
        except KeyError:
            raise AttributeError(r"'Model' object has no attribute '%s'" % key)

    def __setattr__(self, key, value):
        self[key] = value

    def save(self):
        fields = []
        params = []
        args = []
        for k, v in self.__mappings__.items():
            fields.append(v.name)
            params.append('?')
            args.append(getattr(self, k, None))
        sql = 'INSERT INTO %s (%s) VALUES (%s)' % (self.__table__, ','.join(fields), ','.join(params))
        print('SQL: %s' % sql)
        print('ARGS: %s' % str(args))

Step 5: Define the User Class and Test the ORM

When a user defines a User class that inherits from Model, the Python interpreter first looks for a metaclass in the User class definition. If not found, it continues to look in the parent Model class, finds ModelMetaclass, and uses it to create the User class. This means that the metaclass can be implicitly inherited by subclasses, but the subclasses themselves remain unaware of it.

The ModelMetaclass performs several tasks:

  1. Exclude modifications to the Model class itself.
  2. Find all Field attributes in the current class (e.g., User) and save them to a __mappings__ dictionary while removing them from the class attributes. This prevents runtime errors caused by instance attributes overshadowing class attributes.
  3. Save the table name to __table__, simplifying it by assuming the table name is the same as the class name.

In the Model class, you can define various database operation methods like save(), delete(), find(), update(), etc.

We've implemented the save() method, which prints out the executable SQL statement and the list of parameters. To complete the functionality, you only need to connect to the database and execute the SQL statement.

Let's write some code to test it:

python
u = User(id=12345, name='Michael', email='test@orm.org', password='my-pwd')
u.save()

The output will be:

Found model: User
Found mapping: email ==> <StringField:email>
Found mapping: password ==> <StringField:password>
Found mapping: id ==> <IntegerField:id>
Found mapping: name ==> <StringField:username>
SQL: INSERT INTO User (password,email,username,id) VALUES (?,?,?,?)
ARGS: ['my-pwd', 'test@orm.org', 'Michael', 12345]

As you can see, the save() method has printed out a valid SQL statement and the corresponding parameter list. To complete the functionality, you would connect to the database and execute the SQL statement with the provided parameters.

In less than 100 lines of code, we've implemented a concise ORM framework using metaclasses. Isn't that surprisingly simple?

Summary

Metaclasses are highly magical objects in Python that can alter the behavior of class creation. This powerful feature should be used with caution.

Using Metaclasses has loaded