Advantages of Python namedtuple
Python Advantages of namedtuples. Python has a dedicated namedtuple container type, but it seems to be underappreciated. It’s one of those underappreciated yet amazing features of Python.
Namedtuples allow you to manually define classes, and this section will also introduce other interesting features of namedtuples.
So what is a namedtuple, and what’s so special about it? A good way to understand namedtuples is to think of them as an extension of the built-in tuple data type.
Python tuples are simple data structures for grouping arbitrary objects. Tuples are also immutable and cannot be modified after creation. Let’s take a look at a simple example:
>>> tup = ('hello', object(), 42)
>>> tup
('hello', <object object at 0x105e76b70>, 42)
>>> tup[2]
42
>>> tup[2] = 23
TypeError:
"'tuple' object does not support item assignment"
A disadvantage of simple tuples is that the data stored in them can only be accessed through integer indices. It is impossible to assign names to individual attributes stored in a tuple, making the code less readable.
In addition, tuples are singleton data structures. It is difficult to ensure that two tuples have the same number of fields and the same attributes. Therefore, it is easy to introduce subtle bugs due to the different order of fields between different tuples.
Advantages of Python namedtuples: Namedtuples in Action
Namedtuples are designed to solve two problems.
First, like regular tuples, namedtuples are immutable containers. Once data is stored in a namedtuple’s top-level attributes, the attributes cannot be updated. All attributes on namedtuple objects adhere to the “write once, read many” principle.
Second, namedtuples are simply tuples with names. Each object stored within them can be accessed using a unique (human-readable) identifier. This eliminates the need to remember integer indices or resort to workarounds like defining integer constants as mnemonics for indices.
Let’s take a look at namedtuples:
>>> from collections import namedtuple
>>> Car = namedtuple('Car' , 'color mileage')
Namedtuples were first added to the standard library in Python 2.6. To use them, you need to import the collections
module. The example above defines a simple Car
data type with two fields: color
and mileage
.
You might be wondering why the string 'Car'
is passed as the first argument to the namedtuple factory function in this example.
This argument, called typename in the Python documentation, serves as the name of the newly created class when the namedtuple function is called.
Since the namedtuple function doesn’t know which variable the created class will ultimately be assigned to, it needs to be explicitly told the class name to use. The namedtuple function automatically generates a documentation string and a __repr__
that uses the class name in its implementation.
There’s another quirk in this example: why are the fields passed as a whole string like 'color mileage'
?
The answer is that the namedtuple factory function calls split()
on the field name string to parse it into a list of field names. Breaking it down into two steps:
>>> 'color mileage'.split()
['color', 'mileage']
>>> Car = namedtuple('Car', ['color', 'mileage'])
Of course, if you prefer to write it separately, you can also pass in a list with string field names directly. The advantage of using lists is that it’s easier to reformat the code if you need to split it across multiple lines:
>>> Car = namedtuple('Car', [
... 'color',
... 'mileage',
... ])
Regardless of how it’s initialized, you can now use the Car
factory function to create new “Car” objects, which is the same as manually defining a Car
class and providing a constructor that accepts color
and mileage
values:
>>> my_car = Car('red', 3812.4)
>>> my_car.color
'red'
>>> my_car.mileage
3812.4
In addition to accessing values stored in a namedtuple by identifier, indexed access is still available. So namedtuples can be used as a replacement for regular tuples:
>>> my_car[0]
'red'
>>> tuple(my_car)
('red', 3812.4)
Tuple unpacking and the * operator for function argument unpacking also work:
>>> color, mileage = my_car
>>> print(color, mileage)
red 3812.4
>>> print(*my_car)
red 3812.4
The automatically generated string representation of the namedtuple object is also quite nice, so you don’t have to write the relevant functions yourself:
>>> my_car
Car(color='red' , mileage=3812.4)
Like tuples, namedtuples are immutable. Attempting to overwrite a field will result in an AttributeError
exception:
>>> my_car.color = 'blue'
AttributeError: "can't set attribute"
Namedtuple objects are implemented internally as regular Python classes. When it comes to memory usage, namedtuples are “better” than regular classes; both namedtuples and regular tuples have a smaller memory footprint.
Think of it this way: Namedtuples are a good way to quickly and memory-efficiently define an immutable class in Python.
Advantages of Python namedtuples: Subclassing namedtuples
Because namedtuples are built on top of regular Python classes, you can also add methods to namedtuple objects. For example, you can extend the namedtuple class just like any other class, adding methods and new attributes to it. Let’s look at an example:
Car = namedtuple('Car', 'color mileage')
class MyCarWithMethods(Car):
def hexcolor(self):
if self.color == 'red':
return '#ff0000'
else:
return '#000000'
Now we can create a MyCarWithMethods
object and call the hexcolor()
method, just as expected:
>>> c = MyCarWithMethods('red', 1234)
>>> c.hexcolor()
'#ff0000'
This approach can be a bit clumsy, but it’s suitable for building classes with immutable attributes. However, it can easily lead to other problems.
For example, due to the special internal structure of namedtuple, it is difficult to add new immutable fields. Alternatively, the simplest way to create a namedtuple class hierarchy is to use the base tuple’s _fields
attribute:
>>> Car = namedtuple('Car', 'color mileage')
>>> ElectricCar = namedtuple(
... 'ElectricCar', Car._fields + ('charge',))
The result is as expected:
>>> ElectricCar('red', 1234, 45.0)
ElectricCar(color='red', mileage=1234, charge=45.0)
Advantages of Python namedtuples: Built-in helper methods
In addition to the _fields
attribute, each namedtuple instance provides several other useful helper methods. These methods all begin with a single underscore (_
). A single underscore typically indicates that a method or attribute is “private” and not part of the stable public interface of a class or module.
However, in namedtuples, underscores have a different meaning. These helper methods and attributes are part of the namedtuple’s public interface, and they begin with a single underscore only to avoid naming conflicts with user-defined tuple fields. So feel free to use them when necessary.
The following describes some scenarios where these namedtuple helper methods can be useful. Let’s start with the _asdict()
helper method, which returns the contents of a namedtuple as a dictionary:
>>> my_car._asdict()
OrderedDict([('color', 'red'), ('mileage', 3812.4)])
This prevents misspelling field names when generating JSON output:
>>> json.dumps(my_car._asdict())
'{"color": "red", "mileage": 3812.4}'
Another useful helper function is _replace()
. This method creates a shallow copy of a tuple, optionally replacing some of its fields:
>>> my_car._replace(color='blue')
Car(color='blue', mileage=3812.4)
Finally, the _make()
class method creates a new namedtuple instance from a sequence or iterable:
>>> Car._make(['red', 999])
Car(color='red', mileage=999)
Benefits of Python namedtuples: When to use namedtuples
Namedtuples can better organize data, making code cleaner and easier to read.
For example, replacing fixed-format, scenario-specific data types like dictionaries with namedtuples can more clearly express the developer’s intent. Often, when I try to refactor in this way, I magically come up with a better solution to the problem at hand.
Replacing unstructured tuples and dictionaries with namedtuples can also reduce the burden on my colleagues, because namedtuples make the data being passed around somewhat self-documenting.
On the other hand, if namedtuples don’t help me write cleaner, more maintainable code, I’ll try to avoid them. Like many other techniques in this book, overusing namedtuples can have negative consequences.
But when used carefully, namedtuples can undoubtedly make Python code better and more readable.
Key Takeaways
-
collection.namedtuple
makes it easy to manually define an immutable class in Python that uses less memory. -
Using namedtuples allows you to organize data into a more understandable structure, simplifying your code.
-
Namedtuples provide some useful helper methods. Although these methods begin with a single underscore, they are actually part of the public interface and can be used normally.