Before starting this chapter, it's important that you feel you have a good understanding of both basic classes and function wrappers / decorators.. If you don't feel comfortable with these topics, you should go back and review them before continuing.
Dataclasses are a special type of class that is used to store data. They are similar to regular classes, but they are more convenient to use, and they have some additional functionality. Since most classes are defined to store data, with similar methods to initialize objects and access (or compare) their data, dataclasses are a useful shortcut for creating classes. For basic purposes, these two examples are equivalent:
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
def __repr__(self):
return f"Person(name={self.name}, age={self.age})"
def __str__(self):
return f"{self.name} ({self.age})"
Vs. with dataclasses:
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
The second example is obviously much shorter, and it's also more readable. It's also more convenient to use, since we don't have to write the __init__()
method, or the __eq__()
, __repr__()
and __str__()
methods. All these methods, and optionally more, are automatically generated by the @dataclass
decorator. We can also add more methods to the class, and they will be added to the class as usual.
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
def greet(self):
print(f"Hello, my name is {self.name} and I'm {self.age} years old")
person = Person("John", 42)
person.greet() # Prints: Hello, my name is John and I'm 42 years old
another_person = Person("John", 42)
print(person == another_person) # Prints: True (thanks to the automatically generated __eq__() method)
We won't discuss dataclasses in detail here, but you can read more about them in the official documentation. In the future, we will probably prefer to use an external library called Pydantic, which is more powerful and has more features than the built-in dataclasses
module, but is very similar to it.
Most functions that we write in classes are called "instance methods". This means that they are called on an instance of the class, and they have access to the instance's data. In other words, they need access to self
for each object. For example:
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def greet(self):
print(f"Hello, my name is {self.name} and I'm {self.age} years old")
However, sometimes we want to write functions that are called on the class itself, instead of on an instance of the class. These are called "class methods", and they are defined with the @classmethod
decorator. For example:
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
@classmethod
def from_dict(cls, data: dict): # cls is the class itself, not an instance of the class, therefore it's called "cls" instead of "self"
return cls(data["name"], data["age"])
def greet(self):
print(f"Hello, my name is {self.name} and I'm {self.age} years old")
# This example will allow us to have multiple "constructors" to the same type of objects
# instead of just the default `__init__()` method. For example:
person = Person("John", 42)
person.greet() # Prints: Hello, my name is John and I'm 42 years old
another_person = Person.from_dict({"name": "John", "age": 42})
another_person.greet() # Prints: Hello, my name is John and I'm 42 years old
But other than just building constructors, class methods can be used for other purposes as well. For example, we may need some functionality that regards only the class attributes, and we wouldn't want to accidentally modify the instance attributes. For example:
class Person:
all_people = []
def __init__(self, name: str, age: int):
self.name = name
self.age = age
Person.all_people.append(self)
@classmethod
def count_people(cls):
return len(cls.all_people)
john = Person("John", 42)
Person.count_people() # Prints: 1
john.all_people = []
Person.count_people() # Prints: 1
john.count_people() # Prints: 1
Notice that even though we changed the instance value of all_people
, the class value of all_people
remained the same. So even though we called john.count_people()
, it still returned the correct value. This is because count_people()
is a class method, and it only has access to the class attributes, not the instance attributes.
Static methods are similar to class methods, but they don't have access to the class or the instance, at all. This is similar to a regular function that isn't defined on the class at all, but that we do want to keep logically in the same place as the class. For example:
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
@staticmethod
def greet():
print(f"Hello, i'm a person")
def greet_again(self):
print(f"Hello, my name is {self.name} and I'm {self.age} years old")
In this example, greet()
is a static method, and greet_again()
is an instance method. Notice that greet()
doesn't take any arguments like self
or cls
, and it doesn't have access to the class or the instance (it can take other arguments, just like any other normal function).
Properties are a special type of attribute that can be accessed like a regular attribute, but that actually calls a function when it's accessed. This allows us to have more control over the attributes, and to add additional functionality to them. For example:
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
@property
def greeting(self):
return f"Hello, my name is {self.name} and I'm {self.age} years old"
person = Person("John", 42)
print(person.greeting) # Prints: Hello, my name is John and I'm 42 years old
Even though greeting
is a function, we can access it like a regular attribute, without the parentheses. This is because of the @property
decorator. This is useful when we want to add some functionality to an attribute, but we don't want to change the way it's accessed. For example, we may want to count the number of times that the attribute was accessed:
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
self._greeting_count = 0 # This is a private attribute, it's not meant to be accessed directly
@property
def greeting(self):
self._greeting_count += 1
return f"Hello, my name is {self.name} and I'm {self.age} years old"
def get_greeting_count(self):
return self._greeting_count
person = Person("John", 42)
print(person.greeting) # Prints: Hello, my name is John and I'm 42 years old
print(person.greeting) # Prints: Hello, my name is John and I'm 42 years old
print(person.get_greeting_count()) # Prints: 2
However, this kind of property does not allow us to change the attribute. For example, we can't do this:
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
self._greeting_count = 0 # This is a private attribute, it's not meant to be accessed directly
@property
def greeting(self):
self._greeting_count += 1
return f"Hello, my name is {self.name} and I'm {self.age} years old"
person = Person("John", 42)
person.greeting = "Hello, I'm John" # This will raise an error
Sometimes, that will be the intended behavior, because we don't want the ability to change this value. Other times, we may want the option of changing the underlying value, so we need to create a "setter" method. For example:
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
self._greeting_count = 0 # This is a private attribute, it's not meant to be accessed directly
@property
def greeting(self):
self._greeting_count += 1
return f"Hello, my name is {self.name} and I'm {self.age} years old"
@greeting.setter
def greeting(self, value):
self._greeting_count = 0
self.name = value
def get_greeting_count(self):
return self._greeting_count
person = Person("John", 42)
print(person.greeting) # Prints: Hello, my name is John and I'm 42 years old
print(person.greeting) # Prints: Hello, my name is John and I'm 42 years old
print(person.get_greeting_count()) # Prints: 2
person.greeting = "Jane"
print(person.greeting) # Prints: Hello, my name is Jane and I'm 42 years old
print(person.get_greeting_count()) # Prints: 1
This is useful when we want to change the underlying value, but we still want to have some control over it. One common example is adding validation, for example, we may want to make sure that the value is always a string, or that it's always a number, or that it's always a number between 0 and 100, etc.
class Person:
def __init__(self):
self._age = 0
@property
def age(self):
return self._age
@age.setter
def age(self, value):
if not isinstance(value, int):
raise TypeError("Age must be an integer")
if value < 0 or value > 100:
raise ValueError("Age must be between 0 and 100")
self._age = value
person = Person()
print(person.age) # Prints: 0
person.age = 42
print(person.age) # Prints: 42
person.age = "42" # Raises: TypeError: Age must be an integer
person.age = 142 # Raises: ValueError: Age must be between 0 and 100