Engineer in Tokyo

Pickling Objects with Cached Properties

Python descriptors allow you to create properties on python objects that are the result of executing some code. One of the simplest ways of doing that is using the @property decorator. Here, accessing the myprop will call the method and return the resulting "data".

    class MyClass(object):
        @property
        def myprop(self):
            return "data"

One variant of this is the cached_property pattern. There are many implementations floating around. There is a package on pypi. Werkzeug, and by extension Flask, has one. Django has one. These implementations all rely on the fact that if you add a value to the __dict__ of an object, that value has precedence over descriptors and so you can use it as a quickly accessable place to store cached data.

This has a downside however in that this cached data will be pickled along with your object when serializing it to disk or to a caching layer like memcached. In extreme cases this can lead to your pickled binary data exceeding the memcached per-key space limits.

I came up with a way to avoid this by adding a mixin to classes but it’s not terribly clean and seems like it would be brittle.

    class CachedPropertyMixin(object):
        def __getstate__(self):
            state = self.__dict__.copy()
            for key in state:
                if (hasattr(self.__class__, key) and
                        isinstance(getattr(self.__class__, key), cached_property)):
                    del state[key]
            return state


    class MyClass(CachedPropertyMixin, object):
        @cached_property
        def myprop(self):
            return "data"

I’m not really satisfied with this solution so I’d be interested in hearing if there are any other ideas about how to do avoid pickling cached data.