Experiment: Replace __setattr__ With A Descriptor

As part of daily life I see a ton of Python code. Between the code at work and all of the open source projects I watch I’m surprised my eyes don’t bleed. Over the years my style has changed dramatically. I guess you could say I am getting more Pythonic. Looking back at old code sometimes just make me feel ill. I know better now.

There was a project that I worked on where we wanted to ensure all attributes were typed properly. (No comments about this - it’s not the point of the post) To do this we used an evil combination of Hungarian notation and setattr magic. An example class looks something like this:

class Data(object):

    def __init__(self, strX='', fltY=0.0, intZ=0):
        self.strX = strX
        self.fltY = fltY
        self.intZ = intZ

    def __setattr__(self, name, value):
        if name.startswith('int'):
            object.__setattr__(self, name, int(value))
        elif name.startswith('flt'):
            object.__setattr__(self, name, float(value))
        elif name.startswith('str'):
            object.__setattr__(self, name, str(value))
        else:
            object.__setattr__(self, name, value)

As ugly as it is it actually works. All attributes are forced to a type based off of their name. I know the setattr is rather slow, but I never really gave it a second thought even thought I periodically see this code. This weekend a neuron misfired and I had an idea. So I gave this a whirl:

class TypedAttr(object):

    def __init__(self, _type, value):
        self._type = _type
        self._default = self._value = value

    def __get__(self, obj, _type):
        return self._value

    def __set__(self, obj, value):
        self._value = self._type(value)

    def __delete__(self, obj):
        self._value = self._default

class Data(object):
    strX = TypedAttr(str, '')
    fltY = TypedAttr(float, 0.0)
    intZ = TypedAttr(int, 0)

This Data class is must cleaner. The TypedAttr can be defined in another module making this code much smaller and more declarative.

The descriptor approach out performs the getattr. My tests were using the timeit module in Python 2.5. Running the old way gave me about 25 usec and the new way was 11 usec.

I think I’ll suggest the change :-)

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Reddit
  • Technorati

Comments

  • dstanek
    @Ralph

    If I understand you right you are saying to have a class for each property. So for a user class I would have an Email class, a Username class, etc. If that is true I would think that there would be too much boiler plate code laying around.

    Do you have an example of what you are talking about?
  • I guess my main question is why? You can abstract the data object and do the datatype assertion. A simple datatype validation object would probably be cleaner.
  • dstanek
    @Tom Lynn

    There may be some cases when we have a string attribute that actually has a number. If we wanted more detailed validation we would always pass in a more specific factory method as _type.
  • Tom Lynn
    With the current implementation it may be a bit too lenient I suspect. e.g. "foo.strX = 3" is allowed, when that probably wasn't the intention.
  • dstanek
    Again I don't know that this is the end solution. I am just getting alternate ways to do what we are doing. Here is code this takes the instance into account, but still has other issues:

    class TypedAttr(object):

    def __init__(self, _type, value):
    self._type = _type
    self._name = '__TypedAttr_' + str(id(self))
    self._default = value

    def __get__(self, obj, _type):
    return obj.__dict__.get(self._name, self._default)

    def __set__(self, obj, value):
    obj.__dict__[self._name] = self._type(value)

    def __delete__(self, obj):
    if self._name in obj.__dict__:
    del obj.__dict__[self._name]
  • dstanek
    Yes I know that verbatim the code won't do exactly what I want. It was a test to see what the difference would be. The real solution would be more like the property builtin. Just with the additional type casting.
  • Peter Otten
    Unless you store the attribute in the instance (the obj argument of the __set__/__get__() methods it will be shared across all Data instances:

    >>> a = Data()
    >>> a.intZ = 42
    >>> b = Data()
    >>> b.intZ
    42
  • Although descriptors are great, I wouldn't recommend that implementation just yet. Add the following to the bottom of your code.

    if __name__ == '__main__':
    a = Data()
    b = Data()
    assert a.intZ == b.intZ

    a.intZ = 4
    assert a.intZ != b.intZ, ("changing a.intZ changed "
    "b.intZ (%r, %r)" % (a.intZ, b.intZ))


    As you can see, separate instances of the Data object get the same instance of each descriptor. In general, that's not what you want.

    Since "obj" is passed to each descriptor method, there are ways to use them for instance-specific data storage. See http://projects.amor.org/dejavu/browser/trunk/units.py#l332 for an example of using a descriptor for type coercion which stores the actual data in the original object.
  • NOTE: there is a small bug in your implementation.

    Try creating two instances of Data and setting strX differently in each. Then check the values.

    >>> a = Data()
    >>> b = Data()
    >>> a.strX="hello"
    >>> b.strX="goodbye"
    >>> a.strX
    'goodbye'
    >>> b.strX
    'goodbye'
    >>>


    to do what you want requires a little bit more work. I gave an open space talk on it at PyCon two years ago. Looking for the materials now.
  • I wonder if the approach you've taken is at all similar to spec.py, a module within the QP web framework package.

    A spec is a specification - more than just a type; For example:

    ->> match('123', both(string, pattern('[a-zA-Z].+')))
    False
    ->> match('M', both(string, pattern('[a-zA-Z].+')))
    False
    ->> match('Mike', both(string, pattern('[a-zA-Z].+')))
    True


    class Person(Specified):
    name_is = both(string, pattern('[a-zA-Z].+'))
    address_is = Address
    age_is = int
    def __init__(self, name):
    init(self, name=name)

    add_getters_and_setters(Person)

    ->> p = Person('Mike')
    ->> p.set_address('123 Main Street')
    Traceback (most recent call last):
    File "", line 2, in
    File "/usr/local/lib/python2.5/site-packages/qp/lib/spec.py", line 725, in f
    require(value, getattr(klass, name + '_is'))
    File "/usr/local/lib/python2.5/site-packages/qp/lib/spec.py", line 171, in require
    raise TypeError(error)
    TypeError:
    Expected: Address
    Got: '123 Main Street'


    http://www.mems-exchange.org/software/qp/qp-2.0.tar.gz/qp-2.0/lib/spec.py

    http://www.mems-exchange.org/software/qp/
  • That's pretty slick. We've pretty much not delved into the goodness of new-style classes, really only switching over to them for the gc benefits... Now that things are becoming (breifly) less insane for me, maybe it's time to go exploring.
blog comments powered by Disqus