Saturday, March 19, 2011

An API that tries too hard

The purpose of this post is to make a point about the dangers of
excessive automation

Let me emphasize: the purpose of this post if not to criticize ConfigParser in particular, but to make a point about the API over design.  ConfigParser is an old project and if it was redesigned today, would hopefully look very different.


The Python 2 standard library contains a ConfigParser module that
parses "ini"-style files. (In Python 3, it was renamed to
configparser.) Ini files model information as named
sections containing named options. They're typically used in a
schema-less manner, meaning there's no schema for defining what
sections, options, or option values are valid. I've found this format
to be powerful enough to handle lots of configuration problems and
less cumbersome than more sophisticated mechanisms based on XML or
other formats requiring schemas.

I apologize in advance to the maintainers of configparser. I appreciate
and value your effort, really, but ...

The ConfigParser module is a good example of a module that tries too
hard to help and, in so doing, makes a simple problem complicated.

When I parse an INI file, all I want is a function that takes a
string and returns a dictionary of dictionaries. (An ordered
dictionary of ordered dictionaries is a bonus.) ConfigParsser has
this functionality embedded in a relatively short private function
burried in layers of unhelpful and non-pythonic APIs.

ConfigParser provides a bad variable-interpolation syntax that's
an attractive nuisance. Because this mechanaism was used by
PasteDeploy, %s in PasteDeploy's configuration files have
to be escaped and the APIs defined by PasteDeploy have an awkward
"global configuration" parameter that exists soley to accept a
ConfigParser's default section.

ConfigParser provides a policy of case-folding option names that
you have to go out of your way to disable.

ConfigParser provides a policy of trimming leading and traling
spaces from option values that can't be overridden.

I've used ConfigParser in the zc.buildout project for some
time. The trimming of leading spaces in configuration values is a
headache. I'm currently working on a port of zc.buildout to
Python 3 and found that Python 3's configparser wasn't backward
compatible with ConfigParser and I was forced to copy the function
at the heart of ConfigParser. This function is straightforward and
expressed in ~70 lines of code, not counting comments, docstrings and
some exception classes. The function would be even simpler if it
wasn't saddled with some legacy syntax support. (Because I used
ConfigParser, I'm saddled with that legacy too.) My code is now
simpler as a result of using this function and I'll be able to adjust
the text trimming policy easily later (after the Python 3 port is
done).

In an effort to help me, ConfigParser did things for me that
ultimately got in my way. Policies like variable interpolation, case
folding or string trimming can easily be done after initial parsing.
By coupling these policies with parsing, users are either stuck with
the policy decisions, or have to work around them. A simple function that
simply parsed a string and returned an ordered dictionary of ordered
dictionaries would have been far more helpful. People might
appreciate higher-level functions that provide some of these policies,
but these should have been provided as optional conveniences along
with the simpler function.