Monday, June 6, 2011

A basic look at dojo.Animation

Recently, I needed to animate rotation of an HTML button. I use the Dojo javascript library, which has some frameworks for animation. Unfortunately, the core framework, dojo.Animation isn't well documented. For example, if you look at:

http://dojotoolkit.org/reference-guide/quickstart/Animation.html

The documentation and examples are written in terms of higher-level APIs. It turns out that the base animation API is quite simple. Basically, all you need to do, at a minimum, is specify starting and ending values and set up some event handlers. Here's some example code to rotate a node:

new dojo.Animation({
    curve: [0, 360],
    onAnimate: function (v) {
        node.style[transform] = 'rotate(' + v + 'deg)';
    }
}).play();

In the example, I used a variable, transform for the style name. This is necessary because different browsers use different names for the style property that provides for rotation. A full working example can be found at

http://jimfulton.info/demos/dojo-animated-rotate.html

There are various additional properties you can provide to the Animation constructor to control things like duration, frame rate and repeats. You can also provide handlers for beginning, and end of animation. The later is especially important because animation is asynchronous and you often want to perform some action when an animation is done. The API documentation covers these pretty well:

http://dojotoolkit.org/api/1.6/dojo/Animation

The dojo helper functions provide some convenience. For example, to fade a node in:

dojo.fadeIn({node: node})

which is simpler than:

new dojo.Animation({
    curve: [0, 1],
    onAnimate: function (v) { node.style.opacity = v; }
}).play()

but it's nice to know that fadeIn is just a short hand for some pretty simple code.

This is a good example of a low-level API that provides a lot of value, but that gets obscured by higher-level APIs that provide convenience in common cases.

Saturday, March 19, 2011

An API that tries too hard

The purpose of this post is to make a point about the dangers of
excessive automation

Let me emphasize: the purpose of this post if not to criticize ConfigParser in particular, but to make a point about the API over design. ConfigParser is an old project and if it was redesigned today, would hopefully look very different.

The Python 2 standard library contains a ConfigParser module that
parses "ini"-style files. (In Python 3, it was renamed to
configparser.) Ini files model information as named
sections containing named options. They're typically used in a
schema-less manner, meaning there's no schema for defining what
sections, options, or option values are valid. I've found this format
to be powerful enough to handle lots of configuration problems and
less cumbersome than more sophisticated mechanisms based on XML or
other formats requiring schemas.

I apologize in advance to the maintainers of configparser. I appreciate
and value your effort, really, but ...

The ConfigParser module is a good example of a module that tries too
hard to help and, in so doing, makes a simple problem complicated.

When I parse an INI file, all I want is a function that takes a
string and returns a dictionary of dictionaries. (An ordered
dictionary of ordered dictionaries is a bonus.) ConfigParsser has
this functionality embedded in a relatively short private function
burried in layers of unhelpful and non-pythonic APIs.

ConfigParser provides a bad variable-interpolation syntax that's
an attractive nuisance. Because this mechanaism was used by
PasteDeploy, %s in PasteDeploy's configuration files have
to be escaped and the APIs defined by PasteDeploy have an awkward
"global configuration" parameter that exists soley to accept a
ConfigParser's default section.

ConfigParser provides a policy of case-folding option names that
you have to go out of your way to disable.

ConfigParser provides a policy of trimming leading and traling
spaces from option values that can't be overridden.

I've used ConfigParser in the zc.buildout project for some
time. The trimming of leading spaces in configuration values is a
headache. I'm currently working on a port of zc.buildout to
Python 3 and found that Python 3's configparser wasn't backward
compatible with ConfigParser and I was forced to copy the function
at the heart of ConfigParser. This function is straightforward and
expressed in ~70 lines of code, not counting comments, docstrings and
some exception classes. The function would be even simpler if it
wasn't saddled with some legacy syntax support. (Because I used
ConfigParser, I'm saddled with that legacy too.) My code is now
simpler as a result of using this function and I'll be able to adjust
the text trimming policy easily later (after the Python 3 port is
done).

In an effort to help me, ConfigParser did things for me that
ultimately got in my way. Policies like variable interpolation, case
folding or string trimming can easily be done after initial parsing.
By coupling these policies with parsing, users are either stuck with
the policy decisions, or have to work around them. A simple function that
simply parsed a string and returned an ordered dictionary of ordered
dictionaries would have been far more helpful. People might
appreciate higher-level functions that provide some of these policies,
but these should have been provided as optional conveniences along
with the simpler function.

Sunday, February 13, 2011

Health 2 code-a-thon in DC Feb 12, 2011

http://health2challenge.org/code-a-thon/washington-dc/

Someone sent a link about this to the DC Python Meetup Group
(http://meetup.zpugdc.org/) a few weeks ago. It looked like fun and a
way to learn about a new domain, so I signed up. I'm not aware if any
other Python folks were there. I didn't bump into any.

I didn't really know what to expect. I knew pretty close to nothing
about the field. I wondered what technology would be used. It wasn't
clear how teams would be assembled.

A major motivation of this event was to leverage a growing collection
of health-related databases:

http://health2challenge.org/code-a-thon/data-resources/

The event was fun, if a bit chaotic. It was hard to find an
appropriate team and contribute. I gather some teams had formed ahead
of time, but as an outsider, there didn't seem to be any way to get
hooked up ahead of time.

I spent some time brainstorming with one loose team that was
interested in raising awareness at the community level of the economic
impact on a community of health issues. There were some ideas thrown
around that didn't seem very realistic. The "public" aren't likely to
visit dedicated health policy sites or even play health policy games.

I suggested that a good way to reach people in communities might be
through their community newspapers and web sites. The idea was to
develop database-based content in the form of mini applications,
possibly augmented by prose written my health professionals that could
be leveraged by community newspapers. Making this database-based
meant that the content could be relevant to the local community.

This idea was well received. This was a pleasant surprise, since it's
actually kinda close to my day job.

I worked for a while on a prototype application that would provide a
small bit of content of the form:

The hospital readmission rates in MYCOMMUNITY are X.
This compares to a rate of Y in MYSTATE and Z nationally.
To find out more, see http://services.healthindicators.gov.

where obviously MYCOMMUNITY and MYSTATE are community specific and X,
Y and Z are provided by a health database. We used data from
http://services.healthindicators.gov. The idea is that this blurb
would be published as an app that community newspapers could use to
create content. The specific blurb was just a proof of concept.

The database provides SOAP and REST interfaces. I ended up using
suds, http://pypi.python.org/pypi/suds to access the SOAP
interface. This was really easy:

from suds.client import Client
url = 'http://services.healthindicators.gov/v1/SOAP.svc?wsdl'
client = Client(url)

To get a list of all of the methods:

print client

To call a method:

client.service.SomeMethod()

(All of the methods in this API have camel-case names with initial
upper case letters.)

Of course, since this is Python, I could do all of this interactively!
(I say this for the benefit of Health 2.0 readers who read this.)
I was exploring the API in a few minutes. Nice!

For some reason, the API breaks most requests into pages. Each
request has three parts:

foo(some_args, page): Get some data.
For example: GetLocales, GetIndicatorsByLocaleID, GetGenders.
fooCount(some_args): Get the result count
For example: GetLocalesCount, GetIndicatorsByLocaleIDCount, GetGendersCount.
(In case you're wondering, client.service.GetGendersCount() returns 2.)
fooPageCount(some_args): Get the result count
For example: GetLocalesPageCount, GetIndicatorsByLocaleIDPageCount, GetGendersPageCount.

I ended up creating a helper function:

def paged(client, name, *args):
    r = []
    service = client.service
    for page in range(1, getattr(service, name+'PageCount')(*args)+1):
        r.extend(getattr(service, name)(*(args+(page, )))[0])
    return r

(If you're paying close attention, you might be wondering about the
[0] in the code above. For some reason, each "page" of data was
returned by suds as a sequence object with one item containing a
list of the actual data. I don't know if this is a quirk of the API or
of suds.)

This allowed me, for example, to get all locales with:

locales = paged(client, 'GetLocales')

to deal with the paged data.

As is to be expected, the database is challenging. Data are not
uniformly available. Some data are available down to the county
level, but other data isn't. For example, hospital readmission rates
are available at the level of "Health Referral Region", which is
typically (always?) much larger than a county. Different localities
have different amounts of data. Prince William County has on the
order of 300 health indicators available, while DC has around 10,000.

Speaking of "indicators", as with any domain, this one has confusing
jargon. There were "indicator descriptions", like "Acute Hospital
Readmission Rate" and "indicators", like "the value in Arlington is
17%". As it was explained to me, the indicator descriptions are the
questions and the indicators are the answers. The answers are
qualified and adjusted in various ways, probably based on whatever
studies they came out of. I suspect that there will be lots of naive
and misleading uses of this data. I hope these automated
applications get some careful review by domain experts.

Using the database affectively requires either familiarity
with the data, or the ability to quickly browse. The SOAP interface
to the database is pretty slow and doesn't provide very targeted
queries. For example, there's no way to request one type of indicator
for a locale. You can pick an indicator, and get data for all locales,
or pick a locale and get all indicators for it. Getting all of the
indicators for DC took several minutes. They're working on their
search capabilities, so I'm sure this will improve over time.

These sorts of databases will be used for a variety of
applications and run-time use of the databases will likely prove to be
impractical. Taking snapshots is unattractive, as data will
be out of data. Probably, a download model with update
subscriptions would be a better way to go. In other words,
applications might be well served by downloading a database and either
polling for updates or getting updates sent to them.

We decided to bail on our prototype because we didn't feel the data
was local enough. This was a mistake! We should have finished the
prototype. The actual data didn't matter. The presentation of the
prototype would have been a good time to discuss the issues. Dang.

I wandered over to another team that was working with the same
database. They were working on a system for looking at local policy
decisions based on county government databases and connecting these to
outcomes via the health indicators database. I think this is a cool
idea and they were led by a domain expert who had a pretty definite
idea of what he was trying to accomplish. I'm pretty sure that this
will lead to success.

I was hoping to provide some help because I has gained some
familiarity with the database. Unfortunately, they were bogged down
accessing the database using some Java-based SOAP
interface. Gaaaa. Their Java programmer was obviously good, but he was
still using Java. Most of the developers were just sitting around
waiting for the Java programmer. I tried to explain some of the issues
with the data, but the Java programmer was just too busy hacking
Java. I ended up learning the Google chart API so I could help them
eventually display the data.

I eventually got bored and left early. I wish, in hindsight, I'd
finished the prototype I was working on. Hopefully, this blog will be
useful and make up for this a little bit. :)

I wouldn't mind doing this again, especially if I could hook up with a
team ahead of time. I'd even be willing to finish that prototype if
there was interest. I can't spend too much time on this though, as I
have to many other interesting projects.

Saturday, January 29, 2011

A little web application for keeping an eye on things

There are a lot of things I'd keep an eye on, if it was easier,
including:

Nagios problem summaries
Various plots showing metrics for our hosted applications
finance.google.com
social network feeds
...

I could keep web pages open, but that takes too much space.
What I want is the equivalent of an electronic photo frame for web
pages.
I decided to throw something together today that would do this for me:

http://www.riversnake.com/webframe

This is a purely client side application that collects URLs and
cycles through them, staying on each one for a minute at a time. You
can add, list and remove URLs. You can stop and start cycling, move
forward and back, and select URLs to display.
There are a few interesting things to note:

Web Storage: The localStorage facility provided by modern browsers is used to store the list of URLs. The storage is keyed on the URL, so you can keep multiple lists of URLs by adding query strings (or copying the application to other URLs).
Layout: The application has a row of controls across the top and an iframe that takes up the rest of the page. Getting the iframe to fill the remainder of the page, and getting the URL bar at the top to fill the middle of the control bar was a bit tricky. Using CSS percentage based sizes wasn't acceptable because I didn't want to scale all components equally when resizing a page (or when zooming in or out).
The most common technique seems to be to use javascript handlers for page resizes to resize the page contents. Dojo, which I used for the control widgets has some mechanisms to do this, but I wanted to see if I could manage it totally with CSS.
I ended up using a combination of absolute and fixed positioning expressed in terms of ems. See the CSS styles in the HTML for more information.
Unicode Fun: I was too lazy to go scrounge up images to use for the player controls, so I ended up using unicode text that got me close enough. :)

Jim Fulton