[UFO Chicago] The City of Chicago wants you to fork its data on GitHub

Neil R. Ormos ormos at ripco.com
Tue Mar 19 09:48:20 PDT 2013


Brian Sobolak wrote:

> http://feedproxy.google.com/~r/oreilly/radar/atom/~3/Q1sCGlXyP-8/the-city-of-chicago-wants-you-to-fork-its-data-on-github.html
>
> Im interested to see what, if anything, comes of this.  I havent looked at it yet but plan to.

Thanks, Brian, for calling this to attention.

A lot of government data is supposedly "available"
but in a form calculated to make it difficult or
expensive for external users to actually use in
some automated way. I've read about an agency
(elsewhere, but I don't recall where) forced to
release some data against its will, which, instead
of releasing an extract of the computer database
they actually used in their operations, printed a
report, scanned the report (poorly), and released
the scanned images.  IIRC, an official explained
that in his view, legitimate users would browse
the data, rather than search or analyze it using
automated tools.  Therefore, releasing the images
would satisfy legitimate users, while frustrating
illegitimate users who would use automated
analysis of the data as ammunition to challenge
agency practices.

In contrast, Chicago's practice of releasing this
stuff on GitHub with an MIT license lowers the
barriers to public use and editing to an almost
negligible level.  It also appears to be a
diametric reversal of Chicago's former policies of
making public records either inaccessible or very
expensive. (And yes, this reversal has occurred
gradually; the city has had a data portal for some
time.)

Goldstein's comment

|> These datasets are items that are visible in
|> daily life -- transportation and buildings. It
|> is not proprietary data and should be open,
|> editable, and usable by the public.

is worthy of some scrutiny, as is the policy of
removing from the data "internal codes" which the
city does not believe are helpful to users.  I
haven't looked at the complete breadth datasets
available, but it will be interesting to see if
the city likewise releases most of the other
non-proprietary data, the stuff which may not be
visible in "daily life" but which nonetheless is
supposed to be a matter of open, public record.
As potential users discover Chicago's trove of
released data, perhaps it will sparks a public
debate on exactly what data should be proprietary.

One aspect that is pretty nice is that the city is
not just releasing the data, but is also providing
step-by-step instructions on how to use the data,
and is doing so for several programming languages,
including R, which is reasonably accessible for
non-programmers.

--Neil


More information about the ufo mailing list