So Apparently I Am A White House Champion of Change for Open Innovation

Whitehouse_logo This morning I am in Washington DC to take part in a White House Champions of Change event focused on open innovation. Instead of trying to characterize this event myself, I'll just copy/paste some text:

You have been selected by United States Chief Technology Officer Aneesh Chopra to be highlighted as a ‘Champion of Change’, which is part of President Obama’s Winning the Future initiative.  As the White House executes President Obama’s plan to “out-innovate, out-educate and out-build the rest of the world”, entrepreneurs like you are being recognized for the innovative work you have undertaken in your community. 

Each week, we feature a group of Americans who embody the President’s commitment to ‘Innovate, Educate, and Build’.  

CTO Aneesh Chopra and CIO Vivek Kundra as well as other Administration officials will host an Open Innovation / Champions of Change event at the White House on Friday, June 10th , 9am- 11am.

The project being highlighted– and which I'll discuss in a roundtable with about a dozen really smart people with very interesting projects– is Citypayments.org. Here's the blog post I wrote when we launched Citypayments in July 2009. I also wrote extensively earlier this week about Citypayments in the context of the great public data being released by the City of Chicago.

I am honored to be a part of this event. Like others here, I just found out about this late last week, so we're kind of stunned a bit to be a part of this. More to come on all their great projects as they emerge.

Deep Dive: City of Chicago Payments Data

For open data geeks, there is much to be happy about in Chicago municipal government. John Tolva, a longtime open data advocate and city hacker, is now the Chief Technology Officer. Brett Goldstein, the early OpenTable technologist who most recently founded the predictive analytics practice at the Chicago Police Department, is the Chief Data Officer. For people like me, this is a dream come true. Seriously.

Seal of the City of ChicagoA few weeks have passed since their appointment by Mayor Emanuel, and I want to start reviewing the concrete steps they've taken, especially in the raw data they've been publishing to the City's data repository at data.cityofchicago.org. Many of us in the open data movement have been arguing loudly for more raw data. The time for arguing is over. The time for making is now.

Let's start with a high-value data set with which I am intimately familiar– payments to City contractors.

HISTORY

The payment data available in the payments section of Chicago's data repositorty goes back to 1996, and the data on the City Web site lookup tool goes back to 1993. Keep in mind that none of this data covers the Chicago Park District, the Chicago Housing Authority, or the Chicago Transit Authority. As our friends at the Better Government Association have recently shown, there might be some issues over there.

It is impossible to create great applications using civic data without first attempting to understand how that data came to be. It's popular among developers to make fun of difficult-to-navigate municipal Web sites, but I take a more grateful approach. Think of it this way– for more than 15 years, muncipal technology workers have been feeding and caring for this database of fresh, reliably formatted information. I appreciate the time, energy, hiring, skill, storage, and sheer electricity it took to get that done.

Here is a 1998 white paper, "Reengineering the purchasing function: identifying best practices for the City of Chicago" on the creation of FMPS (pdf). The author of the paper is Kathryn M. Kustermann, who appears to have been a Deputy Chief Information Officer at the time FMPS was designed. Reading this document, it becomes clear that a main goal of the system was to "redesign the core purchasing processes at the city of Chicago", and it achieved that goal.

One objective was to "us[e] the system to quickly and easily send and receive information through fax server and e-mail technology, eliminating a majority of manual effort and lost documents".

So this is where their heads were at– getting the City's purchasing system current to technology that was already a few years past its wide acceptance. That's how these things go, and I don't see anything wrong with it. I'd rather my city work more slowly and deliberately than the private sector does. There is a role for everyone.

CITYPAYMENTS.ORG

In 2006 I worked as a contractor for the City of Chicago and I came into contact with the Financial Management and Purchasing Systems (FMPS). This "enterprise system" provides the basis for a whole slew of innards that helps the City get things done. The main public interface into the FMPS was the Vendor, Contract, and Payment Search on the City of Chicago Web site.  It struck me as an odd combination of deeply rich and immensely opaque. As the years went by, I stayed interested in this deep little database.

In October of 2008, I posted a message to the poliparse (politcal parsers) group to see if I could rustle up anyone smarter than me who could actually scrape this info and get it all out. After a while, a friend of mine got excited about liberating this data and making it more searchable.

After we launched CityPayments, we moved on to other things, but I kept tabs on what was going on. One thing the City did (in the previous administration) was to list the "10 Most Recent Awarded Contracts". This was a welcome change, and it is still pretty useful. Often the contracts are so new that they're not even available in PDF format yet, so you have to wait for the actual contract to get uploaded. You can see this here that the first contract number is not linked yet:

Picture 2

 
We have a number of improvements we've wanted to make, but haven't had time. Here's the list. If you're in the mood to code on CityPayments, let us know!

REVIEW OF CURRENT PUBLISHED DATA

The FMPS data provided in this first release by the City is described as follows:

All vendor payments made by the City of Chicago from 1996 to present. Payments from 1996 through 2002 have been rolled-up and appear as "2002." Payment information is available as summarized totals for 2003 through 2009. These data are extracted from the City’s Vendor, Contract, and Payment Search. Time Period: 1996 to present. Frequency: Data is updated daily. Related Applications: City of Chicago Vendor, Contract, and Payments Search (http://webapps.cityofchicago.org/VCSearchWeb/org/cityofchicago/vcsearch/controller/payments/begin.do?agencyId=city).

Note that "Payment information is available as summarized totals for 2003 through 2009" bit was added after a bug was discovered and reported by Dan Sinker and acknowledged right away. We are the bug fixers we've been waiting for. Also, Sinker popped the data dump into a Fusion Table.

Here's the fields published in the data:

Voucher number: This is simply the number of the invoice that is being covered by the payment. This field is often empty, and I never saw it as very important in the whole scheme of things. Also, it is difficult to do further research based on this field, because there's no way to search for it on the source site.

Recently, the City added a new method of initiating a payment:

New in 2010: Direct Voucher payments from January 2010 to the present. Direct Vouchers are used to pay for miscellaneous products and services that are not associated with a signed contract between the City and the Payee. Examples include debt service, utilities, third party payroll expenditures, court and legal settlements, and small payments such as travel reimbursements.

I don't really understand DV or why it was added. Former Alderman Eugene Schulter received more than $50,000 in these types of payments in the last year and a half. 

Amount: the amount of money paid to the vendor for that particular voucher. Definitely useful when trying to match contracts to awards

Check date: Self-explanatory.

Department: There are 57 departments listed in the Contracts and Awards search on this Web search supported by the City.

Contract number: This is the key field if you are looking to do further research. With this number, you can search on the City Web site for pretty much everything you want to know about the contract. My advice: READ THE CONTRACTS. Better than fiction. Funny to see all the signatures.

Vendor name: name of the vendor being paid. There are often errors in this field– names are conflated or misspelled, for instance. I wouldn't rely on this field if you want to find *all* contracts of a type

If you've read all the way down to here, it's time you're rewarded with a cool picture. Here it is:

Peoples Gas Education Pavilion at the Nature Boardwalk at Lincoln Park Zoo

Peoples Gas Education Pavilion at the Nature Boardwalk at Lincoln Park Zoo. (Note: there is a vendor called "LINCOLN PARK ZOOLOGICAL SCTY" and another called "THE LINCOLN PK ZOOLOGICAL SOC.". Again, don't count on Vendor Name to have unique IDs.)

FUN WITH DATA

These are things you can do with this current data:

  • Use it as a jumping-off point for discovering about city contracts and vendors. Now that there is a feed of new payments being pumped to this Web page, you can see the City's checkbook as money goes out. That can spark deeper looks into the actual text of the underlying contracts. Good stuff in there. Some of it is goofy. Note: you do not have to be a coder (I Am Not A Coder) to do this– you just have to copy/paste contract numbers, download PDFs, and read them
  • Make broad year-by-year calculations of spending by Department and whack them against the numbers in the published budgets for each year going back to 2000. Back of the napkin stuff, just for fun. Again, this is knowable stuff based on that which is already published
  • Do some fun things with the Vendor Name field. Rank them in order of dollars paid by Department. Do automated Google searches for the company names, boards of directors, Web site URLs and provide an index of the companies for others to annotate and build a directory
  • See how many of the General Contractors from this list are also city vendors
  • Clean up the Vendor Name field in general, combining obvious duplicates

MOAR IDEAS

This Payments stuff is clearly just a first step when it comes to structure data published out of the city's financial information system. I'd like to see more connections among payment data, vendor data, finished work, and so on. Data is almost never interesting by itself. It's the connections that make it interesting.

Chicagoans really need to get a better view of what we're getting for our money. And I don't mean this in an investigative reporter-style way. I mean that we should be able to look at a voucher and be able to see what we got out of it. And it's not all on the City to provide the info.

I want to see if other contractors could do it more cheaply. If a contractor got beat out of a job, they should use this data to prove how they could have done it for less. It's one thing for nutty developers to grab all of this data and make broad connections. It's another thing altogether for nutty business owners to take teensy slices of this data and make teensy conclusions judgements about it.

In the same vein, I want contractors who were awarded the work to make claims about quality, or better staff, or a better return on investment than the cheaper option.

I want to see the weekly report that the contractor provided related to the deliverables. Let's move past feet-to-the-fire-ism and move toward free market public relations. ("Yes, we got paid today, and here's what we did to earn it.")

I want to see a picture of what we got for the money, whether it's a bridge or a bucket. Tie the payment system into the Home Depot Point of Sale application. Provide VISA card-style itemized purchases. Why not? It certainly exists somewhere, why not everywhere?

Everything is so disconnected for now– there is the record of the work on the LaSalle Intermodal at Congress Parkway and Financial Place progress here, a detailed PDF of the work over here, the 2nd Ward Alderman talks about it over here & etc. We all need to know we're all talking about the same thing. The contract UID is the key, and we all need to find ways to embed them into our loves more easily.

But that's for tomorrow. All hail the city of Chicago, as well as the City of Chicago.

(Bonus link: original research on ICAM, the primogenitor of all Web-based crime mapping applications that started off as a PC-based MapInfo 2.0 application in 1995).

 

Stoked for Muti

About a year ago I won two tickets to the Chicago Symphony Orchestra at a benefit for the Chicago Opera Theater (thanks, Angel!).

It was two tickets, and there was a wide berth of dates to choose from. Opening night and other big performances were not available, and after parsing the schedule closely, we chose February 3, 2011- a random Thursday in the dead of winter.

Then Muti took ill in October and the whole season got muffed. Now he's on the mend and his return date is– Feb 3. Should be a great night.

BatchGeo, Socrata, and Vacant Building Service Requests in the City of Chicago

I was knocking around on the City of Chicago's Data Portal recently and I was impressed to see this data set:

Vacant and Abandoned Buildings Service Requests. Here's how they describe it:

Data set contains all 311 calls for open and vacant buildings reported to the City of Chicago since January 1, 2010. The information is updated daily with the previous day's calls added to the records. The data set provides the date of the 311 service request and the unique Service Request # attached to each request. For each request, the following information (as reported by the 311 caller) is available: address location of building; whether building is vacant or occupied; whether the building is open or boarded; entry point if building is open; whether non-residents are occupying or using the building, if the building appears dangerous or hazardous and if the building is vacant due to a fire.

The Data Portal site has been the subject of widespread scorn since its launch, because its main focus seemed to be on providing lists of FOIA requests submitted by reporters:

Foias-for-data-are-data-but-cmon-now

This was considered a sly finger in the eye by most observers. Goose, gander, etc.

That's why I was surprised to see such a high-quality data set there. It is updated daily and it contains great details. It's not hard to see how a list like this can be useful to community groups and concerned neighbors.

The City's Data Portal runs on software from Socrata, a really solid company out of Seattle. I've seen their services evolve over the last few years, and they are on to something. At every turn, their interface is about extending and sharing data. They take the verbs of video sharing (embed, discuss) as well as programming (export, filter) and Web development (visualization) and apply it to civic data. Here's an embed of the Vacant Building data:

Vacant and Abandoned Buildings Service Requests

Powered by Socrata

That's a worthwhile copy/paste right there.

So I downloaded the data (back over the Christmas break) and looked around. I really don't have much on the technical skill front. I can do some things, but I cannot natively program anything.

So the point is, I downloaded the data in CSV format and really couldn't do anything with it. I opend it in Microsoft Excel and marveled at the 9,601 records going back to February 7, 2008. I got that raw feeling of knowledge and power. It was pretty cool.

Think of all of the calls to the city's 311 system that led to this spreadsheet on my desktop. How many call center operators had to ask how many questions of how many people to pull this detail out of them? We really do rail about city workers and corruption and Pam Zekman-craziness. But this is some great work product, and I am grateful for them.

The first thing I did in Excel was concatenate the various fields in the source data that made up the valid street address. Once I had a single string– and then hid the original, separate fields– I was ready to map the data.

For that, I tried BatchGeo for the first time. I found it to be crazy-simple and super-powerful. All I had to do was copy/paste the fields into a box:

Batch-geo-copy-paste-box

At first, I tried to map all of the data, but the free BatchGeo system only goes up to 5,000 records. They have a paid service– Maptive– for larger data sets. I tried 5,000 records for a while, but the system kept choking on that as well. So I decided to go back to July 1, 2010, which was about 2,500 records.

This worked like a charm: Vacant and Abandoned Buildings Service Requests in Chicago, IL from June 1 to December 22, 2010.

View Vacant and Abandoned Buildings Service Requests in a full screen map

After I looked at the data on the page, I realized that the first item– the last one reported before I downloaded it from the City Web site– was for 1746 E 75TH ST, and was dated 12/22/2010. This is next door to the terrible fire and building collapse that killed two firefighters on the same date. This item was most likely entered by a city worker after the fire next door.

Public data is about life and death. The firefighters were inside the building when they died because they were searching for homeless people they believed were inside.

We're winning– more and more data is being published, more politicians and policy makers are understanding the value of it. We need more developers to devote themselves to turning this stuff into better cities.

AldermanicWebsites.com– a Comprehensive Take on Chicago Political Web Sites

Here's something I worked on over Christmas vacation: AldermanicWebsites. It is a fun/ manic little Web site that "contains links to and reviews of the Web sites, Facebook pages, Twitter accounts, and other Web referencia for each of the 349 people who filed nominating petitions to run for Alderman in one of the cities wards."

Here's some more copy/ paste explanation:

You can read the detailed notes on the evaluation criteria here. The best political reporting Web site in Chicago, Early and Often, provided lots of the early info that feeds this site.

For each candidate in each ward, we link to the campaign Web, provide a screenshot of the homepage. We’ve got the first sentence of their bio and their slogan if they have one. For each Web site, we provide one thing that is unique about it.

The next obsession is the stars in campaign imagery, covering whether they use stars in their design and whether the star is the unique six-pointed star of the Chicago flag.

There’s links to any Facebook, Twitter, and EveryBlock accounts maintained by the candidate. There are some amazing discussions going on among neighbors and candidates.

If you’re looking for info on a particular candidate, your best bet is to search for the candidate’s name in the search form on the right.

Chicago Board of Election Commissioners Logo

If you’re interested examples in particular technologies used by particular candidates, the Quick-See pulldown menu is your best bet.

For instance, you can see all candidates using the WordPress Web development platform, all candidates using PayPal to collect contributions, or everyone who uses Contact Contact to send out mass emails.

Same goes for wards– just choose the ward your interested in (the 24th is super-lively) and you’ll see all candidates. The Ward list is in alphabetical order (when the ward number is spelled out). That’s a little goofy, I know– what can I say; I have limited skills.

If you’re just interested in the highlights (and low lights), I would stick with the “Analysis” links. That’s where I boil down all of the sites into some upshots.

I like the Analysis best– so far I've got five posts there, covering everything from good stars and bad stars to remarkable sites and popular Web development platforms. And you've realy got to view source to see anything.

More to come, including project documentation– I used about two dozen tools to make this thing. Stay tuned!