Few things puzzle me more than a government entity actively hiding, obfuscating, or misplacing its records. Sure, the excuses range from “We can’t,” (flakspeak for “We don’t know how”) to “We just don’t feel comfortable doing that right now” (flakspeak for “I’m so scared I just peed myself”), but a no is a no regardless of reason.
A big fat nondisclosure is a loss for all parties involved. The journalist loses because they can’t pursue a story or fully contextualize one they already have. The public loses because they’ve been locked out; labeled as untrustworthy by the government they fund.
Then there’s another loser: the government itself. A nondisclosure makes the entity seem secretive, makes it seem like it has something to hide. A nondisclosure does no favors to officials, keeping the public in the dark and making it impossible for citizens to draw conclusions about how officials carry out their interests.
For years, I believed in the dichotomy of openness. An entity was either open or closed. Arizona is open. Phoenix is closed. Columbia is open. Springfield is closed. Missouri is open. And so on.
But there’s a third level that is more beneficial to all parties than mere openness: Proactive openness.
Proactive openness is providing information as it comes, nullifying the need for a records request. It is providing information in machine-readable formats, eliminating the need for scraping that’s cumerbsome to both the scraper and the scrapee. Proactive openness is providing data in its raw form, doing away with spin because there are no aggregates or analysis to sugarcoat the truth.
What I call proactive openness is actually an eGovernance movement gaining traction around the country. A recent survey ranked the programs, finding that Washington D.C. and Portland are the best around. Portland, it should be noted, was so miffed at its showing, it even went out and ordered an RFP to get some citizens involved in making things better.
Et tu, Phoenix? Not surprisingly, the fifth largest city in the nation doesn’t make the list. Not anywhere. And what’s sad is there’s no reason it has to be the case. From my interactions with city personnel, it appears staying away from eGovernance techniques has been a deliberate choice. Phoenix doesn’t even provide electronic data when requested under public records law, so I can only imagine the horror at the concept of sharing data electronically before it’s even requested.
I can understand the fear of being open. Often, people who request documents are looking for shady behavior. Their motives alone make them suspect. But fear of transparency? That’s beyond me. There are plenty of good reasons to pursue proactive openness:
- The public can do the work. A collection of APIs, or a data warehouse as provided by OCTO in D.C., turns every citizen into a web producer. Rather then spend weeks toiling over design and UI, government IT people could focus on simply opening the spigot, and move onto bigger and better things. Or, I guess, making sure the municipal judges have the right screen saver. Whatever. Opening up saves time and taxpayer money while increasing productivity.
- It adds new perspectives. There probably aren’t very many watershed experts/tax policy advisers on the city payroll. Just a guess. Opening data allows for invested, educated and interested citizens to become unpaid interns of a sort, offering their expertise on issues that otherwise would get bogged down in bureaucracy.
- It can save money. Or at least allow the city to better allocate its resources. Check out the list of PIO contacts in the city of Phoenix. That’s a lot of people paid to bang out press releases. When information is freely available, press releases become less necessary to get out a message. At the same time, a transparent city has to walk the walk. If actions don’t match up with words, it won’t take a rocket scientist to figure it out. Just any dude with the Internets.
- No spin is the best spin. Raw data is like Scoop, the 1930s reporter guy. “Just the facts, ma’am.” Compared to the content of most press releases, it’s Gospel. Where did Councilman Bob use his P-Card last week? How many potholes are there that they haven’t found time to fill the one on my street yet? Questions could be answered quickly using some hypothetical data sources. And the answers would be trusted.
- Because we want it to be easy to care. Civic inolvement is boring. Data are boring. But opening up civic data to the Web opens all sorts of possibilities for making both engaging and… dare I say it?.. fun. Again, it’s easy to look to D.C. for examples. Want to find parking info? Go for it. Curious what those sirens were last night? You’re covered. Opening up — proactively opening up with close to real-time information — will make for a more plugged in citizenry. Ideally, that’s in line with the City’s own desires.
Now obviously, I’m a journalist, and having government crack open its data would make my life better.
But this isn’t borne entirely out of self interest. Being in my admittedly unique situation has given me a front row seat to the inefficiencies of both government and the citizenry, inefficiencies that could largely be addressed by the type of eGovernance that but for the grace of god is catching on. I have had citizens call me up and ask me what happened on their corner last night. I have had PIOs pitch me stories about an uptick in requests for neighborhood cleanups.
Both deserve better.
One of the most confounding things about the permeance of Caspio is how totally unnecessary it is. Many news sites opt to shell out $8/database/mo. to this service, ignoring a myriad of better, cheaper alternatives.
Part of the rush may be attributable to the, ah, “me too” mentality of the industry. Part of it, I assume, stems from ignorance.
There’s not much mere mortals can do about tweens at the mall. But I can do something about ignorance. And so, without further ado, I introduce the Abolutely Incomplete, Horribly Biased List of Ways to get Interactive Data on Your Website. The entries are listed from easiest to implement to most complex, so figure out where your organization can make hay and do so. Please.
For the record, the example pages I made myself are quick and ugly. I’ve left in obvious errors and haven’t optimized anything. Almost all of them are capable of doing much more, and there is lots of documentation and help available on all of the options listed.
Abolutely Incomplete, Horribly Biased List of Ways to get Interactive Data on Your Website
Zero programming solutions:
1. Zoho Creator. Zoho has been impressing me since Day 1. If Caspio represents the closest you’ve come to programming, you’ll be right at home with these guys. They provide a friendly GUI that allows for embedding into any webpage. They offer a wide range of tools to make your data useful outside of the constraints of the database. An API and coherent documentation of their weird scripting language (Deluge) gives these guys a clear leg up.
- Drawbacks. Cost, first and foremost. But Zoho’s pricing, IMHO, is more fair than Caspio’s bizarre cost-per-”datapage” method, instead charging for the amount of data you store. Like most of these options, Zoho also doesn’t reload the page every time the user performs a search. That’s better for the user, but if your expect the majority of your data traffic to come from repeat searches, it may result in a page view nosedive.
- Example: Missouri State University salaries.
2. Google Spreadsheets. You knew they had to make an appearance, right? Google Spreadsheets allows for embedding a bunch of data-backed “Gadgets” into a site. One option is a simple table, complete with filtering and sorts right of the bat.They also offer other basics and not-so-basics, like maps, org charts, timelines, etc.
- Drawbacks. It’s a table. A dang table. And that is all it’s going to be. The charts offer other opportunities, but this isn’t the sort of stuff you’re going to write home about.
- Example: State populations.
3. Dabble DB. Their video can explain it better than me. It’s fast fast fast.
- Drawbacks. Pricing scheme is still at work here. Again, it’s a cheaper, more reasonable setup than the datapage method. I have no experience with this system, but the web is mostly devoid of complaints. Tried it yourself? Tell me how it went.
- Example: Couldn’t find one, but they say it’s embeddable. Watch the video.
4. IBM’s ManyEyes. This is kind of a departure, since they do NOT offer your basic embeddable-search-and-report table as an option. They do, however, offer a pretty extensive collection of data visualizations, including everything from pie charts to cool New York Times-like block histograms and wordles.
- Drawbacks. One big pageview the whole time it’s loaded. Great if your shop is forward-thinking enough to charge advertisers by time-spent.
- Example: Click the links above!
Psuedo-programming solutions:
5. jQuery. According to Wikipedia, jQuery is a “lightweight JavaScript library that emphasizes interaction between JavaScript and HTML.” It’s also the object of my undying affection. Javascript is one of the things that makes Web 2.0 so neat, allowing for all sorts of rich, interactive changes to a page (think Gmail, many facebook applications, etc.) It’s also hard for me to wrap my head around. No more. A variety of plugins abstract what used to be a time consuming job. FlexiGrid and Tablesorter make HTML tables interactive with little code. Once TableSorter is installed, for instance, you just drop a five-word javascript function at the top of a page, and the plugin takes over from there. Easy peasy, and good looking, too!
- Drawbacks. Absolutely none. Out-of-the-box, some limitations may be that the table has to exist on the page, which would probably drag out load times. But jQuery is an open-source solution, which means if there are drawbacks (like the no-page-view-for-you syndrome) someone, somewhere, can edit them out (or in, as the case may be). Ludicrously apropos.
- Example. The flexigrid link above, Example #3, shows what jQuery is capable of.
6. The Simile Project’s Exhibit. I’ve toyed around with Simile’s TimeLine offering before, but Exhibit was new to me. This is another great example of abstracting wonky code, leaving just the useful stuff. It only takes a handful of files to get up and running, and most of the work has been done for you. Because data is in one file, and formatting in other, Exhibit makes it possible to do just about any visualization of your records. Very slick.
Note: For me, the hardest part was understanding their data format. Luckily, they provide a tutorial for another tool, Babel, that takes care of even that.
- Drawbacks. Quickly slows down as the record count mounts.
- Example. How about a whole page?
And I’d be silly to leave out the heavy hitters. Actual Programming:
7. PHP (or any other scripting language). There are tons and tons and tons of templates that claim to be able to build searchable, sortable tables and web forms with ease. But honestly, they may be over complicating matters. PHP and MySQL go together like milk and cookies, and some good people have made local installation a snap. Why not just go pick it up yourself with one of the myriad of tutorials out there?
- Drawbacks. PHP is plenty powerful. But the learning curve is the same learning curve any non-programmer faces when they first start dabbling with this stuff. I should know — I still barely understand what an include does. That said, there are some great “Getting started” resources out there, and most every community college offers an intro weekend course. Journalists are compulsive learners; put that compulsion to good use.
- Example. Name something cool on the Internet. There’s a good chance it’s done in PHP. TechCrunch, Facebook, feedburner, iStockphoto, Vimeo, YouSendit… the list goes on and on. As for news databases using it, check out the Arizona Republic’s ASU Salaries and dog registration databases.
8. Django/Rails. My “sort by difficulty asc” might have these and PHP flipped, depending on your learning style. If you need to know how things work through and through, PHP/MySQL is your fit. If you just want to get things done, a framework is what you need. These guys are all the buzz in the news industry now for good reason. They allow for rapid development, which works well with the no-holds-barred news cycle. My own experience with them is limited to desktop installations and mucking around, but I’ve been plenty impressed. Friendly script repositories make it easy to stand on the shoulders of giants, further streamlining the process.
- Drawbacks. Like PHP, this is real, live programming. And that has its own initial complications. But fear not! For Django/Python, there’s a handy guide called How to Think Like a Computer Scientist, available for free online in its entirety. I’m sure there’s a Ruby equivalent, too. Since they aren’t nearly as omnipresent as PHP, so finding local classes is probably going to be tough. IRE’s newest boot camp fills that void.
As the title implies this is by no means a complete list. I’ve missed a lot, and probably done an injustice to a few more. If you have anything you’d like to add, drop me a note and I’ll add it to the list or leave it in the comments.