The case for transparency

Few things puzzle me more than a government entity actively hiding, obfuscating, or misplacing  its records.  Sure, the excuses range from “We can’t,” (flakspeak for “We don’t know how”) to “We just don’t feel comfortable doing that right now” (flakspeak for “I’m so scared I just peed myself”), but a no is a no regardless of reason.

A big fat nondisclosure is a loss for all parties involved. The journalist loses because they can’t pursue a story or fully contextualize one they already have. The public loses because they’ve been locked out; labeled as untrustworthy by the government they fund.

Then there’s another loser: the government itself. A nondisclosure makes the entity seem secretive, makes it seem like it has something to hide.  A nondisclosure does no favors to officials, keeping the public in the dark and making it impossible for citizens to draw conclusions about how  officials carry out their interests.

For years, I believed in the dichotomy of openness. An entity was either open or closed. Arizona is open. Phoenix is closed. Columbia is open. Springfield is closed. Missouri is open. And so on.

But there’s a third level that is more beneficial to all parties than mere openness: Proactive openness.

Proactive openness is providing information as it comes, nullifying the need for a records request. It is providing information in machine-readable formats, eliminating the need for scraping that’s cumerbsome to both the scraper and the scrapee.  Proactive openness is providing data in its raw form, doing away with spin because there are no aggregates or analysis to sugarcoat the truth.

What I call proactive openness is actually an eGovernance movement gaining traction around the country. A recent survey ranked the programs, finding that Washington D.C. and Portland are the best around. Portland, it should be noted, was so miffed at its showing, it even went out and ordered an RFP to get some citizens involved in making things better.

Et tu, Phoenix? Not surprisingly, the fifth largest city in the nation doesn’t make the list. Not anywhere. And what’s sad is there’s no reason it has to be the case. From my interactions with city personnel, it appears staying away from eGovernance techniques has been a deliberate choice. Phoenix doesn’t even provide electronic data when requested under public records law, so I can only imagine the horror at the concept of sharing data electronically before it’s even requested.

I can understand the fear of being open. Often, people who request documents are looking for shady behavior. Their motives alone make them suspect. But fear of transparency? That’s beyond me. There are plenty of good reasons to pursue proactive openness:

  • The public can do the work. A collection of APIs, or a data warehouse as provided by OCTO in D.C., turns every citizen into a web producer. Rather then spend weeks toiling over design and UI, government IT people could focus on simply opening the spigot, and move onto bigger and better things. Or, I guess, making sure the municipal judges have the right screen saver. Whatever. Opening up saves time and taxpayer money while increasing productivity.
  • It adds new perspectives. There probably aren’t very many watershed experts/tax policy advisers on the city payroll. Just a guess. Opening data allows for invested, educated and interested citizens to become unpaid interns of a sort, offering their expertise on issues that otherwise would get bogged down in bureaucracy.
  • It can save money. Or at least allow the city to better allocate its resources. Check out the list of PIO contacts in the city of Phoenix. That’s a lot of people paid to bang out press releases. When information is freely available, press releases become less necessary to get out a message. At the same time, a transparent city has to walk the walk. If actions don’t match up with words, it won’t take a rocket scientist to figure it out. Just any dude with the Internets.
  • No spin is the best spin. Raw data is like Scoop, the 1930s reporter guy. “Just the facts, ma’am.” Compared to the content of most press releases, it’s Gospel. Where did Councilman Bob use his P-Card last week? How many potholes are there that they haven’t found time to fill the one on my street yet? Questions could be answered quickly using some hypothetical data sources. And the answers would be trusted.
  • Because we want it to be easy to care. Civic inolvement is boring. Data are boring. But opening up civic data to the Web opens all sorts of possibilities for making both engaging and… dare I say it?.. fun. Again, it’s easy to look to D.C. for examples. Want to find parking info? Go for it. Curious what those sirens were last night? You’re covered. Opening up — proactively opening up with close to real-time information — will make for a more plugged in citizenry. Ideally, that’s in line with the City’s own desires.

Now obviously, I’m a journalist, and having government crack open its data would make my life better.

But this isn’t borne entirely out of self interest. Being in my admittedly unique situation has given me a front row seat to the inefficiencies of both government and the citizenry, inefficiencies that could largely be addressed by the type of eGovernance that but for the grace of god is catching on. I have had citizens call me up and ask me what happened on their corner last night. I have had PIOs pitch me stories about an uptick in requests for neighborhood cleanups.

Both deserve better.

Why I think the generational nonsense is so much BS

Ryan Sholin, Steve Yelvington, Shannan Bowen and others have been weighing in on the journalism generation gap. Got me to thinking of exceptions.

Tom Warhover, Executive Editor for Innovation (or something like that) at the Missourian. 50ish. Gets it. Wants more multimedia. Wants more data. Wants to provide it in ways that aren’t measured in inches. Gets excited by the new and wants to try it out.

He has the job of teaching the majority of students at the Missouri School of Journalism (and I’m sure are prevalent in other, similar school around the country), who, you know, want to write for a living and can’t see why they need to do all this other stuff. Oh. My. God. To tell you the times I heard these people prater on about their want to write, and travel, and… that’s about the brunt of it. Take photos? Video? Out of the question. The typical reaction to industry layoffs could be summarized as “More jobs for us!” Er, no. Not you, oh clueless one. Can they ever get it? Sure, but being young is by no means the equivalent of being clued in.

Don Wyatt, Executive Editor at the Springfield News-Leader. Gets it in a big way. Instituted online goals for reporters… reporters! Asked me about the feasibility of providing cell phone interfaces to data. Where’d that come from? His subscription to ESPN mobile, of course. (Are we supposed to acknowledge that 50-ish folks have cell phones?) Made sure to work recorders for every reporter into a tight budget, and even instituted some in-house training. Yeah, he’s 50ish, too.

To cast this split as generational is to ignore key truths. It creates a false Us v. Them along age lines that just doesn’t exist.

You can start cooking up ideas no matter your age. You can fail to see their use no matter your youth. There is no easy litmus test, the proof is in the pudding.

</Soapbox>

The “fair use” line in the sand

The thesis of that last post would probably be something like, “Free, useful APIs are routinely overlooked in many newsroom, a policy that should be re-explored.”

APIs offer free content. They’re typically intriguing. And they’re built to be torn apart and rebuilt as you see fit. The negative image of them might be the new thorn in my ass: Sports stats.

“Stats,” in this instance, is all-encompassing. It’s the stats that make up the back of the baseball card. It’s W-L record. It’s divisional standings and game scores, league leaders and historical info. It’s ridiculously compelling, and with so many sources, they’re easy to get and can be rebuilt as you see fit.

The problem is that this time, you’re not getting permission. And that’s a whole new can of worms.
The dichotomy comes into play when we start flipping through the print edition. Vendors typically supply the staples, things like box scores and standings. But the minute we start rolling out aggregated statistics not collected by the vendor, we’re delving into new territory. And often the only way we can do that is by giving credit to the source we stole them from, typically ESPN, FOXsports, cbssportsline, or some other online wealth of sports knowledge. We could even go to the league homepage, where solid, reputable stats are aggregated.

But for some reason, we scoff when the same sourcing must be used for an online application. And that’s a shame. Maybe it’s a residual effect of growing up listening to some version of this before every game:

“This copyrighted telecast is presented by authority of the Office of the Commissioner of Baseball. It may not be reproduced or retransmitted in any form, and the accounts and descriptions of this game may not be disseminated, without express written consent.”

Does anyone have examples of getting around this Catch 22? It’s like we’re going thirsty in the ocean — surrounded by sports stats, unable to use any. Sports information seems like it should be the holy grail of online journalism, a creative, telling visualization would almost certainly draw repeat traffic. It’s continually updated information, it’s highly relevant to a specific demographic. If you don’t believe me, try starting a fantasy football league in your office and start batting away the takers.

The possibilities with this stuff are numerous. But our hands are tied.

Where does fair use begin and end? What’s the public domain, what’s proprietary, and is there any middle ground?

Our pride could impede progress

I get a little silly over a good web service. If there’s an API involved, all the better. Programmableweb has a prominent place in my feed reader, and I try to keep fairly abreast of what the rest of the online world is cooking up and how best I can use it.

What worries me is how little of it is allowed to translate into my industry. Even more depressing, I think, is the reasons so little of it is allowed to take root. I think they can be boiled down to a simple character flaw: Pride.

Maybe the best example of this is Yelp. Yelp is a great site, which, at its least advanced, is basically a phone book.  But on top of that, it adds a layer of user reviews and social networking, powerful features that have made it the peer-review go-to source on the Internet.

About a year ago Yelp released an API that allows pretty much anyone to bring the site’s reviews, ratings, neighborhood searches, etc. into any other Website.

To me, that’s an application that begs for a newspaper.com to take advantage:

  • Few news orgs have truly worthwhile dining/entertainment/calendar sites.
  • Almost all, though, make an attempt.
  • With resources being what they are, it seems like a natural fit to take what Yelp is offering and use it to cut down the jobs to be done. The cost is zero, and meanwhile, you’ve added a great feature to your site.

In return for having access to Yelp’s data, all a news site has to do is slap a Yelp logo on the results.

And that, unfortunately, is where the wheels fall off the bus. I’ve heard publishers say things like, “If we do this, we’ll legitimize Yelp. And they are the competition.”

Maybe it’s a sign of ow far to the Dark Side I’ve come, but I don’t see it that way. In fact, I see it much the opposite. showing that we are willing to use free data from a “competitor,” when offered, will make us seem that much closer to getting this whole Web thing. Using their data, I think, legitimizes us to the early adopters that have already embraced the useful tool.

To be sure, Yelp is probably the most controversial example of this. Other services based more on functionality than content would probably be an easier sell within a newsroom. But even then, I don’t think we’re taking advantage as much as we could and should.

Take Twitter. Totally, 100 percent free. Great functionality, out-of-the-box SMS support, solid and growing base of users, etc. Those are things any news site could use. Slowly, I think, we’re coming around to that. We tweet blog posts. But there has to be more there, more that’s available to us because, again, they’ve given us the keys.

It seems like a simple script could turn the Twitter API into something much like OhDontforget. Does that have a place on your newspaper’s entertainment site?

There’s another argument to be made here, but I’m late for work. More to come.

Journalism, journalists and money

Over the past few weeks, a sort of taboo subject has continued to bubble up. More buyouts at some of the best news orgs in the country. Plunging stock prices. An intriguing article on the relationship between the newsroom and print advertising.

As if fated to be seen in contrast, the Pulitzer winners were announced, and some fantastic work got the credit it was due. Google beat out Wall Street projections and posted a profitable first quarter. And some yayhoo railed against Rob Curley, calling him a schlub because his products don’t make money (I have no clue whether that’s true or not, but, regardless, the barb was lofted).

So I ask, with both feet firmly planted in the journalism camp: At what point does all of this become our problem? At what point do we, as journalists, as the webby voices in the good ol’ MSM, start actively thinking about how we can make it better?

If there was a theoretical continuum mapping out the stance on this problem in the average newsroom, I’d wager the needle would be staunchly on the “Not my job” side of things. Any product of a worthwhile J-School has heard the horror stories: Staples Center, the CBS New Year’s Eve gaffe, various examples of ad placement for story coverage or spiked stories to preserve an advertising relationship. The overall message many students walk away with is, “If you think about how any of your work will make money, you’re dirty.”

This isn’t true. What’s more, it’s hurting us. Go over to TechCrunch and check out the list of startups. The plurality of those applications would have been ideal undertakings for a news organization. Those ideas were cooked up to make life better or more interesting, sure, but they were also meant to make money. Generally, they’re succeeding at both.

We’re doing good work, too.

But too often, we’re leaving it in the hands of advertising people to see that it makes money. Their solution, inevitably, is, “Slap an ad on it!” “Upsell X, Y and Z!” or my favorite, “You can’t do that, we sell something similar in print.”

These aren’t wrong answers. Well, except for that last one. Unfortunately for all of us, it isn’t working. It’s time for a new plan. What about allowing subscription cell phone updates for our best apps, or a choice for ad-supported and free? What about harvesting user information and allowing for targeted, premium advertising (The Facebook model)? What about sponsorship?

The journalists who are doing this kind of work are spilling over with ideas. We’re passionate. We love what we do and we want to keep doing it. And I honestly believe that if we started thinking about this, from Project Day 1, we’d come up with something that could work.

To be clear, I’m talking about turning our best ideas into sources of money, not building ideas around sources of money. That’s an important distinction, and a tougher pill for our bosses to swallow. We don’t have to compromise our passions to make this go. Doing so would subvert the entire undertaking. But the belief that our employers should let us do good work because that’s just what news organizations do is somewhere between dead and dying. We have to prove ourselves.

And we can.

ADDENDUM #1: Any and all comments appreciated. If you think this means I lost my soul, please say so.

ADDENDUM #2: Heard from an advertising/marketing person who was looking to repair a relationship with a news editor after mentioning that a new product was mostly being created because it would lead to new revenue. At the very least, that denial of the business side of this business gotsta stop.

In defense of “Raw Data”

Raw data get a bad rap. They’re told they mean nothing, unless someone goes in and adds”context.” If I were data, I’d hate context. Always stealing my thunder. One day data are told they’re the future of journalism, and the next day journalists are complaining about how they’re used.

It’s a tough life, being data. But I think the rollercoaster ride they’ve been through is unwarranted. Why? Because posting data is the 21st century equivalent of what we’ve been doing all along. As in, ask your grandfather how he ran his newspaper, and you’ll start to see some parallels.

If you, like me, don’t have a grandfather who new jack about the news, let me offer up this paragraph from “The Elements of Journalism” (emphasis my own):

“The individual reporter may not be able to move much beyond a surface level of accuracy in the first story. But the first story builds to a second, in which the sources of news have responded to mistakes and missing elements in the first, and the second to a third, and so on. Context is added in each successive layer.

I’m a 20-something who’s known computers my whole life, so I can’t speak from experience on this one. But I’ll bet good money that the progression laid out in the above passage played out in but a percentage of stories — that some simply stopped at the first story. There was no added “context”; there were no complaints from sources.

I don’t think that’s bad. Carl Bernstein talks about something called “the best obtainable version of the truth.” And in many cases, thats what a dataset can offer.

Take public salaries (Please! Really folks, I’ll be here all week). Dozens of newspapers have posted the salaries of their local County, school district, Board of Equalization, PTA, Myrtle’s knitting group, and whatever else they can get their hands on. CAR practitioners have bemoaned the move, saying it’s just not journalism.

Here’s why I take issue. When I worked in the IRE Resource Center, I had the thankless task of processing and reading hundreds of “investigative” stories from all around the globe. Most of them came in around the time of the annual IRE contest, and every year there would be a slew of stories with headlines like, “How much do they make?” It’s the easiest CAR story in the book, right? Find your local municipality’s salary database, ask some questions to fill out a story, and score a talker.

In about 70 percent of the stories I read, that was the methodology. In its entirety. As these were contest entires, I can confidently say that there were no follow ups to those stories; they were meant to stand on their own.

Were those journalism?

Many will put those stories on a journalism continuum. “Yes, it was journalism, just not very good journalism.” That argument assumes that, just because the information was laid out in story form, journalism took place.

You can’t have it both ways. If a crappy salary story is journalism, then a raw salary database is journalism.

Me? I’d argue it’s somewhere in the middle. Journalistic, perhaps; done with the intent of providing context to the community, rather than context to the numbers.

It’s not a catch-all argument. Few are. But I think we need to admit that, sometimes, data on its own can achieve the same outcome as a story, especially when “findings” in analysis are lackluster. Dare I say it, Wire fiends? In some cases, I even think it allows us to achieve more with less.