Ha
A Google Image search for “Caspio” yields my CaspioFail. For old time’s sake:

We have a working model and an admin that allows us to interface with its data. It’s functional and looks pretty, but it’s still not terribly helpful. Let’s soldier forth.
Managing Users and Permissions- First thing first. You’ll notice you have a built-in app (providing django.contrib.auth was installed, as it should be) that includes two editable models: groups and users. This is, amazingly, where you can control groups and users.
It should be pretty intuitive. Dive into the “users” model and add a test account. Save that and you’ll be taken to a second page where you see how customizable each account can be. Here’s a quck rundown of what everything does:
Displaying information in the model admin- In the previous post, anything we added to the “High Schools” model was displayed as “HighSchool object,” which is about the least useful name for anything ever. This is easy to address.
Edit you admin.py file to look like this:
//admin.py
...
class HighSchoolAdmin(admin.ModelAdmin):
list_display = ('name',)
admin.site.register(HighSchool, HighSchoolAdmin)
All we did there was access the list_display property of our ModelAdmin and tell it to how us the name of the school. Access that model admin section again, and you’ll see “HighSchool object” now reads the actual name of your school. Adding other fields to the page is just a matter of stringing field names together in the same format. For example, an admin displaying all of our fields would just require:
//admin.py
...
list_display = ('name', 'principal', 'enrollment', 'website')
The order you call them in here denotes the order they appear in the admin.
Making the model admin searchable- Making your model searchable is similarly easy to execute. Again, it requires only one line in the admin:
//admin.py
...
class HighSchoolAdmin(admin.ModelAdmin):
search_fields = ('name', 'principal')
admin.site.register(HighSchool, HighSchoolAdmin)
Put that in your pipe and smoke it. When you reload the High School page, you’ll see a handy-dandy search box that will search the values of the school name and principal fields.
By default, searches are enclosed in wild cards. So searching for “Smith,” in this example, will return both John Smith High school as well as principal Hymie Goldsmith.
Making fields optional - There is a lot more that can be done with admin.py, but more common problems seem to crop up at the individual record level. Many of these issues need to be addressed in models.py. For instance, let’s say we get three records into our project before we realize that private schools don’t have principals. As things stand, any attempt to leave that field blank will be met with a stern warning from Django saying the Principal field is required.
To make the field optional, we’ll add an argument to its entry in models.py like so:
//models.py
class HighSchool(models.Model):
name = models.CharField(max_length=50)
principal = models.CharField(max_length=50, blank=True)
enrollment = models.IntegerField()
website = models.URLField(verify_exists=True, max_length=120)
Explicitly allowing blanks means the field is now optional. On the “Add High School” page, you’ll notice the name of the field is in regular typeface as opposed to bold.
Adding some signposts- In my experience, the Django admin is significantly more intuitive than PHPMyAdmin. But that’s still not always enough. Sometimes you need to be able to give your users hints along the way.
This, too, is an easy addition. Let’s say we’ve received a frenzied e-mail from our well-intentioned temp saying he doesn’t know what to put in the “Enrollment” field. He has different counts for each grade, he says, but when he tried to list them, he gets an error saying “Enter a whole number.” Help?
Let’s dive back into models.py and add some help text:
//models.py
class HighSchool(models.Model):
name = models.CharField(max_length=50)
principal = models.CharField(max_length=50, blank=True)
enrollment = models.IntegerField(help_text="Enter one value for the whole school.")
website = models.URLField(verify_exists=True, max_length=120)
Well hot damn. Out handy-dandy little signpost has been cleanly inserted right under our field, making this thing all but idiot proof. At this point, clicking the “Add High School” button should give you something much like this:

Between these two posts, you should have enough to rapidly build some simple data entry tools for brand new datasets your organization creates on its own.
Of course, that’s not usually the case. Often, especially in journalism, we need to interface with existing data. Our next installment will cover how to get our arms around legacy datasets and get them into our system.
Data entry is like Elvin from the Cosby Show: You can’t ignore it when it’s there, but it’s too small of a problem to actually get riled.
So you deal. Some news organizations distribute Excel templates in hopes of getting back something somewhat standardized, an approach that generally results in a data cleaning nightmare on the back end, lost sleep and serious drinking problems. Some try to use tools like PHPMyAdmin, only to hear complaints from users who don’t “get” the system.
Wouldn’t it be great if there were something in the middle? Something as rigid as a PHPMyAdmin instance yet as simple to deploy as an Excel spreadsheet?
The Django admin makes this possible. With less than an hour of work you can deploy a beautiful, user-friendly data entry system that adheres to standards you set. The best part? Anyone can do it. (Yes, that means you.) And even if you never use Django for web production, it’s still a painless way to slay the data entry dragon.
This has been a huge help at work, so I’m going to be presenting a short intro to the subject at next week’s NICAR conference. In preparation, and as a resource for any poor soul who might wander in, I thought I’d put together this little how-to.
To get started, find the file in your django install called django-admin.py. Using a command window, get to that directory and type
//in command window django-admin.py startproject testproject
That will create a new project, within which you can have many different applications. For testing purposes, let’s say our data entry task is to create a resource with all of the high schools in town. To make that happen, we’d do like so:
//in command window cd testproject (gets us into the project we just created) django-admin.py startapp highschools
Take a look at what just happened. By typing the startproject and startapp commands, you’ve created the skeleton structure of a fully functional django project. Inside a directory called ‘testproject’, there should be a subdirectory called ‘highschools.’ And within ‘highschools’ there should be a handful of files.
The final part of setting up is telling Django this thing is ready to go. Dive into a file called ’settings.py.’ It should be located in the testproject directory. At the very bottom of the file, you’ll see a section called INSTALLED_APPS. Add a line pointing to our application, and if it’s not there, add ‘django.contrib.admin,’ too.
//settings.py
....
INSTALLED_APPS = (
'django.contrib.auth',
'django.contrib.admin',
'testproject.highschools',
)
One last bit of housekeeping is to make sure Django’s admin is accessible. To make sure, open up the urls.py file in testproject and uncomment the lines it says to uncomment to use the admin. Save your changes, and we’re off to the races.
Now to do some work. One of the files in the ‘highschools’ directory is called ‘models.py.’ Open it up in a text editor (Windows I like NotePad++, mac I like TextWrangler) and let’s get started.
Models.py is basically where we tell Django what our data will look like. Using field types found in the documentation, we can define our data however we think it makes sense. For our high school application, let’s say we want to track each school’s name, it’s principal, its enrollment and its website. A bare bones models file might look something like
//models.py
from django.db import models
class HighSchool(models.Model):
name = models.CharField(max_length=50)
principal = models.CharField(max_length=50)
enrollment = models.IntegerField()
website = models.URLField(max_length=120)
Let’s take a look at what we did there. Most of this is boilerplate, so we’ll focus on the actual definition of our data. You can see we’re using three different field types. CharFields are just as you suspect — a field with a set amount of characters. We’re required to give a max_length argument, which specifies how long the field could be. IntegerFields are also just as you would assume them to be; fields that will contain Integer values. Lastly, we use a URLField. As it stands now, this is little more than a text field. But let’s show off a neat out-of-the-box trick and make it a bit more useful.
Let’s say we want to guard against typos in the URL field. One way to do that would be to make sure any value typed corresponds to an actual site. To add that functionality, we simply add some language to our existing model:
//models.py
....
website = models.URLField(verify_exists=True, max_length=120)
Perfect. Believe it or not, the hard work of all this is done. To make sure it works, change into your testproject directory and run the following command:
//in command window manage.py syncdb
Running this tells Django to make the changes you’ve outlined in models.py. In this case, it’s going to actually create a database for you as outlined in settings (If you used BitNami, this is handled automagically and the changes will be made in MySQL). If it’s your first time running the command, Django will also ask you to create a superuser account. Do so, and remember th login and password you supply. This is your master account.
This has been a gas so far. It gets better. The last bit of this project involves defining how we want our data to behave within the admin. To do this, move into your highschools directory and create a new file called admin.py. Make it look like this:
//admin.py
from testproject.highschools.models import *
from django.contrib import admin
model = HighSchool
admin.site.register(HighSchool, HighSchoolAdmin)
This is the least possible admin.py file. All it does is tells Django that there’s a model and we want it to have an associated admin site.
That is all.
Let’s see what that did for us, shall we? Access the url where you have django isntalled and type ‘admin’ at the end. For example, if your site is called www.azcentral.com, head over to www.azcentral.com/admin. If you’re just testing on your machine, you’ll be at localhost/admin. And if you used BitNami, you’ll be at localhost:(port you entered during setup)/admin. You’ll be prompted for your login (your superuser credentials, as discussed above), so log in and behold:

Poke around a bit and you’ll see what we’ve done. Click “High Schools” to see, well, a blank page. Let’s put something on it. Click the “Add High School” button in the corner and you’ll be taken to a startlingly gorgeous page where you can add high school to your hearts’ content.

As long as this post has somehow managed to be, the crux of what was done was unbelievably simple. Now that you have Django installed and things hooked up, replicating tis process should take three steps:
The result is a clean, customized database that holds all our data entry results and can be deployed anywhere. We use this tack at the Republic for applications that need to live in PHP (for some reason) and it makes life significantly more enjoyable.
Now you may be noticing that what we’ve done leaves a little to be desired. Every High School we enter is displayed on the main High School table page as “Highschool object,” which isn’t exactly the most helpful things in the world.
If you want to learn more about Django and how to make the admin more intuitive, try following along with the intro tutorial in your fancy new install. Here’s a taste: To change “Highschool object” to the school name takes one line of code.
Thoughts? Comments? Questions? Debates?
Last Wednesday, ten staffers from the Republic got up and talked for five minutes in front of about 250-300 colleagues about everything.
I talked about better allocating resources through covering events by collecting facts alone, rather than trying to mush them into a narrative. Andrew Long talked about how to be innovative. Other people talked about cutting edge concepts for advertisements, creative new ways to drum up advertisers and how exactly the paper gets to every doorstep by 6 a.m.
It’s a sad truth that at the vast majority of news organizations: The walls we painstakingly built over decades of “professionalization” are so thick that we have no idea what anyone else is doing. I sit in the newsroom and wonder why those guys in advertising aren’t doing anything innovative. Marketing people see me and wonder what the hell I do all day. IT people operate under the assumption that the newsroom doesn’t even know what the Internets are.
Little of the suspicion and distrust is based on actual interaction.Most of it is utterly false.
We wanted to stop that. So a few months back Andrew cooked up the idea of having an Ignite of our own, modeled after the popular format used at O’Reilly events around the country. We took all of our cues from the Phoenix model. We had a submission period, during which we received 30+ ideas for short presentations. We had a batch of impartial judges pick out submissions that represented a variety of topics. We took on the job of getting the room set up, marketing, and so on.
And then it just happened. For an hour last Wednesday, 250 crowded into our standing-room only auditorium to hear people they knew and didn’t know talk abotu whatever it was they wanted to talk about.
The feedback has been explosively positive. People spotted parallel projects and have identified new ways to get involved in good things. Our publisher has said he wants to shepherd a few of the ideas on to the next level.
Even some of the ideas that weren’t selected have led to good things. I submitted two ideas. One was about databasing news events, which is the submission that was selected. The other was some general stuff on Django and how it’s neat. Someone in advertising, who needed a way to set up a database quickly where users could log in , add and edit things (sound familiar?), saw my submission and asked if Django could help. Now we’re solving his problem together.
The best part of the Ignite model is that it’s relatively easy to do with proper guidance. Any news organization can make it work, and I promise any that do will see immediate benefits.
I’d be remiss not to mention some people here. The aforementioned Mr. Long, for cooking up a great idea, which I guess is par for the cource for that guy. Chris George and Allisence Chang for running tight ship. The inimatable (thank god) Jeff Moriarty, for graciously offering us his guidance throughout.
If you want to get this going in your shop, shoot me an e-mail, Twitter, whatever. I can’t get behind this idea enough.
EDIT: Just got the final count. In total, 319 attended or watched remotely. Simply amazing.
In California. Crap.
To be fair, The Az legislature has a great reputation for opening up data. When asked. It has provided attendance records, lobbyist disclosures (crappy though they may be), and dozens of reports at various parties’ request.
But California has made the logical leap from mere openness to transparency. And all signs point to it being legitimate (moreso than, say, data.gov). The “Official California Legislative Information” website now trumpets a downloadable database that appears to be updated daily, allowing citizens a reliable way to answer the question, “How is my representative reflecting my beliefs?”
Bravo to Maplight and the California First Amendment Coalition for fighting the fight. And kudos to California for admitting it’s mistakes and trying to move past them.
Apparently I need to handle some serious issues in the “My Work” section. It’s on the to-do list. thanks for the e-mails, everyone.
Phoenix is the capital city of Arizona. It’s also in the middle of a desert. Yet somehow the state has evolved some wack records laws that don’t seem to accept this as a truth.
Most people I talked to are shocked to learn that records of water usage are entirely blocked from public access. It makes sense. Water is our number one resource here; we’re barraged with ways to conserve water and take a of what natural moisture there is.
Yet we also have the most acres of golf courses in the nation. We have green lawns down every block. And every other house has a pool. While there are laws that govern the way water is generally used, especially by large consumers, there is no way for a regular Joey Citizen to see for himself if those laws are being followed. The records are entirely hidden away. It’s illegal for a water district to give them out. It’s a crime for a citizen to have them.
Sometimes it’s hard to explain why not having access to information is bad. Other times, the argument is made for you, and all you have to do is point to it. Enter the Panama City Press Herald, and a phenomenal piece by Matt Dixon.
Here’s the kicker:
Over the past five years, 2.4 billion gallons of water — 23 percent of all water purchased by Panama City — has gone unaccounted for, according to an analysis of utility records obtained in a public records request. In 2006 alone, the city lost 631 million gallons, the largest single-year amount since at least 1996, according to utility records.
Arizona, we deserve to see how we stack up.
Few things puzzle me more than a government entity actively hiding, obfuscating, or misplacing its records. Sure, the excuses range from “We can’t,” (flakspeak for “We don’t know how”) to “We just don’t feel comfortable doing that right now” (flakspeak for “I’m so scared I just peed myself”), but a no is a no regardless of reason.
A big fat nondisclosure is a loss for all parties involved. The journalist loses because they can’t pursue a story or fully contextualize one they already have. The public loses because they’ve been locked out; labeled as untrustworthy by the government they fund.
Then there’s another loser: the government itself. A nondisclosure makes the entity seem secretive, makes it seem like it has something to hide. A nondisclosure does no favors to officials, keeping the public in the dark and making it impossible for citizens to draw conclusions about how officials carry out their interests.
For years, I believed in the dichotomy of openness. An entity was either open or closed. Arizona is open. Phoenix is closed. Columbia is open. Springfield is closed. Missouri is open. And so on.
But there’s a third level that is more beneficial to all parties than mere openness: Proactive openness.
Proactive openness is providing information as it comes, nullifying the need for a records request. It is providing information in machine-readable formats, eliminating the need for scraping that’s cumerbsome to both the scraper and the scrapee. Proactive openness is providing data in its raw form, doing away with spin because there are no aggregates or analysis to sugarcoat the truth.
What I call proactive openness is actually an eGovernance movement gaining traction around the country. A recent survey ranked the programs, finding that Washington D.C. and Portland are the best around. Portland, it should be noted, was so miffed at its showing, it even went out and ordered an RFP to get some citizens involved in making things better.
Et tu, Phoenix? Not surprisingly, the fifth largest city in the nation doesn’t make the list. Not anywhere. And what’s sad is there’s no reason it has to be the case. From my interactions with city personnel, it appears staying away from eGovernance techniques has been a deliberate choice. Phoenix doesn’t even provide electronic data when requested under public records law, so I can only imagine the horror at the concept of sharing data electronically before it’s even requested.
I can understand the fear of being open. Often, people who request documents are looking for shady behavior. Their motives alone make them suspect. But fear of transparency? That’s beyond me. There are plenty of good reasons to pursue proactive openness:
Now obviously, I’m a journalist, and having government crack open its data would make my life better.
But this isn’t borne entirely out of self interest. Being in my admittedly unique situation has given me a front row seat to the inefficiencies of both government and the citizenry, inefficiencies that could largely be addressed by the type of eGovernance that but for the grace of god is catching on. I have had citizens call me up and ask me what happened on their corner last night. I have had PIOs pitch me stories about an uptick in requests for neighborhood cleanups.
Both deserve better.
One of the most confounding things about the permeance of Caspio is how totally unnecessary it is. Many news sites opt to shell out $8/database/mo. to this service, ignoring a myriad of better, cheaper alternatives.
Part of the rush may be attributable to the, ah, “me too” mentality of the industry. Part of it, I assume, stems from ignorance.
There’s not much mere mortals can do about tweens at the mall. But I can do something about ignorance. And so, without further ado, I introduce the Abolutely Incomplete, Horribly Biased List of Ways to get Interactive Data on Your Website. The entries are listed from easiest to implement to most complex, so figure out where your organization can make hay and do so. Please.
For the record, the example pages I made myself are quick and ugly. I’ve left in obvious errors and haven’t optimized anything. Almost all of them are capable of doing much more, and there is lots of documentation and help available on all of the options listed.
Abolutely Incomplete, Horribly Biased List of Ways to get Interactive Data on Your Website
Zero programming solutions:
1. Zoho Creator. Zoho has been impressing me since Day 1. If Caspio represents the closest you’ve come to programming, you’ll be right at home with these guys. They provide a friendly GUI that allows for embedding into any webpage. They offer a wide range of tools to make your data useful outside of the constraints of the database. An API and coherent documentation of their weird scripting language (Deluge) gives these guys a clear leg up.
2. Google Spreadsheets. You knew they had to make an appearance, right? Google Spreadsheets allows for embedding a bunch of data-backed “Gadgets” into a site. One option is a simple table, complete with filtering and sorts right of the bat.They also offer other basics and not-so-basics, like maps, org charts, timelines, etc.
3. Dabble DB. Their video can explain it better than me. It’s fast fast fast.
4. IBM’s ManyEyes. This is kind of a departure, since they do NOT offer your basic embeddable-search-and-report table as an option. They do, however, offer a pretty extensive collection of data visualizations, including everything from pie charts to cool New York Times-like block histograms and wordles.
Psuedo-programming solutions:
5. jQuery. According to Wikipedia, jQuery is a “lightweight JavaScript library that emphasizes interaction between JavaScript and HTML.” It’s also the object of my undying affection. Javascript is one of the things that makes Web 2.0 so neat, allowing for all sorts of rich, interactive changes to a page (think Gmail, many facebook applications, etc.) It’s also hard for me to wrap my head around. No more. A variety of plugins abstract what used to be a time consuming job. FlexiGrid and Tablesorter make HTML tables interactive with little code. Once TableSorter is installed, for instance, you just drop a five-word javascript function at the top of a page, and the plugin takes over from there. Easy peasy, and good looking, too!
6. The Simile Project’s Exhibit. I’ve toyed around with Simile’s TimeLine offering before, but Exhibit was new to me. This is another great example of abstracting wonky code, leaving just the useful stuff. It only takes a handful of files to get up and running, and most of the work has been done for you. Because data is in one file, and formatting in other, Exhibit makes it possible to do just about any visualization of your records. Very slick.
Note: For me, the hardest part was understanding their data format. Luckily, they provide a tutorial for another tool, Babel, that takes care of even that.
And I’d be silly to leave out the heavy hitters. Actual Programming:
7. PHP (or any other scripting language). There are tons and tons and tons of templates that claim to be able to build searchable, sortable tables and web forms with ease. But honestly, they may be over complicating matters. PHP and MySQL go together like milk and cookies, and some good people have made local installation a snap. Why not just go pick it up yourself with one of the myriad of tutorials out there?
8. Django/Rails. My “sort by difficulty asc” might have these and PHP flipped, depending on your learning style. If you need to know how things work through and through, PHP/MySQL is your fit. If you just want to get things done, a framework is what you need. These guys are all the buzz in the news industry now for good reason. They allow for rapid development, which works well with the no-holds-barred news cycle. My own experience with them is limited to desktop installations and mucking around, but I’ve been plenty impressed. Friendly script repositories make it easy to stand on the shoulders of giants, further streamlining the process.
As the title implies this is by no means a complete list. I’ve missed a lot, and probably done an injustice to a few more. If you have anything you’d like to add, drop me a note and I’ll add it to the list or leave it in the comments.
Today Salon published a piece about a flawed polling procedure ad what it could mean for the November election. Because the majority of pollsters only place calls to landline phones, the article argues, key demographics are being under-represented:
“…(a) sample that’s predominantly under 40 years of age (oops, that one favors Obama); disproportionately renters rather than homeowners (Obama-leaning again); full of college students (sounds like a Starbucks Obama thing to me) — and, for good measure, includes a higher proportion of blacks and Hispanics than the national population does.”
Seems like a pretty boneheaded move. But I have a hunch many newspaper companies are making the same mistake, skewing our research and leading us to make poorly-informed decisions based on false majorities.
I’d be curious to hear what people learn if they actually go take a look at internal research. I’ll kick it off: Gannett uses landline phones only for the massive readership surveys conducted for all papers.