Ha

A Google Image search for “Caspio” yields my CaspioFail. For old time’s sake:

Putting the admin to use

We have a working model and an admin that allows us to interface with its data. It’s functional and looks pretty, but it’s still not terribly helpful. Let’s soldier forth.

Managing Users and Permissions- First thing first. You’ll notice you have a built-in app (providing django.contrib.auth was installed, as it should be) that includes two editable models: groups and users. This is, amazingly, where you can control groups and users.

It should be pretty intuitive. Dive into the “users” model and add a test account. Save that and you’ll be taken to a second page where you see how customizable each account can be. Here’s a quck rundown of what everything does:

  • The user’s first name, last name and e-mail address are not required, but can be nice to have on hand.
  • In order to log in, a person has to be both staff and active, so both those boxes have to be checked.
  • Superuser status gives a permission full permission to do everything. They can add and edit anything that shows up in any app. Be careful in giving someone this level of access.
  • Finally, we reach  “User permissions.” This is the meat of any CRUD interface. Each model has three different levels of access that a user can have: Add, Change and Delete. They do exactly what you think they should do. Any user with Add permission can add entries to the model. Any user with Change permission cannot add a new listing, but can make changes to any existing entry. A user with Delete permission can’t add or change anything, but can zap anything from the system.
    Obviously, it doesn’t make sense to give these out one by one. The most common permission, I’d say, gives users the ability to Add and Change any entry.

Displaying information in the model admin- In the previous post, anything we added to the “High Schools” model was displayed as “HighSchool object,” which is about the least useful name for anything ever. This is easy to address.

Edit you admin.py file to look like this:

//admin.py
...
class HighSchoolAdmin(admin.ModelAdmin):
    list_display = ('name',)

admin.site.register(HighSchool, HighSchoolAdmin)

All we did there was access the list_display property of our ModelAdmin and tell it to how us the name of the school. Access that model admin section again, and you’ll see “HighSchool object” now reads the actual name of your school. Adding other fields to the page is just a matter of stringing field names together in the same format. For example, an admin displaying all of our fields would just require:

//admin.py
...
    list_display = ('name', 'principal', 'enrollment', 'website')

The order you call them in here denotes the order they appear in the admin.

Making the model admin searchable- Making your model searchable is similarly easy to execute. Again, it requires only one line in the admin:

//admin.py
...
class HighSchoolAdmin(admin.ModelAdmin):
    search_fields = ('name', 'principal')

admin.site.register(HighSchool, HighSchoolAdmin)

Put that in your pipe and smoke it. When you reload the High School page, you’ll see a handy-dandy search box that will search the values of the school name and principal fields.
By default, searches are enclosed in wild cards. So searching for “Smith,” in this example, will return both John Smith High school as well as principal Hymie Goldsmith.

Making fields optional - There is a lot more that can be done with admin.py, but more common problems seem to crop up at the individual record level. Many of these issues need to be addressed in models.py. For instance, let’s say we get three records into our project before we realize that private schools don’t have principals. As things stand, any attempt to leave that field blank will be met with a stern warning from Django saying the Principal field is required.

To make the field optional, we’ll add an argument to its entry in models.py like so:

//models.py
class HighSchool(models.Model):
    name = models.CharField(max_length=50)
    principal = models.CharField(max_length=50, blank=True)
    enrollment = models.IntegerField()
    website = models.URLField(verify_exists=True, max_length=120)

Explicitly allowing blanks means the field is now optional. On the “Add High School” page, you’ll notice the name of the field is in regular typeface as opposed to bold.

Adding some signposts- In my experience, the Django admin is significantly more intuitive than PHPMyAdmin. But that’s still not always enough. Sometimes you need to be able to give your users hints along the way.

This, too, is an easy addition. Let’s say we’ve received a frenzied e-mail from our well-intentioned temp saying he doesn’t know what to put in the “Enrollment” field. He has different counts for each grade, he says, but when he tried to list them, he gets an error saying “Enter a whole number.” Help?

Let’s dive back into models.py and add some help text:

//models.py
class HighSchool(models.Model):
    name = models.CharField(max_length=50)
    principal = models.CharField(max_length=50, blank=True)
    enrollment = models.IntegerField(help_text="Enter one value for the whole school.")
    website = models.URLField(verify_exists=True, max_length=120)

Well hot damn. Out handy-dandy little signpost has been cleanly inserted right under our field, making this thing all but idiot proof. At this point, clicking the “Add High School” button should give you something much like this:


For next time

Between these two posts, you should have enough to rapidly build some simple data entry tools for brand new datasets your organization creates on its own.

Of course, that’s not usually the case. Often, especially in journalism, we need to interface with existing data. Our next installment will cover how to get our arms around legacy datasets and get them into our system.

Taming data entry with the Django Admin

Data entry is like Elvin from the Cosby Show: You can’t ignore it when it’s there, but it’s too small of a problem to actually get riled.

So you deal. Some news organizations distribute Excel templates in hopes of getting back something somewhat standardized, an approach that generally results in a data cleaning nightmare on the back end, lost sleep and serious drinking problems. Some try to use tools like PHPMyAdmin, only to hear complaints from users who don’t “get” the system.

Wouldn’t it be great if there were something in the middle? Something as rigid as a PHPMyAdmin instance yet as simple to deploy as an Excel spreadsheet?

The Django admin makes this possible. With less than an hour of work you can deploy a beautiful, user-friendly data entry system that adheres to standards you set. The best part? Anyone can do it. (Yes, that means you.) And even if you never use Django for web production, it’s still a painless way to slay the data entry dragon.

This has been a huge help at work, so I’m going to be presenting a short intro to the subject at next week’s NICAR conference. In preparation, and as a resource for any poor soul who might wander in, I thought I’d put together this little how-to.

Setting Up

  • If you don’t already have Django installed, try using BitNami’s Djangostack. Last I checked it still ran ver 1.0, but that will handle this task nicely. If you like the results you can always do a proper installation later.

To get started, find the file in your django install called django-admin.py. Using a command window, get to that directory and type

//in command window
django-admin.py startproject testproject

That will create a new project, within which you can have many different applications. For testing purposes, let’s say our data entry task is to create a resource with all of the high schools in town. To make that happen, we’d do like so:

//in command window
cd testproject (gets us into the project we just created)
django-admin.py startapp highschools

Take a look at what just happened. By typing the startproject and startapp commands, you’ve created the skeleton structure of a fully functional django project. Inside a directory called ‘testproject’, there should be a subdirectory called ‘highschools.’ And within ‘highschools’ there should be a handful of files.

The final part of setting up is telling Django this thing is ready to go. Dive into a file called ’settings.py.’ It should be located in the testproject directory. At the very bottom of the file, you’ll see a section called INSTALLED_APPS. Add a line pointing to our application, and if it’s not there, add ‘django.contrib.admin,’ too.

//settings.py
....
INSTALLED_APPS = (
    'django.contrib.auth',
    'django.contrib.admin',
    'testproject.highschools',
)

One last bit of housekeeping is to make sure Django’s admin is accessible. To make sure, open up the urls.py file in testproject and uncomment the lines it says to uncomment to use the admin. Save your changes, and we’re off to the races.

Defining Data

Now to do some work. One of the files in the ‘highschools’ directory is called ‘models.py.’ Open it up in a text editor (Windows I like NotePad++, mac I like TextWrangler) and let’s get started.

Models.py is basically where we tell Django what our data will look like. Using field types found in the documentation, we can define our data however we think it makes sense. For our high school application, let’s say we want to track each school’s name, it’s principal, its enrollment and its website. A bare bones models file might look something like

//models.py
from django.db import models

class HighSchool(models.Model):
    name = models.CharField(max_length=50)
    principal = models.CharField(max_length=50)
    enrollment = models.IntegerField()
    website = models.URLField(max_length=120)

Let’s take a look at what we did there. Most of this is boilerplate, so we’ll focus on the actual definition of our data. You can see we’re using three different field types. CharFields are just as you suspect — a field with a set amount of characters. We’re required to give a max_length argument, which specifies how long the field could be. IntegerFields are also just as you would assume them to be; fields that will contain Integer values. Lastly, we use a URLField. As it stands now, this is little more than a text field. But let’s show off a neat out-of-the-box trick and make it a bit more useful.

Let’s say we want to guard against typos in the URL field. One way to do that would be to make sure any value typed corresponds to an actual site. To add that functionality, we simply add some language to our existing model:

//models.py
    ....
    website = models.URLField(verify_exists=True, max_length=120)

Perfect. Believe it or not, the hard work of all this is done.  To make sure it works, change into your testproject directory and run the following command:

//in command window
manage.py syncdb

Running this tells Django to make the changes you’ve outlined in models.py. In this case, it’s going to actually create a database for you as outlined in settings (If you used BitNami, this is handled automagically and the changes will be made in MySQL). If it’s your first time running the command, Django will also ask you to create a superuser account. Do so, and remember th login and password you supply. This is your master account.

The Payoff Pitch

This has been a gas so far. It gets better. The last bit of this project involves defining how we want our data to behave within the admin. To do this, move into your highschools directory and create a new file called admin.py. Make it look like this:

//admin.py
from testproject.highschools.models import *

from django.contrib import admin
    model = HighSchool

admin.site.register(HighSchool, HighSchoolAdmin)

This is the least possible admin.py file. All it does is tells Django that there’s a model and we want it to have an associated admin site.
That is all.
Let’s see what that did for us, shall we? Access the url where you have django isntalled and type ‘admin’ at the end. For example, if your site is called www.azcentral.com, head over to www.azcentral.com/admin. If you’re just testing on your machine, you’ll be at localhost/admin. And if you used BitNami, you’ll be at localhost:(port you entered during setup)/admin. You’ll be prompted for your login (your superuser credentials, as discussed above), so log in and behold:

Poke around a bit and you’ll see what we’ve done. Click “High Schools” to see, well, a blank page. Let’s put something on it. Click the “Add High School” button in the corner and you’ll be taken to a startlingly gorgeous page where you can add high school to your hearts’ content.

Wrap up

As long as this post has somehow managed to be, the crux of what was done was unbelievably simple. Now that you have Django installed and things hooked up, replicating tis process should take three steps:

  • Creating an app and settings up the models.py file
  • Adding the app it to the INSTALLED_APPS
  • Adding an admin.py to said app

The result is a clean, customized database that holds all our data entry results and can be deployed anywhere. We use this tack at the Republic for applications that need to live in PHP (for some reason) and it makes life significantly more enjoyable.

Now you may be noticing that what we’ve done leaves a little to be desired. Every High School we enter is displayed on the main High School table page as “Highschool object,” which isn’t exactly the most helpful things in the world.

If you want to learn more about Django and how to make the admin more intuitive, try following along with the intro tutorial in your fancy new install. Here’s a taste: To change “Highschool object” to the school name takes one line of code.

Thoughts? Comments? Questions? Debates?

Breaking down all the silos in one hour or less

Last Wednesday, ten staffers from the Republic got up and talked for five minutes in front of about 250-300 colleagues about everything.

I talked about better allocating resources through covering events by collecting facts alone, rather than trying to mush them into a narrative. Andrew Long talked about how to be innovative. Other people talked about cutting edge concepts for advertisements, creative new ways to drum up advertisers and how exactly the paper gets to every doorstep by 6 a.m.

It’s a sad truth that at the vast majority of news organizations: The walls we painstakingly built over decades of “professionalization” are so thick that we have no idea what anyone else is doing. I sit in the newsroom and wonder why those guys in advertising aren’t doing anything innovative. Marketing people see me and wonder what the hell I do all day. IT people operate under the assumption that the newsroom doesn’t even know what the Internets are.

Little of the suspicion and distrust is based on actual interaction.Most of it is utterly false.

We wanted to stop that. So a few months back Andrew cooked up the idea of having an Ignite of our own, modeled after the popular format used at O’Reilly events around the country. We took all of our cues from the Phoenix model. We had a submission period, during which we received 30+ ideas for short presentations. We had a batch of impartial judges pick out submissions that represented a variety of topics. We took on the job of getting the room set up, marketing, and so on.

And then it just happened. For an hour last Wednesday, 250 crowded into our standing-room only auditorium to hear people they knew and didn’t know talk abotu whatever it was they wanted to talk about.

The feedback has been explosively positive. People spotted parallel projects and have identified new ways to get involved in good things. Our publisher has said he wants to shepherd a few of the ideas on to the next level.

Even some of the ideas that weren’t selected have led to good things. I submitted two ideas. One was about databasing news events, which is the submission that was selected. The other was some general stuff on Django and how it’s neat. Someone in advertising, who needed a way to set up a database quickly where users could log in , add and edit things (sound familiar?), saw my submission and asked if Django could help. Now we’re solving his problem together.

The best part of the Ignite model is that it’s relatively easy to do with proper guidance. Any news organization can make it work, and I promise any that do will see immediate benefits.

I’d be remiss not to mention some people here. The aforementioned Mr. Long, for cooking up a great idea, which I guess is par for the cource for that guy. Chris George and Allisence Chang for running tight ship. The inimatable (thank god) Jeff Moriarty, for graciously offering us his guidance throughout.

If you want to get this going in your shop, shoot me an e-mail, Twitter, whatever. I can’t get behind this idea enough.

EDIT: Just got the final count. In total, 319 attended or watched remotely. Simply amazing.

State voting records open up!

In California. Crap.

To be fair, The Az legislature has a great reputation for opening up data. When asked. It has provided attendance records, lobbyist disclosures (crappy though they may be), and dozens of reports at various parties’ request.

But California has made the logical leap from mere openness to transparency. And all signs point to it being legitimate (moreso than, say, data.gov). The “Official California Legislative Information” website now trumpets a downloadable database that appears to be updated daily, allowing citizens a reliable way to answer the question, “How is my representative reflecting my beliefs?”

Bravo to Maplight and the California First Amendment Coalition for fighting the fight. And kudos to California for admitting it’s mistakes and trying to move past them.

Maintenance note

Apparently I need to handle some serious issues in the “My Work” section. It’s on the to-do list. thanks for the e-mails, everyone.

Water records need to be public, too

Phoenix is the capital city of Arizona. It’s also in the middle of a desert. Yet somehow the state has evolved some wack records laws that don’t seem to accept this as a truth.

Most people I talked to are shocked to learn that records of water usage are entirely blocked from public access. It makes sense. Water is our number one resource here; we’re barraged with ways to conserve water and take a of what natural moisture there is.

Yet we also have the most acres of golf courses in the nation. We have green lawns down every block. And every other house has a pool. While there are laws that govern the way water is generally used, especially by large consumers, there is no way for a regular Joey Citizen to see for himself if those laws are being followed. The records are entirely hidden away. It’s illegal for a water district to give them out. It’s a crime for a citizen to have them.

Sometimes it’s hard to explain why not having access to information is bad. Other times, the argument is made for you, and all you have to do is point to it. Enter the Panama City Press Herald, and a phenomenal piece by Matt Dixon.

Here’s the kicker:

Over the past five years, 2.4 billion gallons of water — 23 percent of all water purchased by Panama City — has gone unaccounted for, according to an analysis of utility records obtained in a public records request. In 2006 alone, the city lost 631 million gallons, the largest single-year amount since at least 1996, according to utility records.

Arizona, we deserve to see how we stack up.

The case for transparency

Few things puzzle me more than a government entity actively hiding, obfuscating, or misplacing  its records.  Sure, the excuses range from “We can’t,” (flakspeak for “We don’t know how”) to “We just don’t feel comfortable doing that right now” (flakspeak for “I’m so scared I just peed myself”), but a no is a no regardless of reason.

A big fat nondisclosure is a loss for all parties involved. The journalist loses because they can’t pursue a story or fully contextualize one they already have. The public loses because they’ve been locked out; labeled as untrustworthy by the government they fund.

Then there’s another loser: the government itself. A nondisclosure makes the entity seem secretive, makes it seem like it has something to hide.  A nondisclosure does no favors to officials, keeping the public in the dark and making it impossible for citizens to draw conclusions about how  officials carry out their interests.

For years, I believed in the dichotomy of openness. An entity was either open or closed. Arizona is open. Phoenix is closed. Columbia is open. Springfield is closed. Missouri is open. And so on.

But there’s a third level that is more beneficial to all parties than mere openness: Proactive openness.

Proactive openness is providing information as it comes, nullifying the need for a records request. It is providing information in machine-readable formats, eliminating the need for scraping that’s cumerbsome to both the scraper and the scrapee.  Proactive openness is providing data in its raw form, doing away with spin because there are no aggregates or analysis to sugarcoat the truth.

What I call proactive openness is actually an eGovernance movement gaining traction around the country. A recent survey ranked the programs, finding that Washington D.C. and Portland are the best around. Portland, it should be noted, was so miffed at its showing, it even went out and ordered an RFP to get some citizens involved in making things better.

Et tu, Phoenix? Not surprisingly, the fifth largest city in the nation doesn’t make the list. Not anywhere. And what’s sad is there’s no reason it has to be the case. From my interactions with city personnel, it appears staying away from eGovernance techniques has been a deliberate choice. Phoenix doesn’t even provide electronic data when requested under public records law, so I can only imagine the horror at the concept of sharing data electronically before it’s even requested.

I can understand the fear of being open. Often, people who request documents are looking for shady behavior. Their motives alone make them suspect. But fear of transparency? That’s beyond me. There are plenty of good reasons to pursue proactive openness:

  • The public can do the work. A collection of APIs, or a data warehouse as provided by OCTO in D.C., turns every citizen into a web producer. Rather then spend weeks toiling over design and UI, government IT people could focus on simply opening the spigot, and move onto bigger and better things. Or, I guess, making sure the municipal judges have the right screen saver. Whatever. Opening up saves time and taxpayer money while increasing productivity.
  • It adds new perspectives. There probably aren’t very many watershed experts/tax policy advisers on the city payroll. Just a guess. Opening data allows for invested, educated and interested citizens to become unpaid interns of a sort, offering their expertise on issues that otherwise would get bogged down in bureaucracy.
  • It can save money. Or at least allow the city to better allocate its resources. Check out the list of PIO contacts in the city of Phoenix. That’s a lot of people paid to bang out press releases. When information is freely available, press releases become less necessary to get out a message. At the same time, a transparent city has to walk the walk. If actions don’t match up with words, it won’t take a rocket scientist to figure it out. Just any dude with the Internets.
  • No spin is the best spin. Raw data is like Scoop, the 1930s reporter guy. “Just the facts, ma’am.” Compared to the content of most press releases, it’s Gospel. Where did Councilman Bob use his P-Card last week? How many potholes are there that they haven’t found time to fill the one on my street yet? Questions could be answered quickly using some hypothetical data sources. And the answers would be trusted.
  • Because we want it to be easy to care. Civic inolvement is boring. Data are boring. But opening up civic data to the Web opens all sorts of possibilities for making both engaging and… dare I say it?.. fun. Again, it’s easy to look to D.C. for examples. Want to find parking info? Go for it. Curious what those sirens were last night? You’re covered. Opening up — proactively opening up with close to real-time information — will make for a more plugged in citizenry. Ideally, that’s in line with the City’s own desires.

Now obviously, I’m a journalist, and having government crack open its data would make my life better.

But this isn’t borne entirely out of self interest. Being in my admittedly unique situation has given me a front row seat to the inefficiencies of both government and the citizenry, inefficiencies that could largely be addressed by the type of eGovernance that but for the grace of god is catching on. I have had citizens call me up and ask me what happened on their corner last night. I have had PIOs pitch me stories about an uptick in requests for neighborhood cleanups.

Both deserve better.

Eight Ways to Get Interactive Data on Your Site

One of the most confounding things about the permeance of Caspio is how totally unnecessary it is. Many news sites opt to shell out $8/database/mo. to this service, ignoring a myriad of better, cheaper alternatives.

Part of the rush may be attributable to the, ah, “me too” mentality of the industry. Part of it, I assume, stems from ignorance.

There’s not much mere mortals can do about tweens at the mall. But I can do something about ignorance. And so, without further ado, I introduce the Abolutely Incomplete, Horribly Biased List of Ways to get Interactive Data on Your Website. The entries are listed from easiest to implement to most complex, so figure out where your organization can make hay and do so. Please.

For the record, the example pages I made myself are quick and ugly. I’ve left in obvious errors and haven’t optimized anything. Almost all of them are capable of doing much more, and there is lots of documentation and help available on all of the options listed.

Abolutely Incomplete, Horribly Biased List of Ways to get Interactive Data on Your Website

Zero programming solutions:

1. Zoho Creator. Zoho has been impressing me since Day 1. If Caspio represents the closest you’ve come to programming, you’ll be right at home with these guys. They provide a friendly GUI that allows for embedding into any webpage. They offer a wide range of tools to make your data useful outside of the constraints of the database. An API and coherent documentation of their weird scripting language (Deluge) gives these guys a clear leg up.

  • Drawbacks. Cost, first and foremost. But Zoho’s pricing, IMHO, is more fair than Caspio’s bizarre cost-per-”datapage” method, instead charging for the amount of data you store. Like most of these options, Zoho also doesn’t reload the page every time the user performs a search. That’s better for the user, but if your expect the majority of your data traffic to come from repeat searches, it may result in a page view nosedive.
  • Example: Missouri State University salaries.

2. Google Spreadsheets. You knew they had to make an appearance, right? Google Spreadsheets allows for embedding a bunch of data-backed “Gadgets” into a site. One option is a simple table, complete with filtering and sorts right of the bat.They also offer other basics and not-so-basics, like maps, org charts, timelines, etc.

  • Drawbacks. It’s a table. A dang table. And that is all it’s going to be. The charts offer other opportunities, but this isn’t the sort of stuff you’re going to write home about.
  • Example: State populations.

3. Dabble DB. Their video can explain it better than me. It’s fast fast fast.

  • Drawbacks. Pricing scheme is still at work here. Again, it’s a cheaper, more reasonable setup than the datapage method. I have no experience with this system, but the web is mostly devoid of complaints. Tried it yourself? Tell me how it went.
  • Example: Couldn’t find one, but they say it’s embeddable. Watch the video.

4. IBM’s ManyEyes. This is kind of a departure, since they do NOT offer your basic embeddable-search-and-report table as an option. They do, however, offer a pretty extensive collection of data visualizations, including everything from pie charts to cool New York Times-like block histograms and wordles.

  • Drawbacks. One big pageview the whole time it’s loaded. Great if your shop is forward-thinking enough to charge advertisers by time-spent.
  • Example: Click the links above!

Psuedo-programming solutions:

5. jQuery. According to Wikipedia, jQuery is a “lightweight JavaScript library that emphasizes interaction between JavaScript and HTML.” It’s also the object of my undying affection. Javascript is one of the things that makes Web 2.0 so neat, allowing for all sorts of rich, interactive changes to a page (think Gmail, many facebook applications, etc.) It’s also hard for me to wrap my head around. No more. A variety of plugins abstract what used to be a time consuming job. FlexiGrid and Tablesorter make HTML tables interactive with little code. Once TableSorter is installed, for instance, you just drop a five-word javascript function at the top of a page, and the plugin takes over from there. Easy peasy, and good looking, too!

  • Drawbacks. Absolutely none. Out-of-the-box, some limitations may be that the table has to exist on the page, which would probably drag out load times. But jQuery is an open-source solution, which means if there are drawbacks (like the no-page-view-for-you syndrome) someone, somewhere, can edit them out (or in, as the case may be). Ludicrously apropos.
  • Example. The flexigrid link above, Example #3, shows what jQuery is capable of.

6. The Simile Project’s Exhibit. I’ve toyed around with Simile’s TimeLine offering before, but Exhibit was new to me. This is another great example of abstracting wonky code, leaving just the useful stuff. It only takes a handful of files to get up and running, and most of the work has been done for you. Because data is in one file, and formatting in other, Exhibit makes it possible to do just about any visualization of your records. Very slick.

Note: For me, the hardest part was understanding their data format. Luckily, they provide a tutorial for another tool, Babel, that takes care of even that.

  • Drawbacks. Quickly slows down as the record count mounts.
  • Example. How about a whole page?

And I’d be silly to leave out the heavy hitters. Actual Programming:

7. PHP (or any other scripting language). There are tons and tons and tons of templates that claim to be able to build searchable, sortable tables and web forms with ease. But honestly, they may be over complicating matters. PHP and MySQL go together like milk and cookies, and some good people have made local installation a snap. Why not just go pick it up yourself with one of the myriad of tutorials out there?

  • Drawbacks. PHP is plenty powerful. But the learning curve is the same learning curve any non-programmer faces when they first start dabbling with this stuff. I should know — I still barely understand what an include does. That said, there are some great “Getting started” resources out there, and most every community college offers an intro weekend course. Journalists are compulsive learners; put that compulsion to good use.
  • Example. Name something cool on the Internet. There’s a good chance it’s done in PHP. TechCrunch, Facebook, feedburner, iStockphoto, Vimeo, YouSendit… the list goes on and on. As for news databases using it, check out the Arizona Republic’s ASU Salaries and dog registration databases.

8. Django/Rails. My “sort by difficulty asc” might have these and PHP flipped, depending on your learning style. If you need to know how things work through and through, PHP/MySQL is your fit. If you just want to get things done, a framework is what you need. These guys are all the buzz in the news industry now for good reason. They allow for rapid development, which works well with the no-holds-barred news cycle. My own experience with them is limited to desktop installations and mucking around, but I’ve been plenty impressed. Friendly script repositories make it easy to stand on the shoulders of giants, further streamlining the process.

  • Drawbacks. Like PHP, this is real, live programming. And that has its own initial complications. But fear not! For Django/Python, there’s a handy guide called How to Think Like a Computer Scientist, available for free online in its entirety. I’m sure there’s a Ruby equivalent, too. Since they aren’t nearly as omnipresent as PHP, so finding local classes is probably going to be tough. IRE’s newest boot camp fills that void.

As the title implies this is by no means a complete list. I’ve missed a lot, and probably done an injustice to a few more. If you have anything you’d like to add, drop me a note and I’ll add it to the list or leave it in the comments.

Speaking of, how does your in-house polling stack up?

Today Salon published a piece about a flawed polling procedure ad what it could mean for the November election. Because the majority of pollsters  only place calls to landline phones, the article argues, key demographics are being under-represented:

“…(a) sample that’s predominantly under 40 years of age (oops, that one favors Obama); disproportionately renters rather than homeowners (Obama-leaning again); full of college students (sounds like a Starbucks Obama thing to me) — and, for good measure, includes a higher proportion of blacks and Hispanics than the national population does.”

Seems like a pretty boneheaded move. But I have a hunch many newspaper companies are making the same mistake, skewing our research and leading us to make poorly-informed decisions based on false majorities.

I’d be curious to hear what people learn if they actually go take a look at internal research. I’ll kick it off: Gannett uses landline phones only for the massive readership surveys conducted for all papers.