If You Care About Your Data . . . $B1:$C2#?!

By Charles Falk

More than ever before everything is data and data is everything. Regrettably, managing data has become a black art involving Microsoft Excel VLOOKUPs and highly individualized spreadsheets. Though there has been a push to move data to more robust systems such as SQL Server, a large portion of data ends up at the opposite end of the software spectrum—in scary spreadsheets that are emailed and cut and pasted and generally lack the validation necessary to maintain data integrity.

The reasons for this misuse of software are numerous, but the most common are a lack of resources and limited skills. In both small businesses and large institutions (particularly those whose IT budgets have been slashed in the current recession), individuals often find it necessary to go it alone and come up with ad hoc methods for entering and retrieving data in programs they are familiar with—usually Excel. Overreliance on spreadsheets occurs when other options seem prohibitively difficult or expensive to implement. It can also be the result of a bunker mentality that sets in or when IT enforces the use of specialized, tightly controlled databases, driving some users to go rogue.

The spreadsheet solution seems great at first. You are empowering yourself and getting over on The Man. But over time a spreadsheet tends to turn into a mess of workbooks and worksheets that starts looking more like game of Battleship as your data sinks in a grid of B9:Z88 cells. Ultimately each one evolves into a hopelessly idiosyncratic contrivance that only one user understands. And then you go on vacation and someone renames a worksheet and all your VLOOKUPS and calculations fail. Or worse, you work late one evening and distractedly sort your columns—and scramble your data. Sometimes the spreadsheet solution is flexible to a fault.

When the situation approaches crisis proportions a data analyst is called in to fix the problem. Shouldn't be too difficult—after all, it’s just data, right? Well, one person's data can be another person's nightmare of trying to interpret the myriad of calculated cells across multiple spreadsheets. More often than not, it turns out that columns are riddled with various text headings, pictures, and random cells or nulls that make it impossible to easily integrate the data into another platform like SQL Server or Microsoft Access.

So if you really care about your data, use a program that is designed to protect data through validation using form controls and is robust enough to allow for relational tables. Excel is a highly flexible spreadsheet program that is exceptional for analyzing data but should not be relied upon as a data storage system. For that, there are plenty of off-the-shelf software solutions or developers who can design a custom database to meet your data system needs. The spreadsheets with the most reliable data are usually those that are outputted from a database system.