Got data? These useful tools can turn it into informative, engaging graphics.
You may not think you’ve got much in common with an investigative journalist or an academic medical researcher. But if you’re trying to extract useful information from an ever-increasing inflow of data, you’ll likely find visualization useful — whether it’s to show patterns or trends with graphics instead of mountains of numbers, or to try to explain complex issues to a nontechnical audience.
There are many tools around to help turn data into graphics, but they can carry hefty price tags. The cost can make sense for professionals whose primary job is to find meaning in mountains of information, but you might not be able to justify such an expense if you or your users only need a graphics application from time to time, or if your budget for new tools is somewhat limited. If one of the higher-priced options is out of your reach, there are a surprising number of highly robust tools for data visualization and analysis that are available at no charge.
They range from easy enough for a beginner (i.e., anyone who can do rudimentary spreadsheet data entry) to expert (requiring hands-on coding). But they all share one important characteristic: They’re free. Your main investment: time.
Before you can analyze and visualize data, it often needs to be cleaned. What does that mean? Perhaps some entries list “New York City” while others say “New York, NY” and you need to standardize them before you can see patterns. There might be some records with misspellings or numerical data-entry errors. “Cleaning” tools are designed to help get your data in shape to be analyzed for the period.
DataWrangler (and subsequently Trifacta)
What they do: The DataWrangler web-based service from Stanford University’s Visualization Group is designed for cleaning and rearranging data so it’s in a form that other tools such as a spreadsheet app can use.
Click on a row or column, and DataWrangler will suggest changes. For example, if you click on a blank row, several suggestions pop up such as “delete row” or “delete empty rows.”
There’s also a history list that allows for easy undo — a feature that’s also available in Open Refine (reviewed next).
The team behind Data Wrangler later went to work on the Trifacta commercial product, although the service can still be used as is at the URL above. Trifacta is desktop software. The free version allows one user (without collaboration) and import of local CSV, JSON, text and Excel files.
What’s cool: Text editing is especially easy in DataWrangler. For example, when I selected “Alabama” in one row of sample data headlined “Reported crime in Alabama” and then selected “Alaska” in the next group of data, it led to a suggestion to extract every state name. Hover your mouse over a suggestion, and you can see affected rows highlighted in red.
Drawbacks: I found that unexpected changes occurred as I attempted to explore DataWrangler’s options; I constantly had to click “clear” to reset. And not all suggestions are useful (“promote row to header” seemed an odd suggestion when the row was blank) or easy to understand (“fold split 1 using 2 as key”).
Skill level: Advanced beginner
Runs on: Any web browser for Data Wrangler; Windows or macOS X for Trifacta
Learn more: There’s a screencast on the Data Wrangler home page. Also, see this post on using DataWrangler to format data (from Tableau Public’s blog). For more on Trifacta, see its resources page.