Data visualization is quite fun. Perhaps when you think of data visualization, you think of ugly Microsoft Excel spreadsheets with half-a$$ed graphs.
This tutorial is meant to push you out of the Excel mindset just a little bit, and introduce you to the popular Python library, matplotlib.
The project we will create takes the sample data from the repository that you will download (a.k.a. clone) in Part 0: Setup, parse the sample data from columns and rows to a list of dictionaries, then render that data in two different graphs and in GitHub as a map.
The sample data that is included is a snapshot of public crime filings from the San Francisco police. Once you’ve gone through this tutorial, feel free to find other data that interests you, and rework our visualization functions.
Understand how to:
What else you will be exposed to:
collections
moduleNumPy (pronounced num-pie) is a popular scientific library for Python that gives a developer, academic, or scientist tools to work with high-level mathematical functions as well as multi-dimensional arrays and matrices.
We won’t be using much of NumPy, but it is required that we install this library before we can install and use matplotlib
.
matplotlib is another popular scientific library that gives the developer tools to produce 2D figures. No longer do you need your TI-89 calculator where you must punch in long lines of formulas, waiting precious seconds for it to render a graph that may be too zoomed in to realize you are missing an important axis point. Packed with detailed examples, you are able to make publication/presentation-quality graphs from the comfort of your keyboard.
GeoJSON is a derivative of JSON, and very similar to TopoJSON. It’s a data format for simple geological feature, including coordinate points.
We’ll be using a third-party module to help us in creating GeoJSON files: geojson
.
GitHub has an awesome feature that allows folks to paste GeoJSON files into Gists, and it automatically renders as a map.