Which states are the most concerned by gun crime?

I recently discovered the Capitol Words API and have had some fun playing around with it. One of the categories in the API allows you to search for the words spoken by the senators of each state in the USA, and I was interested in finding out the number of times the words “gun” were recorded on a state bill between January 2012 and December 2013.


As we can see, the most densely populated states of New York, California, Illinois and, to a lesser extent, Texas, mention the word “gun” the most often. It is in interesting (but not surprising) to note that the more Republican and pro-gun Midwestern states are conspicuously quiet about mentioning guns. We can also track the monthly occurence at which the word “gun” was mentioned in state bills between January 2012 and December 2013:


The sharp peak we observe across many states on April 2013 illustrates the national response and outrage that followed the tragic Boston marathon bombing and subsequent shootings. We can also see that the state of California shows some peaks in February, June and November 2013, which can be associated to the Christopher Dorner shooting, the June 7 Santa Monica shooting and the November 1 LAX shooting.

Finally, we can explore the underlying relationship between references to “education” in state bills and that of “gun” and “shooting”. Again, the obvious outliers are Connecticut, California and Illinois, which all refer to education an unordinary amount of times. Interestingly, if these three outliers were removed, we could argue that a decent linear fit (with positive coefficient) could be achieved between the number of times the word “education” is stated in a bill and that of “gun” and “shooting”. In that case, we could interpret this as education being mentioned as a result of gun crime and shooting (a causal analysis will be in order for future work, namely finding the average lag time between shooting events and the reaction of statesmen).


Relationship between the number of times the words “shooting” and “education” were mentioned in state bills between January 2012 and December 2013


Relationship between the number of times the words “gun” and “education” were mentioned in state bills between January 2012 and December 2013


Using APIs in Python: a quick example

Python has an extremely intuitive and straightforward way of dealing with APIs, and makes it simple for people like you or me to access and retrieve information from databases. Before I quickly describe how to use APIs in Python, maybe we should begin with: What is an API?

API (Application Programming Interface): An API is a software intermediary that makes it possible for application programs to interact with each other and share data. It’s often an implementation of REST that exposes a specific software functionality while protecting the rest of the application. (definition taken straight from Google itself). Here, it is important to also define the REST acronym (Representational State Transfer), which is a fancy way of describing a protocol for sending and receiving data (in JSON, XML and even text format) between a client and server.

A typical (and popular!) use of API revolves around the twitter API, which many people have used to predict outcomes, perform sentiment analysis, geomapping etc.. (for example: http://www.kazemjahanbakhsh.com/codes/election.html)

But with no further ado, I will show how to use APIs in Python with the awesome Capitol Words API (heavily inspired from the highly recommended CodeAcademy tutorial):

# import required libraries
import requests
import pprint

# set query parameters
query_params = { 'apikey': 'XXXXXXXXXXXXXX',
  'per_page': 100,
  'phrase': 'debt',
  'sort': 'count desc',
  'start_date': '2012-01-01',
  'end_date': '2013-12-31'

# define the endpoint database we wish to search
endpoint =  'http://capitolwords.org/api/phrases/state.json'

# extract data
response = requests.get(endpoint, params=query_params)
data = response.json()

# print data to file
out_file = open('debt_reference.txt', 'w')
for i in range(len(data['results'])):
  out_file.write('%s,%s\n' % (data['results'][i]['count'], data['results'][i]['state']))


Here, we used the ‘phrases/states.json’ endpoint to find how many times the word debt was uttered by senators of each US state between the period of January 1 2012 and December 31 2013.

In order for the code above to work, you will need to obtain your own API key and insert in the code. You can obtain an API key by simply registering (for free) to the Capitol Words API: http://sunlightlabs.github.io/Capitol-Words/