Data, data, data!


pope tracker

February 24, 2013 | Tags: python , pandas , ipython

Despite not being Catholic, the papal election fascinates me. Not sure if it’s the old rituals, the world-wide interest, or simply the fact that the Catholic Church has left a huge mark on history.

There’s no way I know enough about the inner workings of the Catholic Church to have any idea on who the next Pope may be.

Since domain knowledge is out, the next best option?

Follow the money!

Last week, I set up a scrapper to update the odds hourly from a prominent bookmaker. Given that I had a weeks worth of data, it was time to fire up iPython and see the action.

Alt text

The betting lines seem to have stabilized over the last few days. As we get closer and closer to papal conclave and the members of the remaining Cardinal electors arrive in Rome, the lines should show some movement.

Script to generate the plot:

  import csv
  import datetime
  import urllib
  import pandas as pd

  # read in .csv
  url = "https://raw.github.com/dataparadigms/popeTracker/master/odds.csv"
  webpage = urllib.urlopen(url)

  # convert to pandas data frame
  odds = pd.read_csv(webpage, 
    header = None, 
    names=['date','position','name', 'country', 'odds','probability'])

  # convert the datetime to an actual date time
  odds['date'] = odds['date'].map(
    lambda x: datetime.datetime.strptime(str(x), '%Y%m%d%H%M%S'))

  # drop dups
  odds = odds.drop_duplicates(cols=['date','name'], take_last=True)

  # pivot to get one column per name
  data = odds.pivot(index = 'date',
          columns = 'name',
          values = 'probability')

  # keep those with a > .1 chance of winning
  leaders = data[data > .1]
  leaders = leaders.dropna(axis=1, how='all')

  # make the plot
  leaders.plot(title='Probability of Being the Next Pope', 
    grid=True,
    figsize=(10,10));

Full code, data set, and iPython notebook is available on github.

Thoughts?

comments powered by Disqus