Friday, October 31, 2014

creating Chrysler MOP/MIS charts in Pandas

Pandas is a Python package meant to be used for data analysis.  I love Python, but I'm also a long-term R user.  I've really come to appreciate how quickly I can create a customized chart in R, but let's try a practical example in Pandas and see how it compares to R.

Here's what I came up with, I'm new to Pandas and I'm willing to consider that maybe this isn't the most efficient way to get a chart up:

import pandas as pd
from matplotlib import pyplot as plt

data = pd.read_csv("../charts/my14 mis12/recovery%all_smlc%all_lcc.csv")

#change from int to string, I can do this in the call to read_csv also, but I think that way I need a dict of all column names
data["Model Year"] = data["Model Year"].astype("str")

#clean up column names, remove forward slashes
data = data.rename(columns=lambda x: x.replace('/', ''))

#construct a column of model year - MOP, where MOP is always 2 character
data['MY_MOP'] = "0"+data['MOP']
data['MY_MOP'] = [i[-2:] for i in data["MY_MOP"]]
data['MY_MOP'] = data["Model Year"]+"-"+data["MY_MOP"]

#narrow the dataframe down to just what I need:
data = data[data.MOP!="TOTALS"][data.YTDCMP=="YTD"][['MY_MOP','MIS','C1000']]

#pivot table of C/1000 by Model year - MOP and MIS
#obviously work to do on axis labels!
table = data.pivot_table(values="C1000",index=["MY_MOP"],columns=["MIS"],aggfunc=max)
table.plot()
plt.show()