So I wanted to know how much time I spend at work. I have Google track my every step trough android’s location history and I am sitting on a proper treasure of personal data. Also, I am an advertisment product, but that’s besides the point. Sooo, I pulled my location data from location history takeout, started Jupyter notebook and read my location history into a pandas dataframe.

with open('LocationHistory.json', 'r') as fh:
    raw = json.loads(

df = (
    .assign(dt = lambda x: pd.to_datetime(x['timestampMs'].astype(int),unit='ms'))
    .sort_values('dt', ascending=True)
    .assign(lat = lambda x: x['latitudeE7']/1E7)
    .assign(lon = lambda x: x['longitudeE7']/1E7)
    .assign(dur = lambda x: x['dt'].diff().dt.total_seconds() / 3600)
    .assign(weekday = lambda x: x['dt'].dt.weekday)
    .assign(dur = lambda x: x['dur'].shift(-1))
    .assign(workday = lambda x: x['weekday'].isin(range(0,5)))

Now we have a list of activities with their durations. How do we map them to work / home / other? Let’s take a look at a histogram of lattitudes and londitudes of the time after I started working at the current position and plot it in 2D.

df_subset = df.query("'2015-09-01' < dt")
_ = plt.hist2d(df_subset['lon'], df_subset['lat'], bins=100,


That’s to far away, I must have traveled to different places. Let’s just zoom into Amsterdam.


Ok, it seems most of my activities are logged at three locations, a quick check confirms those are my work and the two places I’ve lived at. Let’s label activities that fit into the area around my work:

work = [,]
lat_range =
lon_range =

select_work = (
   (work[0] - lat_range < df['lat']) & (df['lat'] < work[0] + lat_range) &
   (work[1] - lon_range < df['lon']) & (df['lon'] < work[1] + lon_range)

df.loc[select_work, 'label'] = 'work'

Etc. for home and other.

Now we can use these labels to see how I’ve spent September 2016:

labeled_location_times = (df
    .query("'2016-09-05' < dt and workday == True")



I’ve spent a lot of time at work this September. Let’s see how much of overtime I usually do:

work_hours = (
    df.query("'2015-09-01' < dt and workday == True and label == 'work'")
      .assign(over_hours = lambda x: x['work_hours'] - 8)
      .assign(over_hours_rolling_mean = 
              lambda x: x['over_hours'].rolling(10, center=True).mean())

ax = work_hours.filter(like='over').plot()


It seems I do a lot of overtime. I should probably get a hobby. So what is my effective hourly rate? What part of hours I work do I get paid for?

           .assign(eff = lambda x: 8 / x['work_hours'])

I average 9.45 hours at work per day so my effective hourly rate is 85.7% of my actual hourly rate.

So there you have it! If there is enough interest I will also publish the notebook.