Introduction

Lets see who really has fantasy football bragging rights and who is just lucky (or unlucky).

During this analysis I will display most of my code. While this doesn’t provided the most pleasent visual experience, it will hopefully help the reader learn from my examples. (And potentially provide feedback on how to improve my methods.)

Libraries

The libraries I will use for this analysis are the following:

library(tidyverse)
library(stringr)
library(httr)
library(jsonlite)
library(knitr)
library(DT)

Data Manipulation

It took me a while, but I finally figured out how to access the ESPN Fantasy Football API.

url = "http://games.espn.com/ffl/api/v2/leagueSettings?leagueId=1314476&seasonId=2017"
raw.result <- GET(url = url)
this.raw.content <- rawToChar(raw.result$content)
this.content <- fromJSON(this.raw.content)

Clean and Parse

The API attains the data in JSON format which can be difficult to work with. While I am proud about some of my parsing, it does leave much that can be improved. The experienced with R/JSON may look down on my methods… but they work.

I’ll start by creating a function:

Note - I made a change recently. Apparently ESPN has changed their API. You can see the new code and the old code (commented out) below. Thanks so much to Jeffrey Sumner below for the solution!

games<-function(n)
{
#home and away team scores
scoreaway = lapply(n, "[",c("awayTeamScores"))
scorehome = lapply(n, "[",c("homeTeamScores"))
#who the away team is
test = lapply(n, "[[",4)
awayteam2 = lapply(test,"[",c("teamLocation"))
awayteam1 = lapply(test,"[",c("teamNickname"))
#who the home team is
test2 = lapply(n, "[[",6)
hometeam2 = lapply(test2,"[",c("teamLocation"))
hometeam1 = lapply(test2,"[",c("teamNickname"))

fhelper = data.frame(Away = paste(unlist(awayteam2), unlist(awayteam1)), Home = paste(unlist(hometeam2), unlist(hometeam1)), awayscore = unlist(scoreaway), homescore = unlist(scorehome), week = 1:16)
return(fhelper)
}
# games<-function(n)
# {
#   #home and away team scores
#   scoreaway = lapply(n, "[",c(2))
#   scorehome = lapply(n, "[",c(9))
#   #who the away team is
#   test = lapply(n, "[[",3)
#   awayteam1 = lapply(test,"[",4)
#   awayteam2 = lapply(test,"[",6)
#   #who the home team is
#   test2 = lapply(n, "[[",12)
#   hometeam1 = lapply(test2,"[",4)
#   hometeam2 = lapply(test2,"[",6)
#   
#   fhelper = data.frame(Away = paste(unlist(awayteam2), unlist(awayteam1)), Home = paste(unlist(hometeam2), unlist(hometeam1)), awayscore = unlist(scoreaway), homescore = unlist(scorehome), week = 1:16)
#   return(fhelper)
# }

After writing my funtion, I’ll run it:

fantasystart = games(
  c(
    this.content$leaguesettings$teams$`1`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`2`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`3`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`4`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`5`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`6`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`7`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`8`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`9`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`10`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`11`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`12`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`13`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`14`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`15`$scheduleItems$matchups,
    this.content$leaguesettings$teams$`16`$scheduleItems$matchups
  )
)

Lets look at the data. It contains every game and result thoughtout the season. This includes games that have not been played yet so we’ll have to account for that later.

as.tibble(fantasystart)
## # A tibble: 256 x 5
##    Away                   Home                   awayscore homescore  week
##    <chr>                  <chr>                      <dbl>     <dbl> <int>
##  1 'R'm Chair Quarterback Epsilon Ballers            125.       64.3     1
##  2 Team Guns N' Rosas     'R'm Chair Quarterback     151.      109.      2
##  3 2-2-2 Mega-H-Ertz      'R'm Chair Quarterback     138.       99.5     3
##  4 'R'm Chair Quarterback Las Vegas Radians           85.6     105.      4
##  5 Green Baye's Theorem   'R'm Chair Quarterback      83.5     121.      5
##  6 'R'm Chair Quarterback Baby got Dak               101.       73.4     6
##  7 'R'm Chair Quarterback a^2+b^2= c^hawks           130.       78.3     7
##  8 'R'm Chair Quarterback Team LaPorte                83.8      89.6     8
##  9 Berry Berry Good       'R'm Chair Quarterback      98        93.4     9
## 10 'R'm Chair Quarterback The Big Gronkowski          97.2     124.     10
## # ... with 246 more rows

Transform

So far so good. Now lets transform the data in 4 columns: Team Name, Score, Who Scored The Points, and the Week of the Matchup.

fantasy = 
fantasystart %>%
select(Away, awayscore, week) %>%
  mutate(Team = Away) %>%
  mutate(Score = awayscore) %>%
  mutate(Who = "For") %>%
  select(Team, Score, Who, week) %>%
  bind_rows(
    fantasystart %>%
      select(Home, homescore, week) %>%
      mutate(Team = Home) %>%
      mutate(Score = homescore) %>%
      mutate(Who = "For") %>%
      select(Team, Score, Who, week)
  ) %>%
  bind_rows(
    fantasystart %>%
      select(Away, homescore, week) %>%
      mutate(Team = Away) %>%
      mutate(Score = homescore) %>%
      mutate(Who = "Against") %>%
      select(Team, Score, Who, week) 
   ) %>%
  bind_rows(
    fantasystart %>%
      select(Home, awayscore, week) %>%
      mutate(Team = Home) %>%
      mutate(Score = awayscore) %>%
      mutate(Who = "Against") %>%
      select(Team, Score, Who, week)
  )

Peak at the data again.

##                     Team  Score Who week
## 1 'R'm Chair Quarterback 164.50 For   11
## 2 'R'm Chair Quarterback 164.50 For   11
## 3     Team Guns N' Rosas 163.42 For   16
## 4     Team Guns N' Rosas 163.42 For   16
## 5     Team Guns N' Rosas 159.86 For    3
## 6     Team Guns N' Rosas 159.86 For    3
##                 Team  Score     Who week
## 1        Team Grymes 164.50 Against   11
## 2        Team Grymes 164.50 Against   11
## 3 Calgary Stampeders 163.42 Against   16
## 4 Calgary Stampeders 163.42 Against   16
## 5  Las Vegas Radians 159.86 Against    3
## 6  Las Vegas Radians 159.86 Against    3

Now that our data is taking shape, lets add a row to indicate which team performed the worse each week - and also who played the team that performed the worse.

In our league, the worse performane wins a golden plunger, hence the variable names.

plung = 
fantasy %>%
  group_by(week) %>%
  summarise(Plunger = min(Score))

fantasy = 
  fantasy %>%
  left_join(plung) %>%
  mutate(Plunger = ifelse(Plunger==Score,1,0))

as.tibble(fantasy)
## # A tibble: 1,024 x 5
##    Team                   Score Who    week Plunger
##    <chr>                  <dbl> <chr> <int>   <dbl>
##  1 'R'm Chair Quarterback 125.  For       1       0
##  2 Team Guns N' Rosas     151.  For       2       0
##  3 2-2-2 Mega-H-Ertz      138.  For       3       0
##  4 'R'm Chair Quarterback  85.6 For       4       0
##  5 Green Baye's Theorem    83.5 For       5       0
##  6 'R'm Chair Quarterback 101.  For       6       0
##  7 'R'm Chair Quarterback 130.  For       7       0
##  8 'R'm Chair Quarterback  83.8 For       8       0
##  9 Berry Berry Good        98   For       9       0
## 10 'R'm Chair Quarterback  97.2 For      10       0
## # ... with 1,014 more rows

Mutate

Now lets mutate the data to get relevant descriptive statistics.

fantasy =
fantasy %>%
  filter(Score > 0) %>%
  mutate(Score = as.numeric(as.character(Score))) %>%
  mutate(Team = ifelse(Team == "5 Guns N' Rosas", "Team Guns N' Rosas", Team)) %>%
  mutate(Team = ifelse(Team == "3 Baye's Theorem", "Green Baye's Theorem", Team)) %>%
  mutate(Team = ifelse(Team == "16 Grymes", "Team Grymes", Team)) %>%
  group_by(Team, Who) %>%
  summarise(AverageScore = mean(Score), 
            STDEV = sd(Score), 
            max = max(Score), 
            min = min(Score),
            Plunger = sum(Plunger/2)
            ) %>%
  mutate(PlungerLuck = ifelse(Who == "Against",Plunger,""), Plunger = ifelse(Who == "For", Plunger, "")) %>%
  ungroup() %>%
  arrange(desc(Who), AverageScore) 

Lets look at the data again.

## # A tibble: 5 x 8
##   Team                   Who   AverageScore STDEV   max   min Plunger PlungerLuck
##   <chr>                  <chr>        <dbl> <dbl> <dbl> <dbl> <chr>   <chr>      
## 1 Team Guns N' Rosas     For           116.  27.3  163.  72.5 0       ""         
## 2 Team Mauled Mentos     For           116.  18.7  147.  78.2 0       ""         
## 3 The Big Gronkowski     For           115.  20.0  148.  68.6 0       ""         
## 4 'R'm Chair Quarterback For           109.  22.8  164.  83.8 0       ""         
## 5 Calgary Stampeders     For           108.  27.4  158.  68.9 0       ""
## # A tibble: 5 x 8
##   Team                   Who     AverageScore STDEV   max   min Plunger PlungerLuck
##   <chr>                  <chr>          <dbl> <dbl> <dbl> <dbl> <chr>   <chr>      
## 1 Team Guns N' Rosas     Against         113.  20.9  145.  66.1 ""      1          
## 2 a^2+b^2= c^hawks       Against         109.  21.9  150.  73.2 ""      0          
## 3 Team Grymes            Against         108.  27.7  164.  68.6 ""      0          
## 4 'R'm Chair Quarterback Against         107.  30.1  155.  64.3 ""      0          
## 5 Green Baye's Theorem   Against         104.  20.4  143.  61.3 ""      1

Plotting Preparation

Getting even better. Now we have the descriptive statistics, we need to add a couple of columns to help with plotting the data.

To do this, I will:

  1. Add a dummy variable dummy to help my plotting aesthetics.
  2. Add a dummy variable ord to help set the order of my facets.
  3. Set the order of my Teams so ggplot will facet them in the right order.
fantasy = 
fantasy %>%
  mutate(dummy = 0) %>%
  mutate(ord = row_number())
  
fantasy$Team = factor(fantasy$Team, levels=unique(fantasy$Team[order(fantasy$ord)]), ordered=TRUE)

One last look at the data:

## # A tibble: 5 x 10
##   Team                   Who   AverageScore STDEV   max   min Plunger PlungerLuck dummy   ord
##   <ord>                  <chr>        <dbl> <dbl> <dbl> <dbl> <chr>   <chr>       <dbl> <int>
## 1 Team Guns N' Rosas     For           116.  27.3  163.  72.5 0       ""              0    16
## 2 Team Mauled Mentos     For           116.  18.7  147.  78.2 0       ""              0    15
## 3 The Big Gronkowski     For           115.  20.0  148.  68.6 0       ""              0    14
## 4 'R'm Chair Quarterback For           109.  22.8  164.  83.8 0       ""              0    13
## 5 Calgary Stampeders     For           108.  27.4  158.  68.9 0       ""              0    12
## # A tibble: 5 x 10
##   Team                   Who     AverageScore STDEV   max   min Plunger PlungerLuck dummy   ord
##   <ord>                  <chr>          <dbl> <dbl> <dbl> <dbl> <chr>   <chr>       <dbl> <int>
## 1 Team Guns N' Rosas     Against         113.  20.9  145.  66.1 ""      1               0    32
## 2 a^2+b^2= c^hawks       Against         109.  21.9  150.  73.2 ""      0               0    31
## 3 Team Grymes            Against         108.  27.7  164.  68.6 ""      0               0    30
## 4 'R'm Chair Quarterback Against         107.  30.1  155.  64.3 ""      0               0    29
## 5 Green Baye's Theorem   Against         104.  20.4  143.  61.3 ""      1               0    28

Results

Plot

Nice, now lets plot the data. It will be valuable to see how each team scores (dark values) verses how much they get scored against (light values). We’ll plot the max and min scores as well as a confidence interval at the true mean of their score.

a=.5
ggplot(data = fantasy, aes(y = AverageScore, x = dummy,ymin = AverageScore-2*STDEV, ymax = AverageScore+2*STDEV)) +
# ggplot(data = fantasy, aes(y = AverageScore, x = dummy,ymin = min, ymax = max)) +
  geom_point(aes(alpha = Who, shape = Who, size = a)) +
  geom_point(aes(y = max, x = dummy, color = "red", alpha = Who, shape = Who, size = a)) +
  geom_point(aes(y = min, x = dummy, color = "blue", alpha = Who, shape = Who, size = a)) +
  guides(colour=FALSE, size = FALSE) +
  geom_errorbar(aes(alpha = Who)) +
  scale_alpha_discrete(range=c(0.3, 1)) +
  scale_radius() +
  geom_text(aes(y = 35, x = 0, label = Plunger, fontface = "bold")) +
  geom_text(aes(y = 175, x = 0, label = PlungerLuck, fontface = "bold")) +
  facet_grid(Team~.) +
  # geom_text(hjust = 0, aes(label = Plunger)) +
  theme(strip.text.y = element_text(angle = 0),
        # axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        # axis.text.y=Plunger,
        axis.ticks.y =element_blank()) +
  xlab("Left - Plungers Earned : Right - Plungers Played Against") + ylab("Average Weekly Score") +
  coord_flip() +
  ggtitle("Is Your Team As Bad (or Good) As You Think It Is?") +
  labs(caption = "Through Week 10") 
## Warning: Using alpha for a discrete variable is not advised.

Analysis

If you’ll take a look at the chart, you can interpret the chart in the following way:

  1. Teams are sorted by average points scored per game.
  2. The unshaded colors note points your team has scored.
  3. The shaded colors note points that have been scored against you.
  4. The confidence interval contains the true mean points that your team scores.
  5. The blue dots are the max points you scored in a week (or was scored against you if shaded).
  6. The red dots are the min points you scored in a week (or was scored against you if shaded).

Key takeaways:

  1. Even the “best” teams do not score significantly more than the “worst” teams.
  2. Potentially, the good teams don’t necessarily score more, they just have less points scored against them.
  3. Teams with inconsistent scoring: Baby Got Dak, Guns N’ Rosas
  4. Teams with consistent scoring: Team Mauled Mentos, ‘R’m Chair Quarterback, a^2 + b^2 = c^Hawks
  5. Teams who are ranked higher have generally played teams who have won the plunger.

Measures of Luck

Lets extract some more data from our JSON output to help us measure the ‘luck’ teams are experiencing. In the painful code below, we extract team names and their current playoff seeding.

standing = data.frame(
  Team = c(paste(this.content$leaguesettings$teams$`1`$teamLocation, this.content$leaguesettings$teams$`1`$teamNickname),
           paste(this.content$leaguesettings$teams$`2`$teamLocation, this.content$leaguesettings$teams$`2`$teamNickname),
           paste(this.content$leaguesettings$teams$`3`$teamLocation, this.content$leaguesettings$teams$`3`$teamNickname),
           paste(this.content$leaguesettings$teams$`4`$teamLocation, this.content$leaguesettings$teams$`4`$teamNickname),
           paste(this.content$leaguesettings$teams$`5`$teamLocation, this.content$leaguesettings$teams$`5`$teamNickname),
           paste(this.content$leaguesettings$teams$`6`$teamLocation, this.content$leaguesettings$teams$`6`$teamNickname),
           paste(this.content$leaguesettings$teams$`7`$teamLocation, this.content$leaguesettings$teams$`7`$teamNickname),
           paste(this.content$leaguesettings$teams$`8`$teamLocation, this.content$leaguesettings$teams$`8`$teamNickname),
           paste(this.content$leaguesettings$teams$`9`$teamLocation, this.content$leaguesettings$teams$`9`$teamNickname),
           paste(this.content$leaguesettings$teams$`10`$teamLocation, this.content$leaguesettings$teams$`10`$teamNickname),
           paste(this.content$leaguesettings$teams$`11`$teamLocation, this.content$leaguesettings$teams$`11`$teamNickname),
           paste(this.content$leaguesettings$teams$`12`$teamLocation, this.content$leaguesettings$teams$`12`$teamNickname),
           paste(this.content$leaguesettings$teams$`13`$teamLocation, this.content$leaguesettings$teams$`13`$teamNickname),
           paste(this.content$leaguesettings$teams$`14`$teamLocation, this.content$leaguesettings$teams$`14`$teamNickname),
           paste(this.content$leaguesettings$teams$`15`$teamLocation, this.content$leaguesettings$teams$`15`$teamNickname),
           paste(this.content$leaguesettings$teams$`16`$teamLocation, this.content$leaguesettings$teams$`16`$teamNickname)),
  
  rank = c(this.content$leaguesettings$teams$`1`$record$overallStanding,
           this.content$leaguesettings$teams$`2`$record$overallStanding,
           this.content$leaguesettings$teams$`3`$record$overallStanding,
           this.content$leaguesettings$teams$`4`$record$overallStanding,
           this.content$leaguesettings$teams$`5`$record$overallStanding,
           this.content$leaguesettings$teams$`6`$record$overallStanding,
           this.content$leaguesettings$teams$`7`$record$overallStanding,
           this.content$leaguesettings$teams$`8`$record$overallStanding,
           this.content$leaguesettings$teams$`9`$record$overallStanding,
           this.content$leaguesettings$teams$`10`$record$overallStanding,
           this.content$leaguesettings$teams$`11`$record$overallStanding,
           this.content$leaguesettings$teams$`12`$record$overallStanding,
           this.content$leaguesettings$teams$`13`$record$overallStanding,
           this.content$leaguesettings$teams$`14`$record$overallStanding,
           this.content$leaguesettings$teams$`15`$record$overallStanding,
           this.content$leaguesettings$teams$`16`$record$overallStanding)
  )

And to look at it:

as.tibble(standing)
## # A tibble: 16 x 2
##    Team                    rank
##    <chr>                  <int>
##  1 'R'm Chair Quarterback    10
##  2 Las Vegas Radians          8
##  3 Green Baye's Theorem      11
##  4 Epsilon Ballers           15
##  5 Team Guns N' Rosas         6
##  6 2-2-2 Mega-H-Ertz         13
##  7 a^2+b^2= c^hawks          14
##  8 Baby got Dak               2
##  9 Team HotHanksHitters       9
## 10 Team LaPorte               4
## 11 Viking Joe Lindquist       7
## 12 The Big Gronkowski         5
## 13 Berry Berry Good          16
## 14 Calgary Stampeders         1
## 15 Team Mauled Mentos         3
## 16 Team Grymes               12

Lets join the data with our previous dataset.

fantasy2 = fantasy %>%
  filter(Who == "For") %>%
  left_join(standing, by = "Team")
as.tibble(fantasy)
## # A tibble: 32 x 10
##    Team                 Who   AverageScore STDEV   max   min Plunger PlungerLuck dummy   ord
##    <ord>                <chr>        <dbl> <dbl> <dbl> <dbl> <chr>   <chr>       <dbl> <int>
##  1 Team Grymes          For           84.4  27.7  150.  43.1 5       ""              0     1
##  2 Green Baye's Theorem For           91.8  21.4  129.  64.0 1       ""              0     2
##  3 Epsilon Ballers      For           95.3  20.1  128.  56   1       ""              0     3
##  4 Berry Berry Good     For           95.9  22.7  137.  64.2 4       ""              0     4
##  5 Team HotHanksHitters For           99.0  20.5  143.  63.4 0       ""              0     5
##  6 Team LaPorte         For          100.   22.7  136.  61.3 2       ""              0     6
##  7 2-2-2 Mega-H-Ertz    For          102.   17.6  138.  73.2 0       ""              0     7
##  8 Baby got Dak         For          103.   25.1  148.  60.8 1       ""              0     8
##  9 a^2+b^2= c^hawks     For          105.   15.9  144.  78.3 0       ""              0     9
## 10 Las Vegas Radians    For          105.   20.2  141.  66.1 1       ""              0    10
## # ... with 22 more rows

Almost complete: Next lets add a column to keep track of each team’s ‘actual rank’ (according to points scored) as well as a measure of luck. This measure, though a simple calculation, is the the euclidian distance from how much (or how much less) a team is performing verses what they should expect given their current performance.

fantasy2 =
fantasy2 %>%
  mutate(actualrank = c(16:1)) %>%
  mutate(Luck = actualrank-rank)

Finally! Lets plot the data and see how everyone is doing.

ggplot(data = fantasy2, aes(x = actualrank, y = rank)) +
  geom_point() +
  geom_text(aes(label = paste(Team,":" ,Luck), color = Luck), vjust =1.5, size=3) +
  scale_colour_gradient(low = "red", high = "black") +
  geom_text(aes(x = 8, y = 8, label = "Line of Luck"), angle = 35, nudge_y = .6) +
  geom_abline(slope = 1, intercept = 0) +
  ylim(c(1,17)) + xlim(c(1,17)) +
  xlab("Points Scored Rank") + ylab("Actual Standings") + ggtitle("Are You Over-Performing or Under-Performing?") +
  labs(caption = "The Luck Index measures how much better (or worse) your team is doing than what you should expect.")

This is a way to sort the data to get another view of who is ranked up to their performance.

fantasyfor = fantasy %>%
  filter(Who == "For") %>%
  mutate(MaxPointsScored = max, MinPointsScored = min, PlungersPlayedAgainst = PlungerLuck) %>%
  select(Team, AverageScore, MaxPointsScored, MinPointsScored, Plunger)
fantasyagainst = fantasy %>%
  filter(Who == "Against")%>%
  mutate(MaxPointsAgainst = max, MinPointsAgainst = min, PlungersPlayedAgainst = PlungerLuck) %>%
  select(Team, MaxPointsAgainst, MinPointsAgainst, PlungersPlayedAgainst) 
FantasyChartstart = fantasyfor %>%
  left_join(fantasyagainst, by = "Team")

FantasyChart = FantasyChartstart %>%
  mutate(Scored = round(AverageScore),2) %>%
  mutate(Max = MaxPointsScored) %>%
  mutate(Min = MinPointsScored) %>%
  mutate(MaxFoe = MaxPointsAgainst) %>%
  mutate(MinFoe = MinPointsAgainst) %>%
  mutate(PlungerFoe = PlungersPlayedAgainst)%>%
  select(Team,Scored, Max,Min,Plunger,MaxFoe,MinFoe,PlungerFoe)

# options(DT.options = list(pageLength = 16))
datatable(FantasyChart, options = list(
  columnDefs = list(list(className = 'dt-center', targets = 8)),
  pageLength = 16
))

Lastly, lets look at who just eeks out wins and who snatches defeat from the jaws of victory.

start = fantasystart %>%
  mutate(delta = awayscore-homescore) %>%
  mutate(win = ifelse(delta>0, "Win", "Loss")) %>%
  group_by(Away) %>%
  mutate(margin = ifelse(win == "Win", min(delta[delta>0]),max(delta[delta<0]))) %>%
  group_by(Away, win) %>%
  summarise(Smallest = mean(margin)) %>%
  ungroup() %>%
  mutate(Away = ifelse(Away == "5 Guns N' Rosas", "Team Guns N' Rosas", Away)) %>%
  mutate(Away = ifelse(Away == "3 Baye's Theorem", "Green Baye's Theorem", Away)) %>%
  mutate(Away = ifelse(Away == "16 Grymes", "Team Grymes", Away)) 

a = start %>%
  filter(win == "Win") 
b = start %>%
  filter(win == "Loss") 

MarginofVictory = left_join(a,b, by = "Away") %>%
  mutate(SmallestWin = round(Smallest.x,2), SmallestLoss = round(Smallest.y,2), Team = Away) %>%
  select(Team, SmallestWin, SmallestLoss)

And the Chart…

datatable(MarginofVictory, options = list(
  columnDefs = list(list(className = 'dt-center', targets = 3)),
  pageLength = 16
))