The goal of this make is to read data on wins and losses for all World Series games.

  1. Read R documentation for function scan(). In particular pay attention to attributes what, skip, and nlines
  2. Use scan() to read data on wins and losses for all World Series games
  3. The function scan() reads from left to right, but the dataset is organized by columns and so the years appear in a strange order. Solve this problem (Hint: use order() function).
  4. Make a data frame with year and pattern components
# Read the wseries dataset:
# - Skip the first 35 lines
# - Then read 23 lines of data
# - The data occurs in pairs: a year (integer) and a pattern (character)
world_series <- scan("http://lib.stat.cmu.edu/datasets/wseries",
                     skip = 35,
                     nlines = 23,
                     what = list(year = integer(0), pattern = character(0)))


# sort years 
perm <- order(world_series$year)
# make data frame with sorted information 
world_series <- data.frame(year = world_series$year[perm], pattern = world_series$pattern[perm], stringsAsFactors = FALSE)