This course is about Data Science and Data Humanism and a blending of the two. The reference authors are data scientist Hadley Wickham and information designer Giorgia Lupi. I will try to follow the following teaching principles:
watch The joy of stats by Hans Roslingwatch The whole game by Hadley Wickhamwatch Data Humanism by Giorgia Lupilearn Entity-Relationship modellearn Relational modellearn Relational algebrause Relax with RelaXuse yEd graph editormake Dry runuse Ruse RStudiowatch Learn R with Mike Marinlearn Inside R [html / Rmd]watch A focus on data frames [Part 1 / Part 2]learn Data frames or tibbles?learn R Markdown
learn R Markdownlearn R Markdown formatsglance Markdown cheatsheetglance R Markdown Cheatsheetglance Git and Github
glance Git and Githubdig RStudio, Git and GitHubmake Dry runglance Cheatsheet. Base Rglance Cheatsheet. RStudiolearn Little bunny Foo Foowatch Data import in base Rlearn Data import in base R. Read chapter 4 in R Cookbookmake Dry runlearn Data import with readrglance Tidyverselearn Tidy data with tidyrmake Dry runglance Cheatsheet. Data import: readr, tibble, tidyrlearn Data transformation in base R. Also read chapters from 5.18 to 5.31 in R Cookbooklearn Data transformation with dplyrlearn Joins with dplyrlearn From the shell [shell, adult.data]make Dry runglance Cheatsheet. Data transformation: dplyrread The Great Wave off Kanagawalearn Data visualization in base R. Read chapter 10 in R Cookbookwatch Plotting with Base R. Part 1 / Part 2 / Part 3learn Data visualization with ggplotlearn Exploratory Data Analysis with dplyr and ggplotlearn Perfection is in the detailslearn Animated plots [Rmd / html]learn Interactive plots
learn Shiny
glance HTML widgets
glance Dashboards
make Dry runglance Generative art in Rglance Cheatsheet. Data visualization: ggplot2watch Linear regression in base Rlearn Model basicslearn Model buildinglearn Many modelsglance The corplot packagemake Dry runwatch How we can find ourselves in dataglance Giogia Lupiread Data humanism, the revolution will be visualizeduse Processinglearn A hasty tour inside Processinglearn Customized data visualization in Processing
use Arduinowatch Wired on Arduinolearn From Arduino to Processing and backlearn Visualizing real-time datamake Great gaps in the world of art auctionsmake Your Lucky Numbersmake Pick 3 visualizations from La Lettura and dig into themmake Dry runlearn: I teach, you listen (and hopefully learn).make: I give you an assignment, you make it during the class. We discuss the solutions during the next class.use: you use a software: download, install and run it for the first time. I give you a brief practical introduction to it.watch: We watch a video together. By and large, the video acts as a teaser, introducing the next topic in an informal and attractive way.glance: You give a brief and fast look at something, generally an informative website. I steer you towards the most important sections.read: You read a story, typically at home. We discuss it together during the following class.dig: You read a deepening of the current topic, normally at home. We talk about it during one of the next classes.listen: The class is given by an invited speaker, an expert in the field.In a data story (or data challenge) you tell a story with data. Find a dataset, pose questions, and try to solve them using an analysis notebook in R. Follow your curiosity and be creative.
The mid-course assignment covers the full pipeline of Data Science. Youโre asked to investigate the Italian Soccer League.
The exam consists of a written exam. The written part consists of a list of questions, either open questions or exercises, over all the covered syllabus. During the written exam students are allowed to use only sheets covering the syntax of languages (such as cheatsheets). The outcome of the written part is a mark from 0 to 30.
The student can also make a project, which is optional and gives the student a bonus from 0 to 3 points (to sum to the mark of the written part). The project consists of one significant data challenge chosen by the student. It is done individually and must use methods, languages and software tools seen during the course. The student will discuss the project the day of the written exam, in a maximum time of 15 minutes, using a presentation on a personal laptop (bring adapters). The presentation must focus on the used dataset, the data questions, the performed analyzes and the results obtained. Both the project and the presentation skills will be evaluated. Each student can discuss the project only once. If the written part fails, the bonus of the project is still valid.