Dry run \(\alpha\)

For each of the following scenarios:

  1. Write a conceptual (ER) model for the described universe
  2. Translate the ER model into a relational model including integrity constraints

University

A student is identified by an id. A course is identified by its name and the name of the faculty it belongs. An exam associates students to courses and records date and mark of the exam.

student(id)
course(name, faculty)
exam(student, course, faculty, date, mark)
exam(student) –> student(id)
exam(course, faculty) –> course(name, faculty)

Cars

A car is identified by a license plate. Each car has a unique current owner who bought the car in a given date. The owner is a person described by tax code and can own many cars. You want to keep track of the past owners of a car and the time interval (starting and ending dates) in which they owned the car (we assume a penson cannot buy the same car twice).

person(code)
car(plate, owner, date)
owned(person, car, starting_date, ending_date)
car(owner) –> person(code)
owned(person) –> person(code)
owned(car) –> car(plate)

Countries

A country is identified by a name. A country has a premier, which has a name and a date of election. A country possibly borders with other countries.

country(name, premier, election_date)
premier(name)
neighbour(country1, country2)
country(premier) –> premier(name)
neighbour(country1) –> country(name)
neighbour(country2) –> country(name)

Dry run \(\beta\)

Consider again the university relational schema above:

  1. Write consistent CSV files for each table
  2. Open RStudio and read the created CSV files into data frames with function read.csv()
  3. View the loaded data frames with function View()

Dry run \(\gamma\)

The nycflights13 dataset contains information about all flights that departed from New York City (e.g. EWR, JFK and LGA) in 2013 and related metadata (planes, airports, airlines and weather conditions). It includes the following entities:

The entities are associated with the following relationships:

Build an ER schema for the NYC flights dataset. Then translate the conceptual schema into a relational one. For simplicity, include only attributes relevant for keys and foreign keys.

flight(id, month, day, hour, origin, destination, tailnum, carrier, time_hour)
plane(tailnum)
airline(carrier)
airport(code)
weather(time_hour, origin)
flight(tailnum) –> plane(tailnum)
flight(carrier) –> airline(carrier)
flight(origin) –> airport(code)
flight(destination) –> airport(code)
flight(time_hour, origin) –> weather(time_hour, origin)
weather(origin) –> airport(code)

Dry run \(\delta\)

With respect to the nycflights13 dataset, write in relational algebra the following queries. First install and load the dataset and view column names as follows:

# install package(only once)
install.packages("nycflights13")

# load package
library(nycflights13)

# view column names of a table
names(flights)

# view table
View(flights)
  1. The tail number of flights manufactured by EMBRAER
  2. The flight number, the plane tail number, and plane manufacturer of flights manufactured by EMBRAER (mind that flights and planes tables share the year attribute with different semantics)
  3. Flight paths of length 2 (that is X, Y, and Z such that X –> Y –> Z)
  4. Flight paths of length 3 (that is X, Y, W, and Z such that X –> Y –> W –> Z)
  5. Can you find flight paths of arbitrary length in relational algebra? Why?

The tail number of flights manufactured by EMBRAER

\[ \pi(\sigma(planes, manufacturer = {\rm EMBRAER}), \{tailnum\}) \]

The flight number and the plane tail number of flights manufactured by EMBRAER (mind that flights and planes tables share the year attribute with different semantics)

\[ \begin{array}{l} \pi(\sigma(flights \Join \rho(planes, yr = year), manufacturer = {\rm EMBRAER}), \{flight, tailnum\}) \end{array} \]

\[ \begin{array}{l} \pi(flights \Join \sigma(\rho(planes, yr = year), manufacturer = {\rm EMBRAER}), \{flight, tailnum\}) \end{array} \]

Flight paths of length 2 (that is X, Y, and Z such that X –> Y –> Z)

\[ \begin{array}{l} \rho(\pi(flights, \{origin, dest\}), \{X = origin, Y = dest\}) \Join & \\ \rho(\pi(flights, \{origin, dest\}), \{Y = origin, Z = dest\}) \end{array} \]

Flight paths of length 3 (that is X, Y, W, and Z such that X –> Y –> W –> Z)

\[ \begin{array}{l} \rho(\pi(flights, \{origin, dest\}), \{X = origin, Y = dest\}) \Join & \\ \rho(\pi(flights, \{origin, dest\}), \{Y = origin, W = dest\}) \Join & \\ \rho(\pi(flights, \{origin, dest\}), \{W = origin, Z = dest\}) \end{array} \]

Can you find flight paths of arbitrary length in relational algebra? Why?

No. We would need a infinite number of join operations. But a query is a finite string.

Dry run \(\epsilon\)

Use relational algebra calculator Relax to create the following database. Then write and run a relational algebra query that retrieves the name of all employees managed by John.

group:Business

Employee = {
    Name:string, EmpId:number, DeptName:string
    'Harry'    , 3415        , 'Finance'      
    'Sally'    , 2241        , 'Sales'        
    'George'   , 3401        , 'Finance'      
    'Harriet'  , 2202        , 'Sales'        
    'Tim'      , 1123        , 'Executive'    
}

Department = {
    DeptName:string, Manager:string
    'Sales'        , 'Harriet'     
    'Executive'    , 'Charles'     
    'Finance'      , 'John'     
}

Try this query string:

π Name (σ Manager = ‘John’ ((Employee) ⨝ (Department)))