> "Hello World!" [1] "Hello World!" > 1 + 1 [1] 2 > (1 + (2 * 4) - 3) / 2 [1] 3 > # integer division > 31 %/% 3 [1] 10 > # modulus > 31 %% 3 [1] 1 > # exponents > 2^10 [1] 1024 > # comparison > 1 == 1 [1] TRUE > 1 != 1 [1] FALSE > 1 < 1 [1] FALSE > 1 <= 1 [1] TRUE > # logical comparison > FALSE & TRUE [1] FALSE > FALSE && TRUE [1] FALSE > TRUE | FALSE [1] TRUE > TRUE || FALSE [1] TRUE > !TRUE [1] FALSE > xor(TRUE, FALSE) [1] TRUE > xor(TRUE, TRUE) [1] FALSE
For conjunction and disjunction we have a shorter and a longer form. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right and evaluation proceeds only until the result is determined.
There are a few special values. The value NA (not available) is used to represent missing values. Not to be confused with the value NULL, which is the null object. The value Inf stands for positive infinity:
> 2^1024 [1] Inf > 1 / 0 [1] Inf
The value NaN (not a number) is the result of a computation that makes no sense:
> 0 / 0 [1] NaN > Inf - Inf [1] NaN
Of course, you may use variables to store values. There are 3 equivalent ways to assign a value to a variable (we will use the first one):
> x = 1 > x <- 1 > 1 -> x
To print the value of a variable, just type it:
> x [1] 1
In R, any number is in fact a vector of length 1. The [1] means that the index of the first item displayed in the row is 1. Vector indexes start at 1 (not 0). Construct longer vectors with c (combine) function:
> c(0, 1, 1, 2, 3, 5, 8) [1] 0 1 1 2 3 5 8
or using : operator:
> 0:99 [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [26] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 [51] 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 [76] 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
Operations on two vectors are performed element by element:
> c(1, 2, 3, 4) + c(10, 20, 30, 40) [1] 11 22 33 44 > c(1, 2, 3, 4) * c(10, 20, 30, 40) [1] 10 40 90 160
If the two vectors have different lengths, the smaller one is repeated:
> c(1, 2, 3, 4) + 10 [1] 11 12 13 14 > c(1, 2, 3, 4) + c(10, 20) [1] 11 22 13 24
Character vectors are vectors of strings:
> c("This", "class", "is", "really", "terrific!") [1] "This" "class" "is" "really" "terrific!"
You may refer to members of a vector in several ways:
a = 1:10 * 2 > a [1] 2 4 6 8 10 12 14 16 18 20 > a[5] [1] 10 > a[c(1, 5, 10)] [1] 2 10 20 > a[a > 10] [1] 12 14 16 18 20
Notice that in the last example we used a vector of Booleans:
> a > 10 [1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
An array is a multi-dimensional vector:
a = array(1:12, dim=c(3, 4)) # or, equivalently: a = 1:12 dim(a) = c(3, 4) > a [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 > a[2,3] [1] 8 > a[c(1,3), 2:4] [,1] [,2] [,3] [1,] 4 7 10 [2,] 6 9 12 > a[1,] [1] 1 4 7 10 > a[,1] [1] 1 2 3 a = array(1:18, dim=c(3, 3, 2)) > a[, , 1] [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 > a[, , 2] [,1] [,2] [,3] [1,] 10 13 16 [2,] 11 14 17 [3,] 12 15 18
A matrix is a 2-dimensional array:
m = matrix(data = 1:12, nrow=3, ncol=4) # or, equivalently m = matrix(data = 1:12, nrow=3) > m [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 m = matrix(data = 1:12, nrow=3, byrow=TRUE) > m [,1] [,2] [,3] [,4] [1,] 1 2 3 4 [2,] 5 6 7 8 [3,] 9 10 11 12
A list is a data type that allows to mix data of different types:
l = list(thing="hat", size=8.25) > l $thing [1] "hat" $size [1] 8.25 > l$thing [1] "hat" > l[["thing"]] [1] "hat" > l[[1]] [1] "hat"
Mind that l[1] is a sub-list containing only the first component of list l:
> l[1] $thing [1] "hat" > l[1]$thing [1] "hat" > l[1][[1]] [1] "hat"
A data frame is a list of named vectors of the same length. A data frame is like a database table.
team = c("Inter", "Milan", "Roma", "Palermo") score = c(59, 58, 53, 46) win = c(17, 17, 15, 13) tie = c(8, 7, 8, 7) lost = c(3, 4, 5, 8) league = data.frame(team, score, win, tie, lost) > league team score win tie lost 1 Inter 59 17 8 3 2 Milan 58 17 7 4 3 Roma 53 15 8 5 4 Palermo 46 13 7 8 > league[1,] team score win tie lost 1 Inter 59 17 8 3 > league[,2] [1] 59 58 53 46 > league[,"score"] [1] 59 58 53 46 > league[1:3, c("team", "score")] team score 1 Inter 59 2 Milan 58 3 Roma 53 > league$score [1] 59 58 53 46 > league$score == max(league$score) [1] TRUE FALSE FALSE FALSE > league[league$score == max(league$score), ] team score win tie lost 1 Inter 59 17 8 3 > league$team [1] Inter Milan Roma Palermo Levels: Inter Milan Palermo Roma > as.vector(league$team) [1] "Inter" "Milan" "Roma" "Palermo"
Notice that league$team is of type factor. A factor is a collection of items with a small set of repeated values, called levels. They are efficiently implemented mapping levels to integers. They might be used to store categorial data. For instance:
poll.results = factor(c("Berlusconi", "Berlusconi", "Bersani", "Casini", "Bersani")) > poll.results [1] Berlusconi Berlusconi Bersani Casini Bersani Levels: Berlusconi Bersani Casini > levels(poll.results) [1] "Berlusconi" "Bersani" "Casini"
R is in fact an object-oriented functional programming language. Conditional statements take the form:
x = 49 > if (x %% 7 == 0) x else -x [1] 49
Looping constructs include while and for:
x = 99 i = 2 while (i < x) { if (x %% i == 0) print(i) i = i + 1; } [1] 3 [1] 9 [1] 11 [1] 33 x = 99 for (i in 2:(x-1)) { if (x %% i == 0) print(i) } [1] 3 [1] 9 [1] 11 [1] 33
You may use built-in functions:
> log(128, 2) [1] 7 > args(log) function (x, base = exp(1)) NULL > log(x=128, base=2) [1] 7 > log(base=2, x=128) [1] 7 > log(exp(1)^2) [1] 2
Or define your our functions:
f = function(x=0, y=0) {sqrt(x^2 + y^2)} > f function(x,y) {sqrt(x^2 + y^2)} > args(f) function (x = 0, y = 0) NULL > f(1, 1) [1] 1.414214 > f(1) [1] 1 > f() [1] 0
Functions may be recursive:
factorial = function(x) { if (x == 0) 1 else x * factorial(x-1) } > factorial(5) [1] 120
You may use functions as arguments to other functions:
g = function(f, n) { sum = 0; for (i in 0:n) sum = sum + f(i); return(sum); } > g(factorial, 5) [1] 154
You may define your own binary operators using functions:
'%my%' = function(x, y) {2 * x + 2 * y} > 1 %my% 5 [1] 12
You can store a script of commands in a possibly remote file and evaluate the script using the source command:
source("script.R")
R comes with a number of packages, some of them are loaded by default (like package base). To see all available packages run:
(.packages(all.available=TRUE))
To see loaded packages:
(.packages())
To load an available package:
library(stat4)
Packages not available within R installation can be found on repositories like CRAN and Bioconductor. They can be downloaded and installed from the R console (recall to load them after installation if you want to use them):
install.packages("igraph")
or from command line:
R CMD INSTALL igraph_0.5.3.tar.gz
To remove a package:
remove.packages("igraph")
Getting help:
?log ?'+' ??"regression"
Clear screen with key combination CTRL+l
Quit:
q()
The workspace is saved in files .RData (environment) and .Rhistory (command history). The global environment is the default working space. Use function exists to check if a name exists in the environment, function objects to print all names, and function remove to remove an object.