courseTitle | chapterTitle | description | framework | mode |
---|---|---|---|---|
Introduction to R (v2) |
Matrices |
In this chapter, you'll learn how to work with matrices in R. By the end of the chapter you'll be able to create matrices, and understand how you can do basic computations with them. You'll analyze the box office numbers of Star Wars to illustrate the use of matrices in R. May to force be with you! |
datamind |
selfcontained |
In R, a matrix is a collection of elements of the same data type (numeric, character, or logical), that is arranged into a fixed number of rows and columns. Since you are only working with rows and columns, a matrix is called two-dimensional.
In R, you can construct a matrix with the matrix function, for example: matrix(1:9, byrow=TRUE, nrow=3)
.
The first argument, is the collection of elements that R will arrange into the rows and columns of the matrix. Here, we use 1:9
which constructs the vector c(1,2,3,4,5,6,7,8,9)
.
The argument byrow
indicates that the matrix is filled by the rows. If we want the vector to be filled by the columns, we just place bycol=TRUE
or byrow=FALSE
.
The third argument nrow
indicates that the matrix should have three rows.
*** =instructions
- Construct a matrix with 3 rows containing the numbers 1 upto 9 in the editor, the click Submit Answer button and look at the output in the console. Hint, use:
matrix(1:9, byrow=TRUE, nrow=3)
.
*** =hint Read the assigment carefully, the answer is already given ;-).
*** =sample_code
# Construction of a matrix with 3 rows containing the numbers 1 upto 9
*** =solution
matrix(1:9, byrow = TRUE, nrow = 3)
*** =sct
DM.result <- code_test("matrix(1:9, byrow=TRUE, nrow=3)")
*** =pre_exercise_code
It is time to get your hands dirty. In the following exercises you will analyze the box office numbers of the Star Wars franchisee. May the force be with you!
In the editor, three vectors are defined, each representing the box office numbers from the first three Star Wars movies. The first element of each vector indicates the US box office revenue, the second element of each vector refers to the Non-US box office (source: Wikipedia).
*** =instructions
- Construct a matrix with one row for each movie (thus with 3 rows), a column for the US box office revenue, and a second column for the non-US box office revenue. Name the matrix
star.wars.matrix
.
*** =hint
Remember that you can construct a matrix containing the numbers 1 upto 9 with: matrix(1:9, byrow=TRUE, nrow=3)
. In this case, you don't want the number 1 upto 9, but the elements of the 3 star wars movie: this means the input vector is thus: c(new.hope,empire.strikes,return.jedi)
.
*** =sample_code
# Box office Star Wars: In Millions (!) First element: US, Second
# element: Non-US
new.hope <- c(460.998007, 314.4)
empire.strikes <- c(290.475067, 247.9)
return.jedi <- c(309.306177, 165.8)
# Add your code below to construct the matrix
star.wars.matrix <-
# Show me the
star.wars.matrix
*** =solution
# Box office Star Wars: In Millions (!) First element: US, Second
# element: Non-US
new.hope <- c(460.998007, 314.4)
empire.strikes <- c(290.475067, 247.9)
return.jedi <- c(309.306177, 165.8)
# Add your code below to construct the matrix
star.wars.matrix <- matrix(c(new.hope, empire.strikes, return.jedi), nrow = 3,
byrow = TRUE)
# Show me the
star.wars.matrix
*** =sct
names <- "star.wars.matrix"
values <- "matrix( c(new.hope,empire.strikes,return.jedi),\nnrow=3, byrow=TRUE)"
DM.result <- closed_test(names, values)
*** =pre_exercise_code
To help you remember what's stored in star.wars.matrix, you'd like to add the names of the movies for the rows. Not only does this helps you to read the data, it is also useful to select certain elements from the matrix later.
Similar to vectors, you can add names for the rows and the columns of a matrix with:
rownames(my.matrix) <- row.names.vectors
colnames(my.matrix) <- col.names.vectors
*** =instructions
- Give the columns of
star.wars.matrix
the names"US"
and"non-US"
. - Give the rows of the matrix
star.wars.matrix
the names of the three movies. In case you are not a fan ;-), the movie names are: "A new hope", "The empire strikes back" and "Return of the Jedi".
*** =hint
Don't forget that R is case sensitive. The vector for the column names is thus: c("US","non-US")
and for the row names: c("A new hope","The empire strikes back","Return of the Jedi")
.
*** =sample_code
# Box office Star Wars: In Millions (!) First element: US, Second
# element: Non-US
new.hope <- c(460.998007, 314.4)
empire.strikes <- c(290.475067, 247.9)
return.jedi <- c(309.306177, 165.8)
# Construct matrix:
star.wars.matrix <- matrix(c(new.hope, empire.strikes, return.jedi), nrow = 3,
byrow = TRUE)
# Add your code here such that rows and columns of star.wars have a name!
# Print the matrix to the console:
star.wars.matrix
*** =solution
# Box office Star Wars: In Millions (!) First element: US, Second
# element: Non-US
new.hope <- c(460.998007, 314.4)
empire.strikes <- c(290.475067, 247.9)
return.jedi <- c(309.306177, 165.8)
# Construct matrix:
star.wars.matrix <- matrix(c(new.hope, empire.strikes, return.jedi), nrow = 3,
byrow = TRUE)
colnames(star.wars.matrix) <- c("US", "non-US")
rownames(star.wars.matrix) <- c("A new hope", "The empire strikes back", "Return of the Jedi")
# Print the matrix to the console:
star.wars.matrix
*** =sct
names <- c("star.wars.matrix", "colnames(star.wars.matrix)", "rownames(star.wars.matrix)")
values <- list("star.wars.matrix", "c(\"US\",\"non-US\")", "c(\"A new hope\", \"The empire strikes back\", \"Return of the Jedi\")")
DM.result <- closed_test(names, values, check.existence = c(T, F, F))
*** =pre_exercise_code
The single most important thing for a movie to become an instant legend in Tinseltown, is its worldwide box office figures.
To calculate the total box office revenue for the three Star Wars movies, you have to take the sum of the US revenue column and the non-US revenue column.
In R, the function rowSums()
conveniently calculates the totals for each row of a matrix. This function creates a new vector. sum.of.rows.vector<- rowSums(my.matrix)
.
*** =instructions
- Calculate the worldwide box office figures for the three movies and put these in the vector named
worldwide.vector
.
*** =hint
The rowSums()
function will calculate the total box office for each row of the star.wars.matrix
, if you supply star.wars.matrix
as an argument to that function by putting it between the brackets.
*** =sample_code
# Box office Star Wars: In Millions (!)
# Construct matrix:
box.office.all <- c(461, 314.4,290.5, 247.9,309.3,165.8)
movie.names <- c("A new hope","The empire strikes back","Return of the Jedi")
col.titles <- c("US","non-US")
star.wars.matrix <- matrix(box.office.all, nrow=3, byrow=TRUE,dimnames=list(movie.names,col.titles))
worldwide.vector <- #Your code here!
# Show me the
worldwide.vector
*** =solution
# Box office Star Wars: In Millions (!) Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
worldwide.vector <- rowSums(star.wars.matrix)
# Show me the
worldwide.vector
*** =sct
names <- c("star.wars.matrix", "worldwide.vector")
values <- c("star.wars.matrix", "rowSums(star.wars.matrix)")
DM.result <- closed_test(names, values)
*** =pre_exercise_code
In the previous exercise, you calculated the vector that contained the worldwide boxoffice receipt for each of the three star wars movies. However, this vector is not yet part of star.wars.matrix
...
When you want to add a column or multiple columns to a matrix. A good way to do this is cbind()
, which merges matrices and/or vectors together by column. For example: new.combined.matrix <- cbind( matrix1, matrix2, vector1, ... )
.
*** =instructions
- Add
worldwide.vector
as a new column to thestar.wars matrix
and assign toall.wars.matrix
. Use thecbind()
function.
*** =hint
Bind the worldwide.vector
to the star.wars.matrix
with the cbind()
function, with cbind( the.correct.matrix, the.correct.vector)
and assign to all.star.wars.matrix
.
*** =sample_code
# Box office Star Wars: In Millions (!) Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
worldwide.vector <- rowSums(star.wars.matrix)
# Bind the new variable total.per.movie as a column to star.wars
all.wars.matrix <-
# Show me the
all.wars.matrix
*** =solution
# Box office Star Wars: In Millions (!) Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
# Print the matrix to the console:
worldwide.vector <- rowSums(star.wars.matrix)
worldwide.vector # Print worldwide revenue per movie
# Bind the new variable total.per.movie as a column to star.wars
all.wars.matrix <- cbind(star.wars.matrix, worldwide.vector)
# Show me the:
all.wars.matrix
*** =sct
names <- c("star.wars.matrix", "worldwide.vector", "all.wars.matrix")
values <- c("star.wars.matrix", "rowSums(star.wars.matrix)", "cbind(star.wars.matrix,worldwide.vector)")
DM.result <- closed_test(names, values)
*** =pre_exercise_code
Just like every action has a reaction, every cbind()
has a rbind()
. (Ok we admit, we are pretty bad with metaphors)
Your R workspace now contains two matrices, the star.wars.matrix
we just used (the first trilogy) but also the star.wars.matrix2
for the second trilogy. Type the name of the matrices in the console and press enter, in case you want to have a closer look.
*** =instructions
- Assign to
all.wars.matrix
a new matrix withstar.wars.matrix
in the first three rows andstar.wars.matrix2
in the next three rows.
*** =hint
Bind the two matrices together in the following way: rbind( matrix1, matrix2 )
and assign to all.wars.matrix
.
*** =sample_code
# Box office Star Wars: In Millions (!)
star.wars.matrix # Matrix containing first trilogy box office
star.wars.matrix2 # Matrix containing second trilogy box office
# Combine the both Star Wars trilogies in one matrix
all.wars.matrix <-
# Show me the
all.wars.matrix
*** =solution
# Box office Star Wars: In Millions (!)
star.wars.matrix # Matrix containing first trilogy box office
star.wars.matrix2 # Matrix containing second trilogy box office
# Combine the both Star Wars trilogies in one matrix
all.wars.matrix <- rbind(star.wars.matrix, star.wars.matrix2)
# Show me the
all.wars.matrix
*** =sct
names <- c("star.wars.matrix", "star.wars.matrix2", "all.wars.matrix")
values <- c("star.wars.matrix", "star.wars.matrix2", "rbind(star.wars.matrix,star.wars.matrix2)")
DM.result <- closed_test(names, values)
*** =pre_exercise_code
# Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
# Construct matrix2:
box.office.all2 <- c(474.5, 552.5, 310.7, 338.7, 380.3, 468.5)
movie.names2 <- c("The Phantom Menace", "Attack of the Clones", "Revenge of the Sith")
star.wars.matrix2 <- matrix(box.office.all2, nrow = 3, byrow = TRUE, dimnames = list(movie.names2,
col.titles))
Just like every cbind()
has an rbind()
, every colSums()
has a rowSums()
. Your R workspace contains the all.wars.matrix
you constructed in the previous exercise (Type all.wars.matrix
in the console if you don't recall what it contains). Let's now calculate the total box office revenue for the entire saga.
*** =instructions
- Calculate the total revenue for the US and the non-US region and
assign total.revenue.vector
. You can use thecolSums()
function.
*** =hint
You should use the colSums()
function with star.wars.matrix
as argument to find the total box office per region.
*** =sample_code
# Print box office Star Wars: In Millions (!) for 2 trilogies
all.wars.matrix
total.revenue.vector
*** =solution
# Print box office Star Wars: In Millions (!) for 2 trilogies
all.wars.matrix
total.revenue.vector <- colSums(all.wars.matrix)
*** =sct
names <- c("all.wars.matrix", "total.revenue.vector")
values <- c("all.wars.matrix", "colSums( all.wars.matrix )")
DM.result <- closed_test(names, values)
*** =pre_exercise_code
# Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
# Construct matrix2:
box.office.all2 <- c(474.5, 552.5, 310.7, 338.7, 380.3, 468.5)
movie.names2 <- c("The Phantom Menace", "Attack of the Clones", "Revenge of the Sith")
star.wars.matrix2 <- matrix(box.office.all2, nrow = 3, byrow = TRUE, dimnames = list(movie.names2,
col.titles))
# Box office Star Wars: In Millions (!)
star.wars.matrix # Matrix containing first trilogy box office
star.wars.matrix2 # Matrix containing second trilogy box office
# Combine the both Star Wars trilogies in one matrix
all.wars.matrix <- rbind(star.wars.matrix, star.wars.matrix2)
Similar to vectors, use the square brackets [ ]
to select one or multiple elements from a matrix. Whereas vectors have 1 dimension, matrices have 2 dimensions, therefore use a comma to separate what to select from the rows and what from the columns. For example:
my.matrix[1,2]
selects from the first row the second elementmy.matrix[1:3,2:4]
selects rows 1,2,3 and columns 2,3,4.
If you want to select all elements of a row or column, no number is needed before or after the comma:
my.matrix[,1]
selects all elements of the first columnmy.matrix[1,]
selects all elements of the first row.
Back to Star Wars with this newly acquired knowledge:
*** =instructions
- Calculate the average Non-US revenue per movie. Assign this to the
non.us.all variable
(In other words: take the average of all elements of the second column). - Same question, but now only for the first two Star Wars movies.
Hint: Use the function mean()
to compute the average.
*** =hint
You can use the function mean()
to calculate the average of the inputs to the function. Remember that you can select all rows of a matrix for a specific column x, by typing my.matrix[,x]
. Non-US is the second column.
*** =sample_code
# Box office Star Wars: In Millions (!)
# Construct matrix:
box.office.all <- c(461, 314.4,290.5, 247.9,309.3,165.8)
movie.names <- c("A new hope","The empire strikes back","Return of the Jedi")
col.titles <- c("US","non-US")
star.wars.matrix <- matrix(box.office.all, nrow=3, byrow=TRUE,
dimnames=list(movie.names,col.titles))
star.wars.matrix
non.us.all <- #your code here
non.us.some <- #your code here
# Print both averages to console:
non.us.all
non.us.some
*** =solution
# Box office Star Wars: In Millions (!) Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
star.wars.matrix
non.us.all <- mean(star.wars.matrix[, 2])
non.us.some <- mean(star.wars.matrix[1:2, 2])
# Print to console both averages:
non.us.all
non.us.some
*** =sct
names <- c("star.wars.matrix", "non.us.all", "non.us.some")
values <- c("star.wars.matrix", "mean( star.wars.matrix[,2] )", "mean( star.wars.matrix[1:2,2] )")
DM.result <- closed_test(names, values)
*** =pre_exercise_code
Similar to what you learned with vectors, the standard operators like +
, -
, /
, *
, etc. work in an element-wise way on matrices in R.
For example: 2*my.matrix
multiplies each element of my.matrix
by two.
As a newly-hired data analyst for Lucasfilm, your job is to find out how many visitors went to each movie for each geographical area. You already have the total revenue figures in star.wars.matrix
. Assume that the price of a ticket was 5 dollars. Note that box office numbers divided by the ticket price gives you the number of visitors.
*** =instructions
- Assign to
visitors
the matrix with the estimated number of Non-US and US visitors for the three movies.
*** =hint
The number of visitors is the revenue (which is stored in star.wars.matrix
) divided by the price of ticket (assumed to be $5).
*** =sample_code
# Box office Star Wars: In Millions (!)
# Construct matrix:
box.office.all <- c(461, 314.4,290.5, 247.9,309.3,165.8)
movie.names <- c("A new hope","The empire strikes back","Return of the Jedi")
col.titles <- c("US","non-US")
star.wars.matrix <- matrix(box.office.all, nrow=3, byrow=TRUE,dimnames=list(movie.names,col.titles))
visitors <- #your code here!
# Show me (also in millions!) the
visitors
*** =solution
# Box office Star Wars: In Millions (!) Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
visitors <- star.wars.matrix/5
# Show me (also in millions!) the
visitors
*** =sct
names <- c("star.wars.matrix", "visitors")
values <- c("star.wars.matrix", "star.wars.matrix/5")
DM.result <- closed_test(names, values)
*** =pre_exercise_code
Just like 2*my.matrix
multiplied every element of my.matrix
by two, my.matrix1 * my.matrix2
creates a matrix where each element is the product of the corresponding elements in my.matrix1
and my.matrix2
.
After looking at the result of the previous exercise, big boss Lucas pointed out that the ticket prices went up over time with one dollar per movie. He asks to redo the analysis based on the prices you can find in ticket.prices.matrix
(source: imagination).
(Those familiar with matrices should note this is not the standard matrix multiplication for which you should use %*%
in R)
*** =instructions
- Assign to visitors the matrix with your estimated number of Non-US and US visitors for the three movies.
- Assign to
average.us.visitor
the average number of visitors in the US for a Star Wars movie. - Assign to
average.non.us.visitor
the average number of visitors in non-US areas.
*** =hint
- You can use the function
mean()
to calculate the average of the inputs to the function. - To get the number of visitors in the US, select the first column from
visitors
usingvisitors[ ,1]
*** =sample_code
# Box office Star Wars: In Millions (!)
# Construct matrix:
box.office.all <- c(461, 314.4,290.5, 247.9,309.3,165.8)
movie.names <- c("A new hope","The empire strikes back","Return of the Jedi")
col.titles <- c("US","non-US")
star.wars.matrix <- matrix(box.office.all, nrow=3, byrow=TRUE,dimnames=list(movie.names,col.titles))
ticket.prices.matrix <- matrix( c(5,5,6,6,7,7), nrow=3,byrow=TRUE,dimnames=list(movie.names,col.titles))
visitors <- #your code here
average.us.visitor <- #your code here
average.non.us.visitor <- #your code here
#Show me the (all in Millions)
visitors
average.us.visitor
average.non.us.visitor
*** =solution
# Box office Star Wars: In Millions (!) Construct matrix:
box.office.all <- c(461, 314.4, 290.5, 247.9, 309.3, 165.8)
movie.names <- c("A new hope", "The empire strikes back", "Return of the Jedi")
col.titles <- c("US", "non-US")
star.wars.matrix <- matrix(box.office.all, nrow = 3, byrow = TRUE, dimnames = list(movie.names,
col.titles))
ticket.prices.matrix <- matrix(c(5, 7, 6, 8, 7, 9), nrow = 3, byrow = TRUE,
dimnames = list(movie.names, col.titles))
visitors <- star.wars.matrix/ticket.prices.matrix
average.us.visitor <- mean(visitors[, 1])
average.non.us.visitor <- mean(visitors[, 2])
# Show me the (all in Millions)
visitors
average.us.visitor
average.non.us.visitor
*** =sct
names <- c("star.wars.matrix", "ticket.prices.matrix", "visitors", "average.us.visitor",
"average.non.us.visitor")
values <- c("star.wars.matrix", "ticket.prices.matrix", "star.wars.matrix/ticket.prices.matrix",
"mean( visitors[,1] )", "mean( visitors[,2] )")
DM.result <- closed_test(names, values)
*** =pre_exercise_code