-
Notifications
You must be signed in to change notification settings - Fork 0
/
R_Tutorial Presentation.Rpres
369 lines (293 loc) · 10.4 KB
/
R_Tutorial Presentation.Rpres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
R Tutorial 101: Basic Introduction to R
========================================================
author: Parash Upreti
date:
autosize: true
.
Disclaimer: Most of the information may seem basic but I recommend quickly trying the codes and rejuvinating rusty skills.
Overview
========================================================
This is a very basic R Tutorial and assumes that you have a very little or no previous R knowledge. You can easily follow the slides with R your local installation of the software. Topics here include:
- Getting Started with R
- Working on R
- Workspace
- Commenting R Code
- Working Directory
- Working with data
- Vectors, list and Matrices
- R Data Structures
- Logical Operators and Truth values
- Programming in R
- Functions and packages
Getting Started with R
========================================================
- R is a statistical and mathematical computational software. It is available free of cost here [https://cran.r-project.org/bin/windows/base/].
- The popular IDE for R is RStudio which can be downloaded for free as well here [https://www.rstudio.com/products/RStudio/] Choose the desktop version.
- In RStudio, you will be working on either console or script windows. Inputs can be inserted in both windows but Console is where your results are displayed.
- RStudio is nicer since you can see two additional windows which can be helpful in many ways.
Lets Start working on R
========================================================
Do the following. Write type the following lines on your R script and hit ` control + enter` (there are other ways to run code in R) :
```{r, results="hide"}
1+3*3
```
```{r, results = "hide"}
3**5
3^5
```
As you can see, R has a intreprative command line. It can easily perform basic arithematic operations. Try some other calculations. (Note, you can directly run code in console)
Working on R
======================================================
Also, you can assign values to a variable we will use `"<-"` to assign values. (Can also use `"="` but we try to avoid.)
```{r}
x<- 7
x
y<- 3^2
x + y
```
If you check on the window to your right on the environment tab, aka workspace, you will see `x` and `y` as values. You can check what is on your workspace by `ls()`
Altering the workspace
=====================================
- You can alter the workspace by reassigning the value
```{r}
x<- 1000
x<- x *3
x
```
- You can also remove a particular value by
```{r}
rm(x)
````
- OR
```{r}
remove(y)
```
Commenting the code
========================================================
- Commenting is the most important part of writing code. Comments can be useful to give instructions or explain what particular line of code does.
- In R, you can write comments by beginning the line with ` # `
```{r}
# The following code adds two numbers
3+5 # 3 (first number) comes from x values and 5 (second) from y
```
The code runs with the command (which is ignored by R)
- If you need to clear your console you can press `cltr + l`. It clears the texts from the console but not from the workspace.
Working directory
========================================================
Working directory is the location on your hard drive thar R is currently accessing. This can be changed anytime. In RStudio, you can see this on the bottom right window under files tab.
```{r}
#getting current working directory
getwd()
```
- Set working directory by: `setwd(#location on your computer)`
- Can also view the working directory by:
```{r}
dir()
```
Working with data
========================================================
Data in R is in a vector form (for now). Concatinate (`c`) numbers as entries of vectors:
```{r}
x<- c(1,3,7,2,9,5)
y<- c(12,9,11, 14,7)
```
Check sombe basic information about your data
```{r}
length(x)
length(c(x,y))
```
Working with data
=======================================================
- Select particular data point in your vector
```{r}
x[5]
```
- Or remove a particular value
```{r}
y[-2]
```
- Change value or even add extra datapoints
```{r}
y[2] <- 10
y[6] <- 13 #earlier y only had 6 data points
```
Merging two vectors
========================================================
More on this later but just an example of what can be done with vectors
```{r}
xycol<- cbind(x,y) # x and y have to be same length
xycol
xyrow<- rbind (x,y)
xyrow
```
Create a list
=======================================================
Check to see what `list` does
```{r, results= "hide"}
list(4,3)
first_list<- list(xcol =x,ycol=y)
first_list
```
Notice the difference of the
```{r}
first_list[2]
first_list[[2]]
```
Matrices
========================================================
```{r}
X= matrix(c(1,2,4,2,2,3,2,1,2,4,2,4,1,3,4), ncol= 5, byrow =T)
X
```
By default, matrix is arranged by column. Also, you need to assign the number of columns (or rows).
Like vectors, matrix elements can be located or parsed by using:
```{r}
X[ ,3 ] #First entry is row and second is column index
X[2,3]
```
Table
==========================================================
Table commands tabulates the instances of each datapoint occurances
```{r}
table(c(1,2,4,2,2,3,2,1,2,4,2,4,1,3,4))
prop.table(table(X)) # converts the counts into probability
```
R Data Structure
==========================================================
R data comes in following formats:
- Numeric/Integers: Numeric values including integers and decimals
- Strings : Characters, (exclude if they are factors)
- Factors : Levels of classification. Eg. male/female, red/yellow/blue, etc
- Boolean : True/False (assigns 1 for true and 0 for false). Can add/subtract boolean in R
- Date/Time : Self explanatory. Try `sys.date()` and `sys.time()`
Operation on Matrices
========================================================
R can perform matrix operations. Try the following on your console
```{r}
x<-1:3
y<-4:6
z<-7:9
M<-cbind(x,y,z)
N<-matrix(c(10,11,12,50,51,52),3,2,byrow='true')
```
Verify the following:
```{r, results="hide"}
dim(N) #Returns the dimensions of N.
dim(N)[1] #Number of rows in N.
dim(N)[2] #Number of cols in N.
```
Operation on Matrices
========================================================
Try the following codes: Comments are provided as hints
```{r, results="hide"}
M*M #Elementwise multiplication
M%*%N #Matrix multiplication
M^2 #Square each element
N+7 #Add 7 to each element of M
M%*%M #Square M using matrix multiplication
t(M) #Transpose of A
diag(M) #Returns the diagonal of M as a vector.
diag(3) #Creates the 3x3 identity matrix.
```
Try creating more matrices and try other operations
What does following command do?
```{r, results ="hide"}
M^(-1)
```
Matrices "solve" command
=======================================================
What does the following "solve" command do?
```{r, results ="hide"}
A<- matrix(c(2,3,4,2), nrow =2)
solve(A)
```
- Hint: Try Solve command for a non- square matrix. Try it again for matrix M
More solve:
```{r, results="hide"}
solve(A)%*%A
B =solve(A,diag(2)) #remember what diag(2) does?
B #What is B?
```
- More hint: solve is solving something. Matrix multiply A and B to find out. Shouldn't be that hard.
`A^(-1)` is not same as `solve(A)`
Logical Operators
===========================================================
These are equivalent to your true false statements.
- Equal == [careful here, double equal sign]
- Not equal !=
- Greater than/equal >/>=
- Less than/ equal </<=
Run the code and see the results.
```{r, results= "hide"}
3==7
3!=7
3<7
3<=7
x=1:10
x==7
x<=7
x[x<=7] #Understand this line of code please
```
More Logical Operators
==========================================================
- And [`&`] -- Or [`|`] -- Not [`!`]
Does the word "truth table " ring a bell?
```{r}
TRUE & FALSE
TRUE | FALSE
!TRUE
```
Truth Table
=======================================================
```{r}
P = c(TRUE, TRUE, FALSE, FALSE)
Q = c(TRUE, FALSE, FALSE, TRUE)
cbind(P, Q, "P&Q"= P&Q, "P_or_Q" = P|Q, "Not_P" =!P)
```
Programming in R
======================================================
If else statements in R
```{r}
if(5>1){print('Yes, it is.')}
if(5>1){print('Yes, it is.')}else{print('No, it is not')}
if(5==1){print('Yes, it is.')}else{print('No, it is not')}
```
Using a Function
======================================================
We have already used many functions available in R.
- functions are in the form `function()` such as `cbind()` or `matrix()`
- such functions are available in `R base` integrated in plain R
- thre are many statistical functions in R such as `mean()`, `median()`, etc. (more on this later)
- you can create your own function as well. For eg.
```{r}
operation<- function(x,y){
sumops= x+y
difops= x-y
prodops=x*y
quotops=x/y
list(sum =sumops, diff = difops, prod = prodops, quot = quotops)
}
operation(3,22)$sum
```
Introuction to Statistical Functions
================================================================
R is equipped with many statistical functions. Other functions will be introduced as we move forward
```{r}
normal_data =rnorm(1000,100,15) #sample of size 1000 from a normal distribution with mu=100 and sigma=15
mean(normal_data) #Computes the sample mean.
sd(normal_data) #Computes the sample st dev.
sum(normal_data) #Computes the sum of the elements of x.
```
R Packages
========================================================
Since R is an open source software and many users in the community are contributing, it is growing. R packages are developed by a variety of people in the community, academia, industry, and software companies and storing them in CRAN repositories, GitHub or on some websites. R packages could be thought of as functions created by others that are not already availabe in base R.
So they have to be installed in the machine using `install.packages("package_name")` (only once) and then called in before every section using `library(package_name)`
Example of R pacakges:
- ggplot
- e1071
we will discuss packages and their use in the future
Up Next
========================================================
- Using dataset
- Dataset available in R base
- Plots