Chapter 2 Introduction to RStudio
In this chapter, we will go through some basic operations of RStudio.
2.1 What is R
R is a type of programming language and supports many tasks including statistical computation (data cleaning, data management, statistics, machine learning) and graphics (static plots and interactive plots). You can also use it to create website (like this course website), write papers, analyze texts, etc. The most important thing is that R is free and easy to use, that’s why it has been applied in many fields.
2.2 What is RStudio
RStudio is a programming software for editing and running R code. It has many great features to make R programming easier!
2.3 Install R + RStudio
For better coding and running R, you should install both R and RStudio. You could code R with the installation of R only, however, RStudio provides you with more convenience in coding. In this course, we will use RStudio to do all the course lectures and exercises. So please make sure you install both of them!
R could be downloaded here and RStudio could be downloaded here. Both Windows OS (Operating System) and Mac OS are supported, so please choose the right one you need for your own system. (If you have any questions about the installation of R or RStudio, please come to me in the office hours or ask IT for help)
Or you could use the computers in the lab when there is no lecture.
2.4 Familiar with the user interface of RStudio
Below is a screenshot of the user interface of RStudio. You will find couple of panes/windows with different usages.(Selvam 2019)
- Menu/Tool Bar
- Source The pane where you write and edit your codes.
- Environment/History Environment lists all the variables that you are currently using. History presents the codes you have run before.
- Console Console is the original R interactive window. You could run codes and see the results here.
- Plot/Help Plot window shows the output figures. Help window presents the information of the function or package you are checking.
2.5 Create and save R file
Three ways to create a R file in the RStudio:
1. Menu -> File -> New File -> R Script
2. Shortcut: Ctrl + Shift + N
3. Tool Bar -> New file button
Also three ways to save R file
1. Menu -> File -> Save
2. Shortcut: Ctrl + S
3. Tool Bar -> Save file button
2.6 Print Hello, world
Now, let’s try to code something and run them! Let’s print the very classic “Hello, world!” with print()
function.
We could run the codes in several ways:
- Select the codes or put the cursor in the line of your code, and click the Run button located in the right-top position of the
source
pane. - Select the codes or put the cursor in the line of your code, and use shortcut: Ctrl + Enter
- You could also click the Re-run button near the Run button to re-run the codes you ran last time.
print('Hello, world!')
## [1] "Hello, world!"
Because what we need to output here is a string variable, we have to put them in the quotation mark. Either single quotation or double quotation mark works well. Let’s see another example.
print(5928)
## [1] 5928
Here, 5928 is an integer and we do not need to put them in the quotation marks.
2.7 Install and use R Packages
R is easy to use because it has many packages with different usages. These packages could help you accomplish some complex tasks with just several lines of codes.
Some packages have been already been installed and you could use them directly, which are base packages
. However, most of the packages have to be installed before you use them. There are couple of ways you could install a package. Let’s take the gbm
package for example.
1. Manu -> Tools -> Install Packages... -> Input the package name -> Click Install button
2. Use the code below:
install.packages("gbm")
After the installation of the package, you have import it with library()
function before you use the related functions.
library(gbm)
## Loaded gbm 2.1.8
We will spend more time in future classes to explore the various R packages and their different usages.
2.8 Make notes
It is important to write notes for your codes. It could help others or even yourself understand your codes easily. Use hash tag to indicate the notes. For example,
<- gbm(AvgMet~PkAreaH+StpNumH+DisToMin, # formula
gbm1 data=MetM, # dataset
var.monotone=c(+1, rep(0,10),rep(0,15)),
distribution="gaussian", # see the help for other choices
n.trees=5000, # number of trees
shrinkage=0.001, # shrinkage or learning rate, 0.001 to 0.1 usually work
interaction.depth=6, # 1: additive model, 2: two-way interactions, etc.
bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best
n.minobsinnode = 10, # minimum total weight needed in each node
cv.folds = 5)
R will not run the codes after hash tags in each line.
Please try to write simple but necessary notes for the codes. Keep this as a good habbit and you will thank yourself in the future.
2.9 Tips
- You could divide your codes into sections by entering chunks before each sections with the shortcut: Ctrl + Shift + R. This will help you organize your codes.
- Use
?
orhelp()
function to find the related instruction or help page, for example, if you want to find the instruction oflibrary()
function, just code
?library
or
help(library)
Both will direct you to the instruction page you are looking for in the help window.