Chapter 2 Introduction to RStudio
In this chapter, we will cover some basic operations in RStudio.
2.1 What is R
R is a type of programming language and supports many tasks including statistical computation (data cleaning, data management, statistics, machine learning) and graphics (static plots and interactive plots). You can also use it to create website (like this course website), write papers, analyze texts, etc. The most important thing is that R is free and easy to use, that’s why it has been applied in many fields.
2.2 What is RStudio
RStudio is a programming software for editing and running R code. It has many great features to make R programming easier!
2.3 Install R + RStudio
For better coding and running R, you should install both R and RStudio. You could code R with the installation of R only, however, RStudio provides you with more convenience in coding. In this course, we will use RStudio to do all the course lectures and exercises. Please make sure you install both of them!
R could be downloaded here and RStudio could be downloaded here (choose the free version). Both Windows OS (Operating System) and Mac OS are supported. You should choose the right one you need for your own system. (If you have any questions about the installation of R or RStudio, please come during the office hours or ask IT for help).
2.4 Familiar with the user interface of RStudio
Below is a screenshot of the user interface of RStudio. You will find couple of panes/windows with different usages.(Selvam 2019)
- Menu/Tool Bar
- Source The pane where you write and edit your codes.
- Environment/History Environment lists all the variables that you are currently using. History presents the codes you have run before.
- Console Console is the original R interactive window. You could run codes and see the results here.
- Plot/Help Plot window shows the output figures. Help window presents the information of the function or package you are checking.
2.5 Create and save R file
Three ways to create an R file in the RStudio:
1. Menu -> File -> New File -> R Script 2. Shortcut: Ctrl + Shift + N 3. Tool Bar -> New file button
Also three ways to save R file
1. Menu -> File -> Save 2. Shortcut: Ctrl + S 3. Tool Bar -> Save file button
2.6 Print Hello, world
It’s time to code something and output the results! Let’s print the very classic “Hello, world!” with
After coding, we could run our codes in several ways:
- Select the codes or put the cursor in the line of your code, and click the Run button located in the right-top position of the
- Select the codes or put the cursor in the line of your code, and use shortcut: Ctrl + Enter
- You could also click the Re-run button near the Run button to re-run the codes you ran last time.
##  "Hello, world!"
Because what we need to output here is a string variable, we have to put them in the quotation mark. Either single quotation or double quotation mark works well. Let’s see another example.
##  5928
Here, 5928 is an integer and we do not need to put them in the quotation marks.
2.7 Install and use R Packages
R is easy to use because it has tons of packages with different usages. These packages could help you accomplish some complex tasks with just several lines of codes (another reason we like to use R).
Some packages have already been installed and you could use them directly, which are
base packages. However, most of the packages have to be installed before being called in the codes. There are couple of ways you could install a package. Let’s take the
tidyverse package for example.
1. Manu -> Tools -> Install Packages... -> Input the package name -> Click Install button
2. Use the code below: install.packages("gbm")
After the installation of the package, you have import it with
library() function before you use the functions in the package.
## Loaded gbm 2.1.8
We will spend more time in future classes to explore the various R packages and their usages.
2.8 Make notes
It is important to write notes for your codes. It could help others or even yourself understand your codes easily. Use hash tag to indicate the notes. For example,
<- gbm(AvgMet~PkAreaH+StpNumH+DisToMin, # formula gbm1 data=MetM, # dataset var.monotone=c(+1, rep(0,10),rep(0,15)), distribution="gaussian", # see the help for other choices n.trees=5000, # number of trees shrinkage=0.001, # shrinkage or learning rate, 0.001 to 0.1 usually work interaction.depth=6, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 5)
R will not run the codes after hash tags in each line.
Please try to write simple but necessary notes for the codes. Keep this as a good habit and you will thank yourself in the future.
- You could divide your codes into sections by putting chunks before each sections with the shortcut
Ctrl + Shift + R. This will help you organize your codes. You could run the codes in the chunk by the shortcut
Ctrl + Alt + T.
help()function to find the related instruction or help page, for example, if you want to find the instruction of
library()function, just code
Both will direct you to the instruction page in the help window where you can find how to use these functions.