Using R for Data Analysis
Software Installation
- R-software
- If you are using Windows, please go to the website's homepage, click on
Download R for Windows
, then click oninstall R for the first time
, and finally click onDownload R 4.0.4 for Windows
. After downloading the software, please proceed with the installation. - R language is command-line software with no graphical interface.
- If you are using Windows, please go to the website's homepage, click on
- RStudio
- Simply click the blue
Download
button, or choose other system versions from the options at the bottom of the page. After downloading the software, please complete the installation on your own.
- Simply click the blue
Learning Resources
Online Resources (Recommended)
Books
Basic Data Types
R language has several fundamental data types:
- Numeric
- Integer
- Complex
- Logical
- Character
Numeric
Numeric is the most basic data type in R. When we assign a numeric value to a variable, the variable's type becomes numeric:
> x = 11.15 # Assign the numeric value 11.15 to variable x
> x # Output the value of x
[1] 11.15
> class(x) # Output the type of x
[1] "numeric"
Both integers and decimals can be numeric variables. However, if you create them as shown above, integer variables will also be considered decimal variables.
Integer
To create an integer variable, you need to use the as.integer
function:
> y = as.integer(3)
> y # Output the value of y
[1] 3
> class(y) # Output the type of y
[1] "integer"
> is.integer(y) # Is y an integer?
[1] TRUE
Apart from using the is.integer
function, you can also append the L
suffix to achieve the same:
To round a decimal to an integer, you can use the as.integer
function:
You can also parse and round a string:
However, if the parsed string is not a numeric value, it will result in an error:
> as.integer("Joe") # Parsing a non-numeric string
[1] NA
Warning message:
NAs introduced by coercion
R language, like C language, maps integers 1
and 0
to logical values TRUE
and FALSE
:
> as.integer(TRUE) # Numeric variable for TRUE
[1] 1
> as.integer(FALSE) # Numeric variable for FALSE
[1] 0
Complex
In R language, complex variables are defined using i
:
向量的元素可以通过索引访问,索引从 1
开始:
要创建一个序列,可以使用 :
操作符:
向量可以进行逐元素运算,如加法、减法、乘法和除法:
> x = c(1, 2, 3)
> y = c(4, 5, 6)
> x + y # 逐元素相加
[1] 5 7 9
> x - y # 逐元素相减
[1] -3 -3 -3
> x * y # 逐元素相乘
[1] 4 10 18
> x / y # 逐元素相除
[1] 0.25 0.4 0.5
逐元素运算也适用于逻辑型向量:
> u = c(TRUE, FALSE, TRUE)
> v = c(FALSE, TRUE, FALSE)
> u & v # 逐元素逻辑与运算
[1] FALSE FALSE FALSE
> u | v # 逐元素逻辑或运算
[1] TRUE TRUE TRUE
向量的命名
可以给向量的每个元素起一个名字,以便更好地理解和操作:
要访问具体的元素,可以使用名字:
也可以使用索引:
向量的切片
可以通过索引范围来获取向量的子集,这称为切片:
向量的拼接
可以将两个向量拼接成一个:
向量的重复
可以用 rep
函数来重复一个向量:
也可以指定每个元素的重复次数:
向量的排序
可以使用 sort
函数对向量进行排序:
> x = c(5, 1, 3, 2, 4)
> sort(x) # 升序排序
[1] 1 2 3 4 5
> sort(x, decreasing=TRUE) # 降序排序
[1] 5 4 3 2 1
向量的筛选
可以使用逻辑型向量来筛选向量中的元素,只保留满足条件的元素:
这是 R 语言中向量的基本操作,它们在数据分析和处理中非常常用。
Combining Vectors
To combine two vectors, you can use the c
function:
> n = c(2, 3, 5)
> s = c("aa", "bb", "cc", "dd", "ee")
> c(n, s)
[1] "2" "3" "5" "aa" "bb" "cc" "dd" "ee"
Please note that in the above example, when combining two vectors of different data types, the resulting vector will be of the more permissive type (i.e., it coerces to the least restrictive type, such as converting numeric to character).
Basic Vector Operations
Let's assume we have two vectors, a
and b
:
Here are some basic operations on vectors:
> a + b
[1] 2 5 9 15
> a - b
[1] 0 1 1 -1
> 5 * a
[1] 5 15 25 35
> a * b
[1] 1 6 20 56
> a / b
[1] 1.000 1.500 1.250 0.875
If the vectors being added do not have the same number of elements, the result will be of a length equal to the longer vector:
Accessing Vectors
To retrieve elements from a vector, you can use square brackets [ ]
with an index specifying which element to access, like [index]
:
> s = c("aa", "bb", "cc", "dd", "ee")
> s[3] # Retrieve and print the value of the third element
[1] "cc"
If you put a negative sign before the index, such as [-3]
, it means you want to exclude the third element and retrieve the rest:
If the index exceeds the length of the vector, it will result in an error:
[Updating...] ```
This post is translated using ChatGPT, please feedback if any omissions.