In this tutorial, you will learn about one of the data structures named vector in the R programming language. As in all other programming languages, a data structure is an organized structure for the purpose of storing user data in a computer’s memory which can be referred to later. You will learn how to create a vector, different data manipulation techniques and functions used to examine a vector, and some more. Let us have a look.
In R a vector is a basic data structure and is homogeneous in nature. The term homogeneous defines a vector holding only data elements of the same data type. In other words, a vector is a data structure that stores a sequence of data elements of the same basic data types. Vectors in R uses the basic data types we discussed in our previous tutorial. A vector can have a single element or a sequence of elements that belongs to any of the basic data types like logical, integer, Numeric, etc. Therefore vector data structures are further classified into five classes or atomic types. They are
Vector Types | Description | Example |
---|---|---|
Logical | Takes either a TRUE or FALSE value. | True or False |
Integer | Take whole number values positive integers and 0 | 0, 56, 7990 |
Numeric | Takes both whole numbers and decimal point values. | 0, 10, 0.009, 5.6 |
Complex | Takes values with real & imaginary parts. | 1+2i , -3+4I |
Character | Takes a single character or sequence of words | “A”, “HELLO” |
The function c() function is used to combine a sequence of data elements of the same basic data types in R.
c(<value 1>,<value 2>…………..<value n>)
c(5,45,19) #numeric
c(TRUE,FALSE) #logical
You can store or assign this created vector into some variables like vector1, vector2, etc
Vector1 = c(5,45,19)
Vector2 = c(TRUE,FALSE)
In the previous tutorials, you learned about variables such as
language <- "R programming "
, "Variables in R" -> Tutorial
are all vectors themselves which hold a single element.
Vector1 = c(5,45,19) #created numeric data type vector
#assigned to variable named Vector1
Vector2 = c(TRUE,FALSE) #logical data type vector
#assigned to variable named Vector2
print(Vector1)
print(Vector2)
Output:
[1] 5 45 19 [1] TRUE FALSE
a = c('john','sam','jeniffer','Alex','Paul') # Vector2 of character type
print(a)
a <- c(a, "Zain") #add element Zain at the end of vector a
print(a)
a <- c( "james",a) #add element james at the beginning of vector a
print(a)
A vector a of character data type is created which holds values 'john', 'sam','Jeniffer,'Alex', 'Paul' to which append two elements one at the end and one at the beginning of created vector using a <- c(a, "Zain")
& a <- c( "james",a)
, when you execute the above code it produces the below-given result.
Output:
> print(a) [1] "john" "sam" "jeniffer" "Alex" "Paul" > a <- c(a, "Zain") > print(a) [1] "john" "sam" "jeniffer" "Alex" "Paul" "Zain" > print(a) [1] "james" "john" "sam" "jeniffer" "Alex" "Paul" [7] "Zain"
You can attach names to each vector elements in R by using a function called as names().The function thus helps in referring each element in vector with the name associated with it.
Syntax to name a vector element
names(<vector_name1>) = <vector_name2>
Let us understand with an example. Two vectors number (
number <- c(1,2,3,4) #numeric data type vector
print(number)
Output:
[1] 1 2 3 4
Let us create another vector colors of string data type
colors = c('pink','yellow','blue','green') #character data type vector
print(colors)
Output:
[1] "pink" "yellow" "blue" "green"
Using the function names() assign names to vector elements ie in our example names(number) = colors
.Vector colors
is assigned to the names(number)
function which names each color with a number name as 1 for “pink”,2 for “yellow” etc.
Let us understand with a program
number <- c(1,2,3,4) #numeric data type vector
print(number)
colors = c('pink','yellow','blue','green') #character data type vector
print(colors
names(number) = colors # names()
print(number)
Output:
[1] 1 2 3 4 [1] "pink" "yellow" "blue" "green" pink yellow blue green 1 2 3 4
The same can be represented without using a names() function as given in the program
labels <- c(1,2,3,4)
colors <- c('pink','yellow','blue','green')
names(labels)<- colors
print(labels)
labels <- c('pink'=1,'yellow'=2,'blue'=3,'green'=4)
print(labels)
labels <- c(pink=1,yellow=2,blue=3,green=4)
print(labels)
Output:
> labels <- c(1,2,3,4) > colors <- c('pink','yellow','blue','green') > names(labels)<- colors > print(labels) pink yellow blue green 1 2 3 4 > labels <- c('pink'=1,'yellow'=2,'blue'=3,'green'=4) > print(labels) pink yellow blue green 1 2 3 4
The length() function determines the length of a vector.
Syntax to check length of vector
length(<vector_name>)
Example :
length(labels)
length(colors)
labels <- c(1,2,3,4,5,6,7,8,9,10)
print(length(labels))
colors <- c('pink','yellow','blue','green')
print(length(colors))
When the above code has been executed the length() functions give the length of labels and colors.
Output:
[1] 10 [1] 4
The extraction of a vector element can also be mentioned as subsetting a vector. The operator used for the subset is [ ]
.
In R programming the vector elements can be retrieved by providing the index number of vector elements inside square bracket [ ] like <name of vector>[index value]
eg: Vector1[1]
Vector1 = c(5,45,19) #created numeric data type vector
Vector2 = c('john','sam','jeniffer') # Vector2 of character type
Vector1[1]
Vector2[3]
Vector2[0]
Output:
> Vector1[1] [1] 5 > Vector2[3] [1] "jeniffer" > Vector2[0] character(0)
You can extract multiple vector elements from a vector by specifying starting index to ending Index like Vector1[1:3] which return the data exist within that interval as shown in below
Vector1 = c(5,45,19) #created numeric data type vector
Vector2 = c('john','sam','jeniffer','Alex','Paul') # Vector2 of character type
Vector1[1:3]
Vector2[1:3]
Vector2[1:5]
Output:
> Vector1[1:3] [1] 5 45 19 > Vector2[1:3] [1] "john" "sam" "jeniffer" > Vector2[1:5] [1] "john" "sam" "jeniffer" NA NA
Here NA represents the missing values which we will discuss in coming tutorials.
In R the built-in function is.vector()
determines whether a vector is existing in an R program. The function returns either TRUE if there exists a vector or FALSE in the case of a non-existing vector.
is.vector(<vector_name>)
labels <- c(1,2,3,4,5,6,7,8,9,10)
print(is.vector(labels))
colors <- c('pink','yellow','blue','green')
print(is.vector(colors))
alphabets<- c('pink','yellow','blue','green')
print(is.vector(alphabets))
print(is.vector(names))
Output:
[1] TRUE [1] TRUE [1] TRUE [1] FALSE
A vector is a data structure with a sequence of elements or data that belongs to any of the atomic classes like numeric, integer, complex, logical, character. In R a vector is not allowed to have a combination of these atomic classes as a single vector value. If such a case exit R performs coercion for vectors.
The word meaning for coercion is the “practice of forcing someone to do something “in the case of vectors up-gradation of different data types to the same data type is the action forced to do here in this R programming context.
Consider a vector v3 with three different atomic classes like logical (FALSE), numeric (4.5), integer (67L) values are stored to it.
v3=c(FALSE,4.5,67L)
class(v3)
print(v3)
When you execute the v3 vector code in RStudio it produces the output as shown below which displays value 0.0 for FALSE, 4.5 as such and 67L of integer type gets transformed to a decimal number like 67.0. We can infer from the output even different data types provided while creating a vector, these elements get converted to a single type i.e. to a numeric type.
Output:
> v3=c(FALSE,4.5,67L) > class(v3) [1] "numeric" > print(v3) [1] 0.0 4.5 67.0
Let us understand what happens when to the same code character types “HELLO” and “R” are added.
v3=c(FALSE,4.5,67L,"HELLO",'R')
class(v3)
print(v3)
When you execute the above code with characters data type used in the creation of vector v3 along with other types like FALSE(logical),4.5(numeric) etc produces the output shown below.
Output:
> v3=c(FALSE,4.5,67L,"HELLO",'R') > class(v3) [1] "character" > print(v3) [1] "FALSE" "4.5" "67" "HELLO" "R" >
You can infer that once the code gets executed all the different data types values given get converted to character type and stored as a character data type in vector v3. Different data types are transformed into a single type to store in vector.
Note
function | Description |
---|---|
c() | To create a vector |
names() | To attach labels to vector |
typeof() | Determines the vector data type |
length() | To check vector length |
is.vector() | To check the existence of vector |
A vector created is flexible to perform various operations like finding mean, sd(standard deviation),drawing graph etc.
Vector1 = c(5,45,19) #created numeric data type vector
#assigned to variable named Vector1
mean(Vector1)
sd(Vector1)
barplot(Vector1)
Output:
[1] 5 45 19 [1] TRUE FALSE > mean(Vector1) [1] 23 > sd(Vector1) [1] 20.29778 > barplot(Vector1)![]()
Let us see the snippet of the same with graph
Vectors perform arithmetic operations elementwise. Each element in a vector is operated with another element in another vector to give resulting output..
v1 = c(5,6,7)
v2= c(4,4,2)
We can perform addition of two vectors v1 and v2
v3=v1+v2
print(v3)
Which produce the result
[1] 9 10 9
In similar manner the subtraction, multiplication, division operation perform with vectors which is summarized in below table
Operation | Code | Output |
---|---|---|
Subtraction |
V3 = v1-v2 |
[1] 1 2 5 |
Multiplication |
v3=v1*v2 |
[1] 20 24 14 |
Division |
v3=v1/v2 |
[1] 1.25 1.50 3.50 |