In this tutorial, you will learn about what is the sequential data type and its classification, and also its various operations in the python programming language.
In the Python programming language, Sequential datatypes are one of the fundamental built-in datatypes apart from numeric, mapping, instance, and exception.
A sequence can be defined as a collection of objects arranged in a particular order such that each of them follows the other.
Python has six sequential data types – String, List, Tuple, Range, Bytes, Bytearrays. Among these, the most important sequential datatypes are String, List, and Tuple.
In programming, a sequence of characters is termed as a string.
Characters can be letters, digits, symbols, space, or punctuation marks. Anything (visible or invisible) that can be typed on a computer is considered as a character. For instance, Python
is a string with six characters where all characters are alphabetic letters.
A computer being a machine always understands binary language, 0
or 1
, unlike human beings. To be more specific, characters that are readable to the user have to be converted to machine language. In the computer, each and every character that appears on the screen is implicitly converted into a binary combination of 0’s and 1’s. This type of conversion is termed as encoding and the conversion of binary to characters is called decoding. Two commonly accepted encoding standards are ASCII and Unicode.
In the Python programming language, a string is an ordered sequence of Unicode characters. One important feature of the string is its immutability which means it is impossible to alter the state of a string after being created.
Python strings are defined by bounding texts or characters either in double (“……”) quotes or single (‘…….’) quotes.
Str1 = 'Hello'
Str2 = "Welcome"
print(Str1)
print(Str2)
Output:
Hello Welcome
In python string, triple quotes are used to represent multiple line strings. We can use either three successive single quotes (‘‘‘……….’’’) or three successive double quotes (“””……”””) to traverse multiple lines.
Str3 = '''...
Python programming Language
multiline in three single quotes
... '''
Str4 =""".....
Multiline string example
in three double quotes
....."""
print(Str3)
print(Str4)
Output:
... Python programming Language multiline in three single quotes ... ..... Multiline string example in three double quotes .....
Python, similar to other programming languages, also considers string as an array of Unicode characters. We can access characters either individually or collectively using indexing or slicing.
Indexing is the process of numbering the position of a character in a sequence to facilitate an easy lookup of the character. The index always must be an integer.
As the string is a sequence of Unicode characters each character position is indexed with a corresponding number as done in the array. Two types of indexing in python are:
For instance,
PYTHON
is a string with an array of characters P, Y, T, H, O, and N. Length of the string is 6
. Positive and Negative indexing for the string PYTHON
is visualized in the below table.
Referring to the index number we can easily access the characters individually by using the index operators [ ]
. The below example shows the extraction of a single character through positive indexing.
S = 'PYTHON'
print('S[0] = ',S[0])
print('S[4] = ',S[4])
Output:
S[0] = P S[4] = O
Suppose we have a long string and want to locate a character at the end. Python supports backward counting from the tail to the head. Negative indexing always starts with -1
.
S = 'PYTHON'
print('S[-1] = ',S[-1])
print('S[-5] = ',S[-5])
Output:
S[-1] = N S[-5] = Y
Two common types of error found in python while using indexing are
TypeError
when indexed with the float or complex or other types.IndexError
happens while attempting to access a value beyond its range.
S = 'PYTHON'
print('s[4.5] = ',s[4.5]) #Exibits typeerror as index is a float value
print('s[41] = ',s[41]) #Exibits Indexerror is out of length
Output:
print('s[4.5] = ',s[4.5]) TypeError: string indices must be integers print('s[41] = ',s[41]) IndexError: string index out of range
We have so far discussed how to access a single character from a string using an index operator []
. We can also extract characters collectively from a string by using a range slice operator [:]
.
Slicing as its name indicates, slices the sequences into a section of sequences. In Python string, index or slice operator [ ]
is used to access substrings of length one while range slice operator [:]
is used to access chunks of characters or a substring of arbitrary length.
The syntax for Range Slice can be represented as
S[m : n]
where,
S: String
m: starting index
n: ending index
S[m:n] returns the substring from the index m to n, but excluding the index n.
For better understanding find below a visualization of Range slicing. Each character is placed between the indices. For example, character P is placed between 0 and 1, ‘PY’ is placed between 0 and 2.
The below table gives you the idea of slicing using positive indexing in python more clearly
slicing expression | length of substring | output | remarks |
---|---|---|---|
S[0:5] | 5 | python | returns the character from index 0 to 4 ; position 5 not included |
S[0:6] | 6 | python | returns the characters from index 0 to 5 and 6(excluded) |
S[7:11] | 4 | worl | returns |
S[6:12] | 6 | w,o,r,l,d | omitted index 12 |
S[: 8] | 8 | python w | by default start from the head and omit index 8 |
S[8:] | 4 | orld | by default counts to tail |
S ="python world"
print("S[0:5] = ",S[0:5])
print("S[0:6] = ",S[0:6])
print("S[7:11] = ",S[7:11])
print("S[6:12] = ",S[6:12])
print("S[:8] = ",S[:8])
print("S[8:] = ",S[8:])
Output:
S[0:5] = pytho S[0:6] = python S[7:11] = worl S[6:12] = world S[:8] = python w S[8:] = orld
The below table gives you the idea of slicing using negative indexing in python more clearly
slicing expression | length of substring | output | remarks |
---|---|---|---|
S[-12:-7] | 5 | pytho | returns the characters from position -8 to -12 omitting position -7 |
S[-7:-3] | 4 | n wo | returns the characters from position -4 to -7 while omitted position -3 |
S[-12:] | 12 | python world | by default counts from the tail to end |
S[:-7] | 5 | pytho | by default, counts to head from position -8 |
S ="python world"
print("S[-12:-7] = ",S[-12:-7])
print("S[-7:-3] = ",S[-7:-3])
print("S[-12:] = ",S[-12:])
print("S[:-7] = ",S[:-7])
Output:
S[-12:-7] = pytho S[-7:-3] = wo S[-12:] = python world S[:-7] = pytho
One of the special features of the string is its immutability. We cannot modify a string once it is assigned to a variable. However, we can update a string by reassigning variable with another string.
V = 'Python World'
print('Initially Variable V is assigned to :',V)
V = ' Python programming'
print("Variable V is reassigned to :",V)
Output:
Initially Variable V is assigned to : Python World Variable V is reassigned to : Python programming
Similarly, we can delete an entire string by using the keyword “del” but removing characters from a string is not a valid action in the python programming language. The syntax for deleting a string is as follows:
del variablename
V = 'Python World'
del V
print (V)
Output: Error
print('V =',V) NameError: name 'V' is not defined
Python has several built-in functions to perform specific tasks with string. The most commonly used string function is len()
. Apart from len(),enumerate()
the function is also widely used which we will discuss in later tutorials.
len()
the function gives the length of the string by counting the number of characters present in the string.
#String Function
str ='Python World'
print('Length of the string is',len(str))
Output:
Length of the string is 12
Python allows the string to perform a variety of operations. As each operation is unique they have unique operators designed. Among the most important and common operators in the string are explained in this section.
The membership operator is used to validate the existence of substring in a string. The result will be either True or False. If substring exists it returns the truth value, True otherwise False. There are two types of membership operator in python. They are
#Membership operator
str = 'Python Member'
A ='M' in str
B='Me' not in str
print(A)
print(B)
Output:
True False
One of the fundamental operations of the string is concatenation. Concatenation refers to the gluing of two or more strings to form a new string. The concatenation operator in python is “+” plus symbol. Please note that in case of the numeric plus sign is used for addition while in the string it acts as a string joiner.
#String Concatenation using +
s1 = 'Python'
s2 = 'World'
print('String after concatenation :',s1+s2)
Output:
String after concatenation : PythonWorld
Similarly, using a * symbol we can repeatedly join a string multiple times.
#String Concatenation using *
s1 = 'Python'
print('s1*3 =',s1*3)
Output:
s1*3 = PythonPythonPython
Note: As strings are immutable, a new string formed after concatenation needs to be assigned to a new variable in order to store it.
The next fact to be noted is that implicit string conversion is not possible in the python programming language. Hence concatenating a string with a non-string type like the number, Boolean, etc will result a TypeError
.
s1 = 'Python'
print( s1+3)
Output:
print( s1+3) TypeError: can only concatenate str (not "int") to str
Note: Python can concatenate string to a string only
Let us recall that in python a string is delimited either in single quotes or double-quotes. What happens when we try to print plain text like It’s a “python program” which already has double quotes? A SyntaxError stating invalid syntax happens when the text gets interpreted.
One way to solve this problem is by using triple quotes – either 3 consecutive single quotes or 3 consecutive double-quotes. The other option is to use escape characters.
An escape character, as its name indicates, escapes a special character like single or double quotes in a string. The backslash (\ ) is considered as the escape character in the python string. In other words, an escape character allows you to transform a special character into an ordinary character.
For instance,
It\’s a \ “python program\” is the same as It’s a “python program” in python. In this example, each special character is prefixed with a backslash to circumvent the SyntaxError thereby allowing to print the special characters in a text.
#String formatting
print('''it's a "python program"!!!''') #Triple Quotes
print('it\'s a \"python program\"!!!') #Escape Character
Output:
#String formatting print('''it's a "python program"!!!''') #Triple Quotes print('it\'s a \"python program\"!!!') #Escape Character
Backslash is also used to denote some white space characters like tab, newline, space ,carriage return etc.
print('Straw\tBerry')
print("Mul\nBerry")
Output:
Straw Berry Mul Berry
Listed below additional escape characters in python .
Escape Formats | Specification |
---|---|
\’ | Single Quotes |
\” | Double Quotes |
\n | Newline or Linefeed |
\t | Horizontal Tab |
\v | Vertical Tab |
\r | Carriage Return |
\b | Backspace |
\a | Bell |
\f | Formfeed |
\\ | Backslash |
\ooo | ASCII Octal value ooo |
\xhh | ASCII Hexdecimal value hh |
Yet another unique feature of python is the ability to treat an escape character as a normal character through the representation of Raw String. Raw string is simply the normal string literal starting with title ‘r’ or ‘R’. Unlike escape sequence, Backslash has no special meaning when comes to a raw string.
#Raw StringExample
print("Hi\tWelcome To \n PYTHON \x48 WORLD! ")
print(R"Hi\tWelcome To \n PYTHON \xWORLD! ")
Output:
Hi Welcome To PYTHON H WORLD! Hi\tWelcome To \n PYTHON \xWORLD!
To cognize clearly about raw string let scrutinize the above example. In the above example the first string consists of 3 escape characters -\t , \n and \x48 which denotes a tab, new line and a hexdecimal representation. You can see the result accordingly. The second string is marked as a Raw string and also contains 3 escape characters--\t , \n and \x. Here \x does not have a specific representation or meaning . Even then the program successfully prints the string without raising any error. This is because the raw string ignores all the escape characters in the string literals. But the case is different if the string is not marked as a rawstring. it will raise an error as shown below.
print("Hi\tWelcome To \n PYTHON \x WORLD! ")
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 25-26: truncated \xXX escape
To manipulate string efficiently python has a wide set of built-in methods which are tabulated below
Sr.No. | Methods | Description |
---|---|---|
1 | capitalize() | Capitalizes first letter of string |
2 | center(width,fillchar) | Returns a space-padded string with the original string centered. |
3 | count(str, beg, end) | Counts the occurrence of str in string or in a substring of string if starting index beg and ending index end are given. |
4 | decode(encoding,errors) | Decodes the string using the codec registered for encoding. |
5 | encode(encoding,errors) | Returns encoded string version of the string; ValueError raised when error encounters. |
6 | endswith(suffix, beg, end) | Determines and returns True if string or a substring of string (if starting index beg and ending index end are given) ends with suffix; otherwise False |
7 | expandtabs(tabsize) | Expands tabs in string to multiple spaces; by default tabsize expands to 8 spaces. |
8 | find(str, beg, end) | Returns the index of str if it is present in the string or its substring and -1 otherwise. |
9 | index(str, beg, end) | Return the index same as find(), but raises an exception if str not found. |
10 | isalnum() | Returns true if string has at least 1 alphanumeric character and false otherwise. |
11 | isalpha() | Returns true if string has at least 1 alphabetic character and false otherwise. |
12 | isdigit() | Returns true if and only if string contains digits and false otherwise. |
13 | islower() | Returns true if string has atleast one lowercase letter and false otherwise. |
14 | isnumeric() | Returns true if a unicode string has only numeric characters and false otherwise. |
15 | isspace() | Returns true if string contains whitespace characters and false otherwise. |
16 | istitle() | Returns true if string is properly "titlecased" and false otherwise. |
17 | isupper() | Returns true if string has at least one uppercase character and false otherwise. |
18 | join() | joins multiple strings. |
19 | len(string) | Returns the length or number of characters in the string |
20 | ljust(width[, fillchar]) | Returns a space-padded string with the original string left-justified. |
21 | lower() | Converts all uppercase letters in string to lowercase. |
22 | lstrip() | Removes all leading whitespace in the string. |
23 | maketrans() | Returns a translation table to be used in translate function. |
24 | max(str) | Returns the max alphabetical character from the string str. |
25 | min(str) | Returns the min alphabetical character from the string str. |
26 | replace(old, new [, max]) | Replaces all occurrences of old in string with new or at most max occurrences if max given. |
27 | rfind(str, beg,end) | Same as find(), but search the string in reverse order. |
28 | rindex( str, beg, end) | Same as index(), but search the string in reverse order. |
29 | rjust(width,[, fillchar]) | Returns a space-padded string with the original string right-justified. |
30 | rstrip() | Removes all trailing whitespace of string. |
31 | split(str num) | Splits string to the parameters passed and returns list as much as it can. |
32 | splitlines( num=string.count('\n')) | Splits string at all (or num) NEWLINEs and returns a list of each line with NEWLINEs removed. |
33 | startswith(str, beg,end) | Returns True if a string or sub string starts with str(if beg and end are provided). |
34 | strip([chars]) | Performs both lstrip() and rstrip() on string. |
35 | swapcase() | Inverts case for all letters in string. |
36 | title() | Returns "titlecased" format of string, means, all words begin with uppercase and the remaining are lowercase. |
37 | translate(table, deletechars="") | Translates string according to translation table str(256 chars), removing those in the del string. |
38 | upper() | Converts all lowercase letters in string to uppercase. |
39 | zfill (width) | Returns original string leftpadded with zeros to a total of width characters; intended for numbers, zfill() retains any sign given (less one zero). |
40 | isdecimal() | Returns true if and only if a unicode string contains decimal characters and false otherwise. |
Examples of most frequently used methods such as lower(), upper(), join(), Split(), format(),replace etc. are given.
S = 'Python world'
print(S.lower())
python world
S = 'Python world'
print(S.upper())
PYTHON WORLD
S = 'Python world'
print(S.replace('world','program'))
Python program