Skip to content

Data Types and Variables

Python is a computer programming language. Unlike natural languages that we use daily, computer programming languages have a significant difference: natural languages can be understood differently in various contexts, whereas computers execute tasks based on programming languages that must be unambiguous. Therefore, every programming language has its own set of syntax rules, and compilers or interpreters are responsible for converting syntactically correct program code into machine code that the CPU can execute. Python is no exception.

Python's syntax is relatively simple and uses indentation to structure code, resulting in code that looks like the following:

python
# print absolute value of an integer:
a = 100
if a >= 0:
    print(a)
else:
    print(-a)

Lines starting with # are comments. Comments are meant for humans to read and can contain any content; the interpreter ignores them. Every other line is a statement. When a statement ends with a colon :, the indented lines that follow are considered a block of code.

Indentation: Pros and Cons

Advantages:

  1. Forces Formatted Code: Indentation ensures that your code is well-formatted. However, it does not specify whether to use spaces or tabs for indentation. By convention, it is recommended to consistently use 4 spaces for indentation.

  2. Encourages Less Indentation: Indentation encourages writing code with fewer nested levels. You are more likely to break down long pieces of code into several functions, resulting in code with less indentation.

Disadvantages:

  1. Disables Copy-Paste Functionality: This is the most frustrating aspect. When refactoring code, pasted code must be checked for correct indentation. Additionally, IDEs find it challenging to format Python code as effortlessly as they format Java code.

Final Note: Python is case-sensitive. If you incorrectly use uppercase or lowercase letters, the program will throw an error.

Best Practices for Indentation

  • Python uses indentation to organize code blocks. Always adhere to the conventional practice of using 4 spaces for indentation.
  • In your text editor, set it to automatically convert tabs to 4 spaces to ensure that you do not mix tabs and spaces.

Data Types

As the name suggests, computers are machines that can perform mathematical calculations. Therefore, computer programs can naturally handle various numerical values. However, computers can process much more than just numbers; they can handle text, graphics, audio, video, web pages, and a variety of other data types. Different types of data require different data types. In Python, the directly supported data types include the following:

Integers

Python can handle integers of any size, including negative integers. Their representation in programs is identical to their mathematical notation, such as 1, 100, -8080, 0, and so on.

Since computers use binary, sometimes it is more convenient to represent integers in hexadecimal. Hexadecimal numbers use the 0x prefix followed by digits 0-9 and letters a-f, for example: 0xff00, 0xa5b4c3d2, etc.

For very large numbers, such as 10000000000, it is hard to count the number of zeros. Python allows underscores _ to separate digits in numbers, so 10_000_000_000 and 10000000000 are exactly the same. Hexadecimal numbers can also be written as 0xa1b2_c3d4.

Floating-Point Numbers

Floating-point numbers, or decimals, are called "floating" because the position of the decimal point can vary when expressed in scientific notation. For example, 1.23x10^9 and 12.3x10^8 are exactly equal. Floating-point numbers can be written mathematically, such as 1.23, 3.14, -9.01, etc. However, for very large or very small floating-point numbers, scientific notation must be used, replacing 10 with e. For example, 1.23x10^9 is written as 1.23e9, or 12.3e8, and 0.000012 can be written as 1.2e-5, etc.

Integers and floating-point numbers are stored differently in a computer's memory. Integer operations are always precise (is division precise too? Yes!), whereas floating-point operations may have rounding errors.

Strings

Strings are arbitrary text enclosed in single quotes ' or double quotes ", such as 'abc', "xyz", etc. Note that '' or "" themselves are just notations and not part of the string. Therefore, the string 'abc' consists of the characters a, b, and c. If the single quote ' itself is a character, you can use double quotes to enclose the string, for example, "I'm OK" contains the characters I, ', m, space, O, and K.

Handling Quotes Inside Strings:

If a string contains both single ' and double " quotes, you can use the escape character \ to denote them, for example:

python
'I\'m \"OK\"!'

This represents the string:

I'm "OK"!

The escape character \ can escape many characters, such as \n for a newline, \t for a tab, and the backslash \ itself must also be escaped, so \\ represents \. You can use print() in Python's interactive command line to see the escaped strings:

python
>>> print('I\'m ok.')
I'm ok.
>>> print('I\'m learning\nPython.')
I'm learning
Python.
>>> print('\\\n\\')
\
\

If many characters in a string need to be escaped, adding multiple \ can be cumbersome. To simplify, Python allows you to use r'' to denote raw strings where backslashes are not escaped by default. Try it yourself:

python
>>> print('\\\t\\')
\       \
>>> print(r'\\\t\\')
\\\t\\

If a string contains many line breaks, using \n in a single line can be hard to read. To simplify, Python allows the use of '''...''' to represent multi-line content. Try it yourself:

python
>>> print('''line1
... line2
... line3''')
line1
line2
line3

Note: When entering multi-line content in the interactive command line, the prompt changes from >>> to ..., indicating that you can continue inputting the next lines. Remember, ... is just a prompt indicator and not part of the code.

When the ending delimiter ''' and the closing parenthesis ) are entered, the statement is executed, and the result is printed.

If you write this as a program and save it as a .py file, it would look like this:

python
print('''line1
line2
line3''')

Multi-line strings '''...''' can also be prefixed with r to denote raw strings. Please test it yourself:

python
print(r'''hello,\n
world''')

Boolean Values

Boolean values are identical to those in Boolean algebra, having only two possible values: True or False. In Python, you can directly use True and False to represent Boolean values (note the capitalization), or they can be derived through Boolean operations:

python
>>> True
True
>>> False
False
>>> 3 > 2
True
>>> 3 > 5
False

Boolean values can be used with and, or, and not operations.

  • and Operation (Logical AND): The result is True only if all operands are True.

    python
    >>> True and True
    True
    >>> True and False
    False
    >>> False and False
    False
    >>> 5 > 3 and 3 > 1
    True
  • or Operation (Logical OR): The result is True if at least one operand is True.

    python
    >>> True or True
    True
    >>> True or False
    True
    >>> False or False
    False
    >>> 5 > 3 or 1 > 3
    True
  • not Operation (Logical NOT): This is a unary operator that inverts the Boolean value.

    python
    >>> not True
    False
    >>> not False
    True
    >>> not 1 > 2
    True

Boolean values are frequently used in conditional statements, such as:

python
if age >= 18:
    print('adult')
else:
    print('teenager')

None (Null Value)

None is a special value in Python representing a null value. None is not the same as 0; 0 has a meaning, whereas None signifies the absence of a value.

Additionally, Python provides various other data types like lists, dictionaries, and allows the creation of custom data types, which will be covered later.

Variables

The concept of variables in programming is essentially the same as variables in middle school algebra, except that in computer programs, variables can hold not only numbers but also any data type.

In a program, a variable is represented by a variable name. Variable names must consist of a combination of uppercase and lowercase English letters, numbers, and underscores _, and cannot start with a number. For example:

python
a = 1

Variable a is an integer.

python
t_007 = 'T007'

Variable t_007 is a string.

python
Answer = True

Variable Answer is a Boolean value True.

In Python, the equals sign = is an assignment statement. You can assign any data type to a variable, and the same variable can be reassigned multiple times with different types. For example:

python
a = 123  # a is an integer
print(a)
a = 'ABC'  # a is now a string
print(a)

Languages where variables can change types are called dynamic languages. In contrast, static languages require you to specify the variable type when defining it, and assigning a value of a different type will result in an error. For example, in Java, which is a static language, the assignment statements are as follows (// denotes a comment):

java
int a = 123; // a is an integer variable
a = "ABC"; // Error: cannot assign a string to an integer variable

Compared to static languages, dynamic languages are more flexible for this reason.

Important: Do not equate the assignment operator = in programming with the equality sign in mathematics. For example, consider the following code:

python
x = 10
x = x + 2

Mathematically, x = x + 2 is impossible. In programming, the assignment statement first evaluates the expression on the right side x + 2, resulting in 12, and then assigns it to the variable x. Since x was previously 10, after reassignment, x becomes 12.

Understanding Variables in Computer Memory

Lastly, it's crucial to understand how variables are represented in computer memory. When you write:

python
a = 'ABC'

The Python interpreter does two things:

  1. Creates a string 'ABC' in memory.
  2. Creates a variable named a in memory and points it to 'ABC'.

You can also assign one variable to another. This operation essentially makes the second variable point to the same data as the first variable. For example:

python
a = 'ABC'
b = a
a = 'XYZ'
print(b)

What will the last line print? Will b be 'ABC' or 'XYZ'? If you interpret it mathematically, you might incorrectly think that b and a are the same and both should be 'XYZ'. However, b retains the value 'ABC'. Let's execute the code line by line to see what happens:

  1. Execute a = 'ABC': The interpreter creates the string 'ABC' and the variable a, pointing a to 'ABC'.

    step-1.png

  2. Execute b = a: The interpreter creates the variable b and points it to the same string 'ABC' that a points to.

    step-2.png

  3. Execute a = 'XYZ': The interpreter creates a new string 'XYZ' and changes the pointer of a to 'XYZ', but b remains pointing to 'ABC'.

    step-3.png

Therefore, the final print(b) statement naturally outputs 'ABC'.

Constants

Constants are variables that should not change once assigned. For example, the mathematical constant π is a constant. In Python, constants are typically represented by variable names in all uppercase letters:

python
PI = 3.14159265359

However, PI is still a variable in reality. Python does not have any mechanism to prevent PI from being changed. Using all-uppercase variable names to represent constants is merely a conventional practice. If you choose to change the value of PI, Python will not stop you.

Explanation of Integer Division Precision

Lastly, let's explain why integer division is precise in Python. In Python, there are two types of division:

  1. Standard Division /:

    python
    >>> 10 / 3
    3.3333333333333335

    The result of / division is a floating-point number, even if two integers divide evenly:

    python
    >>> 9 / 3
    3.0
  2. Floor Division //:

    python
    >>> 10 // 3
    3

    Floor division // always returns an integer, even if the division does not result in a whole number. For exact division, use /.

Since floor division // only takes the integer part of the result, Python also provides a modulus operator % to obtain the remainder of the division:

python
>>> 10 % 3
1

Whether performing // division or modulus operations with integers, the results are always integers, ensuring that integer operations are always precise.

Exercise

Please print the values of the following variables:

python
n = 123
f = 456.789
s1 = 'Hello, world'
s2 = 'Hello, \'Adam\''
s3 = r'Hello, "Bart"'
s4 = r'''Hello,
Bob!'''

print(???)

Summary

  • Python supports multiple data types. Internally, any data can be considered an "object," and variables are used in programs to point to these data objects. Assigning a value to a variable associates the data with the variable.

  • Assigning a value to a variable, such as x = y, makes the variable x point to the actual object that y points to. Subsequent assignments to y do not affect what x points to.

  • Note: Python integers have no size limit, whereas some languages like Java have size restrictions for integers based on their storage length. For example, Java limits 32-bit integers to the range -2147483648 to 2147483647.

  • Python's floating-point numbers also have no size limit, but beyond a certain range, they are represented as inf (infinity).

Data Types and Variables has loaded