Skip to content
On this page

Accessing Databases

When a program runs, data is stored in memory. Once the program terminates, it is often necessary to save this data to a disk, whether on a local drive or a remote server. The data ultimately ends up in a disk file.

Defining the storage format for the data is a significant concern. For example, if we want to save a grade report for a class of students:

NameScore
Michael99
Bob85
Bart59
Lisa87

We could store this data in a text file, with each student on a separate line and values separated by commas:

Michael,99
Bob,85
Bart,59
Lisa,87

Alternatively, we could use a JSON format:

json
[
    {"name": "Michael", "score": 99},
    {"name": "Bob", "score": 85},
    {"name": "Bart", "score": 59},
    {"name": "Lisa", "score": 87}
]

However, defining custom formats comes with problems:

  1. Storing and retrieving data requires custom implementations. While JSON is standardized, other formats can vary widely.
  2. Quick searches are not possible. All data must be loaded into memory for searching, which may not be feasible if the data size exceeds available memory.

To facilitate data saving, retrieval, and quick queries, specialized software called databases (Database) was developed.

Evolution of Databases

The history of database software dates back to the 1950s. After progressing through network and hierarchical databases, the relational database emerged in the 1970s, based on relational models.

While relational models are mathematically complex, their concepts are easy to understand. For instance, consider a school:

Suppose an elementary school has three grades. The grades can be represented in a table:

Grade
First
Second
Third

Each grade has multiple classes, which can be shown in another table:

Class IDGrade IDClass Name
111First Grade 1
121First Grade 2
131First Grade 3

This table relationship, where each row in the Grade table corresponds to multiple rows in the Class table, is the foundation of relational databases.

An SQL query to find all classes for a specific grade might look like:

sql
SELECT * FROM classes WHERE grade_id = '1';

The result is another table:

Grade IDClass IDName
111First Grade 1
112First Grade 2
113First Grade 3

Similarly, each row in the Class table can relate to multiple rows in a Student table.

For an in-depth introduction to relational databases and basic SQL, refer to an SQL tutorial.

NoSQL

You may have heard of NoSQL databases, which often claim to outperform relational databases in speed and scalability. However, understanding SQL is fundamental before diving into NoSQL, as both serve different purposes and have distinct use cases.

Types of Databases

When using a relational database, choosing a suitable one is necessary. The most commonly used relational databases include:

  • Commercial Databases (Paid):

    • Oracle: High-end and widely used.
    • SQL Server: Microsoft's product, optimized for Windows.
    • DB2: An IBM offering, often associated with enterprise solutions.
    • Sybase: Once a partner with Microsoft, but now less prevalent.

    Paid databases have the advantage of vendor support for troubleshooting, which may be essential in enterprise environments. However, for web applications requiring thousands of database servers, companies like Google, Facebook, and others often opt for free, open-source databases.

  • Free and Open-Source Databases:

    • MySQL: The most popular choice for general use.
    • PostgreSQL: Academically oriented but highly capable.
    • SQLite: An embedded database suitable for desktop and mobile apps.

For a Python developer, choosing between SQLite and MySQL is common. SQLite is ideal for embedded database use as it requires no installation. For production environments, MySQL or PostgreSQL is recommended.

Accessing Databases has loaded