How Databases Work: SQL, NoSQL, and Data Storage

What Is a Database?

A database is an organized collection of structured data stored and accessed electronically. Databases are managed by software systems called Database Management Systems (DBMS), which provide mechanisms for storing, retrieving, updating, and deleting data while enforcing consistency rules and controlling concurrent access by multiple users. Databases underpin virtually every modern software application — from web applications and mobile apps to enterprise resource planning systems and scientific data repositories. The global database management system market exceeded $80 billion in revenue in 2024, reflecting the central role databases play in the information economy.

Relational Databases and SQL

The relational database model, introduced by Edgar F. Codd at IBM in 1970, organizes data into tables (also called relations) consisting of rows (records) and columns (attributes). Relationships between tables are expressed through foreign keys — columns in one table that reference the primary key of another. This model enables complex queries that join data across multiple tables. SQL (Structured Query Language) is the standardized language used to interact with relational databases. Major relational DBMS products include MySQL, PostgreSQL, Microsoft SQL Server, and Oracle Database.

Core SQL Operations

SELECT: Retrieves rows from one or more tables based on specified conditions.
INSERT: Adds new rows to a table.
UPDATE: Modifies existing rows that match specified conditions.
DELETE: Removes rows matching specified conditions.
JOIN: Combines rows from two or more tables based on a related column (INNER JOIN, LEFT JOIN, etc.).
CREATE / ALTER / DROP: DDL statements that define, modify, or remove database schema objects.

ACID Properties

Relational databases guarantee reliable transaction processing through the ACID properties, which ensure that database operations maintain data integrity even in the event of errors or system failures:

Property	Definition	Example Guarantee
Atomicity	A transaction is all-or-nothing; partial completion is not permitted	A bank transfer either fully completes or fully rolls back
Consistency	A transaction brings the database from one valid state to another	Foreign key constraints are never violated
Isolation	Concurrent transactions execute as if they were sequential	Two users updating the same row do not corrupt each other's data
Durability	Once committed, a transaction's effects persist even after a system crash	Data written to disk survives a power failure

How Databases Store Data

Relational databases store data on disk in pages — fixed-size blocks typically 8 KB or 16 KB in size. The database engine maintains a buffer pool (in-memory cache of frequently accessed pages) to reduce disk I/O. Data is organized within pages using structures such as heap files (unordered rows), B-tree indexes, or clustered indexes (where rows are physically sorted by a key). When a transaction modifies data, changes are first written to a write-ahead log (WAL) before being applied to data pages; this ensures durability and enables crash recovery by replaying the log after a failure.

Database Indexing

An index is an auxiliary data structure that allows the database engine to locate rows matching a query condition without scanning every row in a table. The most common index structure is the B-tree (balanced tree), which supports efficient equality and range queries in O(log n) time. A hash index offers O(1) equality lookups but does not support range queries. Full-text indexes enable fast keyword searches within text columns. While indexes dramatically accelerate read queries, they impose overhead on write operations (INSERT, UPDATE, DELETE) because the index must be maintained alongside the data. Query optimizers automatically evaluate available indexes when generating execution plans.

NoSQL Databases

NoSQL (Not only SQL) databases were developed to address scalability and flexibility requirements that relational databases handle less efficiently. NoSQL systems generally trade some ACID guarantees for horizontal scalability and schema flexibility. They are categorized by data model:

NoSQL Type	Data Model	Example Products	Typical Use Cases
Document store	JSON/BSON documents	MongoDB, Couchbase	Content management, catalogs
Key-value store	Simple key → value pairs	Redis, DynamoDB	Caching, session storage
Wide-column store	Rows with variable columns per row	Apache Cassandra, HBase	Time-series, IoT data
Graph database	Nodes and edges (relationships)	Neo4j, Amazon Neptune	Social networks, fraud detection
Time-series database	Timestamped data points	InfluxDB, TimescaleDB	Metrics, monitoring

The CAP Theorem

The CAP theorem, formulated by Eric Brewer in 2000 and formally proved in 2002, states that a distributed data store can provide at most two of the following three guarantees simultaneously: Consistency (all nodes see the same data at the same time), Availability (every request receives a response), and Partition tolerance (the system continues operating even when network partitions separate nodes). Since network partitions are unavoidable in distributed systems, designers must choose between prioritizing consistency (CP systems, e.g., HBase, ZooKeeper) or availability (AP systems, e.g., Cassandra, CouchDB) during a partition event. The later PACELC theorem extended this analysis to also consider latency vs. consistency trade-offs when the network is operating normally.

NewSQL and Distributed SQL

NewSQL databases attempt to combine the horizontal scalability of NoSQL with the full ACID guarantees of relational databases. Examples include Google Spanner (which uses atomic clocks and GPS to synchronize transactions globally), CockroachDB, and YugabyteDB. These systems use consensus algorithms such as Raft or Paxos to achieve distributed agreement on transaction commits. NewSQL systems are widely used by large organizations that need both transactional integrity and the ability to scale across multiple data centers or cloud regions.

How Databases Work: SQL, NoSQL, and Data Storage

What Is a Database?

Relational Databases and SQL

Core SQL Operations

ACID Properties

How Databases Store Data

Database Indexing

NoSQL Databases

The CAP Theorem

NewSQL and Distributed SQL

Related Articles

How Large Language Models Work: Architecture, Training, and Applications

How the Internet Works: Protocols, Infrastructure, and the Journey of a Web Request

History of Artificial Intelligence: From Turing to the Age of ChatGPT

How Recommendation Algorithms Work: The Technology Behind Your Feed