DATA ANALYSIS  

CRUD - Read: Retrieving Data

The Read operation in CRUD refers to retrieving data stored in a database. In relational databases, data retrieval is performed primarily using the SQL SELECT statement.

The SELECT command allows developers to query tables, filter rows, sort results, and combine data from multiple tables.

While most relational databases follow the ANSI SQL standard, systems like PostgreSQL extend it with additional features such as LIMIT and DISTINCT ON, making data retrieval more flexible and efficient.

Understanding these read operations is essential for building applications that rely on structured data stored in relational databases.

SQL keywords and concepts explained in the article:

  • SELECT – the fundamental statement used to retrieve data from a database.
  • WHERE – used for filtering records. For a more in-depth explanation, see the article Filtering Tables with WHERE.
  • ORDER BY – used to sort the result set in either ascending or descending order.
  • LIMIT / FETCH FIRST – used to restrict the number of rows returned by a query.
  • OFFSET – allows skipping a specified number of rows before returning the results.
  • GROUP BY – groups rows sharing the same values into summary rows, typically used with aggregate functions.
  • JOIN – combines rows from different tables based on a related column between them.
  • DISTINCT ON – a PostgreSQL-specific feature that retrieves unique rows based on specific columns.
  • RETURNING – a PostgreSQL extension that returns data from rows modified by an operation.

Basic SELECT Query

The simplest read operation retrieves all rows from a table.

SELECT * FROM table_name;
SELECT * FROM customer;

This query returns all columns and rows stored in the customer table.

However, using * is generally discouraged in production systems because it retrieves unnecessary data and can reduce performance.

Selecting Specific Columns

Instead of selecting all columns, it is better to specify exactly which fields are needed.

SELECT column1, column2 FROM table_name;
SELECT firstname, lastname, email FROM customer;

This query retrieves only the firstname, lastname and email columns from the table.

Filtering Data with WHERE

Often, applications need only a subset of records. The WHERE clause filters rows based on conditions.

SELECT column1, column2 FROM table_name WHERE condition;
SELECT address1, city FROM address WHERE city = 'Oxford';

This query returns only entries from address where the city is Oxford.

Logical operators such as AND, OR, and NOT can also be used to combine conditions.

Sorting Results with ORDER BY

Query results can be sorted using the ORDER BY clause.

SELECT column1 FROM table_name ORDER BY column1 ASC;
SELECT lastname, dateofbirth FROM customer ORDER BY dateofbirth DESC;

ASC sorts results in ascending order (default), while DESC sorts them in descending order.

Limiting the Number of Returned Rows

Large datasets often require returning only a subset of results.

PostgreSQL Syntax

PostgreSQL supports the LIMIT clause:

SELECT * FROM customer LIMIT 10;

This query returns only the first 10 rows.

SELECT * FROM customer LIMIT 10 OFFSET 20;

This retrieves 10 rows starting from the 21st row.

ANSI SQL Standard

The ANSI SQL standard uses FETCH FIRST instead of LIMIT.

SELECT * FROM customer FETCH FIRST 10 ROWS ONLY;

Many modern databases support both approaches, but LIMIT is commonly used in PostgreSQL.

Joining Data from Multiple Tables

Relational databases allow combining data from different tables using JOIN operations.

SELECT customer.lastname, "order".ordertimestamp, "order".total FROM customer JOIN "order" ON customer.id = "order".customer_id;

This query retrieves customer last names along with their corresponding orders and total amounts.

** Note: In PostgreSQL, "order" is a reserved keyword (used in ORDER BY), so it must be enclosed in double quotes when used as a table or column name.

Common join types include:

  • INNER JOIN — returns matching rows from both tables
  • LEFT JOIN — returns all rows from the left table and matches from the right
  • RIGHT JOIN — returns all rows from the right table and matches from the left
  • FULL JOIN — returns rows when there is a match in either table

PostgreSQL-Specific Read Features

While most read operations follow the ANSI SQL standard, PostgreSQL provides additional capabilities.

DISTINCT ON

PostgreSQL allows selecting unique rows based on a specific column using DISTINCT ON.

SELECT DISTINCT ON (customer_id) customer_id, ordertimestamp FROM "order" ORDER BY customer_id, ordertimestamp DESC;

This retrieves the most recent order for each customer. This feature is not part of the ANSI SQL standard.

** As noted before, the "order" table name must be quoted.

RETURNING Clause (PostgreSQL extension)

Although primarily used in INSERT, UPDATE, and DELETE, PostgreSQL allows returning rows affected by those operations.

UPDATE customer SET updated = NOW() WHERE id = 5 RETURNING id, updated;

This immediately returns the updated row.

Contact details:

+48 790-430-860

analysislessons@gmail.com