關係代數是數據庫查詢的理論基礎,MySQL的SQL語言實現了關係代數的各種操作。以下是詳細講解:

一、關係代數基礎與MySQL對應

1. 基本關係代數運算符

關係代數

MySQL對應

説明

σ (選擇)

WHERE

行選擇

π (投影)

SELECT

列選擇

× (笛卡爾積)

CROSS JOIN

所有組合

⋈ (連接)

JOIN

表連接

∪ (並)

UNION

合併結果

∩ (交)

INTERSECT (MySQL 8.0+)

交集

- (差)

EXCEPT (MySQL 8.0+)

差集

二、選擇操作 (σ - Selection)

基礎語法

-- 關係代數:σ_{條件}(R)
-- MySQL:SELECT * FROM R WHERE 條件

-- 示例:選擇年齡大於30的員工
SELECT * FROM employees WHERE age > 30;

複雜條件選擇

-- 多個條件的AND
SELECT * FROM employees 
WHERE department = 'Sales' AND salary > 50000;

-- 多個條件的OR
SELECT * FROM products 
WHERE category = 'Electronics' OR price < 100;

-- IN操作符(集合選擇)
SELECT * FROM customers 
WHERE country IN ('USA', 'Canada', 'UK');

-- BETWEEN範圍選擇
SELECT * FROM orders 
WHERE order_date BETWEEN '2023-01-01' AND '2023-12-31';

-- LIKE模式匹配
SELECT * FROM products 
WHERE product_name LIKE '%Laptop%';

三、投影操作 (π - Projection)

基礎語法

-- 關係代數:π_{列1,列2,...}(R)
-- MySQL:SELECT 列1, 列2, ... FROM R

-- 示例:選擇部分列
SELECT employee_id, first_name, last_name FROM employees;

-- 去重投影(DISTINCT)
SELECT DISTINCT department FROM employees;

-- 計算列投影
SELECT 
    product_name,
    price,
    price * 0.9 as discounted_price,
    CONCAT('Category: ', category) as description
FROM products;

投影與選擇的組合

-- π_{name, salary}(σ_{dept='IT'}(employees))
SELECT name, salary 
FROM employees 
WHERE department = 'IT';

四、笛卡爾積 (× - Cartesian Product)

基礎用法

-- 關係代數:R × S
-- MySQL:FROM R CROSS JOIN S 或 FROM R, S

-- 顯式笛卡爾積
SELECT * FROM table1 CROSS JOIN table2;

-- 隱式笛卡爾積(不推薦)
SELECT * FROM table1, table2;

-- 實際示例:生成所有組合
SELECT 
    s.size,
    c.color,
    p.product_name
FROM sizes s
CROSS JOIN colors c
CROSS JOIN products p;

五、連接操作 (⋈ - Join)

1. θ-連接(條件連接)

-- 關係代數:R ⋈_{條件} S
-- MySQL:FROM R JOIN S ON 條件

SELECT *
FROM employees e
JOIN departments d ON e.department_id = d.dept_id
WHERE e.salary > 50000;

2. 自然連接

-- MySQL沒有直接的自然連接,但可以模擬
-- 自動匹配相同列名
SELECT *
FROM employees e
JOIN departments d USING (department_id);

-- 或者
SELECT *
FROM employees e
JOIN departments d ON e.department_id = d.department_id;

3. 等值連接

-- 最常用的連接類型
SELECT 
    c.customer_name,
    o.order_date,
    o.amount
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id;

4. 外連接

-- 左外連接
SELECT 
    e.employee_name,
    d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.dept_id;

-- 右外連接
SELECT 
    d.department_name,
    e.employee_name
FROM departments d
RIGHT JOIN employees e ON d.dept_id = e.department_id;

-- 全外連接(MySQL模擬)
SELECT * FROM employees e
LEFT JOIN departments d ON e.department_id = d.dept_id
UNION
SELECT * FROM employees e
RIGHT JOIN departments d ON e.department_id = d.dept_id;

六、集合操作

1. 並操作 (UNION)

-- 關係代數:R ∪ S
-- MySQL:SELECT ... UNION SELECT ...

-- 基本並操作
SELECT product_name FROM current_products
UNION
SELECT product_name FROM archived_products;

-- UNION ALL(保留重複)
SELECT city FROM suppliers
UNION ALL
SELECT city FROM customers;

-- 帶條件的並
(SELECT employee_id, salary FROM employees WHERE department = 'Sales')
UNION
(SELECT employee_id, salary FROM employees WHERE salary > 70000)
ORDER BY salary DESC;

2. 交操作 (INTERSECT) - MySQL 8.0+

-- 關係代數:R ∩ S
-- MySQL 8.0+:SELECT ... INTERSECT SELECT ...

-- 查找同時存在於兩個表中的記錄
SELECT customer_id FROM online_orders
INTERSECT
SELECT customer_id FROM in_store_orders;

-- 在舊版本中模擬交集
SELECT DISTINCT o1.customer_id
FROM online_orders o1
INNER JOIN in_store_orders o2 ON o1.customer_id = o2.customer_id;

-- 或者使用IN
SELECT DISTINCT customer_id
FROM online_orders
WHERE customer_id IN (SELECT customer_id FROM in_store_orders);

3. 差操作 (EXCEPT) - MySQL 8.0+

-- 關係代數:R - S
-- MySQL 8.0+:SELECT ... EXCEPT SELECT ...

-- 查找只在第一個表中的記錄
SELECT product_id FROM all_products
EXCEPT
SELECT product_id FROM discontinued_products;

-- 在舊版本中模擬差集
SELECT product_id 
FROM all_products 
WHERE product_id NOT IN (
    SELECT product_id FROM discontinued_products
);

-- 使用LEFT JOIN
SELECT ap.product_id
FROM all_products ap
LEFT JOIN discontinued_products dp ON ap.product_id = dp.product_id
WHERE dp.product_id IS NULL;

七、複雜關係代數表達式實現

1. 複合表達式示例

-- 關係代數:π_{name, salary}(σ_{dept='IT'}(employees) ⋈ departments)
-- 查找IT部門員工的姓名和工資
SELECT 
    e.employee_name,
    e.salary,
    d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.dept_id
WHERE d.department_name = 'IT';

2. 除法操作實現

-- 關係代數:R ÷ S
-- 查找完成了所有必修課程的學生

-- 方法1:使用雙重NOT EXISTS
SELECT s.student_id, s.student_name
FROM students s
WHERE NOT EXISTS (
    SELECT c.course_id 
    FROM required_courses c
    WHERE NOT EXISTS (
        SELECT 1 
        FROM enrollments e 
        WHERE e.student_id = s.student_id 
        AND e.course_id = c.course_id
    )
);

-- 方法2:使用GROUP BY和HAVING
SELECT s.student_id, s.student_name
FROM students s
JOIN enrollments e ON s.student_id = e.student_id
JOIN required_courses c ON e.course_id = c.course_id
GROUP BY s.student_id, s.student_name
HAVING COUNT(DISTINCT c.course_id) = (
    SELECT COUNT(*) FROM required_courses
);

3. 關係代數優化示例

-- 原始查詢:σ_{price>100}(π_{name,price}(products))
SELECT name, price
FROM products
WHERE price > 100;

-- 優化:先選擇後投影(MySQL會自動優化)
-- 實際執行計劃:先WHERE過濾,再SELECT列

-- 查看執行計劃驗證
EXPLAIN SELECT name, price FROM products WHERE price > 100;

八、高級關係代數模式

1. 自連接(處理層次關係)

-- 查找員工及其經理
SELECT 
    e.employee_name as employee,
    m.employee_name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

2. 多重連接

-- 連接三個表
SELECT 
    c.customer_name,
    o.order_date,
    p.product_name,
    oi.quantity,
    oi.unit_price
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id;

3. 使用WITH語句(公用表表達式)

-- 關係代數的臨時關係
WITH high_value_orders AS (
    -- σ_{amount>1000}(orders)
    SELECT * FROM orders WHERE amount > 1000
),
active_customers AS (
    -- π_{customer_id}(σ_{status='active'}(customers))
    SELECT customer_id FROM customers WHERE status = 'active'
)
-- 連接臨時關係
SELECT 
    c.customer_name,
    hvo.order_date,
    hvo.amount
FROM high_value_orders hvo
JOIN active_customers ac ON hvo.customer_id = ac.customer_id
JOIN customers c ON hvo.customer_id = c.customer_id;

九、關係代數在MySQL中的性能考慮

1. 選擇操作的優化

-- 好的實踐:使用索引友好的條件
SELECT * FROM users WHERE id = 100;  -- 使用索引
SELECT * FROM users WHERE name LIKE 'John%';  -- 前綴匹配使用索引
SELECT * FROM users WHERE created_at >= '2023-01-01';  -- 範圍查詢

-- 避免:全表掃描的查詢
SELECT * FROM users WHERE LOWER(name) = 'john';  -- 函數調用
SELECT * FROM users WHERE amount * 2 > 100;  -- 列運算

2. 連接操作的優化

-- 小表驅動大表(MySQL通常自動優化)
SELECT * 
FROM small_table s
JOIN large_table l ON s.id = l.small_id;

-- 使用合適的連接類型
-- INNER JOIN:等值連接,效率最高
-- LEFT JOIN:需要保留左表所有記錄時使用

3. 投影操作的優化

-- 只選擇需要的列
SELECT id, name FROM users;  -- 好的實踐

-- 避免SELECT *
SELECT * FROM users;  -- 不必要的IO

-- 使用覆蓋索引
SELECT id, name FROM users WHERE department = 'IT';
-- 如果 (department, id, name) 有索引,則效率極高

十、實踐練習

練習1:實現複雜關係代數

-- 查找購買了所有單價超過100的產品的客户
-- 關係代數:π_{customer_id}(orders) ÷ π_{product_id}(σ_{price>100}(products))

WITH expensive_products AS (
    SELECT product_id FROM products WHERE price > 100
),
customer_orders AS (
    SELECT DISTINCT o.customer_id, p.product_id
    FROM orders o
    JOIN order_items oi ON o.order_id = oi.order_id
    JOIN products p ON oi.product_id = p.product_id
)
SELECT c.customer_id, c.customer_name
FROM customers c
WHERE NOT EXISTS (
    SELECT ep.product_id 
    FROM expensive_products ep
    WHERE NOT EXISTS (
        SELECT 1 
        FROM customer_orders co 
        WHERE co.customer_id = c.customer_id 
        AND co.product_id = ep.product_id
    )
);

練習2:關係代數轉換

-- 原始SQL:查找每個部門的平均工資,然後找出高於公司平均工資的部門
SELECT department_id, AVG(salary) as dept_avg
FROM employees
GROUP BY department_id
HAVING AVG(salary) > (
    SELECT AVG(salary) FROM employees
);

-- 關係代數表示:
-- 設 R = employees
-- 1. G = γ_{department_id, AVG(salary)→dept_avg}(R)
-- 2. T = γ_{AVG(salary)→company_avg}(R)
-- 3. 結果 = σ_{dept_avg > company_avg}(G × T)

總結

MySQL通過SQL語句實現了關係代數的各種操作:

  1. 選擇WHERE 子句
  2. 投影SELECT 列列表
  3. 笛卡爾積CROSS JOIN
  4. 連接 → 各種 JOIN
  5. 集合運算UNION, INTERSECT, EXCEPT

理解關係代數有助於:

  • 編寫更高效的SQL查詢
  • 理解查詢優化器的工作原理
  • 設計更好的數據庫模式
  • 解決複雜的數據查詢問題