動態

詳情 返回 返回

技術分享 | PG基於備份和WAL日誌恢復數據 - 動態 詳情

本文為墨天輪數據庫管理服務團隊第137期技術分享,內容原創,作者為技術顧問羅海鷗,如需轉載請聯繫小墨(VX:modb666)並註明來源。如需查看更多文章可關注【墨天輪】公眾號。

適用範圍

PG:ALL

方案概述

PG數據庫發生誤操作,造成數據丟失後如何恢復數據。

實施步驟

一、實驗環境

column1 源庫 目標庫
IP地址 192.168.65.128 192.168.65.129
主機名 luo pg2
版本 17 17

源庫需要提前打開如下參數:

db5=# select name,setting from pg_settings where name in ('wal_level','archive_mode','archive_command','restore_command');
      name       |         setting
-----------------+-------------------------
 archive_command | cp %p /opt/pgwal/arc/%f
 archive_mode    | on
 restore_command | cp /opt/pgwal/arc/%f %p
 wal_level       | replica
(4 rows)

二、在目標庫服務器上對源庫做一個備份。

[postgres@pg2 opt]$ pg_basebackup -Fp -Xs -P -v -D /opt/pgdata -h luo
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2C000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_13727"
55973/55973 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2C000158
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

三、源庫模擬生產,寫入數據

postgres=# create database db5;        --新建一個數據庫db5
CREATE DATABASE
postgres=#
postgres=#
postgres=#
postgres=# \c db5
You are now connected to database "db5" as user "postgres".
db5=# create table t1 as select * from pg_class;
SELECT 415
db5=#
db5=#
db5=# create table t2 (id int);       
CREATE TABLE
db5=# insert into t2 values(1);
INSERT 0 1
db5=# insert into t2 values(2);
INSERT 0 1
db5=#
db5=#
db5=#
db5=# select * from t2;
 id
----
  1
  2
(2 rows)
db5=# delete from t2 where id=2;          --模擬誤操作,刪除t2表上一行數據
DELETE 1
db5=# select * from t2;
 id
----
  1
(1 row)
db5=# create table t3(id int);            --新建t3表
CREATE TABLE
db5=# checkpoint ;
CHECKPOINT
db5=# select pg_switch_wal();               --切換WAL日誌
 pg_switch_wal
---------------
 0/2D46A8A0
(1 row)
db5=#
db5=# select pg_relation_filepath('t2');      --查看t2表文件名,文件名是16485
 pg_relation_filepath                               
----------------------
 base/16479/16485
(1 row)
db5=# \d
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | t1   | table | postgres
 public | t2   | table | postgres
 public | t3   | table | postgres
(3 rows)
db5=# select * from t2;
 id
----
  1
(1 row)

四、源庫上pg_waldump分析wal日誌,定位刪除的具體lsn

[postgres@luo arc]$
[postgres@luo arc]$ pwd                  --歸檔目錄
/opt/pgwal/arc
[postgres@luo arc]$ ls -lrt              --根據時間列出WAL日誌
總用量 65540
-rw------- 1 postgres dba 16777216 8月  27 16:08 00000001000000000000002A
-rw------- 1 postgres dba 16777216 8月  27 16:10 00000001000000000000002B
-rw------- 1 postgres dba 16777216 8月  27 16:10 00000001000000000000002C
-rw------- 1 postgres dba      341 8月  27 16:10 00000001000000000000002C.00000028.backup
-rw------- 1 postgres dba 16777216 8月  27 16:13 00000001000000000000002D
[postgres@luo arc]$

WAL日誌定位到如下關鍵信息:誤操作是DELETE,文件名是16485,XID=833,lsn是0/2D469640,上一個lsn是0/2D469608。

[postgres@luo arc]$ pg_waldump 00000001000000000000002D|grep -i delete|grep 16485
rmgr: Heap        len (rec/tot):     54/    54, tx:        833, lsn: 0/2D469640, prev 0/2D469608, desc: DELETE xmax: 833, off: 2, infobits: [KEYS_UPDATED], flags: 0x00, blkref 
#0
: rel 1663/16479/16485 blk 0
[postgres@luo arc]$

五、目標庫開始恢復數據

1.配置recovery\_target\_lsn參數

vi postgresql.conf
recovery_target_lsn = '0/2D469608'            --lsn為上一個lsn

2.創建standby.signal文件

[postgres@pg2 pgdata]$ cd /opt/pgdata/
[postgres@pg2 pgdata]$ touch standby.signal
[postgres@pg2 pgdata]$
[postgres@pg2 pgdata]$ ll
總用量 388
-rw------- 1 postgres dba    227 8月  27 16:10 backup_label.old
-rw------- 1 postgres dba 318484 8月  27 16:10 backup_manifest
drwx------ 6 postgres dba     46 8月  27 16:18 base
-rw------- 1 postgres dba     47 8月  27 16:17 current_logfiles
drwx------ 2 postgres dba   4096 8月  27 16:18 global
drwx------ 2 postgres dba      6 8月  27 16:10 pg_commit_ts
drwx------ 2 postgres dba      6 8月  27 16:10 pg_dynshmem
-rw------- 1 postgres dba   5867 8月  27 16:10 pg_hba.conf
-rw------- 1 postgres dba   2640 8月  27 16:10 pg_ident.conf
drwx------ 2 postgres dba   4096 8月  27 16:17 pg_log
drwx------ 4 postgres dba     68 8月  27 16:22 pg_logical
drwx------ 4 postgres dba     36 8月  27 16:10 pg_multixact
drwx------ 2 postgres dba      6 8月  27 16:10 pg_notify
drwx------ 2 postgres dba      6 8月  27 16:10 pg_replslot
drwx------ 2 postgres dba      6 8月  27 16:10 pg_serial
drwx------ 2 postgres dba      6 8月  27 16:10 pg_snapshots
drwx------ 2 postgres dba      6 8月  27 16:10 pg_stat
drwx------ 2 postgres dba      6 8月  27 16:10 pg_stat_tmp
drwx------ 2 postgres dba     18 8月  27 16:22 pg_subtrans
drwx------ 2 postgres dba      6 8月  27 16:10 pg_tblspc
drwx------ 2 postgres dba      6 8月  27 16:10 pg_twophase
-rw------- 1 postgres dba      3 8月  27 16:10 PG_VERSION
drwx------ 4 postgres dba     77 8月  27 16:24 pg_wal
drwx------ 2 postgres dba     18 8月  27 16:10 pg_xact
-rw------- 1 postgres dba     88 8月  27 16:10 postgresql.auto.conf
-rw------- 1 postgres dba  30969 8月  27 16:17 postgresql.conf
-rw------- 1 postgres dba     23 8月  27 16:17 postmaster.opts
-rw------- 1 postgres dba     70 8月  27 16:17 postmaster.pid
-rw-r--r-- 1 postgres dba      0 8月  27 16:11 standby.signal
[postgres@pg2 pgdata]$

打開目標庫

[postgres@pg2 pgdata]$ pg_ctl start
waiting for server to start....2025-08-27 16:17:42.609 CST [14902] :[ [txid=0] user= ,db=,app=,client=LOG:  redirecting log output to logging collector process
2025-08-27 16:17:42.609 CST [14902] :[ [txid=0] user= ,db=,app=,client=HINT:  Future log output will appear in directory "pg_log".
 done
server started

目標庫缺少最新WAL日誌。

2025-08-27 17:17:40.417 CST [15577] :[ [txid=0] user= ,db=,app=,client=LOG:  database system is ready to accept read-only connections
cp: 無法獲取"/opt/pgwal/arc/00000001000000000000002D" 的文件狀態(stat): 沒有那個文件或目錄

從源庫傳輸過去

[postgres@luo arc]$ scp 00000001000000000000002D pg2:/opt/pgwal/arc
postgres@pg2's password:
00000001000000000000002D                                                                                         100%   16MB  13.9MB/s   00:01
[postgres@luo arc]$

目標庫不再報錯了,recovery stopping after WAL location (LSN) “0/2D469608”,説明數據庫恢復到我們指定的lsn了,pausing at the end of recovery,數據庫處於暫停恢復狀態。

cp: 無法獲取"/opt/pgwal/arc/00000002.history" 的文件狀態(stat): 沒有那個文件或目錄
2025-08-27 17:20:00.706 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG:  waiting for WAL to become available at 0/2D000018
2025-08-27 17:20:05.694 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG:  restored log file "00000001000000000000002D" from archive
cp: 無法獲取"/opt/pgwal/arc/00000001000000000000002E" 的文件狀態(stat): 沒有那個文件或目錄
2025-08-27 17:20:05.740 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG:  recovery stopping after WAL location (LSN) "0/2D469608"
2025-08-27 17:20:05.741 CST [15581] :[ [txid=0] user= ,db=,app=,client=LOG:  pausing at the end of recovery
2025-08-27 17:20:05.741 CST [15581] :[ [txid=0] user= ,db=,app=,client=HINT:  Execute pg_wal_replay_resume() to promote.

六、登錄數據庫,檢查恢復是否符合預期。

源庫

[postgres@luo ~]$ psql db5
psql (17.0)
Type "help" for help.
db5=# \d
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | t1   | table | postgres
 public | t2   | table | postgres
 public | t3   | table | postgres
(3 rows)
db5=# select * from t2;
 id
----
  1
(1 row)
db5=#

目標庫

[postgres@pg2 pgdata]$ psql db5
psql (17.0)
Type "help" for help.
db5=# \d
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | t1   | table | postgres
 public | t2   | table | postgres
(2 rows)
db5=# select * from t2;
 id
----
  1
  2
(2 rows)
db5=#

目標庫基於備份和WAL日誌恢復已完成。


墨天輪從樂知樂享的數據庫技術社區蓄勢出發,全面升級,提供多類型數據庫管理服務。墨天輪數據庫管理服務旨在為用户構建信賴可託付的數據庫環境,併為數據庫廠商提供中立的生態支持。
墨天輪數據庫服務官網:https://www.modb.pro/service

user avatar ciel717 頭像 vivo_tech 頭像
點贊 2 用戶, 點贊了這篇動態!
點贊

Add a new 評論

Some HTML is okay.