The tricks for purge hydra useless data.

Intro

Ory Hydra is a hardened, OpenID Certified OAuth 2.0 Server and OpenID Connect Provider.

使用苦主在這兒也簡介了下 Hydra

問題是什麼

在 Hydra v1.3.* 的時期 Schema 依賴 foreign key 建立關聯來節省些事情

但也因為服務特性,只要被摸就會留紀錄的特性目前無法避免無效資料量的增長,所以就只能手動刪除了

好消息是 v2.0.* 後就完全找不到 foreign key 依賴的 Schema 了

這樣資料作業也會方便些。

Background

DB: MariaDB 10.4.15
Hydra v1.3.*

table space size

# ls -lhS | head
total 318G
-rw-rw---- 1 mysql mysql  63G Dec 21 19:53 hydra_oauth2_authentication_request.ibd
-rw-rw---- 1 mysql mysql  55G Dec 21 19:53 hydra_oauth2_oidc.ibd
-rw-rw---- 1 mysql mysql  50G Dec 21 19:53 hydra_oauth2_access.ibd
-rw-rw---- 1 mysql mysql  49G Dec 21 19:53 hydra_oauth2_code.ibd
-rw-rw---- 1 mysql mysql  42G Dec 21 19:53 hydra_oauth2_consent_request.ibd
-rw-rw---- 1 mysql mysql  35G Dec 21 19:53 hydra_oauth2_logout_request.ibd
-rw-rw---- 1 mysql mysql 9.4G Dec 21 19:53 hydra_oauth2_authentication_session.ibd
-rw-rw---- 1 mysql mysql 7.9G Dec 21 19:53 hydra_oauth2_consent_request_handled.ibd
-rw-rw---- 1 mysql mysql 6.6G Dec 21 19:53 hydra_oauth2_authentication_request_handled.ibd

table hydra_oauth2_authentication_request status

$ mysql -e "show table status like 'hydra_oauth2_authentication_request'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-------------------------------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+------------------+-----------+
| Name                                | Engine | Version | Row_format | Rows     | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation       | Checksum | Create_options | Comment | Max_index_length | Temporary |
+-------------------------------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+------------------+-----------+
| hydra_oauth2_authentication_request | InnoDB |      10 | Dynamic    | 38617002 |           1085 | 41912156160 |               0 |  24018206720 | 250609664 |           NULL | 2022-04-30 23:53:04 | 2022-12-21 19:55:03 | NULL       | utf8_general_ci |     NULL |                |         |                0 | N         |
+-------------------------------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+------------------+-----------+

問題

產品清資料的語法大概長這樣

mysql> delete from `hydra_oauth2_consent_request` where `client_id` = 'vip' limit 10;

交代一下 Schema 的 FK 關聯多麻煩

$ mysql -e "show create table hydra_oauth2_consent_request\G"
mysql: [Warning] Using a password on the command line interface can be insecure.
*************************** 1. row ***************************
       Table: hydra_oauth2_consent_request
Create Table: CREATE TABLE `hydra_oauth2_consent_request` (
  `challenge` varchar(40) NOT NULL,
  `verifier` varchar(40) NOT NULL,
  `client_id` varchar(255) NOT NULL,
  `subject` varchar(255) NOT NULL,
  `request_url` text NOT NULL,
  `skip` tinyint(1) NOT NULL,
  `requested_scope` text NOT NULL,
  `csrf` varchar(40) NOT NULL,
  `authenticated_at` timestamp NULL DEFAULT NULL,
  `requested_at` timestamp NOT NULL DEFAULT current_timestamp(),
  `oidc_context` text NOT NULL,
  `forced_subject_identifier` varchar(255) DEFAULT '',
  `login_session_id` varchar(40) DEFAULT NULL,
  `login_challenge` varchar(40) DEFAULT NULL,
  `requested_at_audience` text NOT NULL,
  `acr` text NOT NULL,
  `context` text NOT NULL,
  PRIMARY KEY (`challenge`),
  UNIQUE KEY `hydra_oauth2_consent_request_veri_idx` (`verifier`),
  KEY `hydra_oauth2_consent_request_cid_idx` (`client_id`),
  KEY `hydra_oauth2_consent_request_sub_idx` (`subject`),
  KEY `hydra_oauth2_consent_request_idx_01` (`requested_at`),
  KEY `hydra_oauth2_consent_request_login_session_id_idx` (`login_session_id`),
  KEY `hydra_oauth2_consent_request_login_challenge_idx` (`login_challenge`),
  CONSTRAINT `hydra_oauth2_consent_request_client_id_fk` FOREIGN KEY (`client_id`) REFERENCES `hydra_client` (`id`) ON DELETE CASCADE,
  CONSTRAINT `hydra_oauth2_consent_request_login_challenge_fk` FOREIGN KEY (`login_challenge`) REFERENCES `hydra_oauth2_authentication_request` (`challenge`) ON DELETE SET NULL,
  CONSTRAINT `hydra_oauth2_consent_request_login_session_id_fk` FOREIGN KEY (`login_session_id`) REFERENCES `hydra_oauth2_authentication_session` (`id`) ON DELETE SET NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8

當然這張表也不小

$ mysql -e "show table status like 'hydra_oauth2_consent_request'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+------------------------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+------------------+-----------+
| Name                         | Engine | Version | Row_format | Rows     | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation       | Checksum | Create_options | Comment | Max_index_length | Temporary |
+------------------------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+------------------+-----------+
| hydra_oauth2_consent_request | InnoDB |      10 | Dynamic    | 21986247 |           1362 | 29961666560 |               0 |  14324645888 | 108003328 |           NULL | 2022-04-30 23:53:04 | 2022-12-21 20:01:01 | NULL       | utf8_general_ci |     NULL |                |         |                0 | N         |
+------------------------------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+----------------+---------+------------------+-----------+

先從 explain 來看的話會發生什麼事情

$ mysql -e "explain delete from hydra_oauth2_consent_request where client_id = 'vip' limit 10;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+------+-------------+------------------------------+-------+--------------------------------------+--------------------------------------+---------+------+---------+-------------+
| id   | select_type | table                        | type  | possible_keys                        | key                                  | key_len | ref  | rows    | Extra       |
+------+-------------+------------------------------+-------+--------------------------------------+--------------------------------------+---------+------+---------+-------------+
|    1 | SIMPLE      | hydra_oauth2_consent_request | range | hydra_oauth2_consent_request_cid_idx | hydra_oauth2_consent_request_cid_idx | 767     | NULL | 9292132 | Using where |
+------+-------------+------------------------------+-------+--------------------------------------+--------------------------------------+---------+------+---------+-------------+

接下來作業的時候發生了什麼狀況,很明顯的一根 replication delay

雖然不至於產生 HA 切換,但麻煩的是商務邏輯在讀寫分離的情況下產生的功能異常影響就大條了

區分 delete / select 在 explain 上的差異 ; 光 Using index condition / Using where 的差異就明顯解釋了效能差異

[wnlin@dba-desktop-1 ~]$ mysqlq dbm-m028 m028006 -e "explain select * from hydra_oauth2_consent_request where client_id = 'vip' limit 10;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+------+-------------+------------------------------+------+--------------------------------------+--------------------------------------+---------+-------+---------+-----------------------+
| id   | select_type | table                        | type | possible_keys                        | key                                  | key_len | ref   | rows    | Extra                 |
+------+-------------+------------------------------+------+--------------------------------------+--------------------------------------+---------+-------+---------+-----------------------+
|    1 | SIMPLE      | hydra_oauth2_consent_request | ref  | hydra_oauth2_consent_request_cid_idx | hydra_oauth2_consent_request_cid_idx | 767     | const | 9292105 | Using index condition |
+------+-------------+------------------------------+------+--------------------------------------+--------------------------------------+---------+-------+---------+-----------------------+

這個時候的 limit offset 對解決問題的幫助不大

所以解決問題的思維還是要從 PK 下手

但怎麼找出隨機的 PK 又符合 client_id 條件

order by random() 就不再提了

但如果在同一張表上 JOIN 處理過的 Data Set 就好辦了

STMT 大概會長這樣 ; 有沒有需要 ORDER 見仁見智,畢竟原生需求只要清資料

explain SELECT r1.challenge, r1.client_id
  FROM hydra_oauth2_consent_request AS r1 JOIN
       (SELECT CEIL(RAND() *
                     (SELECT MAX(challenge)
                        FROM hydra_oauth2_consent_request)) as id)
        AS r2
 WHERE r1.challenge >= r2.id
 AND r1.client_id = 'vip'
 -- ORDER BY r1.challenge ASC
 LIMIT 10;

再確認一下 explain 已經可以正常使用 index 了

mysql> explain SELECT r1.challenge, r1.client_id
    ->   FROM hydra_oauth2_consent_request AS r1 JOIN
    ->        (SELECT CEIL(RAND() *
    ->                      (SELECT MAX(challenge)
    ->                         FROM hydra_oauth2_consent_request)) as id)
    ->         AS r2
    ->  WHERE r1.challenge >= r2.id
    ->  AND r1.client_id = 'vip'
    ->  -- ORDER BY r1.challenge ASC
    ->  LIMIT 10;
+------+-------------+------------+--------+----------------------------------------------+--------------------------------------+---------+-------+---------+------------------------------+
| id   | select_type | table      | type   | possible_keys                                | key                                  | key_len | ref   | rows    | Extra                        |
+------+-------------+------------+--------+----------------------------------------------+--------------------------------------+---------+-------+---------+------------------------------+
|    1 | PRIMARY     | <derived2> | system | NULL                                         | NULL                                 | NULL    | NULL  | 1       |                              |
|    1 | PRIMARY     | r1         | ref    | PRIMARY,hydra_oauth2_consent_request_cid_idx | hydra_oauth2_consent_request_cid_idx | 767     | const | 9290919 | Using where; Using index     |
|    2 | DERIVED     | NULL       | NULL   | NULL                                         | NULL                                 | NULL    | NULL  | NULL    | No tables used               |
|    3 | SUBQUERY    | NULL       | NULL   | NULL                                         | NULL                                 | NULL    | NULL  | NULL    | Select tables optimized away |
+------+-------------+------------+--------+----------------------------------------------+--------------------------------------+---------+-------+---------+------------------------------+
4 rows in set, 1 warning (0.00 sec)

再看看這個時候取資料的效率 ; 各 offset limit 反應時間都可以維持在 0.5s 內

10 rows in set, 11 warnings (0.01 sec)
1000 rows in set, 1001 warnings (0.01 sec)
10000 rows in set, 10001 warnings (0.04 sec)
50000 rows in set, 50001 warnings (0.11 sec)

我不建議再把語法改寫成 DELETE INNER JOIN 作業

也因為取得的 dataset 內已含 challenge 這 PK 的話,那就是 loop 把 dataset 一個一個走完,原則上也就不擔心 DELETE STMT 產生的 replication delay 了.

MariaDB fails to start after wsrep_on = off

維運案例分享

Environment

OS: CentOS 7u9
DB: MariaDB 10.4.15 & Galera 4

架構需求:

Galera Cluster MariaDB -> MariaDB Stand alone

問題描述

從 Galera Cluster 恢復成單機執行時,服務恢復了,但 transaction 異常,且 InnoDB_trx, InnoDB_lock* 多了些奇怪的資訊

Background

  1. XA Transactions
    從官網 [1] xa-transactions overview 先初步了解一下

  2. MariaDB Galera Cluster XA transactions are not supported.
    官網也說了 Galera 不支援 xa transaction

實際維運案例分享

因為 Cluster 的 commit 緩慢問題正在排除,所以決定恢復單機執行,但 galera_new_cluster 問題依舊,故關閉 wsrep_on = off 後再啟動服務.

過程中我們會先得到哪些資訊

mysqld.log ; 因為 cluster 重啟動 & 恢復叢集的操作中,遺留了不少待處理的 Transaction

2022-12-12 15:37:06 0 [Note] InnoDB: Transaction 524271636 was in the XA prepared state.
2022-12-12 15:37:06 0 [Note] InnoDB: Transaction 524271502 was in the XA prepared state.
2022-12-12 15:37:06 0 [Note] InnoDB: 57 transaction(s) which must be rolled back or cleaned up in total 0 row operations to undo
2022-12-12 15:37:06 0 [Note] InnoDB: Trx id counter is 524289644
2022-12-12 15:37:07 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
==skip==
2022-12-12 15:37:07 0 [Note] InnoDB: Transaction 524272457 in prepared state after recovery
2022-12-12 15:37:07 0 [Note] InnoDB: Transaction contains changes to 1 rows
2022-12-12 15:37:07 0 [Note] InnoDB: 57 transactions in prepared state after recovery
2022-12-12 15:37:07 0 [Note] Found 57 prepared transaction(s) in InnoDB
2022-12-12 15:37:07 0 [Warning] Found 57 prepared XA transactions

slowquery & innodb_trx 得到相關不應該出現緩慢 stmt

*************************** 1. row ***************************
                    trx_id: 524282268
                 trx_state: LOCK WAIT
               trx_started: 2022-12-12 15:05:47
     trx_requested_lock_id: 524282268:6390:3:2
          trx_wait_started: 2022-12-12 15:05:47
                trx_weight: 4
       trx_mysql_thread_id: 2120
                 trx_query: insert ignore into tbl_name (agent_id) values(1)
       trx_operation_state: inserting
         trx_tables_in_use: 1
         trx_tables_locked: 2
          trx_lock_structs: 3
     trx_lock_memory_bytes: 1128
           trx_rows_locked: 1
         trx_rows_modified: 1
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 2. row ***************************
                    trx_id: 524282226
                 trx_state: LOCK WAIT
               trx_started: 2022-12-12 15:05:34
     trx_requested_lock_id: 524282226:7020:4:4
          trx_wait_started: 2022-12-12 15:05:34
                trx_weight: 3
       trx_mysql_thread_id: 321
                 trx_query: update tbl_name set data ='XXX' where cfid ='YYY'
       trx_operation_state: fetching rows
         trx_tables_in_use: 1
         trx_tables_locked: 1
          trx_lock_structs: 3
     trx_lock_memory_bytes: 1128
           trx_rows_locked: 2
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: READ COMMITTED
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
          trx_is_read_only: 0
trx_autocommit_non_locking: 0

繼續追查還有對應 trx_mysql_thread_id: 0 的紀錄

*************************** 59. row ***************************
                    trx_id: 524271508
                 trx_state: RUNNING
               trx_started: 2022-12-12 14:50:58
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 2
       trx_mysql_thread_id: 0
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 1
          trx_lock_structs: 1
     trx_lock_memory_bytes: 1128
           trx_rows_locked: 0
         trx_rows_modified: 1
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
          trx_is_read_only: 0
trx_autocommit_non_locking: 0

從數量判斷 mysqld 處理 crash recover 的時候 XA transaction 也卡住了,接著手動處理 XA transaction (rollback 處理)

Solution

操作官方說明

Steps

可以先試試 --tc-heuristic-recover ROLLBACK 解決問題,如果無解的話

先從 mysql client 接入後 xa recover 呈現現在待處理的 XA transaction,結果從 data 欄位發現 wsrep 遺留的 transaction 包含亂碼,也不好繼續後面指令

XA RECOVER;
+----------+--------------+--------------+------+
| formatID | gtrid_length | bqual_length | data |
+----------+--------------+--------------+------+
|        1 |            4 |            0 | wsrep***** |
+----------+--------------+--------------+------+

轉換可讀語法,官方說法

MariaDB starting with 10.3.3
You can use XA RECOVER FORMAT='SQL' to get the data in a human readable form that can be directly copy-pasted into XA COMMIT or XA ROLLBACK. This is particularly useful for binary xid generated by some transaction coordinators.

所以就會產出

xa recover format='SQL';
+----------+--------------+--------------+-----------------------------------------------+
| formatID | gtrid_length | bqual_length | data                                          |
+----------+--------------+--------------+-----------------------------------------------+
|        1 |           11 |            7 | X'31320d3334093637763738',X'6162630a646566' |
+----------+--------------+--------------+-----------------------------------------------+

接著就可以下 rollback 指定了

xa rollback X'31320d3334093637763738',X'6162630a646566';

建議再重新啟動 MariaDB 觀察一下 mysqld.log 還有沒有其他異常的 Log 需要處理

References

How to Installing Percona XtraDB Cluster from Binary Tarball ver 5.7

因為需要重建手冊的關係,版本又已經從 5.7.18 的時期長大到 5.7.35 了

懶得處理 yum / apt 套件管理更新,所以就直接從 Binary Tarball 處理了

以順序論,建議 Percona XtraBackup 2.4 -> Percona XtraDB Cluster 5.7

具體你會先得到原因知道為什麼上面順序需要這樣安排

Error: Package: Percona-XtraDB-Cluster-server-57-5.7.18-29.20.1.el7.x86_64 (/Percona-XtraDB-Cluster-server-57-5.7.18-29.20.1.el7.x86_64)
           Requires: percona-xtrabackup-24 >= 2.4.4
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

所以從 here 把相關套件帶回來並安裝起來

# wget https://downloads.percona.com/downloads/Percona-XtraBackup-2.4/Percona-XtraBackup-2.4.4/binary/redhat/7/x86_64/Percona-XtraBackup-2.4.4-rdf58cf2-el7-x86_64-bundle.tar
# tar -xvf Percona-XtraBackup-2.4.4-rdf58cf2-el7-x86_64-bundle.tar
percona-xtrabackup-24-2.4.4-1.el7.x86_64.rpm
percona-xtrabackup-24-debuginfo-2.4.4-1.el7.x86_64.rpm
percona-xtrabackup-test-24-2.4.4-1.el7.x86_64.rpm

# yum install percona-xtrabackup-24-2.4.4-1.el7.x86_64.rpm

Installed:
  percona-xtrabackup-24.x86_64 0:2.4.4-1.el7

Dependency Installed:
  libev.x86_64 0:4.15-7.el7               mariadb-libs.x86_64 1:5.5.68-1.el7     perl-Compress-Raw-Bzip2.x86_64 0:2.061-3.el7     perl-Compress-Raw-Zlib.x86_64 1:2.061-4.el7
  perl-DBD-MySQL.x86_64 0:4.023-6.el7     perl-DBI.x86_64 0:1.627-4.el7          perl-IO-Compress.noarch 0:2.061-2.el7            perl-Net-Daemon.noarch 0:0.48-5.el7
  perl-PlRPC.noarch 0:0.2020-14.el7

Complete!

接著把 bundle 從官網帶回來,打開安裝起來便是

# wget https://downloads.percona.com/downloads/Percona-XtraDB-Cluster-57/Percona-XtraDB-Cluster-5.7.18-29.20/binary/redhat/7/x86_64/Percona-XtraDB-Cluster-5.7.18-29.20-r346-el7-x86_64-bundle.tar

# tar -xvf Percona-XtraDB-Cluster-5.7.18-29.20-r346-el7-x86_64-bundle.tar
Percona-XtraDB-Cluster-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-57-debuginfo-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-client-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-devel-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-full-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-garbd-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-server-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-shared-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-shared-compat-57-5.7.18-29.20.1.el7.x86_64.rpm
Percona-XtraDB-Cluster-test-57-5.7.18-29.20.1.el7.x86_64.rpm

# yum install *.rpm

接下來就是 database initial 的作業就先不贅述了

作業程序官網都有說了,我也覺得很有幫助就是了

How Big Can Your Galera with MariaDB Transactions Be

先說結論 Ref: wsrep_max_ws_size 官方文件寫得死死了 ; 看到 permitted the maximum to be set beyond 2GB, which was rejected by Galera.
就又再更心死了一次

2147483647 (2GB, >= MariaDB Galera 10.0.27, MariaDB 10.1.17)
1073741824 (1GB, <= MariaDB Galera 10.0.26, MariaDB 10.1.16)

為什麼這次要研究這個問題 ; 因為我們需要討論的是一大包 Transactions 的合理性,

改不改得動是商務邏輯的事,但從架構面能不能解決 ERROR 1180 看起來還是有點辦法的

MariaDB [test]> COMMIT;
ERROR 1180 (HY000): Got error 5 "Input/output error" during COMMIT

基於 Percona 對 MySQL 做的測試在這可以參考,我們這次的測試標的基本資訊如下

[root@localhost ~]# cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)
[root@localhost ~]# mysql -e "\s" | grep -i "Server version"
Server version:         10.4.15-MariaDB-log MariaDB Server
[root@localhost ~]# mysql -e "show status like '%wsrep%'" | grep 'wsrep_local_state_comment\|wsrep_cluster_status\|wsrep_ready'
wsrep_local_state_comment       Synced
wsrep_cluster_status    Primary
wsrep_ready     ON
[root@localhost ~]# mysql -e "select @@wsrep_max_ws_size" # 因為需要容易觸發的條件,所以就把設定最小值帶上
+---------------------+
| @@wsrep_max_ws_size |
+---------------------+
|                1024 |
+---------------------+
[root@localhost ~]# mysql -e "select @@binlog_row_image"
+--------------------+
| @@binlog_row_image |
+--------------------+
| FULL               |
+--------------------+

測試經過語法 & schema 如下

CREATE TABLE t (
    a bigint primary key DEFAULT UUID_SHORT(),
    b text,
    d text default "123",
    e text default "123",
    f text default "123",
    g text default "123"
);

SET autocommit=0;
START TRANSACTION;
-- binlog_row_image = FULL 的情況下 b 僅能指定 300 個字元,再上去就 ERROR 1180
UPDATE t SET b='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
COMMIT;

-- 錯誤訊息就不另贅述了
MariaDB [test]> COMMIT;
ERROR 1180 (HY000): Got error 5 "Input/output error" during COMMIT

此時 COMMIT 成功的 binlog 長這樣,不相關的欄位也被一起帶進來了,所以合理推測 Transactions 內容也會把其他 column 帶入,如果是 blob / text 欄位的話

那踩到 wsrep_max_ws_size 上限也就合理了

#220220 20:19:55 server id 1  end_log_pos 4434 CRC32 0x68307979         Update_rows: table id 25 flags: STMT_END_F

BINLOG '
6zESYhMBAAAANgAAANAOAAAAABkAAAAAAAEABHRlc3QAAXQABgj8/Pz8/AUCAgICAj7xstL3
6zESYhgBAAAAggIAAFIRAAAAABkAAAAAAAEABv//wAEAANAREmIB9gB4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHgDADEyMwMAMTIzAwAxMjMDADEyM8ABAADQERJiASwBeHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4AwAxMjMDADEy
MwMAMTIzAwAxMjN5eTBo
'/*!*/;
### UPDATE `test`.`t`
### WHERE
###   @1=99662009468977153
###   @2='123'
###   @3='123'
###   @4='123'
###   @5='123'
###   @6='123'
### SET
###   @1=99662009468977153
###   @2='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
###   @3='123'
###   @4='123'
###   @5='123'
###   @6='123'
# Number of rows: 1
# at 4434
#220220 20:19:55 server id 1  end_log_pos 4465 CRC32 0x6dd7ab45         Xid = 11
COMMIT/*!*/;
# at 4465
#220220 20:22:56 server id 1  end_log_pos 4488 CRC32 0x8b4370c7         Stop
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

所以應該用 binlog_row_image=minimal 解

[root@localhost ~]# mysql -e "select @@wsrep_max_ws_size"
+---------------------+
| @@wsrep_max_ws_size |
+---------------------+
|                1024 |
+---------------------+
[root@localhost ~]# mysql -e "select @@binlog_row_image"
+--------------------+
| @@binlog_row_image |
+--------------------+
| MINIMAL            |
+--------------------+

[root@localhost ~]# mysql test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 22
Server version: 10.4.15-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [test]> SET autocommit=0;
Query OK, 0 rows affected (0.001 sec)

MariaDB [test]> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

-- binlog_row_image = MINIMAL 的情況下 b 僅能指定 448 個字元,再上去就 ERROR 1180

MariaDB [test]> UPDATE t SET b='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx';
Query OK, 0 rows affected (0.001 sec)
Rows matched: 1  Changed: 0  Warnings: 0

MariaDB [test]> COMMIT;
Query OK, 0 rows affected (0.000 sec)

此時 binlog 狀態長這樣

#220220 20:49:28 server id 1  end_log_pos 2106 CRC32 0x118f42b9         Update_rows: table id 25 flags: STMT_END_F

BINLOG '
2DgSYhMBAAAANgAAAEwGAAAAABkAAAAAAAEABHRlc3QAAXQABgj8/Pz8/AUCAgICAj5x9VP7
2DgSYhgBAAAA7gEAADoIAAAAABkAAAAAAAEABgEC/gAAAMM3EmIB/sABeHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4
eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eHh4eLlCjxE=
'/*!*/;
### UPDATE `test`.`t`
### WHERE
###   @1=99662172459630592
### SET
###   @2='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
# Number of rows: 1
# at 2106
#220220 20:49:28 server id 1  end_log_pos 2137 CRC32 0x3ffe0ba5         Xid = 27
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

Transactions 內容是多了點沒錯,但原則上還是應急用考量

畢竟 binlog_row_image 不在 FULL 條件下的 Point-in-time recovery (PITR) 也不好實作

畢竟一個 2GB 的 Transactions 種種考量下要 roll back 也不容易。

所以還是建議拆細 Transactions 粒度才是最佳解。

[DBA] How to fix the read lock generated by the restored database of mariabackup?

維運案例分享

OS: CentOS 7u9
DB: MariaDB 10.4

這世界上太多教學告訴你資料庫怎麼還原,Replication 怎麼建立 EX: here

但從來不會有人知道為什麼 mariabackup 打包後的資料庫還原建完 replication 後,
Slave IO Thread 正常,但 Slave SQL Thread 永遠沒辦法正常消化,
放著等也就 ERROR 1205 (HY000): Lock wait timeout exceeded 了

這次遇到的狀況是如此,程序都正常執行著

- mariabackup ->
- move backup ->
- mariabackup --prepare ->
- mariabackup --copy-back ->
- service started ->
- change master for config replication ->
- start slave ->
- replication delay

在說明檢查程序前,我們先了解一下 XA Transactions

在 ERROR 1205 的提醒下,我們可以找到 trx_mysql_thread_id = 0 的 dead lock

MariaDB [information_schema]> select * from INNODB_trx\G
*************************** 1. row ***************************
trx_id: 154142686
trx_state: RUNNING
trx_started: 2022-01-05 13:52:26
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 2
trx_mysql_thread_id: 0
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 1
trx_lock_structs: 1
trx_lock_memory_bytes: 1128
trx_rows_locked: 0
trx_rows_modified: 1
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ

接著把資料找出來

- mariabackup ->
MariaDB [information_schema]> xa recover format='SQL';
+----------+--------------+--------------+---------------------------------------------------------------------+
| formatID | gtrid_length | bqual_length | data                                                                |
+----------+--------------+--------------+---------------------------------------------------------------------+
|        1 |           32 |            0 | X'5753524550586965bddf8a8f51ef11eca1507e79643313a4e9db950400000000' |
+----------+--------------+--------------+---------------------------------------------------------------------+
1 row in set (0.000 sec)

Rollback 處理後,再觀察一次 Replication 的消化狀況

- mariabackup ->
MariaDB [information_schema]> xa rollback X'5753524550586965bddf8a8f51ef11eca1507e79643313a4e9db950400000000';
Query OK, 0 rows affected (0.058 sec)

總結:

我認為這狀況不常見,也不大合理

因為 mariabackup 原則上在 --prepare 下會把該做的 redo / undo 處理告一段落才是,不應該有還沒處理好的 transaction 遺留

所以也只能推測 mariabackup 的鍋

但我沒有證據

Ref: https://www.cnblogs.com/gjc592/p/11240811.html