驗收指標(造數是否“像生產”)

t 規模(過濾後行數)
SELECT COUNT() AS t_rows
FROM dd_data_dictionary
WHERE source_server=? AND schema_name=? AND system_name=?;
每表列數分佈(P50/P90/Max)
SELECT table_eng_name, COUNT(
) AS cols
FROM dd_data_dictionary
WHERE source_server=? AND schema_name=? AND system_name=?
GROUP BY table_eng_name
ORDER BY cols DESC;
t 唯一性(是否有重複 key)
SELECT COUNT() AS total,
COUNT(DISTINCT CONCAT_WS('#',source_server,schema_name,system_name,table_eng_name,column_eng_name)) AS distinct_keys
FROM dd_data_dictionary
WHERE source_server=? AND schema_name=? AND system_name=?;
t2 規模與唯一性(join key)
SELECT COUNT(
) AS t2_rows,
COUNT(DISTINCT CONCAT_WS('#',source_server,table_schema,system_name,table_name,column_name)) AS distinct_keys
FROM dd_source_table_column
WHERE source_server=? AND table_schema=? AND system_name=?;
join 命中率(t 中有多少能在 t2 找到)
SELECT
SUM(t2.column_name IS NOT NULL) AS hit_rows,
COUNT() AS total_rows,
SUM(t2.column_name IS NOT NULL)/COUNT(
) AS hit_ratio
FROM dd_data_dictionary t
LEFT JOIN dd_source_table_column t2
ON t.source_server=t2.source_server
AND t.schema_name=t2.table_schema
AND t.system_name=t2.system_name
AND t.table_eng_name=t2.table_name
AND t.column_eng_name=t2.column_name
WHERE t.source_server=? AND t.schema_name=? AND t.system_name=?;
差異命中率(OR 條件命中佔比,決定返回量/過濾強度)
SELECT
SUM(
t.column_type<>t2.column_type
OR t.column_length<>t2.column_length
OR t.is_primary_key<>t2.is_primary_key
OR t.is_union_key<>t2.is_union_key
OR t.is_foreign_key<>t2.is_foreign_key
OR t.is_nullable<>t2.is_nullable
OR t.numeric_precision<>t2.numeric_precision
OR t.numeric_scale<>t2.numeric_scale
OR t.column_chn_name<>t2.column_comment
OR t.table_chn_name<>t2.table_comment
) AS diff_rows,
COUNT() AS joined_rows,
SUM( ...同上... )/COUNT(
) AS diff_ratio
FROM dd_data_dictionary t
JOIN dd_source_table_column t2
ON t.source_server=t2.source_server
AND t.schema_name=t2.table_schema
AND t.system_name=t2.system_name
AND t.table_eng_name=t2.table_name
AND t.column_eng_name=t2.column_name
WHERE t.source_server=? AND t.schema_name=? AND t.system_name=?;
異常表佔比(被排除的表有多少)
SELECT COUNT(DISTINCT table_name) AS exception_tables
FROM dd_exception_table
WHERE source_server=? AND schema_name=? AND system_name=?;
ST 造數步驟(用 dd_data_dictionary 做種子)

清理並準備本輪 t2(保持你們跑批習慣:TRUNCATE)
TRUNCATE TABLE dd_source_table_column;
用 t 生成 t2 基礎數據(保證大部分 join 命中)
做法:按 join key 把 t 的一批行映射進 t2;其他字段先“拷貝同值”或填默認值(字段映射需要你按 dd_source_table_column 實際列名補齊)。
造數目標建議:先做到 hit_ratio >= 0.98
製造“缺失行”(讓少量 join 不命中,覆蓋邊界)
隨機刪掉 t2 的 0.5%~2% 行(按你想要的缺失比例)
製造“差異”(讓 OR 條件命中)
按 1% / 5% / 20% 三檔分別跑一遍:隨機挑行把 column_type/column_length/column_comment/table_comment 之一改掉(每行改 1 個字段就夠)
準備 dd_exception_table(造 0 行、少量、5% 三檔)
刷新統計信息(避免 ST 上估算亂跳)
ANALYZE TABLE dd_data_dictionary, dd_source_table_column, dd_exception_table;
復現“偶發 100s”的執行法(和造數無關但必須做)

保持跑批併發與生產一致,並在每輪前 TRUNCATE dd_source_table_column
同時開一個會話循環執行 SHOW FULL PROCESSLIST;,一旦慢,確認是否出現 Waiting for table metadata lock
把 SHOW CREATE TABLE dd_source_table_column\G 貼出來後,我可以把第 2 步的 INSERT INTO dd_source_table_column (...) SELECT ... FROM dd_data_dictionary ... 這一段按你們真實列一對一寫成可直接執行的版本(含“差異/缺失/異常”三檔腳本)。