flink-cdc同步mysql到doris的一個常見的類型錯誤處理


一、varchar類型

日誌報錯如下:

java.lang.IllegalArgumentException: Variable character string length must be between 1 and 2147483647 (both inclusive).
	at org.apache.flink.cdc.common.types.VarCharType.<init>(VarCharType.java:52) ~[?:?]
	at org.apache.flink.cdc.common.types.VarCharType.<init>(VarCharType.java:60) ~[?:?]
	at org.apache.flink.cdc.common.types.DataTypes.VARCHAR(DataTypes.java:150) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.utils.MySqlTypeUtils.convertFromColumn(MySqlTypeUtils.java:207) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.utils.MySqlTypeUtils.fromDbzColumn(MySqlTypeUtils.java:113) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlPipelineRecordEmitter.parseDDL(MySqlPipelineRecordEmitter.java:204) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlPipelineRecordEmitter.getSchema(MySqlPipelineRecordEmitter.java:136) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlPipelineRecordEmitter.generateCreateTableEvent(MySqlPipelineRecordEmitter.java:242) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlPipelineRecordEmitter.<init>(MySqlPipelineRecordEmitter.java:84) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.source.MySqlDataSource.lambda$getEventSourceProvider$22720ae2$1(MySqlDataSource.java:55) ~[?:?]
	at org.apache.flink.cdc.connectors.mysql.source.MySqlSource.createReader(MySqlSource.java:188) ~[?:?]
	at org.apache.flink.streaming.api.operators.SourceOperator.initReader(SourceOperator.java:316) ~[flink-dist-1.20.0.jar:1.20.0]
	at org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask.init(SourceOperatorStreamTask.java:94) ~[flink-dist-1.20.0.jar:1.20.0]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:800) ~[flink-dist-1.20.0.jar:1.20.0]
	at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:771) ~[flink-dist-1.20.0.jar:1.20.0]
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:970) ~[flink-dist-1.20.0.jar:1.20.0]
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:939) ~[flink-dist-1.20.0.jar:1.20.0]
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:763) ~[flink-dist-1.20.0.jar:1.20.0]
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) ~[flink-dist-1.20.0.jar:1.20.0]
	at java.lang.Thread.run(Thread.java:842) ~[?:?]
flink-cdc同步mysql到doris,報這個錯

原因就是VarCharType的長度為沒有在(1-2147483647 )之間。Variable character string length must be between 1 and 2147483647 ,varchar一般都要填寫長度,可能有些給寫成了0,在mysql中沒有問題呢,在其他數據庫就會由問題。

可以執行下面的SQL語句查看具體情況

SELECT 
    TABLE_SCHEMA AS '數據庫名',
    TABLE_NAME AS '表名', 
    COLUMN_NAME AS '字段名',
    DATA_TYPE AS '數據類型',
    CHARACTER_MAXIMUM_LENGTH AS '最大長度'
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE DATA_TYPE = 'varchar' 
AND CHARACTER_MAXIMUM_LENGTH = 0
ORDER BY TABLE_SCHEMA, TABLE_NAME;
SELECT 
    TABLE_SCHEMA AS '數據庫名',
    TABLE_NAME AS '表名', 
    COLUMN_NAME AS '字段名',
    DATA_TYPE AS '數據類型',
    CHARACTER_MAXIMUM_LENGTH AS '最大長度',
    COLUMN_TYPE AS '定義的類型'
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE DATA_TYPE = 'varchar' 
AND CHARACTER_MAXIMUM_LENGTH = 0
ORDER BY TABLE_SCHEMA, TABLE_NAME;

flink-cdc同步mysql到doris的一個常見的數據類型錯誤處理_apache

varchar(0)在mysql中沒有問題,在doris中就會出現問題。


二、int類型

int類型也有類似錯誤,因為創建數據庫不規範,還有很多int(255),像這樣的情況,mysql沒有問題,但是doris就會有問題。

SELECT 
    TABLE_SCHEMA AS '數據庫名',
    TABLE_NAME AS '表名', 
    COLUMN_NAME AS '字段名',
    DATA_TYPE AS '數據類型',
    CHARACTER_MAXIMUM_LENGTH AS '最大長度',
    COLUMN_TYPE AS '定義的類型' 
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE DATA_TYPE = 'int' 
AND TABLE_SCHEMA = 'twms' 
AND COLUMN_TYPE = 'int(255)'
ORDER BY TABLE_SCHEMA, TABLE_NAME;

這個 COLUMN_TYPE = 'int(255)' 是我們這裏的問題,int類型大於21就算有問題了吧。你們可以去確認下。

flink-cdc同步mysql到doris的一個常見的數據類型錯誤處理_apache_02