Configuration document

compatibility

  1. compatible v0.3.2+,include v0.3.2

Document structure

  • config/application.properties
  1. Spring-boot configuration file, configurable running environment (spring.profiles.active), etc. For details, please refer to the spring-boot official documentation.
  2. The above parameters can also be specified via the startup.sh script, for example: startup.sh --logging.level.root=debug
  • config/application-${env}.properties
  1. Spring-boot configuration file, configuration files for different runtime environments.
  • config/tasks/${env}/*.properties
  1. Task configuration file
  2. Tasks for different environments (${env}) are configured in different subdirectories of the config/tasks directory.
  • Log generation directory
  1. spring-boot configuration parametersdefault : logging.file=${app.home}/logs/data-porter.log.
  2. ${app.home} is specified in the startup script and refers to the root directory of datas-porter.

Node number

  • porter.id
  1. Used to specify the task node number, which is unique in the distributed environment. Used to self-describe and report heartbeats on zookeeper to implement distributed locks.
  2. eg.
  3. porter.id=100

Statistic

  • porter.statistic.upload
  1. Whether to upload statistics, including logs, TPS indicators.
  2. Type : Boolean

Cluster

porter.cluster

  1. Distributed cluster implementation, currently only supports zookeeper
  • porter.cluster.strategy
  1. Specify the implementation strategy of the distributed cluster.
  2. eg.
  3. porter.cluster.strategy=ZOOKEEPER
  • porter.cluster.client.url
  1. Cluster connection parameters.
  2. eg.
  3. porter.cluster.client.url=127.0.0.1:2181
  • porter.cluster.client.sessionTimeout
  1. Cluster connection timeout
  2. eg.
  3. porter.cluster.client.sessionTimeout=overtime time, in milliseconds.

Alert

porter.alert

  1. Alarm policy driver, currently only supports mail.
  • porter.alert.strategy
  1. Specify the alarm mode
  2. eg.
  3. porter.alert.strategy=EMAIL
  • porter.alert.frequencyOfSeconds
  1. Same content notification receiving frequency.
  2. eg.
  3. porter.alert.frequencyOfSeconds=60
  • porter.alert.client
  1. porter.alert.client.host=smtp server
  2. porter.alert.client.username=mail address
  3. porter.alert.client.password=password
  4. porter.alert.client.smtpAuth=true
  5. porter.alert.client.smtpStarttlsEnable=true
  6. porter.alert.client.smtpStarttlsRequired=false
  • porter.alert.receiver
  1. Global alarm informer
  2. Type ArrayList
  3. porter.alert.receiver[index].realName=name
  4. porter.alert.receiver[index].email=mail address
  5. porter.alert.receiver[index].phone=phone number

Public connection pool

  1. Configured when multiple task targets and sources share a database connection pool to save database link resources.
  2. Type : Map

porter.source.Named name

  • porter.source.Named name.sourceType
  1. Source type
  2. Type : enum
  3. Optional parameter : JDBC
  • porter.source.Name name.url
  1. Multiple separated by ","
  2. eg.
  3. porter.source.Named name.url=jdbc link
  • porter.source.Named name.userName
  1. account name
  2. eg.
  3. porter.source.命名名字.userName=account name
  • porter.source.Named name.password
  1. password
  2. eg.
  3. porter.source.Named name.password=password
  • porter.source.Named name.maxWait
  • porter.source.Named name.minPoolSize
  • porter.source.Named name.maxPoolSize
  • porter.source.Named name.initialPoolSize
  • porter.source.Named name.connectionErrorRetryAttempts
  1. Connection error retries
  2. Type : Int
  • porter.source.Named name.dbType
  1. Type : Enum
  2. Optional parameters : MYSQL ORACLE
  • porter.source.Connection error retries.makePrimaryKeyWhenNo(2.0.1 add)
  1. Type : Boolean
  2. When the target table has no primary key, the default all fields are primary keys.
  3. default : true

Task configuration

porter.task

  1. Node task
  2. Type : ArrayList
  • porter.task[index].taskId
  1. Task number
  2. Type : String
  3. eg.
  4. porter.task[index].taskId=1
  • porter.task[index].receiver
  1. Current task alarm informer
  2. Task alert will be notified : porter.task[index].receiver + porter.alert.receiver
  3. Type ArrayList
  4. porter.task[index].receiver[index].realName=name
  5. porter.task[index].receiver[index].email=mail address
  6. porter.task[index].receiver[index].phone=phone number
  • porter.task[index].consumer
  1. Task consumption source configuration
  2. Type : DataConsumerConfig
  • porter.task[index].consumer.consumerNameme
  1. Consumer plugin
  2. Type : String
  3. Optional parameter : CanalFetchKafkaFetch
  • porter.task[index].consumer.converter
  1. Message converter
  2. Type : String
  3. Optional parameter : canalRow(1.0 add)、oggJson
  • porter.task[index].consumer.source
  1. Consumer data source
  2. Type : Map
  3. CanalFetch:(1.0 add)
  4. porter.task[index].consumer.source.sourceType=CANAL
  5. porter.task[index].consumer.source.slaveId=mysql slaveId
  6. porter.task[index].consumer.source.address=ip:port
  7. porter.task[index].consumer.source.database=database
  8. porter.task[index].consumer.source.username=account
  9. porter.task[index].consumer.source.password=password
  10. porter.task[index].consumer.source.filter=Subscription form regular
  11. KafkaFetch:
  12. porter.task[0].consumer.source.sourceType=KAFKA
  13. porter.task[0].consumer.source.servers=ip:port,ip:port
  14. porter.task[0].consumer.source.topics=topic
  15. porter.task[0].consumer.source.group=Consumer group
  16. porter.task[0].consumer.source.autoCommit=true|false
  • porter.task[index].consumer.metaSource
  1. Metadata data source
  2. Type : Map
  3. When the configuration is not done, the data between the source and the target will not be compared.(1.1 new rule)
  4. Form 1
  5. porter.task[index].consumer.metaSource.sourceName=public data source name
  6. Form 2:
  7. porter.task[index].consumer.metaSource.dbType
  8. porter.task[index].consumer.metaSource.url
  9. porter.task[index].consumer.metaSource.userName
  10. porter.task[index].consumer.metaSource.password
  11. porter.task[index].consumer.metaSource.maxWait
  12. porter.task[index].consumer.metaSource.minPoolSize
  13. porter.task[index].consumer.metaSource.maxPoolSize
  14. porter.task[index].consumer.metaSource.initialPoolSize
  15. porter.task[index].consumer.metaSource.connectionErrorRetryAttempts
  • porter.task[index].consumer.eventProcessor.className(1.1 add)
  1. Custom sync data data extractor
  2. Format : package.className
  • porter.task[index].consumer.eventProcessor.content(1.1 add)
  1. class pathjar pathSource content
  • porter.task[index].consumer.eventProcessor.emptyFetchNoticeSpan(2.0.1 add)
  1. Empty query notification interval, in seconds.
  2. default : 3600
  • porter.task[index].consumer.eventProcessor.emptyFetchThreshold(2.0.1 add)
  1. Empty query notification time threshold, in seconds
  2. -1 does not take effect, default 3600
  • porter.task[index].loader
  1. Task loader configuration
  2. Type : DataLoaderConfig
  • porter.task[index].loader.loaderName
  1. Target loader plugin
  2. Type : Enum
  3. Optional parameter : JdbcBatchJdbcSingle
  • porter.task[index].loader.source
  1. Target data source
  2. Type : Map
  3. Form 1
  4. porter.task[index].loader.source.sourceName=public data source name
  5. Form 2:
  6. porter.task[index].loader.source.dbType
  7. porter.task[index].loader.source.url
  8. porter.task[index].loader.sourceuserName
  9. porter.task[index].loader.source.password
  10. porter.task[index].loader.source.maxWait
  11. porter.task[index].loader.source.minPoolSize
  12. porter.task[index].loader.source.maxPoolSize
  13. porter.task[index].loader.source.initialPoolSize
  14. porter.task[index].loader.source.connectionErrorRetryAttempts
  • porter.task[index].loader.insertOnUpdateError(2.0 add)
  1. The target end update fails to insert the switch parameter, default enabled.
  2. Type : Boolean
  • porter.task[index].mapper
  1. Source and destination schema mapping, used to handle the inconsistency between source and destination naming.
  2. Type : List
  • porter.task[index].mapper[subscript].schema
  1. porter.task[index].mapper[index].schema=source schema,target schema
  • porter.task[index].mapper[subscript].table
  1. porter.task[index].mapper[index].table=source table name, target table name
  • porter.task[index].mapper[subscript].updateDate
  1. Data synchronization result check function is not enabled if it is not configured or configured incorrectly.
  2. porter.task[index].mapper[subscript].updateDate=The source table automatically updates the time field, and the target table automatically updates the time field.
  • porter.task[index].mapper[subscript].column
  1. Field mapping, no need to configure.
  2. porter.task[index].mapper[subscript].column.Source field name = target field name