Pipeline Engine

This article mainly introduces the configuration, deployment and use of pipeline (>=1.1.0 version support) engine.

Note: before compiling the pipelineengine, you need to compile the linkis project in full Currently, the pipeline engine needs to be installed and deployed by itself

This engine plug-in is not included in the published installation and deployment package by default, You can follow this guide to deploy the installation https://linkis.apache.org/blog/2022/04/15/how-to-download-engineconn-plugin Or manually compile the deployment according to the following process

Compile separatelypipeline

  1. ${linkis_code_dir}/linkis-enginepconn-pugins/engineconn-plugins/pipeline/
  2. mvn clean install

The engine package compiled in step 1.1 is located in

  1. ${linkis_code_dir}/linkis-engineconn-pluginspipeline/target/out/pipeline

Upload to the engine directory of the server

  1. ${LINKIS_HOME}/lib/linkis-engineplugins

And restart the linkis engineplugin to refresh the engine

  1. cd ${LINKIS_HOME}/sbin
  2. sh linkis-daemon.sh restart cg-engineplugin

Or refresh through the engine interface. After the engine is placed in the corresponding directory, send a refresh request to the linkis CG engineconplugin service through the HTTP interface.

  • Interfacehttp://${engineconn-plugin-server-IP}:${port}/api/rest_j/v1/rpc/receiveAndReply

  • Request mode POST

  1. {
  2. "method": "/enginePlugin/engineConn/refreshAll"
  3. }

Check whether the engine is refreshed successfully: if you encounter problems during the refresh process and need to confirm whether the refresh is successful, you can view thelinkis_engine_conn_plugin_bml_resourcesOf this tablelast_update_timeWhether it is the time when the refresh is triggered.

  1. #Log in to the database of linkis
  2. select * from linkis_cg_engine_conn_plugin_bml_resources

Linkis1.XIt is carried out through labels, so it is necessary to insert data into our database. The insertion method is shown below.

EngineConnPlugin Engine plug-in installation

Link 1.0 provides cli to submit tasks. We only need to specify the corresponding enginecon and codetype tag types. The use of pipeline is as follows:

  • Note that the enginetype pipeline-1 engine version setting is prefixed. If the pipeline version is V1 , it is set to pipeline-1
  1. sh bin/linkis-cli -submitUser hadoop -engineType pipeline-1 -codeType pipeline -code "from hdfs:///000/000/000/A.dolphin to file:///000/000/000/B.csv"

from hdfs:///000/000/000/A.dolphin to file:///000/000/000/B.csv 3.3 Explained

For specific use, please refer to: Linkis CLI Manual.

becausepipelineThe engine is mainly used to import and export files. Now let’s assume that importing files from a to B is the most introduced case

Right click the workspace module and select Create a new workspace of typestorageScript for

Pipeline Engine - 图1

The syntax is file copy rule:dolphinSuffix type files are result set files that can be converted to.csvType and.xlsxType file, other types can only be copied from address a to address B, referred to as handling

  1. #dolphin type
  2. from hdfs:///000/000/000/A.dolphin to file:///000/000/000/B.csv
  3. from hdfs:///000/000/000/A.dolphin to file:///000/000/000/B.xlsx
  4. #Other types
  5. from hdfs:///000/000/000/A.txt to file:///000/000/000/B.txt

A file importing script to B folder

  1. from hdfs:///000/000/000/A.csv to file:///000/000/B/
  • from grammar,to:grammar
  • hdfs:///000/000/000/A.csv:Input file path
  • file:///000/000/B/: Output path

file B is exported as file A

  1. from hdfs:///000/000/000/B.csv to file:///000/000/000/A.CSV
  • hdfs:///000/000/000/B.csv: Input file path
  • file:///000/000/000/A.CSV: Output file path

Pipeline Engine - 图2

Note: no semicolon is allowed at the end of the syntax; Otherwise, the syntax is incorrect.

speed of progress

Pipeline Engine - 图3

historical information Pipeline Engine - 图4