Import data

This document will guide you on how to import large amounts of data to MatrixOne.

Before you start

If you use the docker install, please make sure that you have a data directory mounted to the container. For example,

  1. docker run -d -p 6001:6001 -v ~/tmp/docker_loaddata_demo:/ssb-dbgen-path:rw --name matrixone matrixorigin/matrixone:0.5.1

This typical installation maps its local path ~/tmp/docker_loaddata_demo to a inner-container path /ssb-dbgen-path.

Method 1: Using the Source command in MySQL Client

You can execute an SQL script file using the source command or \. command in MySQL client.

  1. mysql> source file_name
  2. mysql> \. file_name

Usually source is used to execute many SQL statements, you can write your database&table creation, insert data in one SQL file and execute this file with source in MySQL Client. Each statement is a separate line, the lines starting with -- or wrapped with /* are considered as comments and will be ignored.

If your SQL is from mysqldump, please refer to this tutorial about how to modify the SQL file to adapt to MatrixOne format.

Method 2: Using the Load data command in MySQL Client

You can use Load Data to import data from big data files. Currently, MatrixOne only supports csv files.

  1. Before executing Load Data in MatrixOne, the table needs to be created in advance. For now, the data file is required to be at the same machine with MatrixOne server, a file transfer is necessary if they are in separate machines.

    Note

    If you are working with MatrixOne docker version, please put the data file in the directory mounted to the container, otherwise the container cann’t perceive.

  2. Launch the MySQL Client in the MatrixOne local server for accessing the local file system.

    1. mysql -h 127.0.0.1 -P 6001 -udump -p111
  3. Execute LOAD DATA with the corresponding file path in MySQL client.

    1. mysql> LOAD DATA INFILE '/tmp/xxx.csv'
    2. INTO TABLE xxxxxx
    3. FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY "\r\n";

    Note

    For the docker version, the file path needs to be in the mounted directory.

Example using Load data with docker version

We will walk through the whole process of loading data with MatrixOne 0.5.1 docker version.

  1. Download the dataset file and store the data in ~/tmp/docker_loaddata_demo/:

    1. cd ~/tmp/docker_loaddata_demo/
    2. wget https://community-shared-data-1308875761.cos.ap-beijing.myqcloud.com/lineorder_flat.tar.bz2
  2. Unzip the dataset:

    1. tar -jxvf lineorder_flat.tar.bz2
  3. Use Docker to launch MatrixOne, and mount the directory ~/tmp/docker_loaddata_demo/ that stores data files to a directory in the container. The container directory is /sb-dbgen-path as an example:

    1. docker run -d -p 6001:6001 -v ~/tmp/ docker_loaddata_demo:/ssb-dbgen-path:rw --name matrixone matrixorigin/matrixone:0.5.1
  4. Connect to MatrixOne server:

    1. mysql -h 127.0.0.1 -P 6001 -udump -p111
  5. Create lineorder_flat tables in MatrixOne, and import the dataset into MatriOne:

    1. mysql> create database if not exists ssb;
    2. mysql> use ssb;
    3. mysql> drop table if exists lineorder_flat;
    4. mysql> CREATE TABLE lineorder_flat(
    5. LO_ORDERKEY bigint key,
    6. LO_LINENUMBER int,
    7. LO_CUSTKEY int,
    8. LO_PARTKEY int,
    9. LO_SUPPKEY int,
    10. LO_ORDERDATE date,
    11. LO_ORDERPRIORITY char(15),
    12. LO_SHIPPRIORITY tinyint,
    13. LO_QUANTITY double,
    14. LO_EXTENDEDPRICE double,
    15. LO_ORDTOTALPRICE double,
    16. LO_DISCOUNT double,
    17. LO_REVENUE int unsigned,
    18. LO_SUPPLYCOST int unsigned,
    19. LO_TAX double,
    20. LO_COMMITDATE date,
    21. LO_SHIPMODE char(10),
    22. C_NAME varchar(25),
    23. C_ADDRESS varchar(25),
    24. C_CITY char(10),
    25. C_NATION char(15),
    26. C_REGION char(12),
    27. C_PHONE char(15),
    28. C_MKTSEGMENT char(10),
    29. S_NAME char(25),
    30. S_ADDRESS varchar(25),
    31. S_CITY char(10),
    32. S_NATION char(15),
    33. S_REGION char(12),
    34. S_PHONE char(15),
    35. P_NAME varchar(22),
    36. P_MFGR char(6),
    37. P_CATEGORY char(7),
    38. P_BRAND char(9),
    39. P_COLOR varchar(11),
    40. P_TYPE varchar(25),
    41. P_SIZE int,
    42. P_CONTAINER char(10)
    43. );
    44. mysql> load data infile '/ssb-dbgen-path/lineorder_flat.tbl' into table lineorder_flat FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n';
  6. After the import is successful, you can run SQL statements to check the rows of imported data:

    1. select count(*) from lineorder_flat;
    2. /*
    3. expected results:
    4. */
    5. +----------+
    6. | count(*) |
    7. +----------+
    8. | 10272594 |
    9. +----------+