ETL - Import from RDBMS

Most of DBMSs support JDBC driver. All you need is to gather the JDBC driver and put it in classpath or simply in the $ORIENTDB_HOME/lib directory.

With the configuration below all the records from the table “Client” are imported in OrientDB from MySQL database.

Example importing a flat table

  1. {
  2. "config": {
  3. "log": "debug"
  4. },
  5. "extractor" : {
  6. "jdbc": { "driver": "com.mysql.jdbc.Driver",
  7. "url": "jdbc:mysql://localhost/mysqlcrm",
  8. "userName": "root",
  9. "userPassword": "",
  10. "query": "select * from Client" }
  11. },
  12. "transformers" : [
  13. { "vertex": { "class": "Client"} }
  14. ],
  15. "loader" : {
  16. "orientdb": {
  17. "dbURL": "plocal:/temp/databases/orientdbcrm",
  18. "dbAutoCreate": true
  19. }
  20. }
  21. }

Example loading records from 2 connected tables

With this example we want to import a database that contains Blog posts in the following tables:

  • Authors, in TABLE Author, with the following columns: id and name
  • Posts, in TABLE Post, with the following columns: author_id, title and text

To import them into OrientDB we’d need 2 ETL processes.

Importing of Authors

  1. {
  2. "config": {
  3. "log": "debug"
  4. },
  5. "extractor" : {
  6. "jdbc": { "driver": "com.mysql.jdbc.Driver",
  7. "url": "jdbc:mysql://localhost/mysql",
  8. "userName": "root",
  9. "userPassword": "",
  10. "query": "select * from Author" }
  11. },
  12. "transformers" : [
  13. { "vertex": { "class": "Author"} }
  14. ],
  15. "loader" : {
  16. "orientdb": {
  17. "dbURL": "plocal:/temp/databases/orientdb",
  18. "dbAutoCreate": true
  19. }
  20. }
  21. }

Importing of Posts

  1. {
  2. "config": {
  3. "log": "debug"
  4. },
  5. "extractor" : {
  6. "jdbc": { "driver": "com.mysql.jdbc.Driver",
  7. "url": "jdbc:mysql://localhost/mysql",
  8. "userName": "root",
  9. "userPassword": "",
  10. "query": "select * from Post" }
  11. },
  12. "transformers" : [
  13. { "vertex": { "class": "Post"} },
  14. { "edge": { "class": "Wrote", "direction" : "in",
  15. "joinFieldName": "author_id",
  16. "lookup":"Author.id", "unresolvedLinkAction":"CREATE"} }
  17. ],
  18. "loader" : {
  19. "orientdb": {
  20. "dbURL": "plocal:/temp/databases/orientdb",
  21. "dbAutoCreate": true
  22. }
  23. }
  24. }

Note the edge configuration has the direction as “in”, that means starts from the Author and finishes to Post.