Dump file format

This description uses the same conventions as the protocol description.

The dump file format is not final and is subject to change before EdgeDB 1.0.

General Structure

Dump file is structure as follows:

  1. Dump file format marker \xFF\xD8\x00\x00\xD8EDGEDB\x00DUMP\x00 (17 bytes)

  2. Format version number \x00\x00\x00\x00\x00\x00\x00\x01 (8 bytes)

  3. Header block

  4. Any number of data blocks

General Dump Block

Both header and data blocks are formatted as follows:

  1. struct DumpHeader {
  2. int8 mtype;
  3. // SHA1 hash sum of block data
  4. byte sha1sum[20];
  5. // Length of message contents in bytes,
  6. // including self.
  7. int32 message_length;
  8. // Block data. Should be treated in opaque way by a client.
  9. byte data[message_length];
  10. }

Upon receiving a protocol dump data message, the dump client should:

  • Replace packet type:

    • @ (0x40) → H (0x48)

    • = (0x3d) → D (0x44)

  • Prepend SHA1 checksum to the block

  • Append the entire dump protocol message disregarding the first byte (the message type).

Header Block

Format:

  1. struct DumpHeader {
  2. // Message type ('H')
  3. int8 mtype = 0x48;
  4. // SHA1 hash sum of block data
  5. byte sha1sum[20];
  6. // Length of message contents in bytes,
  7. // including self.
  8. int32 message_length;
  9. // A set of message headers.
  10. Headers headers;
  11. // Protocol version of the dump
  12. int16 major_ver;
  13. int16 minor_ver;
  14. // Schema data
  15. string schema_ddl;
  16. // Type identifiers
  17. int32 num_types;
  18. TypeInfo types[num_types];
  19. // Object descriptors
  20. int32 num_descriptors;
  21. ObjectDesc descriptors[num_descriptors]
  22. };
  23. struct TypeInfo {
  24. string type_name;
  25. string type_class;
  26. byte type_id[16];
  27. }
  28. struct ObjectDesc {
  29. byte object_id[16];
  30. bytes description;
  31. int16 num_dependencies;
  32. byte dependency_id[num_dependencies][16];
  33. }

Known headers:

  • 101 BLOCK_TYPE – block type, always “I”

  • 102 SERVER_TIME – server time when dump is started as a floating point unix timestamp stringified

  • 103 SERVER_VERSION – full version of server as string

Data Block

Format:

  1. struct DumpBlock {
  2. // Message type ('=')
  3. int8 mtype = 0x3d;
  4. // Length of message contents in bytes,
  5. // including self.
  6. int32 message_length;
  7. // A set of message headers.
  8. Headers headers;
  9. }

Known headers:

  • 101 BLOCK_TYPE – block type, always “D”

  • 110 BLOCK_ID – block identifier (16 bytes of UUID)

  • 111 BLOCK_NUM – integer block index stringified

  • 112 BLOCK_DATA – the actual block data