TsFile Hierarchy

Here is a brief introduction of the structure of a TsFile file.

Variable Storage

  • Big Endian

    • For Example, the int 0x8 will be stored as 00 00 00 08, not 08 00 00 00
  • String with Variable Length

    • The format is int size plus String literal. Size can be zero.

    • Size equals the number of bytes this string will take, and it may not equal to the length of the string.

    • For example “sensor_1” will be stored as 00 00 00 08 plus the encoding(ASCII) of “sensor_1”.

    • Note that for the “Magic String”(file signature) “TsFilev0.8.0”, the size(12) and encoding(ASCII) is fixed so there is no need to put the size before this string literal.

  • Data Type Hardcode

    • 0: BOOLEAN
    • 1: INT32 (int)
    • 2: INT64 (long)
    • 3: FLOAT
    • 4: DOUBLE
    • 5: TEXT (String)
  • Encoding Type Hardcode

    • 0: PLAIN
    • 1: PLAIN_DICTIONARY
    • 2: RLE
    • 3: DIFF
    • 4: TS_2DIFF
    • 5: BITMAP
    • 6: GORILLA
    • 7: REGULAR
  • Compressing Type Hardcode

    • 0: UNCOMPRESSED
    • 1: SNAPPY

TsFile Overview

Here is a graph about the TsFile structure.

TsFile Breakdown

Magic String

There is a 12 bytes magic string:

TsFilev0.8.0

It is in both the beginning and end of a TsFile file as signature.

Data

The content of a TsFile file can be divided as two parts: data and metadata. There is a byte 0x02 as the marker between data and metadata.

The data section is an array of ChunkGroup, each ChuckGroup represents a device.

ChuckGroup

The ChunkGroup has an array of Chunk, a following byte 0x00 as the marker, and a ChunkFooter.

Chunk

A Chunk represents a sensor. There is a byte 0x01 as the marker, following a ChunkHeader and an array of Page.

ChunkHeader
Member DescriptionMember Type
The name of this sensor(measurementID)String
Size of this chunkint
Data type of this chuckshort
Number of pagesint
Compression Typeshort
Encoding Typeshort
Max Tombstone Timelong
Page

A Page represents some data in a Chunk. It contains a PageHeader and the actual data (The encoded time-value pair).

PageHeader Structure

Member DescriptionMember Type
Data size before compressingint
Data size after compressing(if use SNAPPY)int
Number of valuesint
Minimum time stamplong
Maximum time stamplong
Minimum value of the pageType of the page
Maximum value of the pageType of the page
First value of the pageType of the page
Last value of the pageType of the page
Sum of the Pagedouble
ChunkGroupFooter
Member DescriptionMember Type
DeviceidString
Data size of the ChunkGrouplong
Number of chunksint

Metadata

TsDeviceMetaData

The first part of metadata is TsDeviceMetaData

Member DescriptionMember Type
Start timelong
End timelong
Number of chunk groupsint

Then there is an array of ChunkGroupMetaData after TsDeviceMetaData

ChunkGroupMetaData

Member DescriptionMember Type
DeviceidString
Start offset of the ChunkGrouplong
End offset of the ChunkGrouplong
Versionlong
Number of ChunkMetaDataint

Then there is an array of ChunkMetadata for each ChunkGroupMetadata

ChunkMetaData
Member DescriptionMember Type
MeasurementidString
Start offset of ChunkHeaderlong
Number of data pointslong
Start timelong
End timelong
Data typeshort
Number of statisticsint
The statistics of this chunkTsDigest
TsDigest

There are five statistics: min, last, sum, first, max

The storage format is a name-value pair. The name is a string (remember the length is before the literal).

But for the value, there is also a size integer before the data even if it is not string. For example, if the min is 3, then it will be stored as 3 “min” 4 3 in the TsFile.

File Metadata

After the array of ChunkGroupMetadata, here is the last part of the metadata.

Member DescriptionMember Type
Number of Devicesint
Array of DeviceIndexMetadataDeviceIndexMetadata
Number of Measurementsint
Array of Measurement name and schemaString, MeasurementSchema pair
Current Version(3 for now)int
Author bytebyte
Author(if author byte is 0x01)String
File Metadata size(not including itself)int
DeviceIndexMetadata
Member DescriptionMember Type
DeviceidString
Start offset of ChunkGroupMetaData(Or TsDeviceMetaData if it’s the first one)long
lengthint
Start timelong
End timelong
MeasurementSchema
Member DescriptionMember Type
MeasurementidString
Data typeshort
Encodingshort
Compressorshort
Size of propsint

If size of props is greater than 0, there is an array of pair as properties of this measurement.

Such as “max_point_number””2”.

Done

After the FileMetaData, there will be another Magic String and you have finished the journey of discovering TsFile!

You can also use /tsfile/example/TsFileSequenceRead to read and validate a TsFile.