TsFile Hierarchy
Here is a brief introduction of the structure of a TsFile file.
Variable Storage
Big Endian
- For Example, the
int0x8will be stored as00 00 00 08, not08 00 00 00
- For Example, the
String with Variable Length
The format is
int sizeplusString literal. Size can be zero.Size equals the number of bytes this string will take, and it may not equal to the length of the string.
For example “sensor_1” will be stored as
00 00 00 08plus the encoding(ASCII) of “sensor_1”.Note that for the “Magic String”(file signature) “TsFilev0.8.0”, the size(12) and encoding(ASCII) is fixed so there is no need to put the size before this string literal.
Data Type Hardcode
- 0: BOOLEAN
- 1: INT32 (
int) - 2: INT64 (
long) - 3: FLOAT
- 4: DOUBLE
- 5: TEXT (
String)
Encoding Type Hardcode
- 0: PLAIN
- 1: PLAIN_DICTIONARY
- 2: RLE
- 3: DIFF
- 4: TS_2DIFF
- 5: BITMAP
- 6: GORILLA
- 7: REGULAR
Compressing Type Hardcode
- 0: UNCOMPRESSED
- 1: SNAPPY
TsFile Overview
Here is a graph about the TsFile structure.

Magic String
There is a 12 bytes magic string:
TsFilev0.8.0
It is in both the beginning and end of a TsFile file as signature.
Data
The content of a TsFile file can be divided as two parts: data and metadata. There is a byte 0x02 as the marker between data and metadata.
The data section is an array of ChunkGroup, each ChuckGroup represents a device.
ChuckGroup
The ChunkGroup has an array of Chunk, a following byte 0x00 as the marker, and a ChunkFooter.
Chunk
A Chunk represents a sensor. There is a byte 0x01 as the marker, following a ChunkHeader and an array of Page.
ChunkHeader
| Member Description | Member Type |
|---|---|
| The name of this sensor(measurementID) | String |
| Size of this chunk | int |
| Data type of this chuck | short |
| Number of pages | int |
| Compression Type | short |
| Encoding Type | short |
| Max Tombstone Time | long |
Page
A Page represents some data in a Chunk. It contains a PageHeader and the actual data (The encoded time-value pair).
PageHeader Structure
| Member Description | Member Type |
|---|---|
| Data size before compressing | int |
| Data size after compressing(if use SNAPPY) | int |
| Number of values | int |
| Minimum time stamp | long |
| Maximum time stamp | long |
| Minimum value of the page | Type of the page |
| Maximum value of the page | Type of the page |
| First value of the page | Type of the page |
| Last value of the page | Type of the page |
| Sum of the Page | double |
ChunkGroupFooter
| Member Description | Member Type |
|---|---|
| Deviceid | String |
| Data size of the ChunkGroup | long |
| Number of chunks | int |
Metadata
TsDeviceMetaData
The first part of metadata is TsDeviceMetaData
| Member Description | Member Type |
|---|---|
| Start time | long |
| End time | long |
| Number of chunk groups | int |
Then there is an array of ChunkGroupMetaData after TsDeviceMetaData
ChunkGroupMetaData
| Member Description | Member Type |
|---|---|
| Deviceid | String |
| Start offset of the ChunkGroup | long |
| End offset of the ChunkGroup | long |
| Version | long |
| Number of ChunkMetaData | int |
Then there is an array of ChunkMetadata for each ChunkGroupMetadata
ChunkMetaData
| Member Description | Member Type |
|---|---|
| Measurementid | String |
| Start offset of ChunkHeader | long |
| Number of data points | long |
| Start time | long |
| End time | long |
| Data type | short |
| Number of statistics | int |
| The statistics of this chunk | TsDigest |
TsDigest
There are five statistics: min, last, sum, first, max
The storage format is a name-value pair. The name is a string (remember the length is before the literal).
But for the value, there is also a size integer before the data even if it is not string. For example, if the min is 3, then it will be stored as 3 “min” 4 3 in the TsFile.
File Metadata
After the array of ChunkGroupMetadata, here is the last part of the metadata.
| Member Description | Member Type |
|---|---|
| Number of Devices | int |
| Array of DeviceIndexMetadata | DeviceIndexMetadata |
| Number of Measurements | int |
| Array of Measurement name and schema | String, MeasurementSchema pair |
| Current Version(3 for now) | int |
| Author byte | byte |
| Author(if author byte is 0x01) | String |
| File Metadata size(not including itself) | int |
DeviceIndexMetadata
| Member Description | Member Type |
|---|---|
| Deviceid | String |
| Start offset of ChunkGroupMetaData(Or TsDeviceMetaData if it’s the first one) | long |
| length | int |
| Start time | long |
| End time | long |
MeasurementSchema
| Member Description | Member Type |
|---|---|
| Measurementid | String |
| Data type | short |
| Encoding | short |
| Compressor | short |
| Size of props | int |
If size of props is greater than 0, there is an array of
Such as “max_point_number””2”.
Done
After the FileMetaData, there will be another Magic String and you have finished the journey of discovering TsFile!
You can also use /tsfile/example/TsFileSequenceRead to read and validate a TsFile.
