Data Model and Terminology

To make this manual more practical, we will use a specific scenario example to illustrate how to operate IoTDB databases at all stages of use. See this pageData Model and Terminology - 图1 for a look. For convenience, we also provide you with a sample data file in real scenario to import into the IoTDB system for trial and operation.

Download file: IoTDB-SampleData.txtData Model and Terminology - 图2.

According to the data attribute layers described in sample dataData Model and Terminology - 图3, we can express it as an attribute hierarchy structure based on the coverage of attributes and the subordinate relationship between them, as shown in Figure 2.1 below. Its hierarchical relationship is: power group layer - power plant layer - device layer - sensor layer. ROOT is the root node, and each node of sensor layer is called a leaf node. In the process of using IoTDB, you can directly connect the attributes on the path from ROOT node to each leaf node with “.”, thus forming the name of a timeseries in IoTDB. For example, The left-most path in Figure 2.1 can generate a timeseries named ROOT.ln.wf01.wt01.status.

Data Model and Terminology - 图4 **Figure 2.1 Attribute hierarchy structure**

After getting the name of the timeseries, we need to set up the storage group according to the actual scenario and scale of the data. Because in the scenario of this chapter data is usually arrived in the unit of groups (i.e., data may be across electric fields and devices), in order to avoid frequent switching of IO when writing data, and to meet the user’s requirement of physical isolation of data in the unit of groups, we set the storage group at the group layer.

Here are the basic concepts of the model involved in IoTDB:

  • Device

A device is an installation equipped with sensors in real scenarios. In IoTDB, all sensors should have their corresponding devices.

  • Sensor

A sensor is a detection equipment in an actual scene, which can sense the information to be measured, and can transform the sensed information into an electrical signal or other desired form of information output and send it to IoTDB. In IoTDB, all data and paths stored are organized in units of sensors.

  • Storage Group

Storage groups are used to let users define how to organize and isolate different time series data on disk. Time series belonging to the same storage group will be continuously written to the same file in the corresponding folder. The file may be closed due to user commands or system policies, and hence the data coming next from these sensors will be stored in a new file in the same folder. Time series belonging to different storage groups are stored in different folders.

Users can set any prefix path as a storage group. Provided that there are four time series root.vehicle.d1.s1, root.vehicle.d1.s2, root.vehicle.d2.s1, root.vehicle.d2.s2, two devices d1 and d2 under the path root.vehicle may belong to the same owner or the same manufacturer, so d1 and d2 are closely related. At this point, the prefix path root.vehicle can be designated as a storage group, which will enable IoTDB to store all devices under it in the same folder. Newly added devices under root.vehicle will also belong to this storage group.

Note: A full path (root.vehicle.d1.s1 as in the above example) is not allowed to be set as a storage group.

Setting a reasonable number of storage groups can lead to performance gains: there is neither the slowdown of the system due to frequent switching of IO (which will also take up a lot of memory and result in frequent memory-file switching) caused by too many storage files (or folders), nor the block of write commands caused by too few storage files (or folders) (which reduces concurrency).

Users should balance the storage group settings of storage files according to their own data size and usage scenarios to achieve better system performance. (There will be officially provided storage group scale and performance test reports in the future).

Note: The prefix of a time series must belong to a storage group. Before creating a time series, the user must set which storage group the series belongs to. Only the time series whose storage group is set can be persisted to disk.

Once a prefix path is set as a storage group, the storage group settings cannot be changed.

After a storage group is set, all parent and child layers of the corresponding prefix path are not allowed to be set up again (for example, after root.ln is set as the storage group, the root layer and root.ln.wf01 are not allowed to be set as storage groups).

  • Path

In IoTDB, a path is an expression that conforms to the following constraints:

  1. path: LayerName (DOT LayerName)+
  2. LayerName: Identifier | STAR

Among them, STAR is “*“ and DOT is “.”.

We call the middle part of a path between two “.” as a layer, and thus root.A.B.C is a path with four layers.

It is worth noting that in the path, root is a reserved character, which is only allowed to appear at the beginning of the time series mentioned below. If root appears in other layers, it cannot be parsed and an error is reported.

  • Timeseries Path

The timeseries path is the core concept in IoTDB. A timeseries path can be thought of as the complete path of a sensor that produces the time series data. All timeseries paths in IoTDB must start with root and end with the sensor. A timeseries path can also be called a full path.

For example, if device1 of the vehicle type has a sensor named sensor1, its timeseries path can be expressed as: root.vehicle.device1.sensor1.

Note: The layer of timeseries paths supported by the current IoTDB must be greater than or equal to four (it will be changed to two in the future).

  • Prefix Path

The prefix path refers to the path where the prefix of a timeseries path is located. A prefix path contains all timeseries paths prefixed by the path. For example, suppose that we have three sensors: root.vehicle.device1.sensor1, root.vehicle.device1.sensor2, root.vehicle.device2.sensor1, the prefix path root.vehicle.device1 contains two timeseries paths root.vehicle.device1.sensor1 and root.vehicle.device1.sensor2 while root.vehicle.device2.sensor1 is excluded.

  • Path With Star

In order to make it easier and faster to express multiple timeseries paths or prefix paths, IoTDB provides users with the path pith star. * can appear in any layer of the path. According to the position where * appears, the path with star can be divided into two types:

* appears at the end of the path;

* appears in the middle of the path;

When * appears at the end of the path, it represents (*)+, which is one or more layers of *. For example, root.vehicle.device1.* represents all paths prefixed by root.vehicle.device1 with layers greater than or equal to 4, like root.vehicle.device1.*, root.vehicle.device1.*.*, root.vehicle.device1.*.*.*, etc.

When * appears in the middle of the path, it represents * itself, i.e., a layer. For example, root.vehicle.*.sensor1 represents a 4-layer path which is prefixed with root.vehicle and suffixed with sensor1.

Note1: * cannot be placed at the beginning of the path.

Note2: A path with * at the end has the same meaning as a prefix path, e.g., root.vehicle.* and root.vehicle is the same.

  • Timestamp

The timestamp is the time point at which data is produced. It includes absolute timestamps and relative timestamps

  • Absolute timestamp

Absolute timestamps in IoTDB are divided into two types: LONG and DATETIME (including DATETIME-INPUT and DATETIME-DISPLAY). When a user inputs a timestamp, he can use a LONG type timestamp or a DATETIME-INPUT type timestamp, and the supported formats of the DATETIME-INPUT type timestamp are shown in the table below:

**Supported formats of DATETIME-INPUT type timestamp**

Format
yyyy-MM-dd HH:mm:ss
yyyy/MM/dd HH:mm:ss
yyyy.MM.dd HH:mm:ss
yyyy-MM-dd’T’HH:mm:ss
yyyy/MM/dd’T’HH:mm:ss
yyyy.MM.dd’T’HH:mm:ss
yyyy-MM-dd HH:mm:ssZZ
yyyy/MM/dd HH:mm:ssZZ
yyyy.MM.dd HH:mm:ssZZ
yyyy-MM-dd’T’HH:mm:ssZZ
yyyy/MM/dd’T’HH:mm:ssZZ
yyyy.MM.dd’T’HH:mm:ssZZ
yyyy/MM/dd HH:mm:ss.SSS
yyyy-MM-dd HH:mm:ss.SSS
yyyy.MM.dd HH:mm:ss.SSS
yyyy/MM/dd’T’HH:mm:ss.SSS
yyyy-MM-dd’T’HH:mm:ss.SSS
yyyy.MM.dd’T’HH:mm:ss.SSS
yyyy-MM-dd HH:mm:ss.SSSZZ
yyyy/MM/dd HH:mm:ss.SSSZZ
yyyy.MM.dd HH:mm:ss.SSSZZ
yyyy-MM-dd’T’HH:mm:ss.SSSZZ
yyyy/MM/dd’T’HH:mm:ss.SSSZZ
yyyy.MM.dd’T’HH:mm:ss.SSSZZ
ISO8601 standard time format

IoTDB can support LONG types and DATETIME-DISPLAY types when displaying timestamps. The DATETIME-DISPLAY type can support user-defined time formats. The syntax of the custom time format is shown in the table below:

**The syntax of the custom time format**

SymbolMeaningPresentationExamples
Geraeraera
Ccentury of era (>=0)number20
Yyear of era (>=0)year1996
xweekyearyear1996
wweek of weekyearnumber27
eday of weeknumber2
Eday of weektextTuesday; Tue
yyearyear1996
Dday of yearnumber189
Mmonth of yearmonthJuly; Jul; 07
dday of monthnumber10
ahalfday of daytextPM
Khour of halfday (0~11)number0
hclockhour of halfday (1~12)number12
Hhour of day (0~23)number0
kclockhour of day (1~24)number24
mminute of hournumber30
ssecond of minutenumber55
Sfraction of secondmillis978
ztime zonetextPacific Standard Time; PST
Ztime zone offset/idzone-0800; -08:00; America/Los_Angeles
escape for textdelimiter
‘’single quoteliteral
  • Relative timestamp

Relative time refers to the time relative to the server time now() and DATETIME time.

Syntax:

  1. Duration = (Digit+ ('Y'|'MO'|'W'|'D'|'H'|'M'|'S'|'MS'|'US'|'NS'))+
  2. RelativeTime = (now() | DATETIME) ((+|-) Duration)+

**The syntax of the duration unit**

SymbolMeaningPresentationExamples
yyear1y=365 days1y
momonth1mo=30 days1mo
wweek1w=7 days1w
dday1d=1 day1d
hhour1h=3600 seconds1h
mminute1m=60 seconds1m
ssecond1s=1 second1s
msmillisecond1ms=1000_000 nanoseconds1ms
usmicrosecond1us=1000 nanoseconds1us
nsnanosecond1ns=1 nanosecond1ns

eg:

  1. now() - 1d2h //1 day and 2 hours earlier than the current server time
  2. now() - 1w //1 week earlier than the current server time

Note:There must be spaces on the left and right of ‘+’ and ‘-‘.