9.4. Persistence risks

Practical implementations will persist data to a non-volatile storage medium. Data will be serialized when stored and deserialized when retrieved, although the details of the serialization format will be user-agent specific. User agents are likely to change their serialization format over time. For example, the format may be updated to handle new data types, or to improve performance. To satisfy the operational requirements of this specification, implementations must therefore handle older serialization formats in some way. Improper handling of older data can result in security issues. In addition to basic serialization concerns, serialized data could encode assumptions which are not valid in newer versions of the user agent.

A practical example of this is the RegExp type. The StructuredSerializeForStorage operation allows serializing RegExp objects. A typical user agent will compile a regular expression into native machine instructions, with assumptions about how the input data is passed and results returned. If this internal state was serialized as part of the data stored to the database, various problems could arise when the internal representation was later deserialized. For example, the means by which data was passed into the code could have changed. Security bugs in the compiler output could have been identified and fixed in updates to the user agent, but remain in the serialized internal state.

User agents must identify and handle older data appropriately. One approach is to include version identifiers in the serialization format, and to reconstruct any internal state from script-visible state when older data is encountered.