11.139. Release 0.102

Unicode Support

All string functions have been updated to support Unicode. The functions assumethat the string contains valid UTF-8 encoded code points. There are no explicitchecks for valid UTF-8, and the functions may return incorrect results oninvalid UTF-8. Invalid UTF-8 data can be corrected with from_utf8().

Additionally, the functions operate on Unicode code points and not user visiblecharacters (or grapheme clusters). Some languages combine multiple code pointsinto a single user-perceived character, the basic unit of a writing system for alanguage, but the functions will treat each code point as a separate unit.

Regular Expression Functions

All Regular Expression Functions have been rewritten to improve performance.The new versions are often twice as fast and in some cases can be manyorders of magnitude faster (due to removal of quadratic behavior).This change introduced some minor incompatibilities that are explainedin the documentation for the functions.

General Changes

  • Add support for partitioned right outer joins, which allows for larger tables tobe joined on the inner side.
  • Add support for full outer joins.
  • Support returning booleans as numbers in JDBC driver
  • Fix contains() to return NULL if the value was not found, but a NULL was.
  • Fix nested ROW rendering in DESCRIBE.
  • Add array_join().
  • Optimize map subscript operator.
  • Add from_utf8() and to_utf8() functions.
  • Add task_writer_count session property to set task.writer-count.
  • Add cast from ARRAY(F) to ARRAY(T).
  • Extend implicit coercions to ARRAY element types.
  • Implement implicit coercions in VALUES expressions.
  • Fix potential deadlock in scheduler.

Hive Changes

  • Collect more metrics from PrestoS3FileSystem.
  • Retry when seeking in PrestoS3FileSystem.
  • Ignore InvalidRange error in PrestoS3FileSystem.
  • Implement rename and delete in PrestoS3FileSystem.
  • Fix assertion failure when running SHOW TABLES FROM schema.
  • Fix S3 socket leak when reading ORC files.