String type

All string literals are of the type string. A string in Nim is very similar to a sequence of characters. However, strings in Nim are both zero-terminated and have a length field. One can retrieve the length with the builtin len procedure; the length never counts the terminating zero.

The terminating zero cannot be accessed unless the string is converted to the cstring type first. The terminating zero assures that this conversion can be done in O(1) and without any allocations.

The assignment operator for strings always copies the string. The & operator concatenates strings.

Most native Nim types support conversion to strings with the special $ proc. When calling the echo proc, for example, the built-in stringify operation for the parameter is called:

  1. echo 3 # calls `$` for `int`

Whenever a user creates a specialized object, implementation of this procedure provides for string representation.

  1. type
  2. Person = object
  3. name: string
  4. age: int
  5. proc `$`(p: Person): string = # `$` always returns a string
  6. result = p.name & " is " &
  7. $p.age & # we *need* the `$` in front of p.age which
  8. # is natively an integer to convert it to
  9. # a string
  10. " years old."

While $p.name can also be used, the $ operation on a string does nothing. Note that we cannot rely on automatic conversion from an int to a string like we can for the echo proc.

Strings are compared by their lexicographical order. All comparison operators are available. Strings can be indexed like arrays (lower bound is 0). Unlike arrays, they can be used in case statements:

  1. case paramStr(i)
  2. of "-v": incl(options, optVerbose)
  3. of "-h", "-?": incl(options, optHelp)
  4. else: write(stdout, "invalid command line option!\n")

Per convention, all strings are UTF-8 strings, but this is not enforced. For example, when reading strings from binary files, they are merely a sequence of bytes. The index operation s[i] means the i-th char of s, not the i-th unichar. The iterator runes from the unicode module can be used for iteration over all Unicode characters.