Chapter 2 The module system

This chapter introduces the module system of OCaml.

2.1 Structures

A primary motivation for modules is to package together relateddefinitions (such as the definitions of a data type and associatedoperations over that type) and enforce a consistent naming scheme forthese definitions. This avoids running out of names or accidentallyconfusing names. Such a package is called a structure andis introduced by the struct…end construct, which contains anarbitrary sequence of definitions. The structure is usually given aname with the module binding. Here is for instance a structurepackaging together a type of priority queues and their operations:

  1. module PrioQueue =
  2. struct
  3. type priority = int
  4. type 'a queue = Empty | Node of priority * 'a * 'a queue * 'a queue
  5. let empty = Empty
  6. let rec insert queue prio elt =
  7. match queue with
  8. Empty -> Node(prio, elt, Empty, Empty)
  9. | Node(p, e, left, right) ->
  10. if prio <= p
  11. then Node(prio, elt, insert right p e, left)
  12. else Node(p, e, insert right prio elt, left)
  13. exception Queue_is_empty
  14. let rec remove_top = function
  15. Empty -> raise Queue_is_empty
  16. | Node(prio, elt, left, Empty) -> left
  17. | Node(prio, elt, Empty, right) -> right
  18. | Node(prio, elt, (Node(lprio, lelt, _, _) as left),
  19. (Node(rprio, relt, _, _) as right)) ->
  20. if lprio <= rprio
  21. then Node(lprio, lelt, remove_top left, right)
  22. else Node(rprio, relt, left, remove_top right)
  23. let extract = function
  24. Empty -> raise Queue_is_empty
  25. | Node(prio, elt, _, _) as queue -> (prio, elt, remove_top queue)
  26. end;;
  27. module PrioQueue :
  28. sig
  29. type priority = int
  30. type 'a queue = Empty | Node of priority * 'a * 'a queue * 'a queue
  31. val empty : 'a queue
  32. val insert : 'a queue -> priority -> 'a -> 'a queue
  33. exception Queue_is_empty
  34. val remove_top : 'a queue -> 'a queue
  35. val extract : 'a queue -> priority * 'a * 'a queue
  36. end

Outside the structure, its components can be referred to using the“dot notation”, that is, identifiers qualified by a structure name.For instance, PrioQueue.insert is the function insert definedinside the structure PrioQueue and PrioQueue.queue is the typequeue defined in PrioQueue.

  1. PrioQueue.insert PrioQueue.empty 1 "hello";;
  2. - : string PrioQueue.queue =
  3. PrioQueue.Node (1, "hello", PrioQueue.Empty, PrioQueue.Empty)

Another possibility is to open the module, which brings allidentifiers defined inside the module in the scope of the currentstructure.

  1. open PrioQueue;;
  1. insert empty 1 "hello";;
  2. - : string PrioQueue.queue = Node (1, "hello", Empty, Empty)

Opening a module enables lighter access to its components, at thecost of making it harder to identify in which module a identifierhas been defined. In particular, opened modules can shadowidentifiers present in the current scope, potentially leadingto confusing errors:

  1. let empty = []
  2. open PrioQueue;;
  3. val empty : 'a list = []
  1. let x = 1 :: empty ;;
  2. Error: This expression has type 'a PrioQueue.queue
  3. but an expression was expected of type int list

A partial solution to this conundrum is to open modules locally,making the components of the module available only in theconcerned expression. This can also make the code easier to read– the open statement is closer to where it is used– and to refactor– the code fragment is more self-contained.Two constructions are available for this purpose:

  1. let open PrioQueue in
  2. insert empty 1 "hello";;
  3. - : string PrioQueue.queue = Node (1, "hello", Empty, Empty)

and

  1. PrioQueue.(insert empty 1 "hello");;
  2. - : string PrioQueue.queue = Node (1, "hello", Empty, Empty)

In the second form, when the body of a local open is itself delimitedby parentheses, braces or bracket, the parentheses of the local opencan be omitted. For instance,

  1. PrioQueue.[empty] = PrioQueue.([empty]);;
  2. - : bool = true
  1. PrioQueue.[|empty|] = PrioQueue.([|empty|]);;
  2. - : bool = true
  1. PrioQueue.{ contents = empty } = PrioQueue.({ contents = empty });;
  2. - : bool = true

becomes

  1. PrioQueue.[insert empty 1 "hello"];;
  2. - : string PrioQueue.queue list = [Node (1, "hello", Empty, Empty)]

It is also possible to copy the components of a module insideanother module by using an include statement. This can beparticularly useful to extend existing modules. As an illustration,we could add functions that returns an optional value rather thanan exception when the priority queue is empty.

  1. module PrioQueueOpt =
  2. struct
  3. include PrioQueue
  4. let remove_top_opt x =
  5. try Some(remove_top x) with Queue_is_empty -> None
  6. let extract_opt x =
  7. try Some(extract x) with Queue_is_empty -> None
  8. end;;
  9. module PrioQueueOpt :
  10. sig
  11. type priority = int
  12. type 'a queue =
  13. 'a PrioQueue.queue =
  14. Empty
  15. | Node of priority * 'a * 'a queue * 'a queue
  16. val empty : 'a queue
  17. val insert : 'a queue -> priority -> 'a -> 'a queue
  18. exception Queue_is_empty
  19. val remove_top : 'a queue -> 'a queue
  20. val extract : 'a queue -> priority * 'a * 'a queue
  21. val remove_top_opt : 'a queue -> 'a queue option
  22. val extract_opt : 'a queue -> (priority * 'a * 'a queue) option
  23. end

2.2 Signatures

Signatures are interfaces for structures. A signature specifieswhich components of a structure are accessible from the outside, andwith which type. It can be used to hide some components of a structure(e.g. local function definitions) or export some components with arestricted type. For instance, the signature below specifies the threepriority queue operations empty, insert and extract, but not theauxiliary function remove_top. Similarly, it makes the queue typeabstract (by not providing its actual representation as a concrete type).

  1. module type PRIOQUEUE =
  2. sig
  3. type priority = int (* still concrete *)
  4. type 'a queue (* now abstract *)
  5. val empty : 'a queue
  6. val insert : 'a queue -> int -> 'a -> 'a queue
  7. val extract : 'a queue -> int * 'a * 'a queue
  8. exception Queue_is_empty
  9. end;;
  10. module type PRIOQUEUE =
  11. sig
  12. type priority = int
  13. type 'a queue
  14. val empty : 'a queue
  15. val insert : 'a queue -> int -> 'a -> 'a queue
  16. val extract : 'a queue -> int * 'a * 'a queue
  17. exception Queue_is_empty
  18. end

Restricting the PrioQueue structure by this signature results inanother view of the PrioQueue structure where the remove_topfunction is not accessible and the actual representation of priorityqueues is hidden:

  1. module AbstractPrioQueue = (PrioQueue : PRIOQUEUE);;
  2. module AbstractPrioQueue : PRIOQUEUE
  1. AbstractPrioQueue.remove_top ;;
  2. Error: Unbound value AbstractPrioQueue.remove_top
  1. AbstractPrioQueue.insert AbstractPrioQueue.empty 1 "hello";;
  2. - : string AbstractPrioQueue.queue = <abstr>

The restriction can also be performed during the definition of thestructure, as in

  1. module PrioQueue = (struct ... end : PRIOQUEUE);;

An alternate syntax is provided for the above:

  1. module PrioQueue : PRIOQUEUE = struct ... end;;

Like for modules, it is possible to include a signature to copyits components inside the current signature. For instance, wecan extend the PRIOQUEUE signature with the extract_optfunction:

  1. module type PRIOQUEUE_WITH_OPT =
  2. sig
  3. include PRIOQUEUE
  4. val extract_opt : 'a queue -> (int * 'a * 'a queue) option
  5. end;;
  6. module type PRIOQUEUE_WITH_OPT =
  7. sig
  8. type priority = int
  9. type 'a queue
  10. val empty : 'a queue
  11. val insert : 'a queue -> int -> 'a -> 'a queue
  12. val extract : 'a queue -> int * 'a * 'a queue
  13. exception Queue_is_empty
  14. val extract_opt : 'a queue -> (int * 'a * 'a queue) option
  15. end

2.3 Functors

Functors are “functions” from modules to modules. Functors let you createparameterized modules and then provide other modules as parameter(s) to geta specific implementation. For instance, a Set module implementing setsas sorted lists could be parameterized to work with any module that providesan element type and a comparison function compare (such as OrderedString):

  1. type comparison = Less | Equal | Greater;;
  2. type comparison = Less | Equal | Greater
  1. module type ORDERED_TYPE =
  2. sig
  3. type t
  4. val compare: t -> t -> comparison
  5. end;;
  6. module type ORDERED_TYPE = sig type t val compare : t -> t -> comparison end
  1. module Set =
  2. functor (Elt: ORDERED_TYPE) ->
  3. struct
  4. type element = Elt.t
  5. type set = element list
  6. let empty = []
  7. let rec add x s =
  8. match s with
  9. [] -> [x]
  10. | hd::tl ->
  11. match Elt.compare x hd with
  12. Equal -> s (* x is already in s *)
  13. | Less -> x :: s (* x is smaller than all elements of s *)
  14. | Greater -> hd :: add x tl
  15. let rec member x s =
  16. match s with
  17. [] -> false
  18. | hd::tl ->
  19. match Elt.compare x hd with
  20. Equal -> true (* x belongs to s *)
  21. | Less -> false (* x is smaller than all elements of s *)
  22. | Greater -> member x tl
  23. end;;
  24. module Set :
  25. functor (Elt : ORDERED_TYPE) ->
  26. sig
  27. type element = Elt.t
  28. type set = element list
  29. val empty : 'a list
  30. val add : Elt.t -> Elt.t list -> Elt.t list
  31. val member : Elt.t -> Elt.t list -> bool
  32. end

By applying the Set functor to a structure implementing an orderedtype, we obtain set operations for this type:

  1. module OrderedString =
  2. struct
  3. type t = string
  4. let compare x y = if x = y then Equal else if x < y then Less else Greater
  5. end;;
  6. module OrderedString :
  7. sig type t = string val compare : 'a -> 'a -> comparison end
  1. module StringSet = Set(OrderedString);;
  2. module StringSet :
  3. sig
  4. type element = OrderedString.t
  5. type set = element list
  6. val empty : 'a list
  7. val add : OrderedString.t -> OrderedString.t list -> OrderedString.t list
  8. val member : OrderedString.t -> OrderedString.t list -> bool
  9. end
  1. StringSet.member "bar" (StringSet.add "foo" StringSet.empty);;
  2. - : bool = false

2.4 Functors and type abstraction

As in the PrioQueue example, it would be good style to hide theactual implementation of the type set, so that users of thestructure will not rely on sets being lists, and we can switch laterto another, more efficient representation of sets without breakingtheir code. This can be achieved by restricting Set by a suitablefunctor signature:

  1. module type SETFUNCTOR =
  2. functor (Elt: ORDERED_TYPE) ->
  3. sig
  4. type element = Elt.t (* concrete *)
  5. type set (* abstract *)
  6. val empty : set
  7. val add : element -> set -> set
  8. val member : element -> set -> bool
  9. end;;
  10. module type SETFUNCTOR =
  11. functor (Elt : ORDERED_TYPE) ->
  12. sig
  13. type element = Elt.t
  14. type set
  15. val empty : set
  16. val add : element -> set -> set
  17. val member : element -> set -> bool
  18. end
  1. module AbstractSet = (Set : SETFUNCTOR);;
  2. module AbstractSet : SETFUNCTOR
  1. module AbstractStringSet = AbstractSet(OrderedString);;
  2. module AbstractStringSet :
  3. sig
  4. type element = OrderedString.t
  5. type set = AbstractSet(OrderedString).set
  6. val empty : set
  7. val add : element -> set -> set
  8. val member : element -> set -> bool
  9. end
  1. AbstractStringSet.add "gee" AbstractStringSet.empty;;
  2. - : AbstractStringSet.set = <abstr>

In an attempt to write the type constraint above more elegantly,one may wish to name the signature of the structurereturned by the functor, then use that signature in the constraint:

  1. module type SET =
  2. sig
  3. type element
  4. type set
  5. val empty : set
  6. val add : element -> set -> set
  7. val member : element -> set -> bool
  8. end;;
  9. module type SET =
  10. sig
  11. type element
  12. type set
  13. val empty : set
  14. val add : element -> set -> set
  15. val member : element -> set -> bool
  16. end
  1. module WrongSet = (Set : functor(Elt: ORDERED_TYPE) -> SET);;
  2. module WrongSet : functor (Elt : ORDERED_TYPE) -> SET
  1. module WrongStringSet = WrongSet(OrderedString);;
  2. module WrongStringSet :
  3. sig
  4. type element = WrongSet(OrderedString).element
  5. type set = WrongSet(OrderedString).set
  6. val empty : set
  7. val add : element -> set -> set
  8. val member : element -> set -> bool
  9. end
  1. WrongStringSet.add "gee" WrongStringSet.empty ;;
  2. Error: This expression has type string but an expression was expected of type
  3. WrongStringSet.element = WrongSet(OrderedString).element

The problem here is that SET specifies the type elementabstractly, so that the type equality between element in the resultof the functor and t in its argument is forgotten. Consequently,WrongStringSet.element is not the same type as string, and theoperations of WrongStringSet cannot be applied to strings.As demonstrated above, it is important that the type element in thesignature SET be declared equal to Elt.t; unfortunately, this isimpossible above since SET is defined in a context where Elt doesnot exist. To overcome this difficulty, OCaml provides awith type construct over signatures that allows enriching a signaturewith extra type equalities:

  1. module AbstractSet2 =
  2. (Set : functor(Elt: ORDERED_TYPE) -> (SET with type element = Elt.t));;
  3. module AbstractSet2 :
  4. functor (Elt : ORDERED_TYPE) ->
  5. sig
  6. type element = Elt.t
  7. type set
  8. val empty : set
  9. val add : element -> set -> set
  10. val member : element -> set -> bool
  11. end

As in the case of simple structures, an alternate syntax is providedfor defining functors and restricting their result:

  1. module AbstractSet2(Elt: ORDERED_TYPE) : (SET with type element = Elt.t) =
  2. struct ... end;;

Abstracting a type component in a functor result is a powerfultechnique that provides a high degree of type safety, as we nowillustrate. Consider an ordering over character strings that isdifferent from the standard ordering implemented in theOrderedString structure. For instance, we compare strings withoutdistinguishing upper and lower case.

  1. module NoCaseString =
  2. struct
  3. type t = string
  4. let compare s1 s2 =
  5. OrderedString.compare (String.lowercase_ascii s1) (String.lowercase_ascii s2)
  6. end;;
  7. module NoCaseString :
  8. sig type t = string val compare : string -> string -> comparison end
  1. module NoCaseStringSet = AbstractSet(NoCaseString);;
  2. module NoCaseStringSet :
  3. sig
  4. type element = NoCaseString.t
  5. type set = AbstractSet(NoCaseString).set
  6. val empty : set
  7. val add : element -> set -> set
  8. val member : element -> set -> bool
  9. end
  1. NoCaseStringSet.add "FOO" AbstractStringSet.empty ;;
  2. Error: This expression has type
  3. AbstractStringSet.set = AbstractSet(OrderedString).set
  4. but an expression was expected of type
  5. NoCaseStringSet.set = AbstractSet(NoCaseString).set

Note that the two types AbstractStringSet.set andNoCaseStringSet.set are not compatible, and values of thesetwo types do not match. This is the correct behavior: even though bothset types contain elements of the same type (strings), they are builtupon different orderings of that type, and different invariants needto be maintained by the operations (being strictly increasing for thestandard ordering and for the case-insensitive ordering). Applyingoperations from AbstractStringSet to values of typeNoCaseStringSet.set could give incorrect results, or buildlists that violate the invariants of NoCaseStringSet.

2.5 Modules and separate compilation

All examples of modules so far have been given in the context of theinteractive system. However, modules are most useful for large,batch-compiled programs. For these programs, it is a practicalnecessity to split the source into several files, called compilationunits, that can be compiled separately, thus minimizing recompilationafter changes.

In OCaml, compilation units are special cases of structuresand signatures, and the relationship between the units can beexplained easily in terms of the module system. A compilation unit Acomprises two files:

  • the implementation file A.ml, which contains a sequenceof definitions, analogous to the inside of a struct…endconstruct;
  • the interface file A.mli, which contains a sequence ofspecifications, analogous to the inside of a sig…endconstruct. These two files together define a structure named A as ifthe following definition was entered at top-level:
  1. module A: sig (* contents of file A.mli *) end
  2. = struct (* contents of file A.ml *) end;;

The files that define the compilation units can be compiled separatelyusing the ocamlc -c command (the -c option means “compile only, donot try to link”); this produces compiled interface files (withextension .cmi) and compiled object code files (with extension.cmo). When all units have been compiled, their .cmo files arelinked together using the ocamlc command. For instance, the followingcommands compile and link a program composed of two compilation unitsAux and Main:

  1. $ ocamlc -c Aux.mli # produces aux.cmi
  2. $ ocamlc -c Aux.ml # produces aux.cmo
  3. $ ocamlc -c Main.mli # produces main.cmi
  4. $ ocamlc -c Main.ml # produces main.cmo
  5. $ ocamlc -o theprogram Aux.cmo Main.cmo

The program behaves exactly as if the following phrases were enteredat top-level:

  1. module Aux: sig (* contents of Aux.mli *) end
  2. = struct (* contents of Aux.ml *) end;;
  3. module Main: sig (* contents of Main.mli *) end
  4. = struct (* contents of Main.ml *) end;;

In particular, Main can refer to Aux: the definitions anddeclarations contained in Main.ml and Main.mli can refer todefinition in Aux.ml, using the Aux.ident notation, providedthese definitions are exported in Aux.mli.

The order in which the .cmo files are given to ocamlc during thelinking phase determines the order in which the module definitionsoccur. Hence, in the example above, Aux appears first and Main canrefer to it, but Aux cannot refer to Main.

Note that only top-level structures can be mapped toseparately-compiled files, but neither functors nor module types.However, all module-class objects can appear as components of astructure, so the solution is to put the functor or module typeinside a structure, which can then be mapped to a file.