Sets

Sets can also be implemented in terms of cons cells. In fact, you can treat any list as a set—Common Lisp provides several functions for performing set-theoretic operations on lists. However, you should bear in mind that because of the way lists are structured, these operations get less and less efficient the bigger the sets get.

That said, using the built-in set functions makes it easy to write set-manipulation code. And for small sets they may well be more efficient than the alternatives. If profiling shows you that these functions are a performance bottleneck in your code, you can always replace the lists with sets built on top of hash tables or bit vectors.

To build up a set, you can use the function **ADJOIN**. **ADJOIN** takes an item and a list representing a set and returns a list representing the set containing the item and all the items in the original set. To determine whether the item is present, it must scan the list; if the item isn’t found, **ADJOIN** creates a new cons cell holding the item and pointing to the original list and returns it. Otherwise, it returns the original list.

**ADJOIN** also takes :key and :test keyword arguments, which are used when determining whether the item is present in the original list. Like **CONS**, **ADJOIN** has no effect on the original list—if you want to modify a particular list, you need to assign the value returned by **ADJOIN** to the place where the list came from. The modify macro **PUSHNEW** does this for you automatically.

  1. CL-USER> (defparameter *set* ())
  2. *SET*
  3. CL-USER> (adjoin 1 *set*)
  4. (1)
  5. CL-USER> *set*
  6. NIL
  7. CL-USER> (setf *set* (adjoin 1 *set*))
  8. (1)
  9. CL-USER> (pushnew 2 *set*)
  10. (2 1)
  11. CL-USER> *set*
  12. (2 1)
  13. CL-USER> (pushnew 2 *set*)
  14. (2 1)

You can test whether a given item is in a set with **MEMBER** and the related functions **MEMBER-IF** and **MEMBER-IF-NOT**. These functions are similar to the sequence functions **FIND**, **FIND-IF**, and **FIND-IF-NOT** except they can be used only with lists. And instead of returning the item when it’s present, they return the cons cell containing the item—in other words, the sublist starting with the desired item. When the desired item isn’t present in the list, all three functions return **NIL**.

The remaining set-theoretic functions provide bulk operations: **INTERSECTION**, **UNION**, **SET-DIFFERENCE**, and **SET-EXCLUSIVE-OR**. Each of these functions takes two lists and :key and :test keyword arguments and returns a new list representing the set resulting from performing the appropriate set-theoretic operation on the two lists: **INTERSECTION** returns a list containing all the elements found in both arguments. **UNION** returns a list containing one instance of each unique element from the two arguments.3 **SET-DIFFERENCE** returns a list containing all the elements from the first argument that don’t appear in the second argument. And **SET-EXCLUSIVE-OR** returns a list containing those elements appearing in only one or the other of the two argument lists but not in both. Each of these functions also has a recycling counterpart whose name is the same except with an N prefix.

Finally, the function **SUBSETP** takes two lists and the usual :key and :test keyword arguments and returns true if the first list is a subset of the second—if every element in the first list is also present in the second list. The order of the elements in the lists doesn’t matter.

  1. CL-USER> (subsetp '(3 2 1) '(1 2 3 4))
  2. T
  3. CL-USER> (subsetp '(1 2 3 4) '(3 2 1))
  4. NIL