layout: post
title: “We don’t need no stinking UML diagrams”
description: “A comparison of code vs UML”

categories: [“DDD”]

In my talk on functional DDD, I often use this slide (in context):

We don't need no stinking UML diagrams

Which is of course is a misquote of this famous scene. Oops, I mean this one.

Ok, I might be exaggerating a bit. Some UML diagrams are useful (I like sequence diagrams for example) and in general, I do think a good picture or diagram can be worth 1000 words.

But I believe that, in many cases, using UML for class diagrams is not necessary.

Instead, a concise language like F# (or OCaml or Haskell) can convey the same meaning
in a way that is easier to read, easier to write, and most important, easier to turn into working code!

With UML diagrams, you need to translate them to code, with the possibility of losing something in translation.
But if the design is documented in your programming language itself, there is no translation phase, and so the design must always be in sync with the implementation.

To demonstrate this in practice, I decided to scour the internet for some good (and not-so-good) UML class diagrams, and convert them into F# code. You can compare them for yourselves.

Regular expressions

Let’s start with a classic one: regular expressions (source)

Here’s the UML diagram:

We don’t need no stinking UML diagrams - 图2

And here’s the F# equivalent:

  1. type RegularExpression =
  2. | Literal of string
  3. | Sequence of RegularExpression list
  4. | Alternation of RegularExpression * RegularExpression
  5. | Repetition of RegularExpression
  6. // An interpreter takes a string input and a RegularExpression
  7. // and returns a value of some kind
  8. type Interpret<'a> = string -> RegularExpression -> 'a

That’s quite straightforward.

Student enrollment

Here’s another classic one: enrollment (source).

Here’s the UML diagram:

We don’t need no stinking UML diagrams - 图3

And here’s the F# equivalent:

  1. type Student = {
  2. Name: string
  3. Address: string
  4. PhoneNumber: string
  5. EmailAddress: string
  6. AverageMark: float
  7. }
  8. type Professor= {
  9. Name: string
  10. Address: string
  11. PhoneNumber: string
  12. EmailAddress: string
  13. Salary: int
  14. }
  15. type Seminar = {
  16. Name: string
  17. Number: string
  18. Fees: float
  19. TaughtBy: Professor option
  20. WaitingList: Student list
  21. }
  22. type Enrollment = {
  23. Student : Student
  24. Seminar : Seminar
  25. Marks: float list
  26. }
  27. type EnrollmentRepository = Enrollment list
  28. // ==================================
  29. // activities / use-cases / scenarios
  30. // ==================================
  31. type IsElegibleToEnroll = Student -> Seminar -> bool
  32. type GetSeminarsTaken = Student -> EnrollmentRepository -> Seminar list
  33. type AddStudentToWaitingList = Student -> Seminar -> Seminar

The F# mirrors the UML diagram, but I find that by writing functions for all the activities rather than drawing pictures, holes in the original requirements are revealed.

For example, in the GetSeminarsTaken method in the UML diagram, where is the list of seminars stored?
If it is in the Student class (as implied by the diagram) then we have a mutual recursion between Student and Seminar
and the whole tree of every student and seminar is interconnected and must be loaded at the same time unless hacks are used.

Instead, for the functional version, I created an EnrollmentRepository to decouple the two classes.

Similarly, it’s not clear how enrollment actually works, so I created an EnrollStudent function to make it clear what inputs are needed.

  1. type EnrollStudent = Student -> Seminar -> Enrollment option

Because the function returns an option, it is immediately clear that enrollment might fail (e.g student is not eligible to enroll, or is enrolling twice by mistake).

Order and customer

Here’s another one (source).

We don’t need no stinking UML diagrams - 图4

And here’s the F# equivalent:

  1. type Customer = {name:string; location:string}
  2. type NormalOrder = {date: DateTime; number: string; customer: Customer}
  3. type SpecialOrder = {date: DateTime; number: string; customer: Customer}
  4. type Order =
  5. | Normal of NormalOrder
  6. | Special of SpecialOrder
  7. // these three operations work on all orders
  8. type Confirm = Order -> Order
  9. type Close = Order -> Order
  10. type Dispatch = Order -> Order
  11. // this operation only works on Special orders
  12. type Receive = SpecialOrder -> SpecialOrder

I’m just copying the UML diagram, but I have to say that I hate this design. It’s crying out to have more fine grained states.

In particular, the Confirm and Dispatch functions are horrible — they give no idea of what else is needed as input or what the effects will be.
This is where writing real code can force you to think a bit more deeply about the requirements.

Order and customer, version 2

Here’s a much better version of orders and customers (source).

We don’t need no stinking UML diagrams - 图5

And here’s the F# equivalent:

  1. type Date = System.DateTime
  2. // == Customer related ==
  3. type Customer = {
  4. name:string
  5. address:string
  6. }
  7. // == Item related ==
  8. type [<Measure>] grams
  9. type Item = {
  10. shippingWeight: int<grams>
  11. description: string
  12. }
  13. type Qty = int
  14. type Price = decimal
  15. // == Payment related ==
  16. type PaymentMethod =
  17. | Cash
  18. | Credit of number:string * cardType:string * expDate:Date
  19. | Check of name:string * bankID: string
  20. type Payment = {
  21. amount: decimal
  22. paymentMethod : PaymentMethod
  23. }
  24. // == Order related ==
  25. type TaxStatus = Taxable | NonTaxable
  26. type Tax = decimal
  27. type OrderDetail = {
  28. item: Item
  29. qty: int
  30. taxStatus : TaxStatus
  31. }
  32. type OrderStatus = Open | Completed
  33. type Order = {
  34. date: DateTime;
  35. customer: Customer
  36. status: OrderStatus
  37. lines: OrderDetail list
  38. payments: Payment list
  39. }
  40. // ==================================
  41. // activities / use-cases / scenarios
  42. // ==================================
  43. type GetPriceForQuantity = Item -> Qty -> Price
  44. type CalcTax = Order -> Tax
  45. type CalcTotal = Order -> Price
  46. type CalcTotalWeight = Order -> int<grams>

I’ve done some minor tweaking, adding units of measure for the weight, creating types to represent Qty and Price.

Again, this design might be improved with more fine grained states,
such as creating a separate AuthorizedPayment type (to ensure that an order can only be paid with authorized payments)
and a separate PaidOrder type (e.g. to stop you paying for the same order twice).

Here’s the kind of thing I mean:

  1. // Try to authorize a payment. Note that it might fail
  2. type Authorize = UnauthorizedPayment -> AuthorizedPayment option
  3. // Pay an unpaid order with an authorized payment.
  4. type PayOrder = UnpaidOrder -> AuthorizedPayment -> PaidOrder

Hotel Booking

Here’s one from the JetBrains IntelliJ documentation (source).

We don’t need no stinking UML diagrams - 图6

Here’s the F# equivalent:

  1. type Date = System.DateTime
  2. type User = {
  3. username: string
  4. password: string
  5. name: string
  6. }
  7. type Hotel = {
  8. id: int
  9. name: string
  10. address: string
  11. city: string
  12. state: string
  13. zip: string
  14. country: string
  15. price: decimal
  16. }
  17. type CreditCardInfo = {
  18. card: string
  19. name: string
  20. expiryMonth: int
  21. expiryYear: int
  22. }
  23. type Booking = {
  24. id: int
  25. user: User
  26. hotel: Hotel
  27. checkinDate: Date
  28. checkoutDate: Date
  29. creditCardInfo: CreditCardInfo
  30. smoking: bool
  31. beds: int
  32. }
  33. // What are these?!? And why are they in the domain?
  34. type EntityManager = unit
  35. type FacesMessages = unit
  36. type Events = unit
  37. type Log = unit
  38. type BookingAction = {
  39. em: EntityManager
  40. user: User
  41. hotel: Booking
  42. booking: Booking
  43. facesMessages : FacesMessages
  44. events: Events
  45. log: Log
  46. bookingValid: bool
  47. }
  48. type ChangePasswordAction = {
  49. user: User
  50. em: EntityManager
  51. verify: string
  52. booking: Booking
  53. changed: bool
  54. facesMessages : FacesMessages
  55. }
  56. type RegisterAction = {
  57. user: User
  58. em: EntityManager
  59. facesMessages : FacesMessages
  60. verify: string
  61. registered: bool
  62. }

I have to stop there, sorry. The design is driving me crazy. I can’t even.

What are these EntityManager and FacesMessages fields? And logging is important of course, but why is Log a field in the domain object?

By the way, in case you think that I am deliberately picking bad examples of UML design, all these diagrams come from the top results in an image search for “uml class diagram”.

Library

This one is better, a library domain (source).

We don’t need no stinking UML diagrams - 图7

Here’s the F# equivalent. Note that because it is code, I can add comments to specific types and fields, which is doable but awkward with UML.

Note also that I can say ISBN: string option to indicate an optional ISBN rather that the awkward [0..1] syntax.

  1. type Author = {
  2. name: string
  3. biography: string
  4. }
  5. type Book = {
  6. ISBN: string option
  7. title: string
  8. author: Author
  9. summary: string
  10. publisher: string
  11. publicationDate: Date
  12. numberOfPages: int
  13. language: string
  14. }
  15. type Library = {
  16. name: string
  17. address: string
  18. }
  19. // Each physical library item - book, tape cassette, CD, DVD, etc. could have its own item number.
  20. // To support it, the items may be barcoded. The purpose of barcoding is
  21. // to provide a unique and scannable identifier that links the barcoded physical item
  22. // to the electronic record in the catalog.
  23. // Barcode must be physically attached to the item, and barcode number is entered into
  24. // the corresponding field in the electronic item record.
  25. // Barcodes on library items could be replaced by RFID tags.
  26. // The RFID tag can contain item's identifier, title, material type, etc.
  27. // It is read by an RFID reader, without the need to open a book cover or CD/DVD case
  28. // to scan it with barcode reader.
  29. type BookItem = {
  30. barcode: string option
  31. RFID: string option
  32. book: Book
  33. /// Library has some rules on what could be borrowed and what is for reference only.
  34. isReferenceOnly: bool
  35. belongsTo: Library
  36. }
  37. type Catalogue = {
  38. belongsTo: Library
  39. records : BookItem list
  40. }
  41. type Patron = {
  42. name: string
  43. address: string
  44. }
  45. type AccountState = Active | Frozen | Closed
  46. type Account = {
  47. patron: Patron
  48. library: Library
  49. number: int
  50. opened: Date
  51. /// Rules are also defined on how many books could be borrowed
  52. /// by patrons and how many could be reserved.
  53. history: History list
  54. state: AccountState
  55. }
  56. and History = {
  57. book : BookItem
  58. account: Account
  59. borrowedOn: Date
  60. returnedOn: Date option
  61. }

Since the Search and Manage interfaces are undefined, we can just use placeholders (unit) for the inputs and outputs.

  1. type Librarian = {
  2. name: string
  3. address: string
  4. position: string
  5. }
  6. /// Either a patron or a librarian can do a search
  7. type SearchInterfaceOperator =
  8. | Patron of Patron
  9. | Librarian of Librarian
  10. type SearchRequest = unit // to do
  11. type SearchResult = unit // to do
  12. type SearchInterface = SearchInterfaceOperator -> Catalogue -> SearchRequest -> SearchResult
  13. type ManageRequest = unit // to do
  14. type ManageResult = unit // to do
  15. /// Only librarians can do management
  16. type ManageInterface = Librarian -> Catalogue -> ManageRequest -> ManageResult

Again, this might not be the perfect design. For example, it’s not clear that only Active accounts could borrow a book, which I might represent in F# as:

  1. type Account =
  2. | Active of ActiveAccount
  3. | Closed of ClosedAccount
  4. /// Only ActiveAccounts can borrow
  5. type Borrow = ActiveAccount -> BookItem -> History

If you want to see a more modern approach to modelling this domain using CQRS and event sourcing, see this post.

Software licensing

The final example is from a software licensing domain (source).

We don’t need no stinking UML diagrams - 图8

Here’s the F# equivalent.

  1. open System
  2. type Date = System.DateTime
  3. type String50 = string
  4. type String5 = string
  5. // ==========================
  6. // Customer-related
  7. // ==========================
  8. type AddressDetails = {
  9. street : string option
  10. city : string option
  11. postalCode : string option
  12. state : string option
  13. country : string option
  14. }
  15. type CustomerIdDescription = {
  16. CRM_ID : string
  17. description : string
  18. }
  19. type IndividualCustomer = {
  20. idAndDescription : CustomerIdDescription
  21. firstName : string
  22. lastName : string
  23. middleName : string option
  24. email : string
  25. phone : string option
  26. locale : string option // default : "English"
  27. billing : AddressDetails
  28. shipping : AddressDetails
  29. }
  30. type Contact = {
  31. firstName : string
  32. lastName : string
  33. middleName : string option
  34. email : string
  35. locale : string option // default : "English"
  36. }
  37. type Company = {
  38. idAndDescription : CustomerIdDescription
  39. name : string
  40. phone : string option
  41. fax : string option
  42. contact: Contact
  43. billing : AddressDetails
  44. shipping : AddressDetails
  45. }
  46. type Customer =
  47. | Individual of IndividualCustomer
  48. | Company of Company
  49. // ==========================
  50. // Product-related
  51. // ==========================
  52. /// Flags can be ORed together
  53. [<Flags>]
  54. type LockingType =
  55. | HL
  56. | SL_AdminMode
  57. | SL_UserMode
  58. type Rehost =
  59. | Enable
  60. | Disable
  61. | LeaveAsIs
  62. | SpecifyAtEntitlementTime
  63. type BatchCode = {
  64. id : String5
  65. }
  66. type Feature = {
  67. id : int
  68. name : String50
  69. description : string option
  70. }
  71. type ProductInfo = {
  72. id : int
  73. name : String50
  74. lockingType : LockingType
  75. rehost : Rehost
  76. description : string option
  77. features: Feature list
  78. bactchCode: BatchCode
  79. }
  80. type Product =
  81. | BaseProduct of ProductInfo
  82. | ProvisionalProduct of ProductInfo * baseProduct:Product
  83. // ==========================
  84. // Entitlement-related
  85. // ==========================
  86. type EntitlementType =
  87. | HardwareKey
  88. | ProductKey
  89. | ProtectionKeyUpdate
  90. type Entitlement = {
  91. EID : string
  92. entitlementType : EntitlementType
  93. startDate : Date
  94. endDate : Date option
  95. neverExpires: bool
  96. comments: string option
  97. customer: Customer
  98. products: Product list
  99. }

This diagram is just pure data and no methods, so there are no function types. I have a feeling that there are some important business rules that have not been captured.

For example, if you read the comments in the source, you’ll see that there are some interesting constraints around EntitlementType and LockingType.
Only certain locking types can be used with certain entitlement types.

That might be something that we could consider modelling in the type system, but I haven’t bothered. I’ve just tried to reproduct the UML as is.

Summary

I think that’s enough to get the idea.

My general feeling about UML class diagrams is that they are OK for a sketch, if a bit heavyweight compared to a few lines of code.

For detailed designs, though, they are not nearly detailed enough. Critical things like context and dependencies are not at all obvious.
In my opinion, none of the UML diagrams I’ve shown have been good enough to write code from, even as a basic design.

Even more seriously, a UML diagram can be very misleading to non-developers. It looks “official” and can give the impression that the design has been thought about deeply,
when in fact the design is actually shallow and unusable in practice.

Disagree? Let me know in the comments!