inaka

Latest blog entries

/
The Art of Writing a Blogpost

The Art of Writing a Blogpost

Apr 11 2017 : Matias Vera

/
SpellingCI: No more spelling mistakes in your markdown flies!

Feb 14 2017 : Felipe Ripoll

/
Fast reverse geocoding with offline-geocoder

Do you need a blazing fast reverse geocoder? Enter offline-geocoder!

Jan 18 2017 : Roberto Romero

/
Using Jayme to connect to the new MongooseIM REST services

MongooseIM has RESTful services!! Here I show how you can use them in an iOS application.

Dec 13 2016 : Sergio Abraham

/
20 Questions, or Maybe a Few More

20 Questions, or Maybe a Few More

Nov 16 2016 : Stephanie Goldner

/
The Power of Meeting People

Because conferences and meetups are not just about the technical stuff.

Nov 01 2016 : Pablo Villar

/
Finding the right partner for your app build

Sharing some light on how it is to partner with us.

Oct 27 2016 : Inaka

/
Just Play my Sound

How to easily play a sound in Android

Oct 25 2016 : Giaquinta Emiliano

/
Opening our Guidelines to the World

We're publishing our work guidelines for the world to see.

Oct 13 2016 : Brujo Benavides

/
Using NIFs: the easy way

Using niffy to simplify working with NIFs on Erlang

Oct 05 2016 : Hernan Rivas Acosta

/
Function Naming In Swift 3

How to write clear function signatures, yet expressive, while following Swift 3 API design guidelines.

Sep 16 2016 : Pablo Villar

/
Jenkins automated tests for Rails

How to automatically trigger rails tests with a Jenkins job

Sep 14 2016 : Demian Sciessere

/
Erlang REST Server Stack

A description of our usual stack for building REST servers in Erlang

Sep 06 2016 : Brujo Benavides

/
Replacing JSON when talking to Erlang

Using Erlang's External Term Format

Aug 17 2016 : Hernan Rivas Acosta

/
Gadget + Lewis = Android Lint CI

Integrating our Android linter with Github's pull requests

Aug 04 2016 : Fernando Ramirez and Euen Lopez

/
Passwordless login with phoenix

Introducing how to implement passwordless login with phoenix framework

Jul 27 2016 : Thiago Borges

/
Beam Olympics

Our newest game to test your Beam Skills

Jul 14 2016 : Brujo Benavides

/
Otec

Three Open Source Projects, one App

Jun 28 2016 : Andrés Gerace

/
CredoCI

Running credo checks for elixir code on your github pull requests

Jun 16 2016 : Alejandro Mataloni

/
Thoughts on rebar3

Thoughts on rebar3

Jun 08 2016 : Hernán Rivas Acosta

/
See all Inaka's blog posts >>

/
7 Heuristics for Development

A photo of Brujo Benavides wrote this on May 31, 2016 under development, erlang, oop .

Introduction

So, we recently had a Tech Day at our offices, but one of the talks could not be recorded. In that talk, Hernán presented 7 Heuristics for Object Oriented Design. And since we couldn't have him on record to share all that wonderful stuff with everybody else in the world, I thought I would write a blog post about it. That way, at least some of it will not be lost.

But, while I was thinking what to write in this blog post I realised that:

  • I'm not exactly an OOP programmer anymore
  • The 7 heuristics and especially the ideas behind them can be easily applied to other paradigms. In particular, to functional programming.

So, I decided to write this blog post with those 7 heuristics in mind, but instead of applying them on OOP, I'll show you how to apply them in Erlang. Let's see how that goes…

Abstract Data Types

Before we move on, I'll introduce a concept that is nothing new for Haskell developers, but that may seem somewhat strange to several Erlangers: Abstract Data Types.

In computer science, an abstract data type (ADT) is a mathematical model for data types where a data type is defined by its behavior

The idea here is to define the entities in your system by describing how they are used and not how they are implemented underneath. There is a lot to talk about ADTs, but for the purpose of this post, I'll mainly focused on a good practice we encourage here at Inaka: Keep your models in one module without exposing their internal structure to the world. Let me put it another way by using one of the most loved/hated structures in the Erlang world: records.

  • Why do we love[d] them?
    • because they perfectly and easily describe and allow us to use complex structures.
  • Why do we hate[d] them?
    • because they cause what I call nightmares.hrl. When a record needs to be used in multiple modules (as it's usually the case) you have to put its definition on a shared header file and that's when you loose control of it.

So, records are fine, if you use them in just one module. But you want to use them to represent your system entities, therefore you want to use them in multiple modules. The key misconception here is that what you really want is not to use the same record in multiple modules, but the same entity. That's when you can create an ADT, put it in just one module and export functions that manage it while not exporting the record with which you want to implement your entity.

For example, let's say your system deals with invoices, and you just created a fantastic #invoice record to represent them:

-record( invoice
       , { id       :: binary()
         , date     :: calendar:datetime()
         , customer :: binary()
         , amount   :: number()
         }
       ).

Where would you write that record definition? If you immediately think an hrl file, think again. This #invoice record represents an entity in your system and what I'm proposing here is that, in that case, it deserves a module of its own, and also an opaque type, like this:

-module(invoices).

-record( invoice
       , { id       :: binary()
         , date     :: calendar:datetime()
         , customer :: binary()
         , amount   :: number()
         }
       ).

-opaque invoice() :: #invoice{}.
-export_type([invoice/0]).

But then, how do we create an invoice? That's easy, we add a function to our invoices module for that. We can even assign some default values to it, look:

-export([new/2]).

-spec new(binary(), number()) -> invoice().
new(Customer, Amount) ->
    #invoice{ id       = uuid:new()
            , date     = calendar:universal_time()
            , customer = Customer
            , amount   = Amount
            }.

And then, of course, you'll eventually want to use some particular field for an invoice. Let's say that, given an invoice, you want to obtain the amount. This is the function you need to implement and export:

-spec amount(invoice()) -> number().
amount(#invoice{amount = Amount}) -> Amount.

This way, outside of the invoices module, everyboy can create invoices, retrieve data from them and (if you allow them to) update them as well. In other words, every module can work with invoices:invoice(), but no one knows anything about #invoice{}.

This concept may sound new to you, but remember we've been doing the same thing for a long time. Just check the dict or sets modules (and a couple of others) in Erlang/OTP.

What's the benefit of this approach? Well, besides preventing hrl hell, data type abstraction also allows you to change more easily your underlying implmentations. Let's say you suddenly think that a map is a better structure to represent your invoices than a record. The only module that you need to change is invoices. The other ones will never know what happened.

The Heuristics

With ADTs in mind, let's start analysing the 7 heuristics proposed by Hernán. Before we start, let's remember two things Hernán said:

  • These are heuristics, not rules. Because, as it's been proven multiple times in software history, there is no silver bullet, nothing can really be a rule here, nothing applies to all scenarios. Every piece of advice must be contextualized. So, these heuristics are just that: pieces of advice. You have to check if they apply to what you're working on, or not, on your own.
  • All these heuristics are based on a much more general piece of advice, namely that:

better code is the one that better models the problem at hand and not the one that just performs better

In other words, Hernán (and I wholeheartedly agree) sees the software development as a modeling process where what we create are computable representations of what we see in the real world, as opposed to writing instructions to tell the machine what to do.

Hernán's way of representing the entities in the world is by defining objects. I'll show how to do it by defining ADTs. Now, with that in mind as well, let's finally delve into the 7 heuristics:

1. Reality-Model Equivalence

This is, to me, the most important of the 7 heuristics and it's all about how you model your system and not about the code you actually end up writing. What Hernán recommends here is for you to have exactly one model in your system for each entity in your problem's domain. The easiest way to show you what that means is by presenting real-life examples of what should not happen if you want to build better systems. The following 4 scenarios show 4 problems you should try to avoid as much as you can:

Entities in your problem domain that can't be represented as Models in your system


For instance: in our example above, our invoices have no other data than a customer and a total amount. But invoices in real-life generally have lists of purchased items. If those items are important in your problem's domain, they have to be represented in your system. And they should be represented where they belong (i.e. inside your invoice ADT and not in a separate place of which invoices know nothing about). Even if they are stored separately (in other table or bucket or whatever), that should not condition the way you represent them. You should always aim at representing your entities accurately with your ADTs. Persistence is a problem with which you should be able to deal with later.

Models in your system that represent Entities that don't exist in reality


It's funny how we Erlangers tend to disregard Java as an ugly/bad language, but for this example Hernán exhibits a Java class for which we do have the equivalent module in OTP: Calendar. The question here is: what does Calendar represent? (i.e. What entities are represented by Calendar?). The short answer is: none. Calendar in Java doesn't represent a calendar, nor it does represent a date. calendar in Erlang has the same problem. On the other hand, calendar is (I think) not designed to be an ADT. But, should we have an ADT for dates? I think it would be a very good idea. Why? Because one of the main problems with Java's Calendar is that it let's you create an object representing the following date: 2016-02-31. Then, and only if you ask really nice, it tells you that date is invalid.

In our erlang world, we have no proper definition of what is a date, but most of us will read this as a date {2016,2,12} and this as a datetime {{2016,2,12},{10,0,12}}. As a matter of fact, we do have type definitions for those things in calendar module:

%%----------------------------------------------------------------------
%% Types
%%----------------------------------------------------------------------

-export_type([date/0, time/0, datetime/0, datetime1970/0]).

-type year()     :: non_neg_integer().
-type year1970() :: 1970..10000.
-type month()    :: 1..12.
-type day()      :: 1..31.
-type hour()     :: 0..23.
-type minute()   :: 0..59.
-type second()   :: 0..59.
-type daynum()   :: 1..7.
-type ldom()     :: 28 | 29 | 30 | 31.
-type weeknum()  :: 1..53.

-type date() :: {year(),month(),day()}.
-type time() :: {hour(),minute(),second()}.
-type datetime() :: {date(),time()}.

But, again, according to those definitions this is a perfectly valid datetime: {{2016,2,31},{0,0,0}}. And if this doesn't seem that bad to you, check heuristics 3 and 4 below.

Entities in your problem domain that are represented by multiple Models in your system


Back again to our previous example with the invoice items. If your system lets you represent those items in the air (i.e. not actually tied to the invoice to which they belong), what happens if you first create them and then you forget/fail to create the invoice itself? What do those items represent then? What real life entity will they be modeling?

Models in your system that represent multiple Entities in the real world


Hernán in his talk used the number 0 as an example for this. In the sense that 0 can be 0 meters, 0 items, 0 invoices, false, etc. And, since 0 by itself doesn't carry any other information about what it is being used for, when you're trying to debug the system you need to check the context in which that value is used/created/etc. to see what that 0 actually means. That, most of the times, is actually pretty complex, and many times it is just impossible.

In Erlang, I've seen that happening countless times. Even dialyzer is many times confounded on what your [] actually is. If you have a function that receives lists of invoices ([invoices:invoice()]) and you're incorrectly calling that function with lists of users ([users:user()]), dialyzer won't complain since an emtpy list of users is also a totally valid list of invoices.

Many of those empty-list scenarios are simply unavoidable, but some other similar problems are: In real life, an invoice is actually an invoice once it's fully written; before that it's just a piece of paper, maybe an invoice template. Your system should properly model that reality: You should not use a single ADT for both the invoices and the invoice_templates. If you keep those two things in independent modules, you'll not need to check the status of each invoice everywhere to see if you can use its amount or if it's still not ready.

2. Immutability

This one is nothing new to my fellow Erlangers, but it's always good to see that people from OOP World also praise Immutability like we do. As a matter of fact, in support of immutability, Hernán exhibited the same benefits and reasonings we're constantly expressing to the world. By keeping objects immutable, we never need to know in which context this object produced that error. Hernán expressed this in a very nice way:

By using immutable objects you don't need to consider the passage of time when you're bulding or debugging your system.

The remaining guidelines in this list will help you achieve this goal as well.

3. Complete Models

Remember: our models represent entities in real-life. And many things in our world cannot be created with missing parts. To continue with our invoice example, real-life invoices are required to have both a customer and a total. Therefore, there can be no invoices with those fields missing.

Imagine that we change our ADT to allow you to build up an invoice field by field, like this:


AnInvoice = invoices:new(),
AnInvoiceWithCustomer =
  invoices:customer(AnInvoice, Customer),
AFullInvoice =
  invoices:amount(
    AnInvoiceWithCustomer, Amount),

That might seem silly in this example; but imagine if those 3 steps are executed in different processes or, at least, different functions. Believe me, this example is not as far-fetched as it looks.

The problem here is that, between the call to invoices:new/1 and the call to invoices:amount/2, what do we really have? What real-life entity are AnInvoice and AnInvoiceWithCustomer representing? The answer is certainly none. And that's bad.

The proper implementation is the one we have above, the invoices' constructor takes all the arguments it needs and generates a 100% complete invoice instance that can already be used as an invoice anywhere.

4. Valid Models

Even when you have complete entitites, another way to fail in representing the real world entities in your system is to allow the user to create invalid instances of your models.

Let's say that invoices in real-life can't have amounts lower or equal to 0. Our current implementation of the invoices model will allow invalid invoices to be created. In that case, two things might happen: either the rest of the system is aware of that behaviour and the devs have to add unneeded checks to verify that what we have is actually a proper invoice everywhere by checking that amount > 0, or the rest of the system assumes that invoices are all valid and it fails in runtime if that's not true.

The much better way to implement our model is to do it this way:

-spec new(binary(), number()) -> invoice().
new(Customer, Amount) when Amount > 0 ->
    #invoice{ id       = uuid:new()
    

Here, our users will not just be able to create an invalid invoice and we'll never need to deal with those things anywhere. Erlang allows you to filter those invalid parameters right there in the function head, which is really nice. And sometimes (like with integers) you can even limit the type specification using types like pos_integer() or non_neg_integer(), thereby letting dialyzer help you identify misuses of your functions.

5. No Nulls

This one, I think, it's the hardest to actually translate from OOP to Erlang, mostly because they way Hernán propose to workaround the need of nulls is sometimes close to impossible to use in Erlang world (at least). The general idea is that you should try to avoid using null values (or, in our case, undefined) like a plague. Why? Well, for starters, they don't represent anything at all. In any case, if you need to represent the absense of something you should be able to do so in a particular-to-that-model way.

To provide an example, let's say our invoices may have a customer address or not (in other words, it's an optional field). We have no other way to store that information than allowing undefined as a possible value in our record field:


         , customer :: binary()
         , address  :: undefined | binary()
         , amount   :: number()

One might be tempted to expose that field directly as a function, too…

-spec address(invoice()) -> undefined|binary().
address(#invoice{address = Address}) -> Address.

The problem with that approach is that every user that wants to just use the invoice address will have to deal with the case in which it's actually undefined. What Hernán recommended for dealing with such optional fields in your objects was to use methods like the following one (in a truly smalltalk-y way):

amount =
  anInvoice getAmountIfNone: [ block of code ]

That's something that is actually pretty hard to accomplish in Erlang and sometimes it's plainly impossible. In SmallTalk you have what Hernán calls full closures, so that in that block of code you can actually include a return statement to return from the function where the block is created. That's something you can't do in Erlang, as far as I know. So, I wasn't able to find a parallel structure here; I'm open to suggestions from our readers on that.

One thing to notice is that, even though 'undefined' is unavoidable when you're using records, you can easily avoid it with maps (i.e, just don't put the field in it). And in OTP 19 you'll even be able to specify that fact in your type's definitions for dialyzer to check for required vs. optional map keys.

6. No Setters

Continuing with our support for immutablity and all that we've learned from items 1 to 4, it's easy to see why setters should be almost non existent in our models, right? If instances of our models are immutable, complete and valid, why would you need to change any of their properties?

But, there are some cases where the entities you are modelling actually change over time. Let's say you need your system to let you add a mark to the invoice once it's stored in your archives. To keep that info, you might add this field to your record definition:


    , amount :: number()
    , archived :: undefined|calendar:datetime()
    }

I usually use dates instead of booleans for those kind of marks, so you can have a little bit more information there which might prove valuable later.

Now it's time to allow your users to mark the invoices. You can add a setter for that:

-spec archived(invoice(), calendar:datetime())
      -> invoice().
archived(Invoice, Date) ->
  Invoice#invoice{archived = Date}.

But that has a couple of problems:

  • It exposes your internal representation of that mark. After that, you won't be able to change it.
  • It allows users to generate invalid invoices, thus violating heuristic #4. Users can provide invalid dates, maybe dates that are in the future or dates that are older than the invoice's date. You can add validations for that, but is it really the way?

A better way to implement the same functionality is to give meaning to your function. Instead of creating a setter, just create a function that reflects the behaviour of your ADT. A function that implements the action of archiving the invoice:

-spec archive(invoice()) -> invoice().
archived(Invoice) ->
  Invoice#invoice{
    archived = calendar:universal_time()}.

7. Update Objects by using other Objects

And this final one is a problem that mostly applies only to OOP languages. The issue arises when you want to change an immutable object by another one. Let's say you have multiple references to theLastInvoice which is an invoice, and you want to change that one by another. Let's say for some reason, you need to keep that exact same object but with a whole new set of values. What Hernán recommended, instead of having multiple setters and calling them one by one to update all the properties (therefore having to add proper validation in each of the setters), is to have a single method, called syncWith:anInvoice, that will take an already complete and valid invoice and sync the current one with it.

I've never faced such a problem in Erlang or Haskell so I don't really know if it even makes sense to try to map this heuristic to the functional paradigm. In any case, if there is a reader out there with experience on this front that wants to give us some insight, please do it in the comments below.

Final Words

As I stated above, this heuristics are just that: heuristics, pieces of advice. They are valuable not as rules, but in the sense that they may open your mind and get you thinking about how you architect and develop your systems. They may also lead you (as they lead me a couple of years ago) to review your approach to programming in general.

Seeing programming as something much closer to an art or science than engineering has helped me in many ways and it's something that made me a better developer, I'm sure of that. Hernán played a huge role in that and I'll always be grateful for that. I hope this article helps opening the doors for you to walk that same path. And, if you're already on that path, I hope you've found something to move you a step forward :)

Appendix A

This is the final version of invoices.erl:

-module(invoices).

-record(invoice,
    { id       :: binary()
    , date     :: calendar:datetime()
    , customer :: binary()
    , address  :: undefined|binary()
    , amount   :: number()
    , archived :: undefined|calendar:datetime()
    }).

-opaque invoice() :: #invoice{}.
-export_type([invoice/0]).

-export(
  [new/2, amount/1, address/1, archived/2]).

-spec new(binary(), number()) -> invoice().
new(Customer, Amount) when Amount > 0 ->
  #invoice{ id      = uuid:new()
          , date    = calendar:universal_time()
          , customer= Customer
          , amount  = Amount
          }.

-spec amount(invoice()) -> number().
amount(#invoice{amount = Amount}) -> Amount.

-spec address(invoice()) -> undefined|binary().
address(#invoice{address=Address}) -> Address.

-spec archived(invoice(), calendar:datetime())
      -> invoice().
archived(Invoice, Date) ->
  Invoice#invoice{archived = Date}.