Journey to an Enterprise Service Bus Integration – Part 5

This article is the fifth in a series dedicated to the ESB integration architecture. The initial article is located here.

Canonical Models

In a preceding article, I talked about 3 limitations of a traditional integration:

  1. Lack of scalability and fault tolerance.
  2. There are no economies of scale.
  3. Sender and receiver are tied by a contract.

In my previous article, I presented Publish-Subscribe asynchronous messaging as a way to alleviate limitation #2 by leveraging topics mechanisms in which multiple receivers can subscribe for the same message, effectively allowing economies of scale when having to disseminate the same information across different systems.

In this article, I am going to discuss how to untie senders and receivers at the data level (schema and content), through the use of Canonical Models.

Defining canonical models allows enterprises to create business objects that are independent from any specific application or system. The model contains all data points required to holistically describe a business entity (e.g. an order, invoice, order status, etc.) to connect to all systems/partners.

The use of canonical models solves limitation #3 by decoupling senders and receivers at the data level. A translator layer, sometimes referred to as Enterprise Layer, is created to handle the data exchange between systems and partners. In this layer, fields are standardized to match enterprise definitions rather than application-specific definitions.

Advantages

  1. The data and data schema are independent from any specific application, facilitating a common understanding throughout the enterprise. For example, an e-Commerce site will have a different way to represent an account (different number, potentially different level of granularity) than an ERP, or a Customer MDM. Materializing the enterprise definition of an account into a canonical model helps alleviate ambiguity, miscommunication and errors of interpretation.
  2. The transformation layer is centralized, allowing for better governance of changes. Also, this allows to create a central business rules engine that is application independent.
  3. Total Cost of Ownership is reduced. This one could be controversial because another layer is added on top of existing applications, which adds cost. However, in my experience, enterprise systems tend to be COTS products (Commercial Off-The-Shelf, like ERPs for example) for which the cost of customizing can become quite high, especially when these customizations have to be carried through product upgrades. Moving customizations from a commercial product into a custom platform is a huge cost avoidance, in the long run.

Limitations

As an abstracted layer sitting on top of existing interface definitions, the use of canonical models brings additional complexity that may not be justifiable in all circumstances. Additional complexity in governance, architecture, implementation and maintenance. This complexity often translates in additional discussions, time spent and degradation of performance.

  1. The use of canonical models require a greater level of coordination between development teams and a greater involvement from all parties involved, especially from enterprise architecture team. Not all IT teams are sized to support that.
  2. Because it covers a larger set of requirements versus a data schema for a single interface, a canonical model tends to contain lots of optional attributes. Some of these could even refer to legacy systems that have been decommissioned and are unused.
  3. Because of the additional step in mappings (source -> canonical -> destination), expect a slight degradation of performance in processing.

When to use

Canonical models are typically used for data definitions that require an additional level of governance. like customer data for example.

Domain Driven Design (DDD) recommends introducing the concept of bounded context to alleviate canonical model limitations. In a nutshell, and for our purpose, a canonical model is created for a reduced set of attributes that are common among different contexts. I recommend reading Eric Evan’s book Domain-Driven Design: Tackling Complexity in the Heart of Software for more information.

Conclusion

We reached the end of my series of articles dedicated to the ESB integration architecture.

As a conclusion, I would like to highlight that, like in a lot of other disciplines, there is no cookie cutter approach to integration. Every approach has its advantages and limitations. It is the architect’s role to weigh an approach versus another, and be pragmatic in their use, in order to deliver an integration that is optimized for the needs of an organization.

I hope you enjoyed these articles. If have any questions, feel free to leave a comment below or contact us here.

Journey to an Enterprise Service Bus Integration – Part 4

This article is the fourth in a series dedicated to the ESB integration architecture. The initial article is located here.

Publish-Subscribe asynchronous messaging

In a preceding article, I talked about 3 limitations of a traditional integration:

  1. Lack of scalability and fault tolerance.
  2. There are no economies of scale.
  3. Sender and receiver are tied by a contract.

In my previous article, I presented Point-to-Point asynchronous messaging as a way to alleviate limitation #1 by adding a persistent, scalable and fault-tolerant layer to decouple the flow between a sender and a receiver.

In this article, I am going to discuss another form of asynchronous messaging called Publish-Subscribe often referred to as Pub/Sub.

Pub/Sub is an asynchronous communication method in which messages are exchanged between applications without knowing the identity of the sender or recipient.

Pub/Sub messaging solves limitation #2. This is a One-to-Many delivery: several receivers (referred to as subscribers) can receive a message from a sender (referred to as a publisher) without having to duplicate the data flow. Rather than using queues like the Point-to-Point messaging, Pub/Sub makes use of topics, which allow several receivers to process the same message. In this case, the message is removed from the topic after all subscribers processed it.

Advantages

  1. There is no more compounded complexity when multiple systems or applications need to receive the same data, as they can simply subscribe to the topic.
  2. As the inbound flows (publisher to bus) become reusable, there is economy of scale.
  3. Similar as Point-to-Point messaging, there is loose coupling between publishers and subscribers allowing them to operate independently of each other.
  4. Scalability, achieved through multi-threading, message caching, etc. as well as reliability / fault tolerance achieved through clustering and load balancing.

Limitations

One of the biggest limitation of this model is in its main advantage. As publishers and subscribers are not aware of each other, the architecture itself does not guarantee delivery: it guarantees the availability of a message for a subscriber to consume, but not the fact that the subscriber will consume it. Therefore, designs outside of the architecture must be in place if guaranteed delivery is a requirement.

Similar to the Point-to-Point messaging, the scalability of the message bus is relative. Slow subscribers, large messages, may overwhelm the message broker to the point of exhausting its available resources and bring it down. I experienced these limitations with a slow subscriber combined with a verbose publisher, leading to accumulation of messages beyond what the broker could handle. Mitigation actions can include cleaning all the messages (but leading to information loss), or routing messages to a disaster recovery area specifically designed to handle large volume of messages. These messages might have to be “re-played” once the situation returns to normal, depending of the business case.

When to use

This model is typically used for scenarios when the publisher does not need to know if a message has been successfully distributed to a subscriber, because by design the publisher is not aware of the subscribers.

Next

In the next article, I will discuss alleviating limitation #3 through the use of Canonical Models.

If you have any questions, feel free to leave a comment below or contact us here.

Journey to an Enterprise Service Bus Integration – Part 3

This article is the third in a series dedicated to the ESB integration architecture. The initial article is located here.

Point-to-Point asynchronous messaging

In my previous article, I talked about 3 limitations of a traditional integration:

  1. Lack of scalability and fault tolerance.
  2. There are no economies of scale.
  3. Sender and receiver are tied by a contract.

In this article, I will describe a form of asynchronous messaging called Point-to-Point.

Point-to-Point asynchronous messaging solves limitation #1. It does so by the introduction of a messaging layer, often referred to as a message bus or event bus and the use of queues. It is still a one-to-one delivery though, similar as traditional integration.

Advantages

A couple of advantages of this model:

  1. Senders and receivers are now physically loosely coupled. For example, if the receiver is down, or slows down (the receiver is still working but it is not able to accept the messages as quickly as they are being sent because it is not sized appropriately for example) the messaging layer can still receive messages and hold them until the receiver is ready to consume them, effectively playing a regulating role.
  2. Scalability and fault tolerance are built-in the model: persistence layer (file or database), vertical scalability since the bus is running on a separate set of servers that can be scaled separately from the sender or receiver, horizontal scalability through the ability to create multiple reception threads.

Limitations

However, this form of messaging still does not allow for economies of scale as there can be only one receiver for a given message. It also ties all parties to a data schema and any change to this schema often requires all parties to be involved.

When to use

This model is typically used to facilitate communication between 2 components (whether they are tightly coupled or written in a different language/technology platform), and/or when you need to ensure a message is processed by one receiver only. If multiple receivers point to the same queue, only one will dequeue it and the message will be removed once acknowledged.

Next

In the next article, I will discuss alleviating limitation #2 above through Publish-Subscribe asynchronous messaging.

If you have any questions, feel free to leave a comment below or contact us here.