Journey to an Enterprise Service Bus Integration – Part 5

This article is the fifth in a series dedicated to the ESB integration architecture. The initial article is located here.

Canonical Models

In a preceding article, I talked about 3 limitations of a traditional integration:

  1. Lack of scalability and fault tolerance.
  2. There are no economies of scale.
  3. Sender and receiver are tied by a contract.

In my previous article, I presented Publish-Subscribe asynchronous messaging as a way to alleviate limitation #2 by leveraging topics mechanisms in which multiple receivers can subscribe for the same message, effectively allowing economies of scale when having to disseminate the same information across different systems.

In this article, I am going to discuss how to untie senders and receivers at the data level (schema and content), through the use of Canonical Models.

Defining canonical models allows enterprises to create business objects that are independent from any specific application or system. The model contains all data points required to holistically describe a business entity (e.g. an order, invoice, order status, etc.) to connect to all systems/partners.

The use of canonical models solves limitation #3 by decoupling senders and receivers at the data level. A translator layer, sometimes referred to as Enterprise Layer, is created to handle the data exchange between systems and partners. In this layer, fields are standardized to match enterprise definitions rather than application-specific definitions.

Advantages

  1. The data and data schema are independent from any specific application, facilitating a common understanding throughout the enterprise. For example, an e-Commerce site will have a different way to represent an account (different number, potentially different level of granularity) than an ERP, or a Customer MDM. Materializing the enterprise definition of an account into a canonical model helps alleviate ambiguity, miscommunication and errors of interpretation.
  2. The transformation layer is centralized, allowing for better governance of changes. Also, this allows to create a central business rules engine that is application independent.
  3. Total Cost of Ownership is reduced. This one could be controversial because another layer is added on top of existing applications, which adds cost. However, in my experience, enterprise systems tend to be COTS products (Commercial Off-The-Shelf, like ERPs for example) for which the cost of customizing can become quite high, especially when these customizations have to be carried through product upgrades. Moving customizations from a commercial product into a custom platform is a huge cost avoidance, in the long run.

Limitations

As an abstracted layer sitting on top of existing interface definitions, the use of canonical models brings additional complexity that may not be justifiable in all circumstances. Additional complexity in governance, architecture, implementation and maintenance. This complexity often translates in additional discussions, time spent and degradation of performance.

  1. The use of canonical models require a greater level of coordination between development teams and a greater involvement from all parties involved, especially from enterprise architecture team. Not all IT teams are sized to support that.
  2. Because it covers a larger set of requirements versus a data schema for a single interface, a canonical model tends to contain lots of optional attributes. Some of these could even refer to legacy systems that have been decommissioned and are unused.
  3. Because of the additional step in mappings (source -> canonical -> destination), expect a slight degradation of performance in processing.

When to use

Canonical models are typically used for data definitions that require an additional level of governance. like customer data for example.

Domain Driven Design (DDD) recommends introducing the concept of bounded context to alleviate canonical model limitations. In a nutshell, and for our purpose, a canonical model is created for a reduced set of attributes that are common among different contexts. I recommend reading Eric Evan’s book Domain-Driven Design: Tackling Complexity in the Heart of Software for more information.

Conclusion

We reached the end of my series of articles dedicated to the ESB integration architecture.

As a conclusion, I would like to highlight that, like in a lot of other disciplines, there is no cookie cutter approach to integration. Every approach has its advantages and limitations. It is the architect’s role to weigh an approach versus another, and be pragmatic in their use, in order to deliver an integration that is optimized for the needs of an organization.

I hope you enjoyed these articles. If have any questions, feel free to leave a comment below or contact us here.

Journey to an Enterprise Service Bus Integration – Part 4

This article is the fourth in a series dedicated to the ESB integration architecture. The initial article is located here.

Publish-Subscribe asynchronous messaging

In a preceding article, I talked about 3 limitations of a traditional integration:

  1. Lack of scalability and fault tolerance.
  2. There are no economies of scale.
  3. Sender and receiver are tied by a contract.

In my previous article, I presented Point-to-Point asynchronous messaging as a way to alleviate limitation #1 by adding a persistent, scalable and fault-tolerant layer to decouple the flow between a sender and a receiver.

In this article, I am going to discuss another form of asynchronous messaging called Publish-Subscribe often referred to as Pub/Sub.

Pub/Sub is an asynchronous communication method in which messages are exchanged between applications without knowing the identity of the sender or recipient.

Pub/Sub messaging solves limitation #2. This is a One-to-Many delivery: several receivers (referred to as subscribers) can receive a message from a sender (referred to as a publisher) without having to duplicate the data flow. Rather than using queues like the Point-to-Point messaging, Pub/Sub makes use of topics, which allow several receivers to process the same message. In this case, the message is removed from the topic after all subscribers processed it.

Advantages

  1. There is no more compounded complexity when multiple systems or applications need to receive the same data, as they can simply subscribe to the topic.
  2. As the inbound flows (publisher to bus) become reusable, there is economy of scale.
  3. Similar as Point-to-Point messaging, there is loose coupling between publishers and subscribers allowing them to operate independently of each other.
  4. Scalability, achieved through multi-threading, message caching, etc. as well as reliability / fault tolerance achieved through clustering and load balancing.

Limitations

One of the biggest limitation of this model is in its main advantage. As publishers and subscribers are not aware of each other, the architecture itself does not guarantee delivery: it guarantees the availability of a message for a subscriber to consume, but not the fact that the subscriber will consume it. Therefore, designs outside of the architecture must be in place if guaranteed delivery is a requirement.

Similar to the Point-to-Point messaging, the scalability of the message bus is relative. Slow subscribers, large messages, may overwhelm the message broker to the point of exhausting its available resources and bring it down. I experienced these limitations with a slow subscriber combined with a verbose publisher, leading to accumulation of messages beyond what the broker could handle. Mitigation actions can include cleaning all the messages (but leading to information loss), or routing messages to a disaster recovery area specifically designed to handle large volume of messages. These messages might have to be “re-played” once the situation returns to normal, depending of the business case.

When to use

This model is typically used for scenarios when the publisher does not need to know if a message has been successfully distributed to a subscriber, because by design the publisher is not aware of the subscribers.

Next

In the next article, I will discuss alleviating limitation #3 through the use of Canonical Models.

If you have any questions, feel free to leave a comment below or contact us here.

Journey to an Enterprise Service Bus Integration – Part 3

This article is the third in a series dedicated to the ESB integration architecture. The initial article is located here.

Point-to-Point asynchronous messaging

In my previous article, I talked about 3 limitations of a traditional integration:

  1. Lack of scalability and fault tolerance.
  2. There are no economies of scale.
  3. Sender and receiver are tied by a contract.

In this article, I will describe a form of asynchronous messaging called Point-to-Point.

Point-to-Point asynchronous messaging solves limitation #1. It does so by the introduction of a messaging layer, often referred to as a message bus or event bus and the use of queues. It is still a one-to-one delivery though, similar as traditional integration.

Advantages

A couple of advantages of this model:

  1. Senders and receivers are now physically loosely coupled. For example, if the receiver is down, or slows down (the receiver is still working but it is not able to accept the messages as quickly as they are being sent because it is not sized appropriately for example) the messaging layer can still receive messages and hold them until the receiver is ready to consume them, effectively playing a regulating role.
  2. Scalability and fault tolerance are built-in the model: persistence layer (file or database), vertical scalability since the bus is running on a separate set of servers that can be scaled separately from the sender or receiver, horizontal scalability through the ability to create multiple reception threads.

Limitations

However, this form of messaging still does not allow for economies of scale as there can be only one receiver for a given message. It also ties all parties to a data schema and any change to this schema often requires all parties to be involved.

When to use

This model is typically used to facilitate communication between 2 components (whether they are tightly coupled or written in a different language/technology platform), and/or when you need to ensure a message is processed by one receiver only. If multiple receivers point to the same queue, only one will dequeue it and the message will be removed once acknowledged.

Next

In the next article, I will discuss alleviating limitation #2 above through Publish-Subscribe asynchronous messaging.

If you have any questions, feel free to leave a comment below or contact us here.

Journey to an Enterprise Service Bus Integration – Part 2

This article is the second in a series dedicated to the ESB integration architecture. The initial article is located here.

Traditional data integration

In my previous article, I talked how an ESB architecture is – to my opinion – the go to architecture to tackle an Enterprise challenges in today’s world.

But in order to understand why I value an ESB, I find essential to explain first the shortcomings of a traditional/point-to-point data integration as well as some of the push backs I initially experienced.

In a traditional data integration, interfaces are built between applications and systems through direct connections as a Point-to-Point integration. As such, the number of interfaces tends to grow exponentially as the data needs increase, and/or the landscape of application and systems expands.

Some refer to this architecture as a “spaghetti architecture”, because all systems have to be interconnected with each other in order to speak to each other.

Traditional interfaces are punctually easy and cheap to build. By punctually I mean: each interface taken separately without looking at the system in its globality, in terms of flexibility and scalability.

Push back

Below are some of the arguments I faced when talking about the need to introduce an integration layer:

  1. There are only 2 teams involved. No need for a third “integration team” that needs to be brought up to speed on each system’s requirements.
  2. There are only 2 systems involved in each interface, limiting the points of failure and troubleshooting efforts.
  3. Sender and receiver teams can quickly discuss the data requirements and come up with a format satisfying the needs of the interface as well as a transport method satisfying each system’s technical limitations.
  4. With this type of interfacing, the sender and/or the receiver remain in control of the data and transformations hence they can troubleshoot issues faster without having to rely, once again, on a “third team”.

Limitations

The points above are very valid. When you operate in a world where there are few senders and receivers, when you mostly need to send data once a day/week/month in a large batch always at the same time, and when the data exchanged only need to go to one place.

However, there are as well very valid limitations to this model:

  1. Lack of scalability and fault tolerance. For example, an overloaded receiver could start running slower, or lose connections leading to delay in processing or worse, message loss. 
  2. There are no economies of scale. If (when) 2 receivers are interested in the same information, the integration needs to be duplicated either by the sender, or by one of the receivers to be then forked to the other, or in a staging area which adds another point of failure (and defeats one of the “pro traditional” arguments). All these duplications cause compounded complexity.
  3. Sender and receiver are tied by a contract (a.k.a. the data schema) that cannot be changed unless both agree on the terms.

Through these limitations, you can see how 2 systems interfaced using this model are tightly coupled and one change in one system may require intervention of all parties involved.

Another consequence: because of the lack of governance that this model entails, all these disparate interfaces built overtime by different developers with different naming conventions start to weigh heavily on the support side, creating pockets of expertise. These pockets become a liability for an enterprise due to the amount of effort and training required to maintain knowledge.

When to use

I personally would not use Traditional Integration end-to-end. But, system limitations sometimes force the use of traditional integration as a step in an overall integration. For example: when a 3rd party customer or supplier is only able to send a file through [s]FTP. In this case, the receiver first task would be to decouple the flow as I will discuss in the next articles.

Next

In the next article, I will discuss a first step in alleviating Limitation #1 above: Point-to-Point asynchronous messaging.

If you have any questions, feel free to leave a comment below or contact us here.

Journey to an Enterprise Service Bus Integration

Introduction

In the 1990s, Enterprise Resource Planning solutions rose tremendously. At the time, they were the new “grail”, concentrating every single one of an enterprise needs in a packaged solution. It did not take long, 10 years roughly, for executives to determine that their bloated, overly customized and overly expensive-to-upgrade solutions would never bring the promised reductions of complexity and savings they were expecting.

Worse, each new change would compound the existing complexity: changing Chart of Accounts, product codification, pricing structure, customer structure, etc.

An idea started to form: rather than concentrating everything in one place, a better approach would be to leave best in breed applications alone and integrate them. How?

The Enterprise Service Bus is arguably the “grail” of data integration solutions, especially when it comes to Enterprise data integrations.

I experienced several reasons for that:

  • For the last 20 years, enterprise growth has been greatly accelerated through acquisitions, creating the need of being able to integrate quickly and in a repeatable way different business models built on different IT systems. But what if I’m on Oracle Applications and they’re on SAP, now what?
  • The speed at which companies are doing business, and the amount of actors involved, created the need of disseminating crucial information in real-time across numerous disparate IT systems. How do I enable customer registration in my e-commerce portals, and propagate in real-time this information into my order entry systems, my warehouse management systems, financial systems and Salesforce CRM so the customer can start placing orders right away?
  • “Data is the new gold”: Being able to gain insights by introspecting and analyzing in real-time data passing through enterprise systems in order to make on-the-spot operational decisions. How do I route a customer order so it is served by a location that will provide the fastest service, at the cheapest cost?

At 50,000 ft, an ESB architecture looks a lot like a Hub & Spoke:

However, the distributed architecture in the central Hub alleviates the bottleneck challenges that the Hub & Spoke model introduced. I will come back later on this aspect.

To that extent, I see the ESB architecture as a natural progression of the Hub & Spoke model to better face today’s enterprise challenges.

In this series of articles, I will review various data integration models and how they address the shortcomings of a Traditional integration, paving the way for the utilization of an Enterprise Service Bus architecture. I will not enter into deep technical details, rather I will provide a practical review of each model and applicability to an enterprise.

The next article in the series can be found here.

If you have any questions, feel free to leave a comment below or contact us here.