Building Geographically Distributed Enterprise Applications — Part 1
Any business organisation that is planning to expand its operations across multiple counties have to consider some key technology aspects:
- Latency in serving customers in different countries
- Minimizing the effect on other regions when a regional IT systems fail
- Data privacy laws in different countries and regions
- Ability to offer country/region specific services
Main approach for addressing above concerns is to deploy IT systems in (or closer to) most of the operating countries, which results in a multi-regional architecture.
Generic multi-region architecture
In such architecture, usually one region acts as the main active data center, hosting all components of the IT system. Then other regional data centers only contain the components that handle transactional data (e.g. purchase requests, inventory searches, etc) or privacy related data (e.g. names, addresses, contact numbers, health records, etc).
Furthermore, one or more regional data centers can contain complete replicas of the main deployment in order to act as a fail over/disaster-recovery (DR) site. Such fail over data centers may or may not handle normal traffic. Usually this deployment have to be backed by data stores which may need to be replicated across data centers. This architecture is shown in Figure 1.
In Figure 1, two systems, system A and system B are exposed to users in two regions. Both systems depend on a database. System A’s full deployment as well as the minimal deployment need to read and write to database 1. As two way database replication among regions can introduce certain complications, both deployments of system A (i.e. in region 1 and region 2) access a shared deployment of database 1 in region 1.
However, system B’s minimal deployment only need read access to database 2. Therefore, database 2 is replicated (one-way replication) in region 2 and system B’s minimal deployment is given read access to the replica of database 2.
Furthermore, full deployments of both system A and B are available in region 3 with replicas of both databases. Therefore, region 3 can act as a fail over region whenever needed.
Enterprise features for multi-region applications
As we have discussed in previous articles as well, most modern business systems are designed by composing multiple backend systems, data sources and micro services. Therefore, an integration layer has to be incorporated into such applications. Further, it is necessary to have an API layer to expose business functionalities to external applications and systems in a controlled manner. In addition, users of modern business applications expect sophisticated authentication, authorization and account management capabilities. In the rest of this article and in few upcoming articles, we will discuss how we can introduce such API Management, integration and Identity and Access Management (IAM) features to multi-region business applications.
API layer has two main components: API management plane and API runtime component. According to the generic architecture discussed earlier, we can deploy both management plane and runtime components in the main region’s data center. Then it is possible to deploy API runtime components in each region, which we want to serve.
To make things more concrete, we are focusing on the WSO2 platform in this article. WSO2 API management product has multiple modules, namely: API Gateway (GW), API Publisher, Developer Portal, Key Manager (KM), Traffic Manager (TM) and Analytics (Figure 2).
Among these modules, API Gateway is the main runtime component (data plane), which acts as a proxy between backend systems and API clients. It enforces all security, rate limiting, content validation, etc policies as well as collects data for analytics.
Traffic Manager (TM) module receives API call metadata from gateways for each API call and evaluates rate limiting policies against API traffic. If rate limiting policies are violated, it informs gateways to take a necessary action. As TM also involves in each API call, we may have to deploy it along with runtime components in multi-region deployments as we discuss later in this article.
All other modules can be considered as management plane components, which handles API creation, policy creation, API subscriptions, key generation and API usage analysis.
Except for analytics modules, all other modules can be deployed as a single binary (all-in-one profile) or as separate binaries as necessary. Details about all these modules can be found here.
Active-Active deployment (without region fail over support)
Figure 3 shows a deployment of API layer in two regions (region 1 and region 2). Here, region 1 acts as the main region. Both API management plane (APIM all-in-one and Analytics) and runtime components (GW and TM) are deployed in the main region. Only the runtime components (GW and TM) are deployed in region 2. In addition to acting as the throttling policy evaluation point, main region’s TM acts as an event hub for notifying gateways about API related updates. Therefore, all gateways in the deployment (both region 1 and region 2) are subscribed to region 1’s TM to receive API update events.
Propagating API and subscription updates to gateways
When an API is published by the management plane, it is saved in the APIM database and an notification is sent to main region’s TM. This TM sends events to all subscribed gateways about the API update. Once this event is received, each gateway fetches updated API artifacts from main region’s TM. Similar event based update happens when subscription details for an API is modified in the developer portal.
Enforcing rate limiting policies
In the scenarios discussed in this article, we assume that rate limiting policies are enforced regionally. i.e. if we have a policy for allowing 100 API calls per minute for API_1, each region allows 100 calls for API_1. Each gateway sends metadata about each API call to its regional TM. Regional TM evaluates API call counts against enabled rate policies and sends notifications to gateways if the rate limit has exceeded. Once such notification is received, gateways can block further API calls.
This type of region wise rate limiting is useful in many situations due to two reasons:
- Usually API users are bound to regions. This allows API and application developers to easily decide on per region rate.
- Region wise rate limiting policies are evaluated by regional TMs, which minimize communication delays to gateways. If it is necessary to enforce rate limiting policies globally, all rate limiting decisions have to be taken by the main region’s TM, which could introduce a communication delay in propagating decisions to regional gateways. Such delays could cause some API calls to go through even if the rate limit has exceeded.
Each gateway publishes metadata about each API call to Analytics module deployed in the main region. This metadata publication happens asynchronously, so that it does not affect the performance of API calls made via gateways.
Furthermore, similar to global rate limiting scenario, this also introduces a communication delay in sending API metadata from regional gateways to main region’s analytics module. However, such delays can be tolerated as we don’t take any time-critical action based on analytics metadata other than updating dashboards.
As you may have already noticed, this same architecture can be used for hybrid or multi-cloud deployments. In such deployments, we can deploy the API Manager full deployment in a cloud infrastructure (e.g. AWS) and additional (GW and TM) deployments in on-premise data centers or in infrastructure of another cloud provider (e.g. Azure). Such hybrid/multi-cloud deployment is shown in figure 4.
An important property in this architecture is that all inbound connections are made to the full deployment hosted in cloud (AWS in this case). Although arrows for API and subscription updates are drawn from AWS cloud to other deployments, these updates occur via a AMQP connection initiated by other deployments. Therefore, other data centers (especially on-premise data centers) do not have to allow inbound connections from the cloud.
Benefits of the architecture
In this architecture, all authentication and token generation requests are directed to the main region. Assumption here is that frequency of such requests are low (one per user session) compared to API calls. Furthermore, as a user performs only one authentication and token generation request per session, some delay in such requests are tolerated. If these assumptions hold, this architecture allows us to deploy multi-region active-active API layer without any cross region database replication and with minimum licensing and infrastructure cost.
However, if above assumptions do not hold for a certain deployment (e.g. has thousands of authentication requests per minute), it may be necessary to deploy API management plane (or IAM cluster) also in each region and replicate databases. We will discuss this scenario in detail in the next article.
Now let’s check which features discussed in the beginning of this article can be satisfied with this architecture:
Latency in serving customers in different countries/regions
API calls made within a region goes through that region’s gateway, so that the API call latency is minimized.
Minimizing the effect on other regions when a regional IT systems fail
Failure of a main region can affect the entire deployment, so that this requirement is not satisfied. We will discuss how to satisfy this requirement in the next article.
Data privacy laws in different countries/regions
Message contents sent via API calls do not cross regional boundaries. Therefore, this requirement is satisfied for API call data. However, if users are authenticated by the API management plane, user data has to be maintained in the main region. We can follow below approaches to overcome this based on the business use case:
- If client applications maintain their own user bases, users can be authenticated within the client application itself (this is common for external systems having their own user stores). In such cases, client applications can authenticate with the API management plane just by providing application level credentials. (For more information, please refer to different API authentication mechanisms discussed in this article).
- If users have to be authenticated with the API management plane (e.g. customer portal) and if privacy laws don’t permit maintaining user data in a central region, management plane can be replicated in each region with separate user stores.
Ability to offer country/region specific services
As each region is served by a separate gateway cluster, it is possible to deploy region specific APIs to any given region. Furthermore, as all regions are served by the main region’s management plane, it is possible to govern all common and region specific APIs with a single API portal.