OpenStack User-Facing Services and Internal Services

Estimated reading time: 13 minutes.

Objective: Describe the generic internal architecture of OpenStack services and how they use Kubernetes networking and storage.

General Architecture of OpenStack Services

Up to this point, we presented OpenStack services such as Nova and Neutron as black boxes. We gave you hints that each of these services might have a more complex internal structure, because they have pieces running on the OpenStack control plane and pieces running on OpenStack compute nodes, and they also need support from other services such as database and messaging servers.

Before you inspect the Kubernetes API resources that represent those OpenStack services, for example, to troubleshoot performance and availability issues of those services, you must understand how their internal components relate to each other and to other OpenStack services.

An Example: OpenStack Nova

Let’s use OpenStack Nova as an example, not only because it is the most important, but because it also includes some complexity you may not see if we pick one of the simpler OpenStack services.

The figure shows the external and internal views of the architecture of OpenStack Nova. From the external view, Nova interacts with other OpenStack services: Cinder, Glance, Neutron, and Placement. Additionally, it interacts with Keystone, which every OpenStack service has to interact with.

These interactions happen using HTTP, in fact, they use the same OpenStack APIs that the OpenStack client and Horizon use to communicate with these OpenStack services. For our purposes, it doesn’t matter which of the internal components of Nova communicate with which OpenStack services: they all exist inside the black box of Nova.

From the internal view, opening the black box, all components of Nova communicate with each other using AMQP message-passing, including components that run outside of OpenShift, on RHEL compute nodes. Components running on the OpenStack control plane can also communicate with a relational database server to store data from API resource instances.

The following are the main components of OpenStack Nova. The OpenStack community documentation call them "Nova services" and you will see that "services" is a very overloaded term in both OpenStack and OpenShift, so be sure that you know to which kind of service you refer to, at any moment.

Nova API: Provides the API entry point for all OpenStack clients, services, and users that need to use services from Nova. No component external to Nova communicates with any component except the API entry point.
Nova Scheduler: Selects a compute node to run new server instances, taking into account the availability of compute capacity, storage, and network connectivity.
Nova Conductor: Mediates communication between Compute components running on individual compute nodes and the relational database server, which could easily become overloaded with too many connections from too many compute nodes.
Nova Compute: Unlike other components, which run on an OpenStack control plane, that is, as containers on OpenShift, the Compute component runs on RHEL compute nodes. It is the component which interacts with the local KVM hypervisor to connect virtual machines and attach them to virtual network and storage resources.

OpenStack Nova may include other components not shown in the figure, and Red Hat OpenStack Services on OpenShift runs a few of them. For example, there’s a component which mediates remote console sessions using the VNC protocol. We omit these components for simplicity.

Internal Scalability of OpenStack Services

The previous figure shows a single icon for each internal component of OpenStack Nova but there could be multiple instances of each component, which could run on different control plane servers. That is, there could be multiple containers on OpenShift for each of the components of Nova, potentially running on different nodes of an OpenShift cluster. This enables Nova to handle a large number of compute nodes running an even larger number of virtual machines.

The fact that each component of Nova can be scaled to multiple containers independently of other components enables OpenStack Administrators to efficiently handle different scenarios, for example: is the bottleneck caused by too many messages from too many Nova Compute instances, each on a compute node? Run more Nova Conductor instances! Is the bottleneck caused by a CI/CD system that provisions and tears down too many server instances in a short period of time? Scale the Nova API instances! Is the bottleneck caused by complex attributes of compute nodes, host aggregates, and a large number of server flavors? Scale the Nova Scheduler instances!

Other OpenStack User-Facing Services

Other OpenStack services are expected to follow a similar architecture to Nova. They don’t have to, but by convention and by design best practices from the OpenStack community, it is expected that they do. Unless you are a developer working on the internal implementation of OpenStack services themselves, you do not need to care about eventual inconsistencies between different OpenStack services.

So you expect that no service interacts with each other except by using their public APIs over HTTP. You also expect that internal components of any service interact with each other using AMQP messaging passing and all store API resources on a relational database. Finally, you expect that all OpenStack services could run multiple containers for scalability and high availability.

Some OpenStack services may run entirely on the control plane, for example, Swift, Glance, and Heat. A few services require a component running on compute nodes, for example, Neutron and Cinder. Components running on compute nodes communicate with components running on the control plane by using AMQP message passing. If these components also need to interact with other OpenStack Services, they use HTTP.

The services-oriented architecture of OpenStack does not force all its services into the same mold. Each service can make different architectural and implementation decisions, if they need to. This leads to occasional inconsistencies, but it is a small price to pay for increased flexibility and scalability.

Internal Services of OpenStack Clusters

Some internal services of Red Hat OpenStack Services on the OpenShift cluster are shared by multiple OpenStack user-facing servers. User-facing services do not need to share internal-infrastructure services, but it is simpler to manage fewer instances of each service.

For example, you could have different database servers for each OpenStack service, but there is no reason to not share the same database server for all of them, at least until the database itself becomes overloaded with too many queries.

In a similar vein, you could have Nova instances have their own RabbitMQ services for its cells, independent of RabbitMQ services for Neutron and Cinder cells, but you usually configure the same cells and use the same RabbitMQ cells for all services. The fact that you could do it differently does not mean you should do it.

Other OpenStack services may require specialized internal services to fulfill their functionality, for example: we already mentioned that Nova needs remote desktop protocol servers to enable access to server instance consoles. Neutron needs a software-defined networking layer. A few OpenStack services need clients for networked files or block storage.

OpenStack Compute Cell Services

Most user-facing OpenStack services included with Red Hat OpenStack Services on OpenShift require a relational database to store API resource instances and an AMQP message passing server. These two internal services define the scope of an OpenStack cluster compute cell and impact the scalability of the entire OpenStack control plane.

Red Hat OpenStack Services includes the following internal infrastructure services to address the needs of relational database and AMQP messaging:

MariaDB: It is an open source relational database based on the original codebase of MySQL.
Galera: It clusters multiple MySQL and compatible databases in active-active mode with synchronous data replication.
RabbitMQ: It is an AMQP messaging server which replicates messages between its instances, so no message is lost and message delivery is guaranteed.

You can deploy proof-of-concept Red Hat OpenShift clusters running a single instance of RabbitMQ and a single instance of MariaDB, without using Galera. For production clusters, Red Hat recommends running multiple instances of each, configured as a database and as a messaging cluster. The good news is that the OpenStack add-on operator handles the clustered, multi-instance deployment of RabbitMQ and MariaDB for an OpenStack compute cell.

OVN Networking

Another key internal service of Red Hat OpenStack Services on OpenShift is the OVN networking layer, which runs components on both the control plane and compute nodes. OVN creates virtual networks by tunneling packets between OpenStack compute nodes and enables strong network isolation between workloads running on OpenStack clusters, without the need for external networking gear.

OVN distributes network flow databases between compute nodes, in a way that processing those packet flows is distributed among compute nodes, instead of overloading a few network control nodes. OVN handles the replication and high availability of these network flow databases, including running multiple instances of the main flow databases at the OpenStack control plane.

OVN is so powerful that more recent releases of Red Hat OpenShift also use OVN to implement Kubernetes networking and to extend it for more advanced use cases, which were not originally supported by standard Kubernetes. The OVN instances running on the OpenStack control plane are independent of the OVN instances running on the OpenShift control plane, that is: Kubernetes networking and OpenStack networking are completely independent of each other.

Child Operators of the OpenStack Add-On Operator

Now that you have a glimpse of the internal structure of OpenStack services, you realize that each individual service needs its own management, scaling, and configuration. The OpenStack add-on operator handles that by relying on a number of child add-on operators.

In fact, Red Hat OpenStack Services on OpenShift includes specialized operators to manage each of the user-facing and internal infrastructure services: There is a Nova operator, a Neutron operator, a RabbitMQ operator, an OVN operator, and so on.

OpenShift add-on operators have the concept of a meta-operator, which is an add-on operator that manages a set of child add-on operators. The OpenStack add-on operator is a meta-operator, and the External Data Plane Management add-on operator, which uses Ansible to manage OpenStack compute nodes, is also a child operator of the OpenStack add-on operator.

OpenShift Storage and Networking for OpenStack Services

The storage and networking services of OpenStack, such as Cinder and Neutron, provide capabilities for workloads running as VMs on OpenStack compute nodes. They do not provide such capabilities for other OpenStack services. OpenStack services, at least their components running on OpenShift, must use networking and storage capabilities from Kubernetes.

Data Storage Requirements of OpenStack Services

Kubernetes storage requirements from Red Hat OpenStack Services on OpenShift come mainly from internal infrastructure services. User-facing services focus on data storage for compute nodes, which run outside of OpenShift.

OpenStack services which require Kubernetes storage are expected to handle data consistency and availability by themselves, without requiring high-availability nor replication features from the storage backend. This means you do not need OpenShift Data Foundation (ODF) nor other kind of enterprise storage backend for an OpenShift cluster which runs Red Hat OpenStack Services on OpenStack. The storage capabilities offered by the Local Volume Manager Storage (LVMS) add-on operator, which enables access to local storage of OpenShift cluster nodes, are sufficient.

OpenStack Administrators are not required to use the LVMS add-on operator: they could use any Container Storage Interface (CSI) driver which is certified for Red Hat OpenShift and use whatever backend storage that driver supports. They just have no need to deploy the CSI driver and their add-on operators on OpenShift.

Some OpenStack services may run remote storage client, for example, Ceph librbd, from their own containers to manage or access data in the same storage backends that compute nodes do on behalf of their workloads. They do not use Kubernetes storage at all for managing Cinder volumes and Glance images. OpenStack workloads and OpenShift workloads do not require access to the same storage backends.

Later in this course, we will present the Kubernetes concepts and API resources required to manage storage for OpenStack services, as well as which OpenStack internal services require Kubernetes storage.

Network Connectivity Requirements of OpenStack Services

Kubernetes networking requirements from Red Hat OpenStack Services on OpenShift come from the necessity of exposing OpenStack APIs over HTTP to OpenStack Operators and Administrators, besides the necessity of exposing those APIs over HTTP and also exposing AMQP messaging (RabbitMQ) to compute plane nodes.

Additional requirements come from the common data center design patterns of network isolation, which require that components of OpenStack services running on OpenShift and also compute nodes have connectivity to multiple isolated networks, something that standard Kubernetes alone cannot provide.

To provide network connectivity capabilities beyond standard Kubernetes, OpenShift includes a cluster operator and two add-on operators:

Multus: Enables containers to attach to secondary networks, which maps to additional network interfaces on OpenShift cluster nodes. Multus enables the containers from OpenStack services to connect to the same isolated networks that compute nodes connect to, as long as the OpenShift cluster nodes are connected to these same networks.
NMState: Enables declarative configuration of network devices on OpenShift cluster nodes, which may be easier than configuring all devices at OpenShift cluster deployment, especially for more complex setups such as bonded interfaces. The NMState add-on operator also enables day-2 changes of networking configurations of those nodes, for example, to enable access to new VLANs over existing trunk interfaces.
MetalLB: Standard Kubernetes only enables external access to load balancers on cloud provider platforms. If Kubernetes is deployed on a physical server or a traditional hypervisor, then Kubernetes load-balancers only work inside a cluster. The MetalLB add-on operators add a resource controller for Kubernetes load-balancers that enables external accesses on non-cloud platforms.

The use of the NMState add-on operator is not required by Red Hat OpenStack Services on OpenShift, but it is recommended most times. The Multus cluster operator and MetalLB add-on operator are required.

Later in this course we will present the required Kubernetes APIs and concepts, including the custom resources from NMState, MetalLB, and Multus, that are required to manage network connectivity for OpenStack services, and how user-facing and internal services, use those Kubernetes API resources.