The origin of the microservices architecture

The evolution of application architecture is a process of continuously adapting to changes in business needs. In the early stages of application development, monolithic architectures dominated with their simple and straightforward design concepts. This architecture centralizes all functional modules in the same deployment unit and shares a unified technology stack, which has the advantages of low development threshold and simple deployment method, and effectively supports the development of enterprise-level software. However, with the rapid expansion of business scale, the shortcomings of monolithic architecture in terms of development efficiency and scalability have become increasingly prominent, forcing the industry to start exploring new system architectures, which has led to the emergence of service-oriented architecture (SOA). The service shelf splits the application into multiple independent services, realizes inter-service communication through the enterprise service bus (ESB), ensures the standardization of communication protocols while maintaining system flexibility, and breaks through the bottleneck of the monolithic architecture to a certain extent, so it is widely used in key business systems such as bank account management, transaction processing, and operator metering and billing.

Despite the great success of service-oriented architecture, centralized enterprise service buses are gradually beginning to expose performance bottlenecks, and the coupling between services is limiting the further growth of the business. In the context of the explosion of global Internet business, the microservice architecture came into being. Compared with service-oriented architectures, microservices architectures use more fine-grained service splitting, with each service focusing on providing a single business capability, further reducing the coupling of systems. More importantly, the microservice architecture completely abandons centralized components such as enterprise service buses, and services can communicate directly through a lighter mechanism, which greatly improves the throughput of the system.

With its superior performance in large-scale and high-concurrency scenarios, the microservice architecture has become the preferred architecture of Internet giants such as Google, Meta, and Amazon. The core benefits of this architectural pattern are reflected in several dimensions:

Resource efficiency

Microservices support precise scaling of specific services on demand, avoiding the waste of resources in traditional architectures

R&D efficiency

Each team can independently choose the technology stack and release cadence according to the characteristics of the service, achieving a truly independent evolution

System stability

Effective isolation between services ensures that localized failures do not cause a ripple effect, significantly improving overall availability

Deployment policies

The independent deployment mechanism provides a technical foundation for refined operations such as grayscale release and A/B testing.

Together, these features constitute the core competitiveness of the microservice architecture to deal with complex business scenarios.

O&M challenges of microservice applications

While microservices bring many benefits, they also introduce new complexities. The challenges inherent in distributed systems—such as service-to-service communication, data consistency, distributed transactions, and so on—require more granular design and management. At the infrastructure level, it is necessary to build a complete support system such as service registration and discovery, configuration center, and monitoring and alarming. Among the many changes, the challenges of the operating model of the data storage component are particularly noteworthy.

Day 1 - Initial deployment goes live

The unique needs and challenges of developing a test environment

Complex component dependencies and orchestration caused by microservice architectures

Complexity and challenges of initial data preparation

Using the CI/CD process for a development and test environment as an example, modern microservices applications face many challenges when they are first deployed. Dev-test environments have unique characteristics: test environments need to be created and destroyed frequently to support rapid iteration, multi-branch parallel development requires environment isolation to avoid mutual interference, and automated tests need rapid feedback to improve development efficiency.

The difficulties presented by microservices architecture at this stage are first reflected in the complexity of component dependencies. A complete application may consist of 10-20 microservices and rely on multiple data storage components: MySQL for user data, Redis for caching, MongoDB for log data, and Kafka for asynchronous messages. Each component has a specific version, configuration, topology, boot order, and requires complex orchestration logic and error handling mechanisms.

The preparation of the initial data has also become more difficult. Each microservice depends on a data store component that needs to be initialized separately, and there is often an association between these data. The desensitization of sensitive data needs to ensure security and compliance while maintaining the integrity of the data relationship. Data isolation between test environments requires each environment to have its own data set, which significantly increases storage costs and management complexity.

The customer environment is highly differentiated and adapted

Difficulty in root cause analysis due to insufficient observability

It is difficult to ensure the scalability and high availability of data storage components

When an application enters production, the challenges of long-term O&M become more severe. For example, in the case of commercial applications in production environments, customer environments tend to be highly differentiated, with some customers using public clouds and others insisting on private deployments. Each environment has its own specific network and storage types, which need to be adapted and tuned. At the same time, the production environment has extremely high requirements for stability and security, and needs to provide 24×7 uninterrupted guarantees.

Observability is the most intuitive challenge to operating a production environment. Root cause analysis needs to track the complete call chain of a user request across multiple services, and components of different technology stacks generate metrics and log data in various formats, which need to be collected, processed, stored, and retrieved in a unified manner. However, application components and data storage components often have coordination problems in terms of data collection granularity, aggregation and display, etc.

Scalability presents challenges in production on multiple dimensions. The expansion of data storage components is particularly difficult, the read/write splitting of databases needs to consider data synchronization delays, and shard scaling needs to handle data migration and traffic balancing strategies. Different types of data storage components require different resource types, some are CPU-intensive, some are memory-intensive, and some are IO-intensive, requiring targeted scaling schemes. Cost optimization requires the system to dynamically adjust resource allocation based on actual load to avoid resource waste caused by over-allocation.

In addition to observability and scalability, ensuring high availability of microservice applications is a challenge, especially in the data storage component. Microservices architectures often use multiple data storage components, which may use different persistence technologies with specific failure modes and recovery mechanisms, making fault detection and automated recovery extremely complex. To make matters worse, the failure of a data storage component often has a cascading effect, such as the temporary unavailability of a database that can cause multiple microservices to fall into an abnormal state, which in turn affects the stability of the entire system. In addition, the inconsistencies in failure recovery times vary widely across datastore components, making it difficult to guarantee RTO and RPO for systems.

How does OpenIM address operational challenges?

What is OpenIM?

OpenIM is an open-source instant messaging application that provides users with complete real-time communication capabilities, including one-on-one chats, group chats, audio and video calls, file transfers, and more. As a high-performance, high-availability, and easy-to-scale communication platform, OpenIM is widely used in scenarios such as internal collaboration, customer service system integration, social application backend, and online education. OpenIM adopts a microservices architecture, which splits complex instant messaging functions into multiple independent services, where:

API Gateway serves as a unified access layer

User Service manages user authentication and profiles

The Friend Service and Group Service handle friend relationships and group management, respectively

The Message Service is responsible for routing messages to and from each other

Push Service enables multi-channel push

File Service manages file storage

In addition to application components, OpenIM uses a variety of data storage components, including:

MySQL stores structured data such as user information, friendships, and groups

Redis provides caching to maintain presence and session information

MongoDB stores unstructured data such as chat message logs

Whether it is the complete deployment of OpenIM in the CI/CD process, or the observation and optimization of the running state of OpenIM in the production environment, it has a high degree of complexity. Therefore, OpenIM recommends using KubeBlocks in a Kubernetes environment to solve these challenges.

OpenIM + KubeBlocks joint solution

Operating environment requirements

Before you start deploying OpenIM, you need to ensure that your Kubernetes cluster meets the following basic requirements:

Kubernetes version	1.22 or later
Minions	3 pcs
Single-node configuration	At least 4-core CPU, 8GB RAM, 50GB storage
Network Plug-in (CNI)	Normal operation (Calico, Flannel or Cilium recommended)
Storage Plug-in (CSI)	Normal operation (OpenEBS or CSI is recommended)
External access (optional)	LoadBalancer or Ingress Controller is available

The above environment will be used to create the following components:

MySQL clusters	Replica configuration: 2-core CPU, 4GB RAM, 20GB storage
	Topology: 1 master + 1 slave, with high availability capability
Redis clusters	Replica configuration: 1 core CPU, 2 GB memory, 10 GB storage
	Topology: 3-node cluster
MongoDB clusters	Replica configuration: 2-core CPU, 4GB RAM, 30GB storage
	Topology: 3-node replica set
OpenIM application components	API Gateway: 1 core CPU, 1GB memory, 3 replicas
	Other components: 0.5 CPU cores per service, 512MB memory, 2 replicas

Install KubeBlocks

It is recommended to use Helm to install KubeBlocks, which is more flexible and easy to manage:

Install Helm (if it isn't):

# 在 Linux/macOS 上安装 Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# 验证 Helm 安装
helm version

To add a KubeBlocks Helm repository:

# 添加 KubeBlocks 官方 Helm 仓库
helm repo add kubeblocks https://apecloud.github.io/helm-charts

# 更新 Helm 仓库索引
helm repo update

# 查看可用版本
helm search repo kubeblocks/kubeblocks

To install KubeBlocks:

# 安装 CRD
kubectl create -f https://github.com/apecloud/kubeblocks/releases/download/v1.0.0/kubeblocks_crds.yaml

# 创建专用命名空间并安装 KubeBlocks
helm install kubeblocks kubeblocks/kubeblocks \
  --namespace kb-system \
  --create-namespace \
  --wait

# 如果需要指定特定版本
helm install kubeblocks kubeblocks/kubeblocks \
  --namespace kb-system \
  --create-namespace \
  --version=1.0.0 \
  --wait

To install the kbcli command-line tool:

# 安装 kbcli
curl -fsSL https://kubeblocks.io/installer/install_cli.sh | bash

# 将 kbcli 添加到 PATH（根据安装输出调整路径）
export PATH="$PATH:/usr/local/bin"

# 验证 kbcli 安装
kbcli version

Check the running status of KubeBlocks:

# 检查 KubeBlocks 总体状态
kbcli kubeblocks status

# 查看 kb-system 命名空间中的 Pod
kubectl get pods -n kb-system

# 检查 KubeBlocks Operator 日志
kubectl logs -n kb-system deployment/kubeblocks -f

Check the built-in plugins (Addons):

# 查看所有可用插件
kbcli addon list

Verify storage and networking:

# 检查 StorageClass
kubectl get storageclass

# 验证默认 StorageClass
kubectl get storageclass -o jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}'

# 检查网络连通性（创建测试 Pod）
kubectl run test-pod --image=busybox --rm -it --restart=Never -- nslookup kubernetes.default

Initialize the data storage component

To create a MySQL cluster:

# 创建 MySQL 主从集群
kbcli cluster create mysql mysql-cluster \
  --version 8.0.39 \
  --termination-policy WipeOut

To create a Redis cluster:

# 创建 Redis 集群
kbcli cluster create redis redis-cluster \
  --version 7.2.7 \
  --mode replication \
  --replicas 2 \
  --termination-policy WipeOut

To create a MongoDB replica set:

# 创建 MongoDB 副本集
kbcli cluster create mongodb mongodb-cluster \
  --version 6.0.22 \
  --mode replicaset \
  --replicas 3 \
  --termination-policy WipeOut

To create a MongoDB database and users:

# 创建 MongoDB 数据库和用户
kubectl exec -it pods/mongodb-cluster-mongodb-0 -- /bin/bash
root@mongodb-cluster-mongodb-0:/# mongosh "mongodb://$MONGODB_USER:$MONGODB_ROOT_PASSWORD@mongodb-cluster-mongodb-mongodb:27017/admin"
mongodb-cluster-mongodb [direct: primary] admin> use openim_v3
mongodb-cluster-mongodb [direct: primary] openim_v3> db.createUser({
  user: 'openim',
  pwd: 'openimPassword123',
  roles: [{ role: 'readWrite', db: 'openim_v3' }]
});

Deploy OpenIM application components

To create an OpenIM profile:

# openim-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: openim-config
  namespace: default
data:
  config.yaml: |
    mysql:
      address: mysql-cluster-mysql.default.svc.cluster.local:3306
      username: root
      password: "$(MYSQL_ROOT_PASSWORD)"
      database: openim_v3
      maxOpenConn: 100
      maxIdleConn: 10
      maxLifeTime: 5
    redis:
      address: redis-cluster-redis.default.svc.cluster.local:6379
      password: ""
    mongo:
      uri: mongodb://openim:openimPassword123@mongodb-cluster-mongodb.default.svc.cluster.local:27017/openim_v3?authSource=openim_v3
      address: mongodb-cluster-mongodb.default.svc.cluster.local:27017
      database: openim_v3
      username: openim
      password: openimPassword123
    api:
      openImApiPort: [ 10002 ]
    rpcport:
      openImUserPort: [ 10110 ]
      openImFriendPort: [ 10120 ]
      openImMessagePort: [ 10130 ]
      openImGroupPort: [ 10150 ]
      openImAuthPort: [ 10160 ]
      openImPushPort: [ 10170 ]
      openImConversationPort: [ 10180 ]
      openImRtcPort: [ 10190 ]

Get the database password and create a secret:

# 获取 MySQL root 密码
MYSQL_ROOT_PASSWORD=$(kubectl get secret mysql-cluster-conn-credential -o jsonpath='{.data.password}' | base64 -d)
# 创建包含数据库密码的 Secret
kubectl create secret generic openim-secret \
  --from-literal=mysql-root-password="$MYSQL_ROOT_PASSWORD"
# 应用配置文件
kubectl apply -f openim-config.yaml

Deploy OpenIM application components:

# openim-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: openim-api
  labels:
    app: openim-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: openim-api
  template:
    metadata:
      labels:
        app: openim-api
    spec:
      containers:
        - name: openim-api
          image: openim/openim-server:v3.5.0
          command: ['/openim/bin/openim-api']
          ports:
            - containerPort: 10002
          env:
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: openim-secret
                  key: mysql-root-password
          volumeMounts:
            - name: config
              mountPath: /openim/config
              readOnly: true
          resources:
            requests:
              memory: '512Mi'
              cpu: '500m'
            limits:
              memory: '1Gi'
              cpu: '1000m'
      volumes:
        - name: config
          configMap:
            name: openim-config
---
apiVersion: v1
kind: Service
metadata:
  name: openim-api-service
spec:
  selector:
    app: openim-api
  ports:
    - port: 10002
      targetPort: 10002
  type: ClusterIP

Deploy other OpenIM components:

# 应用 OpenIM 部署配置
kubectl apply -f openim-deployment.yaml

# 监控 Pod 启动状态
kubectl get pods -l app=openim-api -w

# 检查服务状态
kubectl get svc openim-api-service

Verify the OpenIM health status

Check the status of all components:

# 检查数据库集群状态
echo "=== Database Clusters Status ==="
kbcli cluster list

# 检查 OpenIM 应用状态
echo "=== OpenIM Application Status ==="
kubectl get pods -l app.kubernetes.io/name=openim
kubectl get svc -l app.kubernetes.io/name=openim

Verify app functionality:

# 检查 OpenIM API 健康状态
kubectl exec -it deployment/openim-api -- curl -f http://localhost:10002/healthz
# 查看应用日志
kubectl logs -l app=openim-api --tail=50

# 端口转发进行本地测试
kubectl port-forward service/openim-api-service 10002:10002 &

# 测试 API 接口
curl http://localhost:10002/healthz

Use the preceding verification steps to ensure that OpenIM and its dependent database services are running properly and have the ability to process actual business requests.

Solution Summary

The KubeBlocks microservice application deployment and operation solution helps OpenIM solve the core challenges of containerized deployment. In Day 1, KubeBlocks simplifies the deployment of HACs through declarative APIs and command-line tools, and has built-in best practice configurations for various data storage components, greatly reducing learning costs and error probability. Its lifecycle management capabilities and backup recovery mechanisms are deeply integrated with CI/CD processes to support the authoring, updating, and rapid recovery and release of data baselines based on a data baseline. In the Day 2 phase, KubeBlocks provides professional monitoring metrics and alarm rules for each data storage component, forming a complete observability system with the monitoring metrics of application components, making cross-component troubleshooting more efficient. KubeBlocks supports fine-grained auto scaling policies, and can dynamically adjust the topology and number of replicas of MySQL, Redis, and MongoDB to achieve on-demand scaling. In terms of high availability guarantee, KubeBlocks provides multi-AZ deployment options, which significantly improves the disaster recovery level of data storage components.

The KubeBlocks microservice application deployment and operation solution is supporting more types of data storage components and further expanding its application scenarios. In addition to popular components such as MySQL, Redis, and MongoDB, key components such as InfluxDB time series database, ZooKeeper configuration management, RabbitMQ, and RocketMQ message queues are already included. This enables KubeBlocks to provide complete deployment and O&M capabilities for more microservice application scenarios such as IoT data collection, financial transaction processing, e-commerce recommendation system, and real-time data analysis.

Microservice application deployment and O&M