Info Image

Service Mesh and Edge Computing - Considerations

Service Mesh and Edge Computing - Considerations Image Credit: Photonphoto/Bigstockphoto.com

How are Service Mesh & Edge Computing Related? Can they help each other?

Figure 1: Latency Sensitivity Across various Edges (source)

Latency being primary driver for edge computing, applications and services need to be developed leveraging the capabilities of edge compute platform. These applications are expected to be cloud native, microservices based, leveraging helper services available on the platform, while providing secure multi-tenant configurations. The partitioning of the functionality must leverage resource awareness in the platform and facilitate performance scalability and security. Mobility and Federation will be important considerations allowing use of services across multiple MEC domains.

Another success factor will be portability across a range of device capabilities, some with hardware acceleration, some without. Paradigm of edge-native applications clearly describe characteristics required for developers to build applications in order to satisfy service offerings at the edge [1]. The reality however is proving to be far more complex for application writers to satisfy plethora of requirements [2].

Where Kubernetes CNI is limited...

Microservice architectures are decoupling applications from maintaining & managing infrastructure operations to only perform required business logic. Infrastructure tasks of moving the data, health checks, rate limiting, etc., are being decoupled from orchestrators like Kubernetes and are being taken up by service mesh using sidecar proxies, forming software defined data plane for microservices.

While Kubernetes has become the orchestrator of choice for microservices, data plane networking has evolved with advent of Container Network Interface (CNI)s such as Calico, Cilium, and OVN4NFV. Applications running as microservices still perform routine infrastructure management tasks such as rate limiting, health checking, L4 to L7 connection management, and TLS termination, which has to be replicated by each of the microservices.

…Service mesh addresses the needs

Service mesh comes to the rescue to decouple all these operations at the networking layer by introducing sidecar proxy and a reverse proxy, that enables application developers to focus on their business logic while offloading crucial operations to sidecar proxy [3] as shown in Figure 1 [4].

Attributes of edge aware applications could be directly attributed to functionality of a typical service mesh that introduces side-car proxy into each of the application pod. Following attributes of edge-native functions could be directly mapped with minimal enabling of a service mesh to be deployed at the edge as indicated below:

A. Awareness and discovery:

ETSI MEC specifies edge applications to be registered and discoverable of each service or have capabilities to discover each other’s services [6]. Applications can offer enhanced user experience if network services such as location, QoS or traffic influence could be used in conjunction with the services.

Figure 2: Service Discovery with Envoy Proxy (source)

Traditional functionality of a side car proxy within a service mesh is to identify capabilities of associated service and set of services it can reach out to or it is allowed to communicate with. Application developers can now leverage side-car proxy to discover set of available services within a cluster and additional meta-data such as QoS or location or traffic requirements, etc., that the application utilize during run-time. Additional constructs such as API gateways could be used in conjunction with sidecar proxies to scale the awareness and discovery services across edge clusters and ide during network or connectivity failure could be offloaded.

B. Resiliency:

One of the primary tenants of a service mesh is its ability to self-heal and be resilient across application restarts or heavy load conditions or unpredictable failure situations, etc. [7]. Service mesh control plane leverages set of user defined policies to constantly monitor health of side-car proxies and application SLAs and reroute traffic to prevent degradation of overall performance.

Figure 3 : Circuit breaking in a Service Mesh (source)

Much of the logic of health checks, traffic reroutes, circuit breaking functionality are all taken care by the mesh control plane there by decoupling application developers & applications to have relevant logic for resiliency. This aspect of service mesh saves a lot of development, testing and deployment cycles for developers to focus on the functionality and services offered.

C. Scalability

Applications and services at the edge need to have the ability to scale as per traffic and load conditions on-demand basis to satisfy end user requirements [6]. While this can be a controlled setting in a private wireless edge-based deployment, usually applications & services need to encompass the ability to scale and satisfy client’s or end-device’s requests at scale. Service mesh can encompass these requirements by thresholding the traffic surges and rerouting the traffic to application pods that can handle the increase in requests [3]. Service mesh have tight coupling with orchestrators such as Kubernetes in order to scale the services to meet SLAs.

Figure 4: Global control in Tanzu Service Mesh at Scale

D. Low-latency offloads

Edge compute infrastructure should be elastic to scale the services to offload the services from end devices, to edge environment to the cloud compute infrastructure on need basis to satisfy low latency requirements.

Figure 5: SmartNIC Offloads example (source)

While service mesh could facilitate the interaction between the clusters to provide elastic compute, service mesh also provides ability to utilize hardware offload constructs such as offloads to Smart-NIC or utilize acceleration hardware for secure transactions in order to complete compute intensive operations within certain latency requirements [8]. This can help reduce the overall Round Trip Time (RTT) of an end user’s request by leveraging hardware offload capabilities.

E. Security and privacy

Protecting the privacy of every end device/end user using secure communication is one of the most important aspect of edge computing. Necessary security network functions are introduced across the Edges - IoT Edge, On-Premises Edge, Access Edge, Network Edge, to provide secure boundaries across Edge to Cloud communication continuum. Edge Native applications however need to be able to communicate across these boundaries without complicating application developer’s experience while ensuring low latency communication. Service mesh can help in this domain by offloading communication security aspects such as TLS termination, Ipsec offloads, etc., using side-car proxies [9]. Mesh control plane can now scale this across the cluster services, or among multiple service meshes across clusters. This helps applications be independent of infrastructure requirements in terms of scaling while ensuring security among service-to-service interaction.

What Do You Do with This?

As the saying goes “The Future is Yours for the Taking” Edge computing presents enormous opportunities in enabling various service deployment models with Service Meshes. Few items that are worth exploring:

  • Types of service mesh deployment models that are suitable for various types of Edges from across 4 Types of Edge Computing - Broadly Classified
  • Latency impact in leveraging side-car proxy model for microservices customized for Edge computing
  • Multi-cluster management with service meshes as Edges consists of multitude of clusters dispersed across geographical locations
  • FCAPS implications
  • and so on.

As we progress in evolution towards 6G Networks, time will tell the if and how service meshes were impactful for Edge computing.

[1] M. Satyanarayanan, G. Klas, M. Silva and S. Mangiante, “The Seminal Role of Edge-Native Applications,” 2019 IEEE International Conference on Edge Computing (EDGE), 2019, pp. 33–40, doi: 10.1109/EDGE.2019.00022.

[2] Shadi A. Noghabi and John Kolb and Peter Bodik and Eduardo Cuervo, “Steel: Simplified Development and Deployment of Edge-Cloud Applications”, 10th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 18), 2018, USENIX.

[3] W. Li, Y. Lemieux, J. Gao, Z. Zhao and Y. Han, “Service Mesh: Challenges, State of the Art, and Future Research Opportunities,” 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE), 2019, pp. 122–1225, doi: 10.1109/SOSE.2019.00026.

[4] K.Balaji, “What is a servicemesh and do you need one”, MuleSoft

[5] B. Chess, E.Sigler, “Scaling Kubernetes to 7500 Nodes”, Resarch, Openai.com

[6] ETSI GS, “Multi-access Edge Computing (MEC); Framework and Reference Architecture”, ETSI GS MEC 003 V2.1.1 (2019–01)

[7] R.Chandramouli, Z.Butcher, “Building Secure Microservices-based Applications Using Service-Mesh Architecture”, NIST Special Publication 800–204A, https://doi.org/10.6028/NIST.SP.800-204A

[8] Y.Jiang, et.al., “Service mesh offload to network devices”, USPTO publication number 20210243247

[9] P.Gupta, “Mutual TLS: Securing Microservices in Service Mesh”, TheNewStack

Author

Sunku Ranganath is a Solutions Architect for Edge Compute at Intel. For the last few years, his area of focus has been on enabling solutions for the Telecom domain, including designing, building, integrating, and benchmarking NFV based reference architectures using Kubernetes & OpenStack components. Sunku is an active contributor to multiple open-source initiatives.  He serves as a maintainer for CNCF Service Mesh Performance & CollectD Projects and participated on the Technical Steering Committee for OPNFV (now Anuket). He is an invited speaker to many industry events, authored multiple publications and contributed to IEEE Future Networks Edge Service Platform & ETSI ENI standards.  He is a senior member of the IEEE.

PREVIOUS POST

Future of Cloud: Digital Transformation in a Post-Pandemic World

NEXT POST

5 Trends Proving Data is the Heart of Business Transformation