Securing Microservices & APIs in a Zero Trust Environment

Sep 26, 2023

So you’re building a Zero Trust solution, and you want to lock down the microservices…

If any of the following objections keep getting raised by the development team, you might want to look at a Policy-as-Code solution, it might be able to help:

Your Zero Trust architecture will make our microservices much less robust
Your Zero Trust architecture will be too expensive at scale
Your Zero Trust architecture will make our microservices much too slow
We don’t have time to be implementing all those zero-trust gateways in our microservices
Your Zero Trust changes are not a priority for the development team
All the roles and capabilities you keep adding to support Zero Trust is causing havoc in the RBAC infrastructure
All of our services are in <Environment K>, and your Zero Trust solution needs to support that

Policy-as-Code is not just about managing infrastructure

That’s probably where most people have heard of it — creating rules about how Terraform deploys new infrastructure. And Policy-as-Code can do that. But some of the implementations can do a lot more, including as a Policy Decision Point in your Zero Trust architecture. Here are the two most-well-established ones:

AWS Cedar
Open Policy Agent

Note: if you’re building a general purpose Policy-as-Code solution that I haven’t mentioned, please let me know!)

AWS Cedar

AWS Cedar is a policy-as-code solution that (you guessed it) focused on AWS environments. Cedar is a language built specifically for managing Policy-Based Access Control (PBAC). If you have problems with the growing complexity of your RBAC implementation, AWS Cedar might be a great fit. Permit.io provides a containerized solution for Cedar, so you can use it in a variety of places (Kubernetes, etc). But note - it is focused on answering a yes/no question about access by Principals to Resources. It doesn’t (yet) provide much capability beyond that.

Open Policy Agent (OPA) & Rego

The most generalized of the PaC solutions, and also the one that has the best overall design is OPA. OPA is an open source agent, implemented in Go, which can be run as a standalone executable, as a sidecar in Kubernetes, or as a library you attach to your existing Go infrastructure. (This richness in deployment options generally addresses the concerns about environments).

But the advantages go beyond just environmental flexibility:

You can put the agent very close to the microservices.

Let’s say the microservices are in Kubernetes — you can add OPA as a sidecar. The calls from your microservices will be quite fast, and there’s no single point of failure. (I mean, obviously if the sidecar somehow crashes, that’s a brief outage, but K8S can manage that easily). This generally addresses the concerns of robustness and speed.

You can deploy them everywhere at minimal cost

The agents are free, the management channel is not particularly chatty, and the memory and CPU footprint is quite low. You’re not looking at a substantial performance hit, except perhaps in the most exceptionally resource-starved implementations. This generally addresses the concerns about cost.

The development team just has to implement an HTTP call to get a decision

OPA expects all of its queries to be in JSON format, sent over HTTP/HTTPS. This is well supported by almost all languages and platforms at this point. And by pushing the decision to OPA, you’re freeing up the developers from having to implement this policy logic in their native code. It’s a win-win! What’s more, if you need to adjust your policies, in many cases that can be done without touching a line of the microservice code. Let the security policy developers focus on the security issues, and let the business developers focus on business value. This addresses the concerns about the priorities of the development team, and also should mute some of the concerns about gateway implementations.

Policy-based Access Control is a new way to think about authorization and access

One of the challenges we’ve seen is in the “explosion” of RBAC configurations. As you add new functionality to your system, you are obliged to add more and more roles and capabilities. If you need to support limits (for example, purchase order approval limits) that can make the RBAC system awkward and pushes more responsibilities back on the developers.

With a policy-based solution, much of that can go away. These policy rules can be implemented in more “normal” ways, such as “Anyone with the purchase order approval role who has been with the organization for 5 years can approve up to <X> dollars.” That’s just one example of the type of ‘intuitive’ access control you can get from policy logic, instead of role membership.

Rich functionality to support Zero Trust objectives

Rego (the policy language) is designed specifically for processing rules, and its functional model makes it fast and safer than procedural options. No chance of an infinite loop, every decision is carefully logged, and you have out of the box support for TLS-encrypted communications, bearer tokens, cryptographic hashes, JWT parsing and certificate parsing. People complain about Rego being complex, but it’s not really the Rego that’s complex, it’s the classic challenge of writing functional code in a readable style, and that’s something that can be addressed with a modest amount of training.

Tools that manage large-scale OPA deployments already exist

Both Styra DAS and Permit.io’s OPAL have management tools that can help you keep track of fleets of agents. DAS is a proprietary SaaS and on-premise solution, written by the inventors of OPA. OPAL is an open-source management tool, that also can be used in a (subscription) SAAS model or on-prem. Both can be integrated into monitoring solutions. This partially addresses concerns about the complexity of deploying & managing the policy-as-code infrastructure.

Wrapping Up

If any of this has resonated with you and your Zero Trust journey, you really ought to take a look at Policy-as-Code. Many of the common implementation challenges the development team will encounter in the Zero Trust process are already well-supported by Rego. If you can implement your Zero Trust initiatives while taking the development teams off of the critical path, that seems like a win. If you want to talk about this at more length, feel free to contact me at: johnbr@paclabs.io

Policy-as-Code Musings and News

Discussion about this post