A brief introduction to Policy-as-Code

Oct 03, 2023

You may have heard the term “Policy-as-Code”, but I am guessing most people don’t know what it means. So let’s start from the beginning.

Policy: A policy, is a rule or guideline that describes how to make decisions. For example, a “No Solicitors” sign at a home is a policy. “No dogs allowed in the common area” is a policy.

In the context of a business, these policies are often a little more complex: “No contractors can enter the building before 7 am.” “The server room is off-limits except for authorized personnel.”

And then there’s policy embedded in software. One of the first things that most multi-user applications (i.e. enterprise services, SaaS, etc.) need is some sort of authentication. Once the users are authenticated (i.e. once we know who they are), the system then needs to determine what the user is authorized to do. Which is also known as authorization policy. And, it (the authorization policy) is implemented in software. So here we come to the most basic and least interesting definition — policy-as-code means the organizational rules, implemented in software. And early on, when software needed to enforce organizational policy, often it was literally just that — additional logic in the application/service software that implemented the policy. But embedding this logic directly into the code is challenging to get right, and leads to lots of maintenance problems. So almost immediately, people started looking for a better way.

Using RBAC for Authorization Policy

One common form of a less-fragile form of authorization policy is “Role-Based Access Control”. RBAC is more or less the standard approach to how most organizations define authorization: “If a user has the appropriate role, the software should allow that user to do a set of things”. This approach worked well for many authorization problems for quite a while. You define a corporate policy — “Certain people are allowed to do a certain thing”, and then you determine which roles represent those “certain people”. Later on, when a request came in to your systems, it would look at the request, identify the role(s) were associated with the policies that governed that request, and finally determine if the requesting entity had one of those roles.

Over time, however, it has become harder to match policies directly to roles. Because as our organizations have grown more sophisticated, in many cases, there’s an additional factor. It’s no longer: “Can a user do a thing”, but more: “How much of a particular thing can be done by a user?”

For example: In modern cloud-based infrastructure, developers might need to be able to allocate VMs or other resources from the cloud provider, so the developers can perform their work. But if you don’t govern this carefully, the developers might allocate too many resources, or allocate far more expensive resources than they really need. In the exciting era of “Infrastructure-as-Code”, this becomes even more of a challenge — when a developer can create an automated task to allocate 100 expensive VMs, use them for an hour and then tear it all down — this can lead to skyrocketing infrastructure costs without anyone noticing until they get the bill at the end of the month.

RBAC doesn’t really help here — how do you create roles that properly enforce the corporate policy that ‘people with role ABC can allocate four servers of size X, but no more than two servers of size X+1’ ?

Manual Code Inspection

The short-term answer to this challenge has been quite human intensive: the ‘infrastructure documents’ that the developers create to request resources have to be approved by other humans before they’re allowed to be executed. This helps mitigate cost in some ways, but just creates more busywork for the people who are approving documents, instead of adding value.

What if there was a way that these ‘infrastructure documents’ could be analyzed by software, in such a way that the software could make the approval decisions automatically, following the corporate policy’ I’m sure you’ve already guessed where I’m going — the software that reviews infrastructure documents, and enforces corporate policy about the size of infrastructure requests is an example of ‘Policy-as-Code’. But this is different than the fragile policy-as-code we mentioned before. Because the great insight that the inventors of the new system realized: your applications and systems need to know whether the action is allowed by policy, but they don’t have to be responsible for implementing that policy.

Policy-As-Code

The general concept is that a Policy-as-Code solution does four things:

It understands the contents and format of an incoming request
It uses “policy code” to understand what policies are associated with the request, and then to determine whether that request is authorized
It responds to the request in a structured and consistent way
It does all of this in a way that allows the policies to change without having to change the ‘business logic’

Realistically, Policy-as-Code is more correctly called something like Policy-As-Code-via-Modular-Agent-Networks, but that acronym might get you sued.

I gave you an example of using Policy-as-Code to manage infrastructure documents, but Policy-as-Code can be used for a lot more than just Kubernetes or Terraform DevOps-style policy enforcement. Consider the modern architectural model of microservices.

Microservices

Microservices are a fairly hot new thing in software architectures. In the most common microservice approach, different domain teams are responsible for implementing a set of services in their own preferred way, often using their own preferred language and infrastructure. This provides a number of benefits — the experts in the various domains get to decide how they solve their own problems, and as long as the solutions are robust and fast, none of the other service teams care how its implemented.

One of the interesting challenges in the microservice world is the way they start to talk to each other. Especially as the capabilities of these microservices increases over time. For example, when ‘Microservice A’ requests something from ‘Microservice B’, how do we know whether the request is allowed? RBAC is one way, but again, RBAC has challenges with questions of quantity or expense, or aspects of time or any other sophisticated policy. “The microservice developers can implement the quantity enforcement in the service code” is pretty much the standard answer these days, which basically harkens back to the era before RBAC, when the policy logic was directly embedded into the service’s business logic. And that is a disaster waiting to happen, for one obvious reason.

What if the policy changes?

When a policy changes, every piece of software that is implementing that policy has to be reviewed. And if that policy change affects a many domains across the organization? Now you have to get several different domain development teams to schedule the change. You have to get that change tested and verified and deployed, and, except in the most mature of organizations, you have to hope that the change doesn’t break some pre-existing workflow. You’re looking at quarters, if not years before your policy change is properly implemented.

In a world where Policy-As-Code solutions provide the policy enforcement for your microservices, this doesn’t have to happen. In this world, the services don’t do any hard-coded authorization. They pass all authorization requests off to a PaC system, which looks at the request, applies the appropriate policy rules to the request, and approves or denies it. If the policy changes, you change the rule in the PaC system, one time, in one place. All of the services are updated at the same time, with little or no involvement from the domain team.

Note: I’m a software architect by trade, and if I were in the shoes of another architect, reading the paragraph above, I’d have a bunch of architectural concerns. Discussing those is beyond the scope of this document, so I’ve produced an essay about the operational aspects of Policy-as-Code:

Architectural Concerns with Policy-as-Code

John Brothers

October 3, 2023

I recently wrote a document introducing the concepts of Policy-As-Code. When I got to the interesting opportunities for implementation, I realized that as a software architect, I’d have a lot of questions about the practical aspects of implementing a PaC solution. So this is an essay discussing how the implementations of Policy-As-Code solutions help mi…

Read full story

APIs

Even if you don’t use a lot of microservices, if you provide APIs for third-parties to interact with your business, Policy-as-Code can be used as the authorization mechanism. Similarly to microservices, this allows you to develop and manage your policies in one place, rather than distributing the implementation and maintenance burden to the API teams. If you want to let certain types of users access more of your resources, why does the development team need to be involved?

Compliance

You have policies. And you have auditors asking you “How do you know that your policies are being enforced properly by your applications?” In a pre-Policy-as-Code world, this is often frustrating, and sometimes near-impossible. In a PaC world, you show them (the auditors) the policies, you show them the policy code, and you show them that the applications use the PaC system for authorization. How much of your organization’s time have you just freed up?

Rules Engines

There are examples of businesses using Policy-As-Code solutions to solve more sophisticated policy analysis. For example, determining if an applicant is approved for a loan. If you think about it, Rules Engines are just another example of organizational policy, and in some cases — perhaps many — using a Policy-As-Code solution to implement the rules in your Rules Engine can be a significant improvement over other approaches.

Wrapping Up

I hope this essay has given you a better understanding of what Policy-As-Code is, and some of the ways it can help your organization adapt and evolve with the ongoing changes to software, infrastructure and APIs. If I can leave you with one takeaway, perhaps it should be this:

Organizational policy implementation and enforcement should not be embedded directly into your organization’s software and development teams

Operations teams should be able to make the decisions about operational policy. Product teams should make the decisions about product/customer policy. And security teams should make the decisions about compliance and security policy.

Thank you for your time

If you have questions, think I’ve made too many weak assertions, or you would just like to discuss further, please feel free to contact us at: info@paclabs.io .

Policy-as-Code Musings and News

Architectural Concerns with Policy-as-Code

Discussion about this post