Yuriy Yunikov Yuriy Yunikov  on Engineering February 13, 2019

Building a fine-grained permission system in a distributed environment: Architecture

At Very Good Security (VGS), our seasoned engineering team works hard to successfully solve complex technical challenges - while keeping security our top priority. One of such security challenge is access control to resources we store in the system.

Problem

Consider a real-world scenario: a document management system. On each document, you can find multiple collaborators with different permissions levels. Your user account has access to multiple documents, and on each of them you have different permissions levels: read, write or admin (managing other users’ access to the document). This structure is similar to how sharing permissions works on Google Docs.

Let’s take a look at a simple, specific example: Does Alice have access to “read” document #123?

fine-grained-permission-system

We need to evaluate access quickly and implementation shouldn't be too complicated for other engineers to use it.

Let’s also set some technical conditions we want to achieve:

  • There are multiple documents we need to restrict access to
  • Access needs to be restricted on a per-user basis
  • Documents' permissions need to be evaluated by multiple instances of a service
  • Easy to deploy and use
  • Access evaluation mechanism needs to be fast in performance

Existing solutions

Typical authentication and authorization mechanisms are based on OAuth 2.0 / OpenID Connect, but these protocols are not designed to solve these types of issues. OAuth 2.0 is a delegation protocol, which instead, solves a client-to-client access delegation problem. OpenID Connect just adds an identity layer on top of OAuth 2.0. To achieve our goal, we need to look beyond these solutions.

Access control lists, RBAC and ABAC concepts were designed to solve these exact problems. If you’re designing a monolithic application, the described problem can be resolved by simply storing all the lists in a single database and evaluating permissions any time the system needs it. When thinking about this problem in terms of a distributed environment, however, things begin to get more complicated.

It is very difficult to manage decisions criteria and policies when access control logic is embedded into each service. Each service needs to be updated with any changes in policies, and evaluation logic needs to be shared among different applications. Considering the fact that you can have services written in different languages, you’d need to support libraries for each language. Service decoupling for access decisions allows policies to be updated just once, while affecting all clients simultaneously. This makes the mechanism language independent.

Looking at the standards built on top of OAuth 2.0 and OpenID Connect that solve such problems, User-Managed Access (UMA) and XACML are the first that come to mind. The purpose of UMA, in its specification, is defined as:

"enable a resource owner to control the authorization of data sharing and other protected-resource access made between online services on the owner's behalf or with the owner's authorization by an autonomous requesting party".

UMA doesn’t define the policy format, but instead defines the communication mechanism. The benefit of having UMA in place is that it’s compatible with OAuth 2.0. However, a combination of regular OAuth Authorization Grant Flow with UMA is complicated. Neither “OAuth dance” nor “UMA dance” are easy. Try to look at the diagrams of Full UMA Flow. If you’re not familiar with it, it might be a pain for you to implement or even use.

Implementing XACML and storing it is also a complicated task. Regular XACML policy looks similar to this one:


<xacml-ctx:Request ReturnPolicyIdList="true" CombinedDecision="false" xmlns:xacml-ctx="urn:oasis:names:tc:xacml:3.0:core:schema:wd-17">
   <xacml-ctx:Attributes Category="urn:oasis:names:tc:xacml:3.0:attribute-category:action" >
      <xacml-ctx:Attribute AttributeId="actionId" IncludeInResult="true">
         <xacml-ctx:AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">view</xacml-ctx:AttributeValue>
      </xacml-ctx:Attribute>
   </xacml-ctx:Attributes>
   <xacml-ctx:Attributes Category="urn:oasis:names:tc:xacml:3.0:attribute-category:resource" >
      <xacml-ctx:Attribute AttributeId="resource-id" IncludeInResult="true">
         <xacml-ctx:AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">doc#123</xacml-ctx:AttributeValue>
      </xacml-ctx:Attribute>
   </xacml-ctx:Attributes>
   <xacml-ctx:Attributes Category="urn:oasis:names:tc:xacml:1.0:subject-category:access-subject" >
      <xacml-ctx:Attribute AttributeId="user.identifier" IncludeInResult="true">
         <xacml-ctx:AttributeValue DataType="http://www.w3.org/2001/XMLSchema#string">Alice</xacml-ctx:AttributeValue>
      </xacml-ctx:Attribute>
   </xacml-ctx:Attributes>
</xacml-ctx:Request>

As you can see, XACML is verbose and even JSON profile doesn’t help much.

With all the issues described above, the main problem is that UMA and XACML are rarely supported by many implementations of authorization servers, or implementation isn't complete. Moreover, building these types of solutions on your own is complicated.

Another problem is the barrier to entry for engineers who aren’t familiar with it. What if you just wanted to build a simple microservice and need to evaluate permissions like “Does Alice have access to read document #123?” That would be a significant amount of knowledge you’d need to already have to write even a simple policy like the one above.
Luckily, there are other solutions to this problem.

How can we do it better?

In recent years, Service Mesh architecture became popular and is often used alongside Kubernetes. Service Mesh defines the concept of a sidecar, which is a proxy or a small service sitting in front of your service. What if, as a developer, I could simply put a sidecar in front of my service for evaluation of access?

Access-control-sidecar-2

This looks like a much more scalable solution, which could be applied across multiple services. It would also be fast, without any long network calls, because of the type of deployment. Another benefit is that the access control sidecar encapsulates all complex evaluation logic. Development of access control can be separated from the service and is very useful in engineering team structure: a team which manages identities and access (IAM) can work on developing a sidecar while another team can work on the business logic of the service.

Building such a sidecar on your own would take quite some time, as you’d need to define the right architecture, code it, and make it production ready. However, there’s a ready-to-use product on the market already, called Open Policy Agent (OPA). It’s open source, small, and fast in performance because of in-memory storage. With OPA you can declare your policy file in a Rego language, which would look like this:


package httpapi.authz

default allow = false


# is Alice allowed to read document with id 123
allow {
 input.method = "GET"
 input.path = ["document", "123"]
 input.user = "Alice"
}

Writing a policy for each user in the system would be too cumbersome. That’s why the concept of data exists in OPA. Data is a simple JSON file that can be read by the policy. Here is example of documents.json:

[
  {
    "id": "123",
    "users": [
    	{
    		"id" : "Alice",
    		"permission" : "read"
    	}
    ]
  },
  ...
]

A more generic case policy would look similar to this one:


package httpapi.authz

import input as http_api
import documents

default allow = false

allow {
  http_api.method = "GET"
  http_api.path = ["document", document_id]
  document = documents[_]
  document_user = document.users[_]

  # is user allowed to read document with document_id
  document.id = document_id
  document_user.id = http_api.user
  document_user.permission = "read"
}

As you can see, it certainly looks more readable than XACML. The only learning curve here would be to understand how to write policies in Rego. Reading the documentation for 10-20 minutes should give you enough understanding to write simple policy files.

Service side implementation for this is also trivial. Policy evaluation on service results in a simple HTTP call with allow/deny response:


curl -X POST -d '{"input":{"user": "Alice", "path": ["document", "123"], "method": "GET"}}' localhost:8181/data/httpapi/authz

{
  "result": {
    "allow": true
  }
}

This can be coded as HTTP filter, which would evaluate each request to the service.

Summary

Solving resource permissions problems is not easy. Existing standardized protocols built on top of OAuth 2.0 (such as User Managed Access) are complicated, can be verbose (such as XACML), and there are not many production-ready solutions for them.

Having access control sidecars scale well in the distributed environment and simplify usage by separating access control logic on deployment level. This approach can be used with Open Policy Agent (OPA), which provides everything needed. Policies are less verbose compared to XACML, and the application developer shouldn’t focus around sophisticated access control logic.

This post still doesn’t cover a few open questions related to the described architecture:

  • Where should you store and manage data files?
  • How would this scenario work and eventually be consistent if there are multiple services with the same data source?
  • Where should you store and manage policy files?
  • What needs to be changed in the code or the deployment of service itself?

Stay tuned for updates on the Very Good Security Blog.

Subscribe to our Blog

Please enter a valid email address.