Protect What Matters Most: The Data — Part 2, Data-Centric Security

Welcome to Part 2, of Protect What Matters Most: The Data. You can find Part 1 here, if you didn’t catch it.

A while back, my colleague, Gerry Gebel, and I delivered a webinar on the very topic of data-centric authorization. Gerry starts off by reviewing existing techniques. In reality, most data-centric solutions out there focus on the container’s security rather than the data. The techniques relate to database security such as:

  • Default accounts
  • Users and roles
  • Exposed passwords
  • Patching
  • Privileges and permissions
  • Parameter settings
  • Password management
  • Profiles
  • Auditing
  • Listener security

But what we really want to do is focus on data security i.e. the ability to dynamically filter and mask data:

  • Filter entire data entries (rows)
  • Mask entire columns
  • Mask / tokenize data inside cells

This is exactly what data-centric security focuses on, and how several of our dynamic authorization products work: 

Both products have a similar architecture. The key difference lies in their focus. ADAF MD focuses on traditional relational databases be it on-premise or in the cloud (Oracle, MS SQL…). The latter focuses on Big Data systems (Apache Spark, HADOOP, Databricks…)

Architecture

At the core of the products lies what is known as the SQL Proxy, an interceptor similar to what API Gateways are for APIs. It intercepts the SQL statements, processes them, and eventually sends them to the backend database. The proxy is transparent to both the calling application and the protected database.

One of the steps the SQL proxy can do is invoke Axiomatics’ authorization service to retrieve a set of filters that must be applied to the original statement. The SQL Proxy will then rewrite the original statement using those filters in such a way that only the modified statement is passed to the backend and only entitled data is retrieved from the backend.

The diagram depicts the information flow between the client and the backend.


Preemptive Authorization

I like to mention that this model achieves preemptive authorization. What I mean is that the data retrieved from the database is the set of data the user is allowed to access. There is no need for further filtering or masking. Whatever comes out of the DB is what the user can work with.

Dynamic Data Filtering

One of the terms I use is Dynamic Data Filtering. This refers to the fact that your DBAs do not need to create filters beforehand. The filters are dynamically generated based on policy. All you have to do is write a policy e.g.

  • A user with role customer representative can approve a transaction if the amount is less than their approval limit and if the owner of the transaction is not the user (segregation of duty check).

ADAF MD and SmartGuard will be capable of:

  1. Retrieving the necessary metadata e.g. the user’s role and approval limit
  2. Generating SQL filters based on the policy and the metadata.

The result will be akin to “SELECT * FROM transactions WHERE amount < 5000 AND owner not in (“Alice”).

Dynamic Data Masking, Anonymization, and Tokenization

Filtering is only the first step towards comprehensive data-centric security. Sometimes specific use cases call for masking or tokenization instead. For instance PCI-DSS requires that credit card numbers not be displayed in full (3.3 Mask PAN when displayed; the first six and last four digits are the maximum number of digits you may display. Not applicable for authorized people with a legitimate business need to see the full PAN. Does not supersede stricter requirements in place for displays of cardholder data such as on a point-of-sale receipt. Source: www.pcisecuritystandards.org Several legislations dictate how social security numbers are to be processed. By using dynamic data masking, we can tell the SQL Proxy to query the database in a way that only the last four digits of an SSN are returned. The process is exactly the same as for dynamic data filtering. The policy is augmented with additional statements e.g.:

  • A user with role customer representative can approve a transaction if the amount is less than their approval limit and if the owner of the transaction is not the user (segregation of duty check).
    • Mask the credit card number and only show the last 4 digits

Data can be:

  • Masked e.g. replaced with a NULL value or default value
  • Tokenized
  • Encrypted / decrypted – so long as the target database can handle that functionality

What does the future hold?

So far, Axiomatics has been successful at applying dynamic authorization to relational databases and emerging data stores. We are looking into expanding to other databases / flavors / dialects. We just released SmartGuard for Data – Spark SQL Edition. This version works with Apache Spark – anywhere it’s installed (in the cloud, on-prem) and works well with tools like @Databricks, Cloudera and more. This opens up the door to fine-grained authorization on resources such as AWS S3. But that’s not all. The same principle could be applied to non-SQL sources such as Elasticsearch. Sky’s the limit. What’s your flavor du jour?



Leave a Reply

Your email address will not be published. Required fields are marked *