Read contexts
The read context is the most important aspect of Antimatter's data control functionality. As the name implies, it is intended to capture all the nuances of a context in which data is read. The reasons you may create a read context are up to you, but commonly they are used to capture:
- Different teams that need to see a different subset of the data (e.g. data science vs fraud vs customer support)
- Different use cases that may require different transformations of the data (e.g. model training vs report generation)
- A use case specifically called out in your Data Processing Agreement
A read context lets you
- Control who is allowed to access which data for a particular reason (access control)
- Create an audit trail with an explicit reason-for-access
- Keep your data transformation rules cleanly partitioned
- Delegate management of data transformation rules for a context
A read context consists of two parts:
- A collection of data policies bound to the read context that decide how data is to be treated
- A list of hooks that are required to support the policies. Typically these hooks will have been run as part of the write-context (as that is most efficient) but if a read-context policy requires a hook that has not yet been run on data (because it was written in a write-context that did not require that hook) then it will be run just-in-time on the read path.
Data policies can reference a set of information, and make a decision. The types of information that can be referenced in a data policy rule are:
- Tags that the capsule or pieces of data within the capsule may have attached to them. These tags may have come from write-context classifying hooks, or they may have been provided by the user at encapsulate time.
- Capabilities that the authenticated user may have as part of their domain identity. Capabilities allow you to normalize the information that exists in external identity providers (such as a user's groups, roles, team etc). See Domain Identity for more information about capabilities.
- Facts that may exist or not exist in the domain. Facts allow you to capture supplemental information that simplifies the expression of policy rules, such as which customers have opted-in to a feature. See Facts for more information about facts.
Simple read context rules
A simple read context rule might be to remove all email addresses when data is read in a read context. You would create such a rule as follows:
- Python
- CLI
import antimatter as am
amr = am.Session.from_api_key(domain_id="dm-xxxxxxxx", api_key="xxxxxxxxx")
amr.add_read_context_rules("my_read_ctx",
rule_builder=am.ReadContextRuleBuilder().
add_match_expression(am.Source.Tags,
key="tag.antimatter.io/email",
operator=am.Operator.Exists).
set_action(am.Action.Redact)
)
am data-policy create \
--name my_policy \
--description "Example data policy redacting emails"
POLICY_ID=$(am data-policy list | yq '.policies[] | select(.name == "my_policy") | .id')
am data-policy rule create \
--policy-id ${POLICY_ID} \
--effect Redact \
--clause '{"operator": "AllOf", "tags": [{"operator": "Exists", "name": "tag.antimatter.io/email"}]}'
am data-policy binding set-read-context-attachment \
--policy-id ${POLICY_ID} \
--read-context-id default \
--attachment Attached
Advanced data policy rules
See the documentation for your language for the exact syntax for a rule, but all the bindings share the same components. Within a rule, all the clauses are ANDed together and within a clause, the operator
(AllOf
, NotAllOf
, AnyOf
, NotAnyOf
, or Always
) determines how the parameters are conjoined.
Clauses
A data policy rule clause matches against a tag, capability, fact, or read parameter. Here are some example rules:
This rule allows all access to data if the user has the team-admin
capability.
am data-policy rule create \
--policy-id ${POLICY_ID} \
--effect Allow \
--clause '
{
"operator": "AllOf",
"capabilities": [
{
"name": "team-admin",
"operator": "Exists"
}
]
}
'
This rule denies access to the capsule if it does not have the tag mycompany.com/clean-data
.
am data-policy rule create \
--policy-id ${POLICY_ID} \
--effect DenyCapsule \
--clause '
{
"operator": "AllOf",
"tags": [
{
"name": "mycompany.com/clean-data",
"operator": "NotExists"
}
]
}
'
This rule assumes that data has been tagged according to NIST SP 800-60. It will permit a capsule to be read, but if a record contains any data tagged with an impact level of moderate or high, it will remove the entire record from the output.
am data-policy rule create \
--policy-id ${POLICY_ID} \
--effect DenyRecord \
--clause '
{
"operator": "AllOf",
"tags": [
{
"name": "nist.gov/sp800-60/impact",
"operator": "In",
"values": [
"moderate",
"high"
]
}
]
}
'
Another way of handling the above problem is to permit all records to be returned, but the fields or phrases within them that are tagged as moderate or high impact will be redacted
am data-policy rule create \
--policy-id ${POLICY_ID} \
--effect Redact
--clause '
{
"operator": "AllOf",
"tags": [
{
"name": "nist.gov/sp800-60/impact",
"operator": "In",
"values": [
"moderate",
"high"
]
}
]
}
'
Fact rules
Facts allow you to store supplementary information that can simplify your policy rules. See Facts for more information about facts and when to use them. Your rule can reference facts and relate them back to tags and domain identity. The rule will look for a fact that has arguments matching the arguments provided in the rule. Each argument can be matched against:
- A literal: match against the provided string
- A capability: match against the value of the named capability the subject has (not meaningful for unary capabilities). Does not match if the capability is not present in the subject's identity
- A read parameter: match against the value of the provided read parameter key. Does not match if the read parameter is not present.
- A tag: match against the value of the provided tag name. Does not match if the tag is not present on the data.