Blog

Your Data Is Gold: Data Platform Security And Best Practices

Joseph Ojo
Jan 26, 2026
5
min read
Data is the new Gold
If data is the new gold, why don't we protect it like gold?

Let's think about where physical gold is stored for a second.

A stranger may not be able to access the building; an individual will likely need to go through some sort of authentication at the entrance (which might even be multi-layered) before gaining access to the building. Even with that access, a person may still be restricted to some floors, with some areas requiring some clearance level. These types of buildings have systems to log who came in and when, alongside so many more sophisticated measures to prevent breaches.

This level of care exists because of the value of the content in the building, which is gold in this case.

So if data truly carries the same value, then our data platforms should reflect that same level of care.

If you look at many data platforms today, that level of care is often missing.

Access is usually granted for convenience. Someone needs to run a query, so they get broad permissions. Someone needs to ship something quickly, so access is copied from another role. Over time, more people and systems can touch sensitive data than anyone originally intended.

This is how platforms quietly drift into being wide open.

Data Platform security is not about slowing people down or adding unnecessary processes. It is about being intentional. Just like with physical gold, access should be deliberate.

You should be able to answer simple questions at any point in time. Who can access this data? Why do they have access? What can they do with it? And how would we know if something went wrong?

Here is what intentional security looks like in practice.

1. Authentication and Access Controls

At the most basic level, a data platform should know who is accessing it and what they are allowed to do.

Being able to log in should not automatically translate to broad access. Identity and permissions should be treated as separate concerns. Access should be explicitly granted based on need, and reviewed as those needs change.

In Snowflake, this means using role based access control as the foundation. Privileges should be granted to roles, and users should inherit access only through those roles, rather than being granted permissions directly. An analyst role might have SELECT access on curated analytics tables, but no access to raw ingestion schemas.

Note: If access feels slightly uncomfortable, you are probably doing it right.

2. Encryption

Data should be protected even when it is not actively being queried.

Encryption ensures that data is unreadable when stored or moved between systems without the appropriate keys. This applies to data at rest in warehouses and data in transit across pipelines and tools.

Most modern data platforms support encryption, but it should not be assumed. Some systems historically required encryption to be explicitly enabled, and misconfigurations can still leave data exposed. This makes encryption defaults an important factor when evaluating and choosing a data platform.

3. Auditing and Logging

It should be possible to see who accessed what data and when.

Audit logs provide visibility into platform activity and enable investigation of incidents, unusual behavior, or compliance questions. 

Logs are only useful if they exist and are reviewed.

4. Data Masking

Not everyone who can access a table should see raw values.

Data masking limits exposure by hiding or obfuscating sensitive fields such as personal or financial information, even when users are authorized to query the dataset. This can be implemented statically or dynamically. Some columns can be permanently masked, while others are revealed only to users with the appropriate role.

An analyst may see redacted values (******), while a data owner or trusted role querying the same table sees the real data.

Masking reduces risk by ensuring that access does not automatically translate to full visibility.

Most data platform issues do not come from a single bad decision. 

They come from many small decisions made over time in the name of speed and convenience. When those decisions are never revisited, risk quietly builds up.

If data is truly the new gold, then protecting it should not be reactive or optional. It should be part of how platforms are designed, operated, and maintained from the start, not something that only becomes important after something goes wrong.

Being intentional early is far cheaper than fixing trust later.

Share this post