What is Access Control?

ness-intricity101
12 Jan 202105:20

Summary

TLDRJared Hillam discusses the evolution and importance of access control in data security. As organizations grow, access control becomes crucial for managing who can see what data, reducing irrelevant information and ensuring sensitive data protection. With the rise of cloud data platforms, centralized data requires flexible access control, necessitating a shift from traditional BI tools to embedding security in the data architecture itself. This approach allows for consistent access across various tools and ensures data is relevant to the audience, promoting a loosely coupled architecture for independent upgrades and business continuity.

Takeaways

  • 🔒 Data security is not only about external threats like hackers but also about internal access control to ensure sensitive information is protected and irrelevant data is not unnecessarily exposed.
  • 📈 As organizations grow, access control becomes increasingly important to manage the visibility of data across different segments of the organization.
  • 🚧 The traditional approach to access control, which was tied to business intelligence (BI) tools, is no longer sufficient for modern, large-scale data architectures.
  • 🗂️ The shift towards centralizing data in data warehouses and data lakes has necessitated a reevaluation of where and how access control is implemented.
  • 💡 Access control should be an integral part of the data architecture itself, not just an afterthought in the BI layer, to ensure flexibility and scalability.
  • 🔄 The move towards cloud data platforms has enabled scalable querying of centralized data sets, which in turn requires a more dynamic and decentralized approach to access control.
  • 🔗 Loose coupling is a key architectural principle that can be facilitated by embedding access control in the data architecture, allowing for independent upgrades and interchangeability of analytics tools.
  • 🛡️ Deep access controls are essential for ensuring that data is segmented and relevant to the audience consuming it, regardless of the tool used to access the data.
  • 👥 Large organizations may have multiple BI tools and diverse audiences interacting with data, necessitating a more granular and centralized approach to access control.
  • 📝 Active governance, auditability, and sustainable standards are crucial for managing roles and data segment grants in complex cloud data architectures.

Q & A

  • What is the primary concern with data security beyond hacking and theft?

    -The primary concern with data security beyond hacking and theft is access control, which defines what segments of data internal parties can see.

  • Why is access control important in large organizations?

    -Access control is important in large organizations because it not only protects sensitive data but also minimizes noise by ensuring that only relevant data is accessible to specific organizational parties.

  • How has the approach to data consumption changed over the past 20 years?

    -Over the past 20 years, data consumption shifted from one-off business intelligence reporting applications to more centralized data warehousing, and then to cloud data platforms that allow scalable querying of centralized data sets.

  • What was the traditional location for access control in data architectures?

    -Traditionally, access control was located in the business intelligence layer, even after data had been centralized in a data warehouse.

  • Why did the approach to access control need to change around 2015?

    -The approach to access control needed to change around 2015 due to the advent of cheap storage methods, which led to further centralization of data and a need for more flexible data access across a growing number of tools and audiences.

  • Why can't access control be tightly coupled with reports and dashboards in modern data platforms?

    -Access control cannot be tightly coupled with reports and dashboards in modern data platforms because organizations may use multiple business intelligence tools and have various audiences interacting with data directly, necessitating a more flexible and decentralized approach to access control.

  • Where should security be implemented in modern data architectures to ensure flexibility and consistency?

    -Security should be implemented back into the architecture as an aspect of the data itself, ideally at the data lake level, to ensure that any access to the data can be segmented and relevant to the audience consuming it.

  • What is the benefit of pushing access control back to the data lake?

    -Pushing access control back to the data lake allows every point of data to be served to every user with no concern about the tool issuing the query, meeting the expectations of business stakeholders for a flexible data architecture.

  • What is the key to successfully implementing deep access controls in a data architecture?

    -The key to successfully implementing deep access controls is active governance, auditability, and sustainable standards, with administrators being highly aware of roles and grants for data segments.

  • What is the title of the white paper mentioned in the script that discusses potential issues with access control?

    -The title of the white paper mentioned is 'How to Botch Your Snowflake Deployment in Three Easy Steps.'

Outlines

00:00

🔐 The Importance of Access Control in Data Security

Jared Hillam discusses the critical role of access control in data security, emphasizing its significance beyond just preventing data breaches. Access control is crucial for defining which segments of data internal parties can see, which is particularly important as organizations grow. The necessity for access control arises not only from the sensitivity of data like salaries but also to minimize irrelevant data exposure, reducing 'noise' within the organization. Hillam points out that the placement of access control logic has significant implications for data architecture, and it has become a major challenge in recent years. Historically, access control was managed through business intelligence (BI) tools, but with the advent of cloud data platforms and centralized data storage, a more flexible approach is needed. The video suggests that access control should be integrated into the data architecture itself, ideally at the data lake level, to ensure consistent and relevant access for various tools and audiences.

05:00

📚 Access Control Challenges and Solutions

The second paragraph introduces a white paper that addresses common pitfalls in access control, specifically within Snowflake deployments. The paper, titled 'How to Botch Your Snowflake Deployment in Three Easy Steps,' is available in the video description. Additionally, the paragraph offers viewers the opportunity to connect with a specialist for personalized advice on their data security situations. The video suggests that while the ideal of flexible, accessible data architecture is clear, achieving it requires active governance, auditability, and sustainable standards. It also highlights the expertise of Intricity in implementing cloud data architectures for large organizations, indicating their experience in navigating the complexities of access control in modern data environments.

Mindmap

Keywords

💡Data Security

Data security refers to the protection of data from unauthorized access or corruption. In the video, it is highlighted as both the prevention of external threats (like hackers) and the internal management of data access. The focus here is on access control, ensuring that only the right people within an organization can access certain data.

💡Access Control

Access control is the process of defining which individuals or systems are permitted to view or use specific data. In the video, this is presented as a critical element of data security, especially in large organizations. It ensures sensitive data, like salaries, remains restricted, and irrelevant data is hidden from certain users to reduce noise.

💡Centralized Logic

Centralized logic refers to the idea of centralizing key data functions (such as queries and access control) rather than scattering them across multiple layers of a data architecture. The video explains how centralizing logic in a data warehouse allows for more controlled access to data across various tools, which becomes increasingly necessary as organizations grow.

💡Data Warehouse

A data warehouse is a centralized repository where data is stored and organized for analysis and reporting. In the video, it is discussed as the traditional place where centralized logic and access control were managed. However, modern data architectures have outgrown this model due to increased data centralization needs.

💡Data Lake

A data lake is a centralized storage system for raw, unprocessed data, often used in modern data architectures. The video mentions the shift toward implementing access control within the data lake rather than only in the business intelligence layer. This allows for more flexible and scalable access control across different tools and audiences.

💡Loose Coupling

Loose coupling refers to a design principle where system components remain independent and easily upgradeable. The video emphasizes that access control should be loosely coupled from business intelligence tools to avoid dependency on specific platforms, allowing organizations to switch tools without disrupting access control.

💡Business Intelligence (BI) Tools

BI tools are software applications that process data to generate reports, dashboards, and other analytics for decision-making. The video highlights how many organizations use multiple BI tools and explains why access control should not be embedded in these tools but instead be part of the deeper data architecture.

💡Scalability

Scalability refers to a system's ability to handle growing amounts of work or data efficiently. The video discusses how cloud data platforms have enabled scalable queries on centralized data sets, but stresses the importance of scalable access control to handle the growing number of tools and audiences interacting with the data.

💡Active Governance

Active governance refers to the continuous monitoring and management of data access policies and controls. The video suggests that active governance is necessary to ensure that access control remains effective, as organizations face evolving challenges in managing roles, grants, and data segments in complex cloud environments.

💡Auditability

Auditability is the ability to track and review who accessed data and what changes were made to it. In the video, auditability is mentioned as part of ensuring effective governance in modern data architectures, where administrators need to know who has access to data and how it's being used to maintain security and compliance.

Highlights

Data security involves not only preventing hackers from acquiring sensitive data but also managing access control within organizations.

Access control is crucial as organizations grow, to manage who can see different segments of data.

Data may be irrelevant to certain organizational parties, emphasizing the need for access control to minimize noise.

Access control's placement can lead to either enablement or chaos in large data architectures.

20 years ago, data was primarily consumed through business intelligence reporting applications.

Centralized logic in data warehousing became more prevalent as organizations sought to centralize data.

Access control initially resided in the business intelligence layer, even with data centralized in a warehouse.

Around 2015, cheap storage led to a centralization of data, changing data consumption patterns.

Cloud data platforms have enabled scalable querying of deeply centralized data sets.

Organizations need flexible data access for a growing number of tools and audiences.

Tightly coupling access control to reports and dashboards is no longer suitable for modern data platforms.

Security should be integrated into the data architecture, closer to the base data lake where centralized data resides.

Deep access controls ensure consistent and relevant data access for any tool or audience.

Loose coupling in architecture is promoted by housing security in the data architecture, allowing interchangeability of analytics tools.

The end goal is to serve every point of data to every user without concern for the querying tool's modality.

Achieving this requires active governance, auditability, and sustainable standards.

Administrators must be aware of roles and grants for data segments, which can be complex in cloud platforms.

Intricity has experience implementing cloud data architectures and offers insights in a white paper on avoiding access control pitfalls.

Transcripts

play00:01

hi i'm jared hillam when it comes to

play00:03

data security

play00:04

most people think of hackers and hoodies

play00:07

trying to acquire sensitive data

play00:09

and while this is a very important side

play00:11

of security

play00:12

there's also a practical side to data

play00:15

security and that's

play00:16

access control access control

play00:20

is what defines the segments of data

play00:22

that internal parties can actually see

play00:25

and the larger an organization gets the

play00:28

more important access control becomes

play00:31

now this isn't just because some data is

play00:33

sensitive like people's salaries

play00:35

but because data might be completely

play00:37

irrelevant

play00:39

to certain organizational parties

play00:42

so simply for the benefit of minimizing

play00:44

noise

play00:46

being able to isolate data through

play00:48

security is a super important function

play00:51

but the question becomes where should

play00:53

the definition

play00:55

of this access control occur if you

play00:57

recall from one of our previous videos

play00:59

that asks

play01:00

where should logic live we learn that

play01:03

the answer to that question

play01:05

has massive downstream impacts when it

play01:07

comes to access control

play01:08

it can be the difference between

play01:10

enablement or total

play01:12

chaos in recent years access control has

play01:15

become a

play01:15

primary stumbling block in large data

play01:18

architectures

play01:20

now let me give you some contextual

play01:21

history to explain why

play01:23

20 years ago data was mostly consumed

play01:26

through one-off

play01:27

business intelligence reporting

play01:29

applications

play01:31

these systems housed all the complex

play01:33

queries

play01:34

for producing formatted reports however

play01:37

organizations began to see

play01:39

that the data itself needed more

play01:41

centralized logic

play01:43

this was the time where wider adoption

play01:46

of data warehousing

play01:47

came into the picture access control

play01:50

still

play01:50

mostly lived in the business

play01:52

intelligence layer

play01:53

even though the data had been somewhat

play01:55

centralized in a warehouse

play01:57

in those early years any consumption of

play02:00

the data

play02:01

outside of the data warehouse and bi

play02:03

layer was often considered

play02:04

as a rogue query but this all began to

play02:07

change around

play02:08

2015. this is because cheap storage

play02:12

methods

play02:13

ushered in a centralization of data

play02:17

in more recent years cloud data

play02:19

platforms have now made it possible to

play02:21

scalably query these deeply centralized

play02:24

data sets

play02:25

with all this centralization

play02:27

organizations need their data to be

play02:29

flexible to an ever-growing number of

play02:32

tools and audiences

play02:33

this means they can't just tightly

play02:35

couple all the access control into some

play02:37

reports

play02:38

and dashboards and then call it a day

play02:40

large organizations may end up having

play02:42

10 business intelligence tools due to

play02:44

acquisitions and different end user

play02:46

needs

play02:46

additionally they may have dozens of

play02:48

different audiences that interact with

play02:50

the data directly and not through some

play02:52

sanctioned analytics tool so having

play02:55

access control

play02:56

nested into the bi layer no longer is a

play02:58

proposition

play02:59

that can suit today's large modern data

play03:02

platforms

play03:02

instead security needs to go back into

play03:05

the architecture as an

play03:06

aspect of the data itself and this

play03:09

doesn't mean the data warehouse

play03:10

but rather even further back into the

play03:12

base data lake

play03:14

where all the core centralized data

play03:16

resides

play03:18

this is the only way to ensure that any

play03:21

access to the data can be segmented

play03:23

and relevant to the audience consuming

play03:26

it

play03:26

additionally this makes it possible for

play03:29

any business intelligence

play03:30

analytics data science or reporting

play03:33

layer

play03:33

to plug into the environment and get

play03:36

consistent access

play03:37

based on credentials and do so no matter

play03:40

if they queried the data warehouse

play03:42

or the data lake another important

play03:45

reason for setting up these deep access

play03:48

controls

play03:48

is loose coupling if you recall from our

play03:51

video titled

play03:53

what is tight coupling the correct

play03:55

practice in any architecture is to

play03:57

loosely couple

play03:58

each component so they're independently

play04:00

upgradeable

play04:02

by housing the security in the data

play04:04

architecture

play04:05

the analytics tool becomes highly

play04:07

interchangeable without impacting the

play04:08

business

play04:10

the end result of pushing access control

play04:12

all the way back to the data

play04:14

lake is the ability to serve up every

play04:17

point of data to every user

play04:19

with no concern about the modality of

play04:22

the tool issuing the query

play04:24

this ultimately is what business

play04:25

stakeholders expect from their data

play04:27

architecture

play04:27

in the first place but delivering on

play04:30

this is much

play04:31

easier said than done but it is doable

play04:34

and worth it

play04:35

the key here is active governance

play04:38

auditability

play04:39

and sustainable standards administrators

play04:42

need to be highly aware of

play04:43

roles and grants for data segments which

play04:46

can be tricky for some of these cloud

play04:48

platforms now intricity has spent a lot

play04:51

of time with this topic

play04:52

as it is implemented hundreds of cloud

play04:55

data architectures for large

play04:56

organizations

play04:58

we recently wrote a white paper diving

play05:00

into some of the critical points that

play05:01

can go wrong with access control

play05:03

titled how to botch your snowflake

play05:05

deployment in three easy steps

play05:08

we've included a link to that white

play05:10

paper in the video description

play05:12

also if you'd like to talk with a

play05:13

specialist about

play05:15

your specific situation you'll see a

play05:18

link for that as well

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
Data SecurityAccess ControlData ArchitectureCloud PlatformsBusiness IntelligenceData WarehousingCentralized DataData PrivacyInformation GovernanceSecurity Best PracticesData Analytics
¿Necesitas un resumen en inglés?