What is the difference between authentication and authorization?
What are some different ways to manage permissions? What are the advantages and drawbacks of each?
What is some advantages of token-based auth? Why are most organizations adopting it? Are there any drawbacks?
For each of the following, is it a username + password method or a token method? PAM, LDAP, Kerberos, SAML, ODIC/OAuth
Introduction
The goal of Auth is to manage two desires: everybody should be able to do their work, but nobody should be able to work on something they’re not supposed to
People leave, join, change roles frequently. Having one person with the key to every room is impractical (and unsafe)
In “least privilege”, people only get access to the things they need and nothing they don’t
Auth is all about systems that balance these needs
LDAP/AD
Authentication = knowing who is requesting access to something they need
Authorization = checking if that person should have access to the thing they’re asking for
Lightweight Directory Access Protocol (LDAP) or Access Directory (AD) centralizes security by having everyone need only one “key” for every room (one set of username-password credentials)
Authentication is improved, but authorization is not
The communication between the device and the server is not guaranteed to be secure
You still need to check credentials at each room repeatedly
Single Sign On (SSO)
SSO is like getting a key card at the front desk for the day
The card is given to you if your credentials are correct (authenticated)
Card maintains your credentials throughout the day
Card already knows which rooms you need access to (authorized)
Managed by browser’s tokens:
Security Assertion Markup Language (SAML 2.0, in XML)
Open Identity Connect (OIDC, OAuth2.0, in JSON)
External services like Okta, OneLogin, Azure Active Directory are vendors
Permissions
Simple permissions management is just a list e.g. Access Control List (ACL)
Role Based Access Control (RBAC) defines groups of permissions by person e.g. manager, intern, executive
More flexible and simple at first
Complexity creep for each person who thinks they’re “special”
Attribute Based Access Control (ABAC) defines permissions for combinations of the person, task, data, etc.
e.g. AWS Identity and Access Management (IAM) aims to balance complexity with security
Why should we care?
Your org should be able to equip you with the tools required to be a creative and effective data scientist without jeopardizing their privacy/security
Data access, of course
Internal data sources often only need LDAP-like credentials (not SSO)
Some orgs use Kerberos to create SSO-like tokens for data
Others still use JSON Web Token (JWT) — rare but still new
IAM is often used for cloud-to-cloud security
Service accounts are software that needs permission to do things by itself
e.g. you don’t want to have to sign in yourself for every user who wants to use your secure Shiny app
Learning Objectives
What is the difference between authentication and authorization?
Authentication = who is asking?
Authorization = what do you want to do?
What are some different ways to manage permissions? What are the advantages and drawbacks of each?
In order of complexity: ACL, RBAC, ABAC
What is some advantages of token-based auth? Why are most organizations adopting it? Are there any drawbacks?
Reduces complexity by providing one “handshake”, but doesn’t solve everything (e.g. data access, token management)
For each of the following, is it a username + password method or a token method? PAM, LDAP, Kerberos, SAML, ODIC/OAuth