Learning objectives
- What is the difference between authentication and authorization?
- What are some different ways to manage permissions? What are the advantages and drawbacks of each?
- What is some advantages of token-based auth? Why are most organizations adopting it? Are there any drawbacks?
- For each of the following, is it a username + password method or a token method? PAM, LDAP, Kerberos, SAML, ODIC/OAuth
Introduction
- The goal of Auth is to manage two desires: everybody should be able to do their work, but nobody should be able to work on something they’re not supposed to
- People leave, join, change roles frequently. Having one person with the key to every room is impractical (and unsafe)
- In “least privilege”, people only get access to the things they need and nothing they don’t
- Auth is all about systems that balance these needs
LDAP/AD
- Authentication = knowing who is requesting access to something they need
- Authorization = checking if that person should have access to the thing they’re asking for
- Lightweight Directory Access Protocol (LDAP) or Access Directory (AD) centralizes security by having everyone need only one “key” for every room (one set of username-password credentials)
- Authentication is improved, but authorization is not
- The communication between the device and the server is not guaranteed to be secure
- You still need to check credentials at each room repeatedly
Single Sign On (SSO)
- SSO is like getting a key card at the front desk for the day
- The card is given to you if your credentials are correct (authenticated)
- Card maintains your credentials throughout the day
- Card already knows which rooms you need access to (authorized)
- Managed by browser’s tokens:
- Security Assertion Markup Language (SAML 2.0, in XML)
- Open Identity Connect (OIDC, OAuth2.0, in JSON)
- External services like Okta, OneLogin, Azure Active Directory are vendors
Permissions
- Simple permissions management is just a list e.g. Access Control List (ACL)
- Role Based Access Control (RBAC) defines groups of permissions by person e.g. manager, intern, executive
- More flexible and simple at first
- Complexity creep for each person who thinks they’re “special”
- Attribute Based Access Control (ABAC) defines permissions for combinations of the person, task, data, etc.
- e.g. AWS Identity and Access Management (IAM) aims to balance complexity with security
Why should we care?
- Your org should be able to equip you with the tools required to be a creative and effective data scientist without jeopardizing their privacy/security
- Data access, of course
- Internal data sources often only need LDAP-like credentials (not SSO)
- Some orgs use Kerberos to create SSO-like tokens for data
- Others still use JSON Web Token (JWT) — rare but still new
- IAM is often used for cloud-to-cloud security
- Service accounts are software that needs permission to do things by itself
- e.g. you don’t want to have to sign in yourself for every user who wants to use your secure Shiny app
Learning Objectives
- What is the difference between authentication and authorization?
- Authentication = who is asking?
- Authorization = what do you want to do?
- What are some different ways to manage permissions? What are the advantages and drawbacks of each?
- In order of complexity: ACL, RBAC, ABAC
- What is some advantages of token-based auth? Why are most organizations adopting it? Are there any drawbacks?
- Reduces complexity by providing one “handshake”, but doesn’t solve everything (e.g. data access, token management)
- For each of the following, is it a username + password method or a token method? PAM, LDAP, Kerberos, SAML, ODIC/OAuth
PAM = username + password
LDAP = username + password
Kerberos = Token
SAML = Token
ODIC/OAuth = Token