Auth in Enterprise

Learning objectives

  • What is the difference between authentication and authorization?
  • What are some different ways to manage permissions? What are the advantages and drawbacks of each?
  • What is some advantages of token-based auth? Why are most organizations adopting it? Are there any drawbacks?
  • For each of the following, is it a username + password method or a token method? PAM, LDAP, Kerberos, SAML, ODIC/OAuth

Introduction

  • The goal of Auth is to manage two desires: everybody should be able to do their work, but nobody should be able to work on something they’re not supposed to
  • People leave, join, change roles frequently. Having one person with the key to every room is impractical (and unsafe)
  • In “least privilege”, people only get access to the things they need and nothing they don’t
  • Auth is all about systems that balance these needs

LDAP/AD

  • Authentication = knowing who is requesting access to something they need
  • Authorization = checking if that person should have access to the thing they’re asking for
  • Lightweight Directory Access Protocol (LDAP) or Access Directory (AD) centralizes security by having everyone need only one “key” for every room (one set of username-password credentials)
    • Authentication is improved, but authorization is not
    • The communication between the device and the server is not guaranteed to be secure
    • You still need to check credentials at each room repeatedly

Single Sign On (SSO)

  • SSO is like getting a key card at the front desk for the day
    • The card is given to you if your credentials are correct (authenticated)
    • Card maintains your credentials throughout the day
    • Card already knows which rooms you need access to (authorized)
  • Managed by browser’s tokens:
    • Security Assertion Markup Language (SAML 2.0, in XML)
    • Open Identity Connect (OIDC, OAuth2.0, in JSON)
  • External services like Okta, OneLogin, Azure Active Directory are vendors

Permissions

  • Simple permissions management is just a list e.g. Access Control List (ACL)
  • Role Based Access Control (RBAC) defines groups of permissions by person e.g. manager, intern, executive
    • More flexible and simple at first
    • Complexity creep for each person who thinks they’re “special”
  • Attribute Based Access Control (ABAC) defines permissions for combinations of the person, task, data, etc.
    • e.g. AWS Identity and Access Management (IAM) aims to balance complexity with security

Why should we care?

  • Your org should be able to equip you with the tools required to be a creative and effective data scientist without jeopardizing their privacy/security
  • Data access, of course
    • Internal data sources often only need LDAP-like credentials (not SSO)
    • Some orgs use Kerberos to create SSO-like tokens for data
    • Others still use JSON Web Token (JWT) — rare but still new
    • IAM is often used for cloud-to-cloud security
  • Service accounts are software that needs permission to do things by itself
    • e.g. you don’t want to have to sign in yourself for every user who wants to use your secure Shiny app

Learning Objectives

  • What is the difference between authentication and authorization?
    • Authentication = who is asking?
    • Authorization = what do you want to do?
  • What are some different ways to manage permissions? What are the advantages and drawbacks of each?
    • In order of complexity: ACL, RBAC, ABAC
  • What is some advantages of token-based auth? Why are most organizations adopting it? Are there any drawbacks?
    • Reduces complexity by providing one “handshake”, but doesn’t solve everything (e.g. data access, token management)
  • For each of the following, is it a username + password method or a token method? PAM, LDAP, Kerberos, SAML, ODIC/OAuth

PAM = username + password

LDAP = username + password

Kerberos = Token

SAML = Token

ODIC/OAuth = Token

Meeting Videos

Cohort 1