Why containers for Data Science?
- Containers are mostly used for:
- Packaging an environment for someone else to use
- Packaging a finished product (project/app/whatever) for archiving, reproducibility, or production
Potential examples:
- When publishing, instead of providing code, data, and describing the environment used, you can include a Dockerfile so anyone can pick up exactly where you left off
- You have a project at work that needs to be interacted with every week no matter who looks at it or when
- You’re publishing an R package and want to test specific features across different base R or package versions