Club meetings
Welcome
Welcome to the DevOps for Data Science book club!
book:
do4ds.com
These slides are available at
dslc.io/do4ds
.
We follow the
Data Science Learning Community Code of Conduct
.
Book club meetings
Volunteer leads discussion of a chapter
This is the best way to learn the material.
Presentations:
Review of material
Questions you have
Maybe live demo
How to edit the slides:
More info about editing:
this github repo
Ideally convert existing Rmd to qmd as we go
Recorded, available on the
Data Science Learning Community YouTube Channel (DSLC.video)
Pace
Goal:
1 chapter/week
Ok to split overwhelming chapters
Ok to combine short chapters
Meet
every
week except holidays, etc
We will meet even if scheduled presenter unavailable
Slack is good place for offline discussion/troubleshooting
Learning objectives (LOs)
Students who study with LOs in mind
retain more
Tips:
Think “After today’s session, you will be able to {LO}”
Very
roughly
1 per heading
Group introductions
If you feel comfortable sharing:
Who are you?
Where you calling in from? (If you’re not comfortable sharing, skip)
How long have you been using R?
What was your introduction to R?
What are you most looking forward to during the club?
Learning objectives
Recognize the
history of DevOps.
Differentiate between
DevOps
(knowledge, practices, and tools) and
IT/Admins
(people and roles).
Recognize
red flags about IT/Admin functions
and what they might indicate.
Organize the content that will be
covered in this book.
Devops?
Grew out of
Agile software development (2001).
Deliver small units, collect feedback, iterate.
Needed similar process to
get the iterations deployed.
DevOps (~2010)
is the
system/discipline.
DevOps vs IT/Admins
DevOps
Knowledge, practices, & tools
Put things into prod
Safely & easily
IT/Admins
People/roles who manage the servers, etc.
Many names:
Information Technology (IT)
SysAdmin
Site Reliability Engineering (SRE)
DevOps
Red Flags about IT/Admins
Subdivided
(security, databases, networking, etc)
Pros:
Super-deep expertise.
Cons:
Hard to find the right person.
Outsourced
Pros:
Companies can get competence fast.
Cons:
Scheduling, often high turnover.
Nobody
Pros:
Freedom!
Cons:
It’s all up to you!
This Book
Section 1:
Patterns & principles to grease the path to production.
Section 2:
Vocab & beginnings of DIY.
Section 3:
Hands-on DIY. Still very very in progress.
DevOps Lessons for Data Science
Learning objectives:
Describe the
core principles of DevOps.
Apply
DevOps best practices
to
data science.
The 5 Tenets of DevOps
Code should be
well-tested
and tests should be
automated.
Updates
should be
frequent
and
low-risk.
Security
concerns should be considered
up front as part of architecture.
Production
systems should have
monitoring and logging.
Frequent opportunities for
review, change, and updating
should be
built into the system
– both
culturally
and
technically.
DevOps for Data Science (1/2)
Use CI/CD ➡️ Code Promotion and Integration Processes
Structure output so moving to prod or updating is easy.
Infrastructure as Code ➡️ Manage Environments as Code
Reproducible & secure environments are… reproducible and secure!
Microservices ➡️ Data Science Project Components
Figure out how to subdivide things to make updating less painful.
DevOps for Data Science (2/2)
Monitoring & Logging ➡️ Monitoring & Logging
Data science doesn’t do enough of this, but he’ll tell us how we should!
Other Things ➡️ Other Things 🙃
Chapter about Docker for Data Science here, because it deserves its own chapter.
Section 2 will be all about things like communication, collaboration, and review practices.
Architect 👷🏻♀️ VS Archaeologist 🤠
You are a software engineer.