dplyr or pandas for data transformation.Source: DuckDB Labs
Source: DuckDB Labs
How much data?
Despite being in-memory, DuckDB allows analyses to “spillover” into the hard disk. What are the limits of this fallback?
Which process?
What counts as a “process” for in-process? The authors discuss querying S3 files from a Cloud VM instance “in process” - is this a serverless deployment of Python? What’s the “process” for the DuckDB CLI?
varchar, numeric, etc.SELECT * instead of enumerating every column name.The repo contains excellent instructions for updating the repository using R and the usethis package.
If you do not use R, the gist is to create a personal fork of the repository, make your changes to that fork, and then push your changges back to main repository.
Make sure your fork is up to date with the main repository!
.qmd file for the the chapter you are presenting under /slides/xx.qmd.YAML header:/_site/slides/xx.htmlTo render the book chapters on your local machine, you need to load the R dependencies listed in the DESCRIPTION file. You need these even if your presentation has no R code in it.
There are no Python dependencies right now, but once we hit that point we will record them in pyproject.toml in the repository.
pak and renv:## Install pak and renv
if (!requireNamespace("pak", quietly = TRUE)) {
install.packages("pak", repos = sprintf(
"https://r-lib.github.io/p/pak/stable/%s/%s/%s",
.Platform$pkgType,
R.Version()$os,
R.Version()$arch
))
}
if (!requireNamespace("renv", quietly = TRUE)) {
pak::pkg_install("renv")
}
## Configure renv to use pak to install packages
renv::config$pak.enabled(TRUE)
## Configure renv to snapshot dependencies from the DESCRIPTION file
renv::settings$snapshot.type("explicit")
## Install the project dependencies
pak::pkg_install(renv::dependencies(path = "DESCRIPTION"))DESCRIPTION and commit the changespyproject.toml (UV does this automatically)pyproject.toml and DESCRIPTION, but not changes to renv.lock and uv.lockor, in Python
Then, to create the chunk:
The book covers a lot different technolgies, and we may tire of troubleshooting dependencies in the main repo.
Quarto’s “freeze” option is a quick fix to this issue. It will tell the rendering pipeline in the main repo not to render that chapter and instead pull the .html file from _freeze/slides/xx/.
In the top level YAML, use:
_freeze/slides/xx to the repository!