class: center, middle, inverse, title-slide # Chapter 21: Translating R Code ## Pavitra Chakravarty ### R4DS Reading Group --- <style> .remark-code-line { font-size: 11px; } </style> <style> hide { display: none; } .remark-slide-content h1 { font-size: 45px; } h1 { font-size: 2em; margin-block-start: 0.67em; margin-block-end: 0.67em; } .remark-slide-content { font-size: 16px } .remark-code { font-size: 14px; } code.r { font-size: 14px; } pre { margin-top: 0px; margin-bottom: 0px; } .red { color: #FF0000; } .footnote { color: #800020; font-size: 9px; } </style> # Translating R code {#translation} ## Introduction The combination of first-class environments, lexical scoping, and metaprogramming gives us a powerful toolkit for translating R code into other languages. One fully-fledged example of this idea is dbplyr, which powers the database backends for dplyr, allowing you to express data manipulation in R and automatically translate it into SQL. You can see the key idea in `translate_sql()` which takes R code and returns the equivalent SQL: ```r translate_sql(x ^ 2) ``` ``` ## <SQL> POWER(`x`, 2.0) ``` ```r translate_sql(x < 5 & !is.na(x)) ``` ``` ## <SQL> `x` < 5.0 AND NOT(((`x`) IS NULL)) ``` ```r translate_sql(!first %in% c("John", "Roger", "Robert")) ``` ``` ## <SQL> NOT(`first` IN ('John', 'Roger', 'Robert')) ``` ```r translate_sql(select == 7) ``` ``` ## <SQL> `select` = 7.0 ``` --- ## S3 CLASSES The easiest way to do this is to create an S3 class that distinguishes between regular text (that needs escaping) and HTML (that doesn't). ```r html <- function(x) structure(x, class = "advr_html") print.advr_html <- function(x, ...) { out <- paste0("<HTML> ", x) cat(paste(strwrap(out), collapse = "\n"), "\n", sep = "") } html("1") ``` ``` ## <HTML> 1 ``` --- We then write an escape generic. It has two important methods: * `escape.character()` takes a regular character vector and returns an HTML vector with special characters (`&`, `<`, `>`) escaped. * `escape.advr_html()` leaves already escaped HTML alone. ```r escape <- function(x) UseMethod("escape") escape.character <- function(x) { x <- gsub("&", "&", x) x <- gsub("<", "<", x) x <- gsub(">", ">", x) html(x) } escape.advr_html <- function(x) x ``` Some examples ```r escape("This is some text.") ``` ``` ## <HTML> This is some text. ``` ```r escape("x > 1 & y < 2") ``` ``` ## <HTML> x > 1 & y < 2 ``` ```r escape(escape("This is some text. 1 > 2")) ``` ``` ## <HTML> This is some text. 1 > 2 ``` ```r escape(html("<hr />")) ``` ``` ## <HTML> <hr /> ``` --- ## Basic tag functions Now, lets look at the other tags in a html doc `<p>`- HTML tag can have both attributes (e.g., id or class) and children (like `<b>` or `<i>`). Given that attributes are named and children are not, it seems natural to use named and unnamed arguments for them respectively. For example, a call to `p()` might look like: ```r p("Some text. ", b(i("some bold italic text")), class = "mypara") ``` We could list all the possible attributes of the `<p>` tag in the function definition, but that's hard because there are many. Instead, we'll use `...` and separate the components based on whether or not they are named. With this in mind, we create a helper function that wraps around `rlang::list2()` --- ## Lets refer back to 19.6 ... (dot-dot-dot) One place we could use `list2()` is to create a wrapper around `attributes()` that allows us to set attributes flexibly: ```r numeric <- function(...) { dots <- list2(...) num <- as.numeric(dots) set_names(num, names(dots)) } x<-numeric(a=1,b=2,c=3,d=4) x ``` ``` ## a b c d ## 1 2 3 4 ``` ```r test <- "q" numeric( 1,!!!x,!!test:=9) ``` ``` ## a b c d q ## 1 1 2 3 4 9 ``` --- ## Getting back to getting attributes ```r dots_partition <- function(...) { dots <- list2(...) if (is.null(names(dots))) { is_named <- rep(FALSE, length(dots)) } else { is_named <- names(dots) != "" } list( named = dots[is_named], unnamed = dots[!is_named] ) } str(dots_partition(a = 1, 2, b = 3, 4)) ``` ``` ## List of 2 ## $ named :List of 2 ## ..$ a: num 1 ## ..$ b: num 3 ## $ unnamed:List of 2 ## ..$ : num 2 ## ..$ : num 4 ``` --- # `<p></p>` ```r p <- function(...) { dots <- dots_partition(...) attribs <- html_attributes(dots$named) children <- map_chr(dots$unnamed, escape) html(paste0( "<p", attribs, ">", paste(children, collapse = ""), "</p>" )) } p("Some text", id = "myid") ``` ``` ## <HTML> <p id='myid'>Some text</p> ``` ```r p("Some text", class = "important", `data-value` = 10) ``` ``` ## <HTML> <p class='important' data-value='10'>Some text</p> ``` --- ## Now for the children tags ```r tag <- function(tag) { new_function( exprs(... = ), expr({ dots <- dots_partition(...) attribs <- html_attributes(dots$named) children <- map_chr(dots$unnamed, escape) html(paste0( !!paste0("<", tag), attribs, ">", paste(children, collapse = ""), !!paste0("</", tag, ">") )) }), caller_env() ) } tag("b") ``` ``` ## function (...) ## { ## dots <- dots_partition(...) ## attribs <- html_attributes(dots$named) ## children <- map_chr(dots$unnamed, escape) ## html(paste0("<b", attribs, ">", paste(children, collapse = ""), ## "</b>")) ## } ``` --- ## Void Tags ```r p <- tag("p") b <- tag("b") i <- tag("i") p("Some text. ", b(i("some bold italic text")), class = "mypara") ``` ``` ## <HTML> <p class='mypara'>Some text. <b><i>some bold italic ## text</i></b></p> ``` ```r void_tag <- function(tag) { new_function( exprs(... = ), expr({ dots <- dots_partition(...) if (length(dots$unnamed) > 0) { abort(!!paste0("<", tag, "> must not have unnamed arguments")) } attribs <- html_attributes(dots$named) html(paste0(!!paste0("<", tag), attribs, " />")) }), caller_env() ) } img <- void_tag("img") img ``` ``` ## function (...) ## { ## dots <- dots_partition(...) ## if (length(dots$unnamed) > 0) { ## abort("<img> must not have unnamed arguments") ## } ## attribs <- html_attributes(dots$named) ## html(paste0("<img", attribs, " />")) ## } ``` ```r img(src = "myimage.png", width = 100, height = 100) ``` ``` ## <HTML> <img src='myimage.png' width='100' height='100' /> ``` --- ## Lets pull in all tags ```r tags <- c("a", "abbr", "address", "article", "aside", "audio", "b","bdi", "bdo", "blockquote", "body", "button", "canvas", "caption","cite", "code", "colgroup", "data", "datalist", "dd", "del","details", "dfn", "div", "dl", "dt", "em", "eventsource","fieldset", "figcaption", "figure", "footer", "form", "h1", "h2", "h3", "h4", "h5", "h6", "head", "header", "hgroup", "html", "i","iframe", "ins", "kbd", "label", "legend", "li", "mark", "map","menu", "meter", "nav", "noscript", "object", "ol", "optgroup", "option", "output", "p", "pre", "progress", "q", "ruby", "rp","rt", "s", "samp", "script", "section", "select", "small", "span", "strong", "style", "sub", "summary", "sup", "table", "tbody", "td", "textarea", "tfoot", "th", "thead", "time", "title", "tr", "u", "ul", "var", "video" ) void_tags <- c("area", "base", "br", "col", "command", "embed", "hr", "img", "input", "keygen", "link", "meta", "param", "source", "track", "wbr" ) ``` --- ## A lot of these tags are the same as base R function So do not include in global environment or in a package ```r html_tags <- c( tags %>% set_names() %>% map(tag), void_tags %>% set_names() %>% map(void_tag) ) ``` --- ## Finally the DSL ```r with_html <- function(code) { code <- enquo(code) eval_tidy(code, html_tags) } ``` Lets execute our DSL ```r with_html( body( h1("A heading", id = "first"), p("Some text &", b("some bold text.")), img(src = "myimg.png", width = 100, height = 100) ) ) ``` ``` ## <HTML> <body><h1 id='first'>A heading</h1><p>Some text &<b>some ## bold text.</b></p><img src='myimg.png' width='100' height='100' ## /></body> ``` ```r with_html( body( title("facebook", lang = "en", dir = "ltr"), p("This is &", b("Facebook")), img(src = "myimg.png", width = 100, height = 100) ) ) ``` ``` ## <HTML> <body><title lang='en' dir='ltr'>facebook</title><p>This is ## &<b>Facebook</b></p><img src='myimg.png' width='100' height='100' ## /></body> ```