There are several ways to construct strings in R
such as paste()
. However, when the string which is going to
be constructed is too complex, using paste()
can be a pain.
For example, we want to put some parameters as title in a plot.
region = c(1, 2)
value = 4
name = "name"
str = paste("region = (", region[1], ", ", region[2], "), value = ", value,
", name = '", name, "'", sep = "")
cat(str)
## region = (1, 2), value = 4, name = 'name'
As you can see, it is hard to read and very easy to make mistakes. (Syntax highlighting may be helpful to match brackets, but it is still quite annoying to see so many commas and quotes.)
In Perl, we always use variable interpolation to
construct complex strings in which variables are started with special
marks (sigil), and variables will be replaced with their real values. In
this package, we aim to implement variable interpolation in R. The idea
is rather simple: use special marks to identify variables and then
replace with their values. The function here is qq()
which
is named from the subroutine with the same name in Perl
(It stands for double quote). Using variable interpolation, above
example can be written as:
library(GetoptLong)
str = qq("region = (@{region[1]}, @{region[2]}), value = @{value}, name = '@{name}'")
cat(str)
## region = (1, 2), value = 4, name = 'name'
Or use the shortcut function qqcat()
:
## region = (1, 2), value = 4, name = 'name'
One feature of qqcat()
is you can set a global prefix to
the messages by qq.options("cat_prefix")
, either a string
or a function. If it is set as a function, the value will be generated
at real time by executing the function.
## [INFO] This is a message
qq.options("cat_prefix" = function() format(Sys.time(), "[%Y-%m-%d %H:%M:%S] "))
qqcat("This is a message")
## [2024-10-24 02:57:56] This is a message
## [2024-10-24 02:57:58] This is a message after 2 seconds
## This is a message
You can shut down all messages produced by qqcat()
by
qq.options("cat_verbose" = FALSE)
.
Also you can set a prefix which has local effect.
qq.options(RESET = TRUE)
qq.options("cat_prefix" = "[DEBUG] ")
qqcat("This is a message", cat_prefix = "[INFO] ")
## [INFO] This is a message
## [DEBUG] This is a message
From version 1.1.2, qq.options()
can work in a local
mode in which the copy of the options only work in a local chunk.
qq.options("cat_prefix" = "[DEBUG] ")
qq.options(LOCAL = TRUE)
qq.options("cat_prefix" = "[INFO] ")
qqcat("This is the first message")
## [INFO] This is the first message
## [INFO] This is the second message
## [DEBUG] This is the third message
Reset the options so that it does not affect example code in following part of the vignette.
Not only simple scalars but also pieces of codes can be interpolated:
## There is 1 dog.
## There are 2 dogs.
If the text is too long, it can be wrapped into lines.
qq.options("cat_strwrap" = TRUE)
qqcat("one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty.")
## one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen,
## fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty.
There can be multiple templates:
## command -a 1 \
## -b 2 \
## -c 3
NOTE: Since qq
as the function name is
very easy to be used by other packages (E.g., in
lattice, there is a qq()
function as well)
and if so, you can enforce qq()
in your working environment
as the function in GetoptLong by:
In above exmaple, @{}
is used to mark variables. Later,
variable names will be extracted from these marks and replaced with
their real values.
The marking code pattern can be any type. But you should make sure it
is easy to tell the difference from other part in the string. You can
set your code pattern as an argument in qq()
. The default
pattern is @\\{CODE\\}
because we only permit
CODE
to return simple vectors and @
is a sigil
representing array in Perl.
In following example, the code pattern is #{}
.
## x = 1
Or set in qq.options()
as a global setting:
As you can guess, in @\\{CODE\\}
, CODE
will
be replaced with .*?
to construct a regular expression and
to match variable names in the string. So if your
code.pattern
contains special characters, make sure to
escape them. Some candidate code.pattern
are:
code.pattern = "@\\{CODE\\}" # default style
code.pattern = "@\\[CODE\\]"
code.pattern = "@\\(CODE\\)"
code.pattern = "%\\{CODE\\}"
code.pattern = "%\\[CODE\\]"
code.pattern = "%\\(CODE\\)"
code.pattern = "\\$\\{CODE\\}"
code.pattern = "\\$\\[CODE\\]"
code.pattern = "\\$\\(CODE\\)"
code.pattern = "#\\{CODE\\}"
code.pattern = "#\\[CODE\\]"
code.pattern = "#\\(CODE\\)"
code.pattern = "\\[%CODE%\\]" # Template Toolkit (Perl module) style :)
Since we just replace CODE
to .*?
, the
function will only match to the first right parentheses/brackets. (In
Perl, I always use recursive regular expression to
extract such pairing parentheses. But in R, it seems
difficult.) So, for example, if you are using @\\[CODE\\]
and your string is "@[a[1]]"
, it will fail to extract the
correct variable name while only extracts a[1
, finally it
generates an error when executing a[1
. In such condition,
you should use other pattern styles that do not contain
[]
.
Finally, I suggest a more safe code pattern style that you do not need to worry about parentheses stuff:
It will first look up in the envoking environment, then through searching path. Users can also pass values of variables as a list like:
## x = a, y = b
If variables are passed through list, qq()
only looks up
in the specified list.
qq()
only allows variables to return vectors. The whole
string will be interpolated repeatedly according to longest vectors, and
finally concatenated into a single long string.
## 1 is an odd number. 2 is an even number. 3 is an odd number. 4 is an even number. 5 is an
## odd number. 6 is an even number.
## 1, a, A 2, b, B 3, a, C 4, b, D 5, a, E 6, b, A
This feature is especially useful if you want to generate a report such as formatted in a HTML table:
## <tr><td>a</td><td>1</td><tr> <tr><td>b</td><td>2</td><tr> <tr><td>c</td><td>3</td><tr>
## <tr><td>d</td><td>4</td><tr>
The returned value can also be a vector while not collapsed into one string:
## [1] 6
## [1] "1, a, A" "2, b, B" "3, a, C" "4, b, D" "5, a, E" "6, b, A"
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
## [4] LC_COLLATE=C LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] markdown_1.13 GetoptLong_1.1.0 knitr_1.48
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.37 R6_2.5.1 GlobalOptions_0.1.2 fastmap_1.2.0
## [5] xfun_0.48 rjson_0.2.23 maketools_1.3.1 cachem_1.1.0
## [9] htmltools_0.5.8.1 rmarkdown_2.28 buildtools_1.0.0 lifecycle_1.0.4
## [13] cli_3.6.3 sass_0.4.9 jquerylib_0.1.4 compiler_4.4.1
## [17] highr_0.11 sys_3.4.3 tools_4.4.1 evaluate_1.0.1
## [21] bslib_0.8.0 yaml_2.3.10 crayon_1.5.3 jsonlite_1.8.9
## [25] rlang_1.1.4