英文:
How to declare a dependency on an R package from which you only use S3/S4 methods, but no exports?
问题
当前,我的软件包DESCRIPTION
中依赖于dbplyr
:
Imports:
dbplyr,
dplyr
dbplyr
几乎仅仅因其定义的 S3 方法而有用:https://github.com/tidyverse/dbplyr/blob/main/NAMESPACE。实际上,你调用dbplyr
的函数几乎全部来自dplyr
。
通过将dbplyr
放入我的Imports
中,它应该会自动加载,但不会被附加,这应该足以注册其S3方法:https://r-pkgs.org/dependencies-mindset-background.html#sec-dependencies-attach-vs-load。
这似乎运行良好,但每当我运行R CMD check
时,它告诉我:
N checking dependencies in R code (10.8s)
Namespace in Imports field not imported from: ‘dbplyr’
All declared Imports should be used.
首先,为什么R CMD check
会检查这个,考虑到经常加载而不导入包是有道理的。其次,我该如何满足R CMD check
而不加载我不想要或不需要的内容到我的命名空间中?
英文:
Currently I have in my package DESCRIPTION
, a dependency on dbplyr
:
Imports:
dbplyr,
dplyr
dbplyr
is useful almost solely because of the S3 methods it defines: https://github.com/tidyverse/dbplyr/blob/main/NAMESPACE. The actual functions you call to use dbplyr
are almost entirely from dplyr
.
By putting dbplyr
in my Imports, it should automatically get loaded, but not attached, which should be enough to register its S3 methods: https://r-pkgs.org/dependencies-mindset-background.html#sec-dependencies-attach-vs-load.
This seems to work fine, but whenever I R CMD check
, it tells me:
N checking dependencies in R code (10.8s)
Namespace in Imports field not imported from: ‘dbplyr’
All declared Imports should be used.
Firstly, why does R CMD check
even check this, considering that it often makes sense to load packages without importing them. Secondly, how am I supposed to satisfy R CMD check
without loading things into my namespace that I don't want or need?
答案1
得分: 3
我相当确定你的两个假设是错误的。
首先,将 Imports: dbplyr
放入你的 DESCRIPTION
文件中并不会加载它,因此它的方法不会仅仅通过这样做加载。基本上,DESCRIPTION
文件中的 Imports
字段只是保证在需要时可以加载 dbplyr
。如果你通过 NAMESPACE
文件导入了某个东西,那将导致它被加载。如果你评估 dbplyr::something
,那也会导致它被加载。执行 loadNamespace("dbplyr")
是另一种方法,还有一些其他方法。你还可以加载某个加载了它的其他包。
其次,我认为你对错误消息的解释是错误的。它并不是说你加载了它而没有导入它(尽管它也会投诉这个问题),它是说它无法检测到你的包中是否使用了它,因此也许它不应该是安装你的包的必需条件。
不幸的是,用于检测使用情况的代码是有缺陷的,因此有时会忽略使用情况。我听说的一些例子包括:
- 如果该包仅用于函数参数的默认值。这在 R-devel 中已经修复。
- 如果该包仅在构建过程中用于构造某些对象,例如类似
someclass <- R6::R6Class(...)
的代码需要R6
,但检查代码不会看到它,因为它查看的是someclass
,而不是创建它的源代码。 - 如果包的使用被指定为字符变量中的包名称。
- 如果对包的需求是间接的,例如你需要使用
ggplot2::geom_hex
。这需要hexbin
包,但ggplot2
仅将其声明为 "Suggested"。
这些示例来自于这个讨论:https://github.com/hadley/r-pkgs/issues/828#issuecomment-1421353457。
推荐的解决方法是在你的包中创建一个明确引用导入包的对象,例如将以下行放入你的包中:
dummy_r6 <- function() R6::R6Class
这足以消除这个注释,而不会实际加载 R6
。(如果你曾经调用这个函数,它将被加载。)
然而,你的要求更为严格:如果你希望使用 dbplyr
的方法,你确实需要确保 dbplyr
已加载。我会在你的 .onLoad()
函数中添加一些触发加载的代码。例如:
.onLoad <- function(lib, pkg) {
# 确保加载 dbplyr 方法
loadNamespace("dbplyr")
}
添加注:正如评论中指出的那样,检查代码中存在一个 bug,意味着它不会检测到这是对 dbplyr
的使用。你确实需要同时执行两个操作,例如:
.onLoad <- function(lib, pkg) {
# 确保加载 dbplyr 方法
loadNamespace("dbplyr")
# 绕过 R 4.2.2 中用于检查包使用的代码中的 bug
dummy <- function() dbplyr::across_apply_fns
}
用于虚拟构造的函数是任意的;它可能甚至不需要存在,但我选择了一个确实存在的函数。
英文:
I am pretty sure two of your assumptions are false.
First, putting Imports: dbplyr
into your DESCRIPTION
file won't load it, so its methods won't be loaded from that alone. Basically the Imports
field in the DESCRIPTION
file just guarantees that dbplyr
is available to be loaded when requested. If you import something via the NAMESPACE
file, that will cause it to be loaded. If you evaluate dbplyr::something
that will cause it to be loaded. Executing loadNamespace("dbplyr")
is another way, and there are a few others. You may also load some other package that loads it.
Second, I think you have misinterpreted the error message. It isn't saying that you loaded it without importing it (though it would complain about that too), it is saying that it can't detect any use of it in your package, so maybe it shouldn't be a requirement for installing your package.
Unfortunately, the code to detect uses is fallible, so it sometimes misses uses. Examples I've heard about are:
- if the package is only used in the default value for a function argument. This has been fixed in R-devel.
- if the package is only used during the build to construct some object, e.g. code like
someclass <- R6::R6Class( ... )
needsR6
, but the check code won't see it because it looks atsomeclass
, not at the source code that created it. - if the use of the package is hidden by specifying the name of the package in a character variable.
- if the need for the package is indirect, e.g. you need to use
ggplot2::geom_hex
. That needs thehexbin
package, butggplot2
only declares it as "Suggested".
These examples come from this discussion: https://github.com/hadley/r-pkgs/issues/828#issuecomment-1421353457 .
The recommended workaround there is to create an object that refers to the imported package explicitly, e.g. putting the line
dummy_r6 <- function() R6::R6Class
into your package is enough to suppress the note without actually loading R6
. (It will be loaded if you ever call this function.)
However, your requirement is stronger: you do need to make sure dbplyr
is loaded if you want its methods to be used. I'd put something in your .onLoad()
function that triggers the load. For example,
.onLoad <- function(lib, pkg) {
# Make sure the dbplyr methods are loaded
loadNamespace("dbplyr")
}
EDITED TO ADD: As pointed out in the comments, there's a bug in the check code that means it won't detect this as being a use of dbplyr
. You really need to do both things, e.g.
.onLoad <- function(lib, pkg) {
# Make sure the dbplyr methods are loaded
loadNamespace("dbplyr")
# Work around bug in code checking in R 4.2.2 for use of packages
dummy <- function() dbplyr::across_apply_fns
}
The function used in the dummy construction is arbitrary; it probably doesn't even need to exist, but I chose one that does.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论