获取参数化查询的结果集,使用R的`DBI`将其直接合并到数据库中。

huangapple go评论66阅读模式
英文:

Get the results sets of a parametrized query `rbind`ed *and* directly in the database using R's `DBI`

问题

使用package:DBI,我需要:

  1. 运行一个带有不同参数(即参数向量)的参数化查询;
  2. 获取连接的结果集(根据R术语的rbind或根据SQL术语的union进行连接);
  3. 并在数据库中获取生成的表以进行进一步操作。

dbBind()/dbGetquery()满足了要求1和2,但然后我需要使用dbWriteTable()将生成的数据帧写入数据库,这是低效的:

library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "iris", iris)

res <- dbGetQuery(con,
                  "select * from iris where Species = ?",
                  params = list(c("setosa", "versicolor")))

dbWriteTable(con, "mytable", res)

相反,dbExecute()满足要求3,但我认为它没有"rbind特性"。当然,这会引发错误,因为表将被覆盖:

dbExecute(con,
          "create table mytable as select * from iris where Species = ?",
          params = list(c("setosa", "versicolor")))

如何以最高效/推荐的方式进行操作?

备注:

  • 我不是数据库管理员,只能通过R访问数据库。
  • 我的示例过于简单,可以在单个查询中实现。我的用例确实需要多次使用不同参数运行参数化查询。
  • 我必须使用Oracle,但我对即使不适用于Oracle的解决方案也感兴趣。
英文:

Using package:DBI, I need to:

  1. run a parametrized query with different parameters (i.e. a vector of parameters);
  2. get the results sets concatenated (i.e. rbinded as per R terminology or unioned as per SQL terminology);
  3. and get the resulting table in the database for further manipulation.

dbBind()/dbGetquery() fullfils requirements 1 and 2, but I then need to write the resulting data frame to the database using dbWriteTable(), which is ineficient:

library(DBI)
con &lt;- dbConnect(RSQLite::SQLite(), &quot;:memory:&quot;)
dbWriteTable(con, &quot;iris&quot;, iris)

res &lt;- dbGetQuery(con,
                  &quot;select * from iris where Species = ?&quot;,
                  params = list(c(&quot;setosa&quot;, &quot;versicolor&quot;)))

dbWriteTable(con, &quot;mytable&quot;, res)

Conversely, dbExecute() fulfils requirement 3, but I don't think it has the "rbind feature". Of course, this throw an error because the table would get overwritten:

dbExecute(con,
          &quot;create table mytable as select * from iris where Species = ?&quot;,
          params = list(c(&quot;setosa&quot;, &quot;versicolor&quot;)))

What is the most efficient/recommended way of doing so?

Notes:

  • I am not the DBA and can only access the database through R.
  • My example is too trivial and could be achieved in a single query. My use case really requires a parametrized query to be run multiple times with different parameters.
  • I have to use Oracle, but I am interested in a solution even if it don't works with Oracle.

答案1

得分: 1

  1. 使用第一个参数创建表格,然后将其他参数插入其中。
library(RSQLite)

con <- dbConnect(SQLite())
dbWriteTable(con, "iris", iris)

parms <- c("setosa", "versicolor")

dbExecute(con, "create table mytable as
  select * from iris where Species = ?",
  params = parms[1])
for (p in parms[-1]) {
  dbExecute(con, "insert into mytable
    select * from iris where Species = ?",
    params = p)
}

# 检查
res <- dbGetQuery(con, "select * from mytable")
str(res)
  1. 交替生成执行所有操作的SQL语句。sqldf引入了RSQLite和gsubfn,后者提供了fn$,以实现文本替换。
library(sqldf)

con <- dbConnect(SQLite())
dbWriteTable(con, "iris", iris)

parms <- c("setosa", "versicolor")
parmString <- toString(sprintf("'%s'", parms))
fn$dbExecute(con, "create table mytable as
  select * from iris where Species in ($parmString)")

# 检查
res <- dbGetQuery(con, "select * from mytable")
str(res)
  1. (2)的变种是插入适当数量的问号。
library(sqldf)

con <- dbConnect(SQLite())
dbWriteTable(con, "iris", iris)

params <- list("setosa", "versicolor")
quesString <- toString(rep("?", length(params))

fn$dbExecute(con, "create table mytable as
  select * from iris where Species in ($quesString)", params = params)

# 检查
res <- dbGetQuery(con, "select * from mytable")
str(res)
英文:

1) Create the table with the first parameter and then insert each of the others into it.

library(RSQLite)

con &lt;- dbConnect(SQLite())
dbWriteTable(con, &quot;iris&quot;, iris)

parms &lt;- c(&quot;setosa&quot;, &quot;versicolor&quot;)

dbExecute(con, &quot;create table mytable as
  select * from iris where Species = ?&quot;,
  params = parms[1])
for (p in parms[-1]) {
  dbExecute(con, &quot;insert into mytable
    select * from iris where Species = ?&quot;,
    params = p)
}

# check
res &lt;- dbGetQuery(con, &quot;select * from mytable&quot;)
str(res)

2) Alternately generate the text of an SQL statement to do it all. sqldf pulls in RSQLite and gsubfn which supplies fn$ that enables the text substitution.

library(sqldf)

con &lt;- dbConnect(SQLite())
dbWriteTable(con, &quot;iris&quot;, iris)

parms &lt;- c(&quot;setosa&quot;, &quot;versicolor&quot;)
parmString &lt;- toString(sprintf(&quot;&#39;%s&#39;&quot;, parms))
fn$dbExecute(con, &quot;create table mytable as
  select * from iris where Species in ($parmString)&quot;)

# check
res &lt;- dbGetQuery(con, &quot;select * from mytable&quot;)
str(res)

3) A variation of (2) is to insert the appropriate number of question marks.

library(sqldf)

con &lt;- dbConnect(SQLite())
dbWriteTable(con, &quot;iris&quot;, iris)

params &lt;- list(&quot;setosa&quot;, &quot;versicolor&quot;)
quesString &lt;- toString(rep(&quot;?&quot;, length(params)))

fn$dbExecute(con, &quot;create table mytable as
  select * from iris where Species in ($quesString)&quot;, params = params)

# check
res &lt;- dbGetQuery(con, &quot;select * from mytable&quot;)
str(res)

答案2

得分: 0

根据 @r2evans 的评论和 @G.Grothendieck 的回答,我使用了一个参数化查询,直接将数据插入到表中。

首先,我创建了一个具有适当列的表来收集结果:

library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")

create_table <-
"
CREATE TABLE
  warpbreaks2 (
    breaks real,
    wool text,
    tension text
);
"

dbExecute(con, create_table)

然后,我执行了一个 INSERT INTO 步骤:

dbWriteTable(con, "warpbreaks", warpbreaks)

insert_into <-
"
INSERT INTO
  warpbreaks2
SELECT
  warpbreaks.breaks,
  warpbreaks.wool,
  warpbreaks.tension
FROM
  warpbreaks
WHERE
  tension = ?;
"

dbExecute(con, insert_into, params = list(c("L", "M")))

这是一个示例用于说明的虚拟示例。可以更直接地实现,例如:

direct_query <-
"
CREATE TABLE
  warpbreaks3 AS
SELECT
  *
FROM
  warpbreaks
WHERE
  tension IN ('L', 'M');
"

dbExecute(con, direct_query)
英文:

Based on @r2evans comment and @G.Grothendieck answer, instead of query/download/combine/upload, I used a parameterized query that inserts directly into a table.

First, I created the table with the appropriate columns to collect the results:

library(DBI)
con &lt;- dbConnect(RSQLite::SQLite(), &quot;:memory:&quot;)

create_table &lt;-
&quot;
CREATE TABLE
  warpbreaks2 (
    breaks real,
    wool text,
    tension text
);
&quot;

dbExecute(con, create_table)

Then I executed an INSERT INTO step:

dbWriteTable(con, &quot;warpbreaks&quot;, warpbreaks)

insert_into &lt;-
&quot;
INSERT INTO
  warpbreaks2
SELECT
  warpbreaks.breaks,
  warpbreaks.wool,
  warpbreaks.tension
FROM
  warpbreaks
WHERE
  tension = ?;
&quot;

dbExecute(con, insert_into, params = list(c(&quot;L&quot;, &quot;M&quot;)))

This is a dummy example for illustration purpose. It could be achieve more directly with e.g.:

direct_query &lt;-
&quot;
CREATE TABLE
  warpbreaks3 AS
SELECT
  *
FROM
  warpbreaks
WHERE
  tension IN (&#39;L&#39;, &#39;M&#39;);
&quot;

dbExecute(con, direct_query )

huangapple
  • 本文由 发表于 2023年2月6日 02:53:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75354724.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定