英文:
How to load an encrypted sqlite file in RStudio?
问题
问题: 基本上,我可以加载和解密文件,但然后我不知道如何连接到sqlite数据库并对其进行查询。
问题: 你能否提出一种聪明而优雅的方法来实现我的目标,请?
英文:
DESCRIPTION: I am using RStudio and I have a zip file called databse.zip
. The file is encrypted by a password. My goal is to be able to load the file, decrypt it, extract the sqlite database inside (database.db
) and be able to make a query to it.
So far I tried the following:
# Load required packages
library(dotenv)
library(Hmisc)
library(RSQLite)
# Get password
password = Sys.getenv("PASSWORD")
# Load .zip file and decrypt it
test = getZip("databse.zip", password=password)
# Connect to the SQLite database
con = dbConnect(RSQLite::SQLite(), test) # ERROR! Error in path.expand(path) : invalid 'path' argument
# Get list of items from table in the database
my_data = dbGetQuery(con, "SELECT column_name FROM table") # I do not even reach this point
PROBLEM: Basically, I am able to load and decrypt the file but then I have no idea how to connect to the sqlite database and make a query to it.
QUESTION: Would you be able to suggest a smart and elegant way to achieve my goal please? Feel free to use different packages if needed.
答案1
得分: 1
getZip()
用于从 zip 存档中流式传输单个文件的内容,而不需要首先将其存储在磁盘上,它返回连接对象。这将适用于像 zipped csv 文件这样的扁平文件,在使用 read.csv()
时也适用。对于 DBI / RSQLite,通常需要在您的磁盘上拥有一个文件。
由于 R 不能本地处理受密码保护的 zip 存档,因此我们使用 7z,尽管任何其他合适的命令行提取工具也可以使用(例如,在 Windows 中使用 rtools 的 unzip)。
在这里,我们首先创建受密码保护的 zip 存档以创建可重现的示例,然后将其提取到 R 会话临时目录中的 extracted/
并连接到提取的 SQLite 数据库。
library(DBI)
library(glue)
# 7z 可执行文件
sevenz <- "C:/Program Files/7-Zip/7z.exe"
archive_pw <- "pw123"
tmp_dir <- tempdir()
# 在文件管理器中打开临时文件夹:
# browseURL(tmp_dir)
# 准备 reprex ----------------------------------------------------------
# 将 mtcars 存储在 sqlite 中,创建受密码保护的 zip 存档
sqlite_f <- file.path(tmp_dir, "mtcars.sqlite")
con <- dbConnect(RSQLite::SQLite(), sqlite_f)
data("mtcars")
dbWriteTable(con, "mtcars", mtcars)
dbDisconnect(con)
# 用双引号处理路径中的空格
system(glue('"{sevenz}" a "{sqlite_f}.zip" "{sqlite_f}" -p"{archive_pw}"' ))
#> [1] 0
# 提取并加载受密码保护的 zip 存档中的 sqlite 数据库 ------------------
dest <- file.path(tmp_dir, "extracted")
system(glue('"{sevenz}" e "{sqlite_f}.zip" -o"{dest}" -p"{archive_pw}"' ))
#> [1] 0
con <- dbConnect(RSQLite::SQLite(), file.path(dest, "mtcars.sqlite"))
dbGetQuery(con, "SELECT * FROM mtcars LIMIT 5")
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
dbDisconnect(con)
创建于 2023-01-10,使用 reprex v2.0.2
英文:
getZip()
is used to stream a content of a single file form zip archive without storing it on a disk first, it returns connection object. This would work with flat files like zipped csv-s while using it with e.g. read.csv()
. For DBI / RSQLite you (generally) need to have a file on your disk.
As R can't handle password-protected zip archives natively, we use 7z, though any other suitable command line extractor would work as well (e.g. unzip from rtools in case of Win).
Here we first create password-protected zip arhive to create a reproducible example, then extract it to extracted/
in R session tmp. directory and connect to extracted SQLite database.
library(DBI)
library(glue)
# 7z executable
sevenz <- "C:/Program Files/7-Zip/7z.exe"
archive_pw <- "pw123"
tmp_dir <- tempdir()
# open temp folder in file manager :
# browseURL(tmp_dir)
# prepare reprex ----------------------------------------------------------
# store mtcars in sqlite, create pasword-protected zip archive
sqlite_f <- file.path(tmp_dir, "mtcars.sqlite")
con <- dbConnect(RSQLite::SQLite(), sqlite_f)
data("mtcars")
dbWriteTable(con, "mtcars", mtcars)
dbDisconnect(con)
# double-quotes to handle spaces in paths
system(glue('"{sevenz}" a "{sqlite_f}.zip" "{sqlite_f}" -p"{archive_pw}"' ))
#> [1] 0
# extract and load sqlite DB from password-protected zip ------------------
dest <- file.path(tmp_dir, "extracted")
system(glue('"{sevenz}" e "{sqlite_f}.zip" -o"{dest}" -p"{archive_pw}"' ))
#> [1] 0
con <- dbConnect(RSQLite::SQLite(), file.path(dest, "mtcars.sqlite"))
dbGetQuery(con, "SELECT * FROM mtcars LIMIT 5")
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
dbDisconnect(con)
<sup>Created on 2023-01-10 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论