将数据加载到Oracle数据库中,通过扁平文件。

huangapple go评论57阅读模式
英文:

Load data to Oracle db through flat file

问题

有没有一种方法可以将数据从一个扁平文件加载到Oracle表中。我正在使用以下Python代码,但文件太大,脚本在一段时间后停止(由于丢失的数据库连接)。

from tqdm import tqdm

insert_sty = "insert into MRSTY (CUI,TUI,STN,STY,ATUI,CVF) values (:0,:1,:2,:3,:4,:5)"
records=[]

file_path = "../umls_files/umls-2023AA-metathesaurus-full/2023AA/META/MRSTY.RRF"
num_lines = sum(1 for line in open(file_path))

with open(file_path, 'r') as f:
    for line in tqdm(f, total=num_lines, desc="Processing file"):
        line = line.strip()
        records.append(line.split("|"))

    for sublist in records:  
        if sublist:
            sublist.pop()
    
for i in tqdm(records, desc="Inserting records"):
    try:
        cur.execute(insert_sty,i)
        print ("record inserted")       
    except Exception as e:
        print (i)
        print("Error: ",str(e))
        
conn.commit()

希望这可以帮助您加载数据到Oracle表中。

英文:

Is there a way to load data from a flat file to an oracle table. I am using following python code but the file is too big and the script stops after sometime (due to lost db connection).

from tqdm import tqdm

insert_sty = "insert into MRSTY (CUI,TUI,STN,STY,ATUI,CVF) values (:0,:1,:2,:3,:4,:5)"
records=[]

file_path = "../umls_files/umls-2023AA-metathesaurus-full/2023AA/META/MRSTY.RRF"
num_lines = sum(1 for line in open(file_path))

with open(file_path, 'r') as f:
    for line in tqdm(f, total=num_lines, desc="Processing file"):
        line = line.strip()
        records.append(line.split("|"))

    for sublist in records:  
        if sublist:
            sublist.pop()
    
for i in tqdm(records, desc="Inserting records"):
    try:
        cur.execute(insert_sty,i)
        print ("record inserted")       
    except Exception as e:
        print (i)
        print("Error: ",str(e))
        
conn.commit()

答案1

得分: 1

使用Oracle时,我建议使用SQL*Loader。它是一个可以非常快速加载数据的实用程序。

简而言之,它的工作原理如下:

  • 你有一个要加载的文件
  • 如果你的计算机上安装了Oracle数据库(或安装了Oracle客户端),你就有了SQL*Loader实用程序
  • 创建一个控制文件(按照我提供的链接中描述的说明进行操作)
    • 它会告诉实用程序在哪里找到源数据,将数据加载到哪个数据库表中,是否有其他选项需要包括等等。
  • 运行它
    • 通过在操作系统命令提示符下调用sqlldr可执行文件来完成
    • 这可能涉及到Python - 只是为了调用实用程序
  • 等待直到完成;查看日志文件以获取更多信息

你可以在这里查看一个简短的概述。

附言:我忘了提到:如果你发布了表格描述并提供了几行示例数据,我们可以帮助你编写控制文件,并演示它的工作原理。

英文:

As you use Oracle, I'd suggest <a href="https://docs.oracle.com/en/database/oracle/oracle-database/19/sutil/oracle-sql-loader.html">SQL*Loader</a>. It is utility which loads data really, really fast.

In a few words, how it works:

  • you have a file to be loaded
  • if you have Oracle database installed on your PC (or installed Oracle Client), you have SQL*Loader utility
  • create a control file (by following instructions described in link I posted)
    • it instructs the utility where to find source data, which database table is the target, are there any other options you want to include, etc.
  • run it
    • it is done by calling the sqlldr executable at the operating system command prompt
    • this is where Python might be involved - just to call the utility
  • wait until it is over; check the log file for more info

Have a look at short overview <a href="https://www.oracletutorial.com/oracle-administration/oracle-sqlloader/">here</a>.

P.S. Forgot to mention: if you posted table description and provided several rows of sample data, we could've helped with composing the control file and demonstrate how it would've worked.

huangapple
  • 本文由 发表于 2023年5月29日 13:50:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76354939.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定