优化插入操作的LOB dblink

huangapple go评论98阅读模式
英文:

Optimize insert operation LOB dblink

问题

我正在尝试对通过db_link访问的包含LOB列的表进行值的插入。然而,性能非常差。我已经尝试使用游标和批量收集,但它们似乎不能与远程数据库中的LOB一起使用。是否有其他优化选项?

这是我的查询。它在没有最后两个过滤条件的情况下工作,因为在插入查询中不允许对LOB列进行操作,但理想情况下,我们希望将其包含在同一操作中。

-- Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
-- PL/SQL Release 12.1.0.2.0 - Production
-- "CORE 12.1.0.2.0 Production"
-- TNS for Linux: Version 12.1.0.2.0 - Production
-- NLSRTL Version 12.1.0.2.0 - Production

INSERT INTO raw_data (
"URL",
tipo,
fecha_evento,
fecha_registro,
usuario,
cliente,
contrato,
referer,
correlation_id,
session_id,
tracking_cookie,
"JSON",
"APPLICATION"
)
SELECT
r.*
FROM
schema.remote_table@dblink r
WHERE
fecha_evento BETWEEN trunc(SYSDATE - INTERVAL '1' MONTH, 'MONTH') AND trunc(SYSDATE, 'MONTH') - INTERVAL '1' DAY
AND url IN (
'a',
'b',
'c',
'd'
) AND NOT JSON_EXISTS ( r."JSON", '$.response.pasofin.codoferta' )
AND "URL" NOT IN (
'e',
'f'
);

英文:

I am trying to do an insert of the values of a table accessible via db_link and containing a LOB column. However, the performance is pretty bad. I have tried using cursors and bulk collect but they don't seem to work with LOBs in remote databases. Is there any other option to optimize it?

This is my query. It works without the last two filter conditions, since operations on the LOB column are not allowed in the insert query, but ideally we would like to include it within the same operation.

-- Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
-- PL/SQL Release 12.1.0.2.0 - Production
-- "CORE	12.1.0.2.0	Production"
-- TNS for Linux: Version 12.1.0.2.0 - Production
-- NLSRTL Version 12.1.0.2.0 - Production


INSERT INTO raw_data (
    "URL",
    tipo,
    fecha_evento,
    fecha_registro,
    usuario,
    cliente,
    contrato,
    referer,
    correlation_id,
    session_id,
    tracking_cookie,
    "JSON",
    "APPLICATION"
)
    SELECT
        r.*
    FROM
        schema.remote_table@dblink r
    WHERE
        fecha_evento BETWEEN trunc(SYSDATE - INTERVAL '1' MONTH, 'MONTH') AND trunc(SYSDATE, 'MONTH') - INTERVAL '1' DAY
        AND url IN (
            'a',
            'b',
            'c',
            'd'
        )   AND NOT JSON_EXISTS ( r."JSON", '$.response.pasofin.codoferta' )
        AND "URL" NOT IN (
            'e',
            'f'
        );

答案1

得分: 1

LOB操作通过数据库链接一直以来都是一个问题,尽管在较新的版本中有所改进。在内部,当你通过INSERT SELECTCTAS复制LOB时,Oracle会为每一行执行一个循环,使用内部的dbms_lob调用(或其底层的某个组件)逐步获取字节并将其写入目标LOB段。可以将其视为嵌套循环-对于每一行,都要执行一个内部循环,直到组装完整的LOB(对于行中的每个额外的LOB,都要重复此过程)。显然,这比简单地拉取一个简单的行集要多做很多工作。每个LOB片段被复制到的缓冲区历来都很小,导致需要进行多次往返,因此网络延迟累积起来,从而使得拉取LOB变得非常缓慢。希望能够控制缓冲区大小,以减少往返次数,但除非有一个下划线参数可以实现这一点,否则我认为我们无法对其进行控制。我注意到LOB的移动在最近的版本中有所改进,但我不记得在12.1版本中是否已经改进了。

有几种解决方法可以加快这个过程的速度。

  1. 拉取所有LOB长度小于4000字节的行,将它们转换为varchar2(4000),以便通过链接以varchar2的形式传输。在此过程中,所有使用dbms_lob的操作都必须使用远程版本的@dblink,而不是本地版本。由于字符集不匹配的问题,你可能需要将其设置为varchar2(2000)甚至varchar2(1000)。这种方法非常快速。然后,将较长的LOB作为LOB拉取,希望这样的行较少。如果大多数LOB小于4K,这样可以大大加快速度。然而,如果大多数LOB大于4K,则无法获得任何好处。

  2. 你可以在远程数据库上创建一个过程,该过程接受列名和表名,并按照确定的行顺序从本地读取所有行的LOB,将它们转换为varchar2(32767)记录的集合(然后可以将其转换为BLOBRAW的集合,如果需要,还可以进行lz_compressed压缩)。通过OUT参数返回该集合,以及另一个字节偏移的集合,显示每行的LOB从哪里开始。调用数据库接收输出集合,并在本地反转此过程,重构原始LOB并将其写入目标。这种方法相当复杂,但是有效(我已经成功地使用过),而且比通过数据库链接进行本地LOB移动要快得多。然而,由于其复杂性和容易出错的性质,我不太推荐使用这种方法。但这是一个选择。

  3. 使用多个进程(例如通过dbms_scheduler)将表提取分成大致相等的部分,并以蛮力的方式解决这个问题。最后重新组装线程工作表(或更好的是,一个表的分区),以收集最终结果。使用10个线程将这些LOB移动的速度几乎比单个会话快10倍(虽然需要两次写入数据的开销,但与并行化网络拉取的好处相比,这是非常小的)。这可以单独完成,也可以与上述其他技术结合使用。

英文:

LOB operations over dblinks have long been an issue, though they've improved in more recent versions. Internally when you copy a LOB over via INSERT SELECT or CTAS, Oracle will for each row do a loop on internal dbms_lob calls (or something underneath it) to incrementally fetch bytes into a buffer and write them to the target LOB segment. Think of it as a nested loop - for each row, do an inner loop as many times as it takes to assemble a LOB (and do it again for each additional LOB in the row). Clearly this is a lot more work than a straight pull of a simple rowset. This buffer that each LOB piece is copied into has historically been quite small, resulting in many round trips so that network latency adds up and it can really make pulling LOBs over quite slow. One wishes it were possible to control the buffer size so fewer round trips are necessary, but unless there's an underscore parameter that does this, I don't think we have any control over it. I have noticed that LOB movement has improved in recent versions, but I don't recall if it had yet in 12.1.

There are several workarounds to speed this up.

  1. Pull all rows with a LOB length of < 4000 bytes, casting them to varchar2(4000) so they are transferred over the link as varchar2. All uses of dbms_lob while doing this must use the @dblink remote versions, not the local ones. Due to characterset mismatch issues you might need to makes this varchar2(2000) or even varchar2(1000). This is very fast. Then pull the longer ones over as LOBs, which hopefully is far fewer rows. If most of your LOBs are less than 4K, this can really speed things up. If most of your LOBS are greater than 4K, however, you get no benefit.

  2. You can create a procedure on the remote database that takes a column name and a table name and reads the LOBs from all rows locally in a determined row order, converting them into a collection of varchar2(32767) records (this can then be converted to BLOB and a collection of RAW and then lz_compressed if desired). Return the collection via an OUT parameter along with another collection of byte offsets showing where each row's LOB starts. The calling database receives the output collections and reverses this process locally, reconstructing the original LOBs and writes them to the target. This is rather complicated but works (I've done it successfully) and this is a lot faster than native LOB movement over a dblink. Because of its complexity and liability to bugs however, it really isn't an approach I would highly recommend. But it is an option.

  3. Use multiple processes (e.g. via dbms_scheduler) to break up the table extract into roughly equal portions and tackle the problem that way with brute force. Reassemble thread work tables (or better, partitions of one table) at the end to collect your final result. 10 threads will move those LOBs almost 10x faster than a single session can (you do have the overhead of writing the data twice, but that's very small compared to the benefit of parallelizing the network pull). This can be done either by itself, or in combination with the other techniques mentioned above.

huangapple
  • 本文由 发表于 2023年8月9日 16:14:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76865803.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定