学习关于MarkLogic CoRB处理的内容。

huangapple go评论45阅读模式
英文:

learning about the MarkLogic CoRB processing

问题

我目前正在学习MarkLogic数据库,并查看CoRB批处理(在Windows上运行)。
在“Documents”数据库中,我创建了10个示例数据,如下所示

/customer1~10.xml

<?xml version="1.0" encoding="UTF-8"?>
<customer>
  <name>Customer1</name>
  <address>123 Main St</address>
  <company>A Corporation</company>
  <emailAddress>customer1@mail.com</emailAddress>
</customer>

&lt;address&gt;&lt;company&gt; 在接下来的9个数据集中都相同。

现在我的目标是通过Uri.xqy遍历所有/Customer*.xml数据,并将&lt;emailAddress&gt;从customer1@mail.com更改为customer1@changed.mail.com。

我尝试了以下查询,但没有返回任何结果。

URI

xquery version "1.0-ml";

declare variable $target as xs:string external;

let $uris := cts:uris((), (), cts:directory-query($target, "infinity"))
return(
  fn:count($uris),
  $uris
)

PROCESS

xquery version "1.0-ml";

declare variable $URI as xs:string external;

let $docnew := cts:search(fn:doc(), cts:document-query(fn:tokenize($URI, ";")))
for $doc in $docnew
let $new-email := fn:replace($doc//emailAddress/text(), "customer.*@mail.com", "customer*@changed.mail.com")
return
  xdmp:node-replace($doc//emailAddress, $new-email)

和以下的corb命令,
我创建了一个app-server(XDBC)在端口9000上,该服务器有“documents”作为数据库和“modules”作为模块数据库。

CoRB命令
java -cp marklogic-xcc-<version>.jar;corb.jar com.marklogic.developer.corb.ModuleExecutor ^
-DXCC-CONNECTION-URI=xcc://user:password@localhost:9000/Modules ^
-DXCC-MODULE=/process.xqy ^
-DXCC-MODULE-ROOT=C:/corb/ ^
-DURIS-MODULE=uris.xqy ^
-DPROCESS-TASK=com.example.CustomTask ^
-DTHREAD-COUNT=4
英文:

I'm currently Learning MarkLogic database, and looking in to the CoRB batch processing(running it on windows).
I've made 10 sample data in the Documents database such as below

/customer1~10.xml

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;customer&gt;
  &lt;name&gt;Customer1&lt;/name&gt;
  &lt;address&gt;123 Main St&lt;/address&gt;
  &lt;company&gt;A Corporation&lt;/company&gt;
  &lt;emailAddress&gt;customer1@mail.com&lt;/emailAddress&gt;
&lt;/customer&gt;

the &lt;address&gt; and &lt;company&gt; are all the same for the next 9 datasets.

Now my goal was to go through all the /Customer*.xml data with the Uri.xqy
and change the &lt;emailAddress&gt; from customer1@mail.com to customer1@changed.mail.com.

I've been trying the queries below and it wouldn't return anything.

URI

xquery version &quot;1.0-ml&quot;;

declare variable $target as xs:string external;

let $uris := cts:uris((), (), cts:directory-query($target, &quot;infinity&quot;))
return(
  fn:count($uris),
  $uris
)

PROCESS

xquery version &quot;1.0-ml&quot;;

declare variable $URI as xs:string external;

let $docnew := cts:search(fn:doc(), cts:document-query(fn:tokenize($URI, &quot;;&quot;)))
for $doc in $docnew
let $new-email := fn:replace($doc//emailAddress/text(), &quot;customer.*@mail.com&quot;, &quot;customer*@changed.mail.com&quot;)
return
  xdmp:node-replace($doc//emailAddress, $new-email)

and the corb command below,
Ive made a app-server(XDBC)on port 9000 which has "documents" as a database and "modules" as a modules database.

CoRB command
java -cp marklogic-xcc-&lt;version&gt;.jar;corb.jar com.marklogic.developer.corb.ModuleExecutor ^
-DXCC-CONNECTION-URI=xcc://user:password@localhost:9000/Modules ^
-DXCC-MODULE=/process.xqy ^
-DXCC-MODULE-ROOT=C:/corb/ ^
-DURIS-MODULE=uris.xqy ^
-DPROCESS-TASK=com.example.CustomTask ^
-DTHREAD-COUNT=4

答案1

得分: 1

在将 CoRB 添加到流程中之前,请验证您的逻辑在查询控制台中是否有效(最好以与 CoRB 进程相同的用户身份使用 invoke 函数执行此操作)。如果您不能验证逻辑是否与控制文档上的相同用户一起运行,尝试自动化批量处理是没有价值的。

还有一点很有帮助,就是记录/跟踪,以便了解您的代码正在执行的操作。

选项卡 1:验证您的 URIS 处理 - 它是否返回您期望的结果?

选项卡 2:验证您的 PROCESS 函数。(用样本控制文档替换外部 $URI。)

您的配置应声明每次对 PROCESS 模块的调用一次发送一个 URI(并同时调用 4 次)。

我认为您会发现您的代码没有按预期执行。我建议以下步骤帮助您开始:

(:PROCESS:)
xquery version "1.0-ml";
xdmp:invoke-function(function(){
let $URI = "some URI you know and trust";
let $_ := xdmp:log("processing URI: " || $URI)
let $doc := fn:doc($URI)
let $new-email := fn:replace($doc//emailAddress/text(), "customer.@mail.com", "customer@changed.mail.com")
let $_ := xdmp:log("new email address(proves we got the doc and element): " || $new-email)
let $new-email-element := <emailAddress>{$new-email}</emailAddress>
return xdmp:node-replace($doc/customer/emailAddress, $new-email-element)
}, map:entry("userId", xdmp:user("your-user-here")))

英文:

Before adding CoRB into the mix, validate that your logic works in query console (and idealy do this as the same user as your CoRB process using an invoke function). There is no value in trying to automate bulk processing if you cannot validate the the logic works with the same user on a control document.

It is also helpful to log/trace so that you know what your code is doing.

Tab 1: validate your URIS proces - does it return what you expect?

Tab 2: validate your PROCESS function. (Replace the external $URI with a sample control document.)

Your configuration would state that each call to the PROCESS module sends one URI at a time (and invokes 4 at a time).

I think you will see that your code is not executing as expected. I suggest the following to get you started:

(:PROCESS:)
xquery version &quot;1.0-ml&quot;;
xdmp:invoke-function(function(){
  let $URI = &quot;some URI you know and trust&quot;;
  let $_ := xdmp:log(&quot;processing URI: &quot; || $URI)
  let $doc := fn:doc($URI)
  let $new-email := fn:replace($doc//emailAddress/text(), &quot;customer.*@mail.com&quot;, &quot;customer*@changed.mail.com&quot;)
  let $_ := xdmp:log(&quot;new email address(proves we got the doc and element): &quot; || $new-email)
  let $new-email-element := &lt;emailAddress&gt;{$new-email}&lt;/emailAddress&gt;
  return xdmp:node-replace($doc/customer/emailAddress, $new-email-element)
}, map:entry(&quot;userId&quot;, xdmp:user(&quot;your-user-here&quot;)))

huangapple
  • 本文由 发表于 2023年4月13日 15:52:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76002959.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定