PowerQuery导入XML数据 – 将嵌套表格列合并为一个单元格

huangapple go评论63阅读模式
英文:

PowerQuery XML data import - combine nested table columns into one cell

问题

需要将包含表格的最后两列合并到一个单元格中,当前导入的格式如下:
标题,链接,描述,发布日期,guid,toExpand
str,str,str,str,Table,Table

我需要的格式是:
str,str,str,str,[str,str,..],[str,str,..]

提前感谢!

查询:

let
Source = Xml.Tables(Web.Contents("https://patentscope.wipo.int/search/en/92ec30a9-24ed-4383-b1aa-7b76748f6266/rss.xml")),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Attribute:version", Int64.Type}}),
channel = #"Changed Type"{0}[channel],
#"Changed Type1" = Table.TransformColumnTypes(channel,{{"title", type text}, {"link", type text}, {"description", type text}, {"language", type text}, {"pubDate", type datetime}, {"lastBuildDate", type datetime}}),
item = #"Changed Type1"{0}[item],
#"Changed Type2" = Table.TransformColumnTypes(item,{{"title", type text}, {"link", type text}, {"description", type text}, {"pubDate", type datetime}})
in
#"Changed Type2";

我尝试展开表格,但那只返回每一行作为一个表格。

英文:

I need the last 2 columns containing tables to be combined into a cell, currently the format of the import is:
title, link, description, pubDate, guid, toExpand
str, str, str, str, Table, Table

and I need
str, str, str, str, [str, str,..], [str, str, ..]

thanks in advance!

Query:

let
    Source = Xml.Tables(Web.Contents("https://patentscope.wipo.int/search/en/92ec30a9-24ed-4383-b1aa-7b76748f6266/rss.xml")),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Attribute:version", Int64.Type}}),
    channel = #"Changed Type"{0}[channel],
    #"Changed Type1" = Table.TransformColumnTypes(channel,{{"title", type text}, {"link", type text}, {"description", type text}, {"language", type text}, {"pubDate", type datetime}, {"lastBuildDate", type datetime}}),
    item = #"Changed Type1"{0}[item],
    #"Changed Type2" = Table.TransformColumnTypes(item,{{"title", type text}, {"link", type text}, {"description", type text}, {"pubDate", type datetime}})
in
    #"Changed Type2"

I tried expanding the tables but that returns just each row as a table

答案1

得分: 0

以下是代码部分的翻译:

let
    Source = Xml.Tables(Web.Contents("https://patentscope.wipo.int/search/en/92ec30a9-24ed-4383-b1aa-7b76748f6266/rss.xml")),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Attribute:version", Int64.Type}}),
    channel = #"Changed Type"{0}[channel],
    #"Changed Type1" = Table.TransformColumnTypes(channel,{{"title", type text}, {"link", type text}, {"description", type text}, {"language", type text}, {"pubDate", type datetime}, {"lastBuildDate", type datetime}}),
    item = #"Changed Type1"{0}[item],
    #"Changed Type2" = Table.TransformColumnTypes(item,{{"title", type text}, {"link", type text}, {"description", type text}, {"pubDate", type datetime}}),

    //假设只有一个GUID在“Table”中
    //如果可能有多个,需要一些代码更改
    #"Expand GUID" = Table.TransformColumns(#"Changed Type2",{
        {"guid", each _[#"Element:Text"]{0}}}),
    
    //提取创建者
    Creator = Table.AddColumn(#"Expand GUID", "Creator", each let 
            #"Creator" = try Table.Column([#"http://purl.org/dc/elements/1.1/"],"creator"){0} otherwise null ,

        //如果是表格则展开
            #"Is Table" = Value.Is(#"Creator",type table),
            Result = if #"Is Table" then Text.Combine(#"Creator"[#"Element:Text"]," ; ")
                        else #"Creator"
        in Result, type text),
    Contributor = Table.AddColumn(Creator, "Contributor", each let 
            #"Contributor" = try Table.Column([#"http://purl.org/dc/elements/1.1/"],"contributor"){0} otherwise null ,

        //如果是表格则展开
            #"Is Table" = Value.Is(#"Contributor",type table),
            Result = if #"Is Table" then Text.Combine(#"Contributor"[#"Element:Text"]," ; ")
                        else #"Contributor"
        in Result),
    Identifier = Table.AddColumn(Contributor, "Identifier", each let 
            #"Identifier" = try Table.Column([#"http://purl.org/dc/elements/1.1/"],"identifier"){0} otherwise null ,

        //如果是表格则展开
            #"Is Table" = Value.Is(#"Identifier",type table),
            Result = if #"Is Table" then Text.Combine(#"Identifier"[#"Element:Text"]," ; ")
                        else #"Identifier"
        in Result, type text),
    #"Removed Columns" = Table.RemoveColumns(Identifier,{#"http://purl.org/dc/elements/1.1/"})
in
    #"Removed Columns"

最后的列
PowerQuery导入XML数据 – 将嵌套表格列合并为一个单元格

英文:

Just expanding those tables won't work. You've got to "tease out" the relevant elements, and then combine them into a text string.

Perhaps something like this:

Note that I separated items with a semicolon ; but you might want to use a linefeed if you want the results to wrap in end result table

let
    Source = Xml.Tables(Web.Contents("https://patentscope.wipo.int/search/en/92ec30a9-24ed-4383-b1aa-7b76748f6266/rss.xml")),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Attribute:version", Int64.Type}}),
    channel = #"Changed Type"{0}[channel],
    #"Changed Type1" = Table.TransformColumnTypes(channel,{{"title", type text}, {"link", type text}, {"description", type text}, {"language", type text}, {"pubDate", type datetime}, {"lastBuildDate", type datetime}}),
    item = #"Changed Type1"{0}[item],
    #"Changed Type2" = Table.TransformColumnTypes(item,{{"title", type text}, {"link", type text}, {"description", type text}, {"pubDate", type datetime}}),

//assuming only a single GUID in the "Table"
//If there might be more than one, need some code changes
    #"Expand GUID" = Table.TransformColumns(#"Changed Type2",{
        {"guid", each _[#"Element:Text"]{0}}}),
    
//Extract Creator
    Creator = Table.AddColumn(#"Expand GUID", "Creator", each let 
            #"Creator" = try Table.Column([#"http://purl.org/dc/elements/1.1/"],"creator"){0} otherwise null ,

        //expand if table
            #"Is Table" = Value.Is(#"Creator",type table),
            Result = if #"Is Table" then Text.Combine(#"Creator"[#"Element:Text"],"; ")
                        else #"Creator"
        in Result, type text),
    Contributor = Table.AddColumn(Creator, "Contributor", each let 
            #"Contributor" = try Table.Column([#"http://purl.org/dc/elements/1.1/"],"contributor"){0} otherwise null ,

        //expand if table
            #"Is Table" = Value.Is(#"Contributor",type table),
            Result = if #"Is Table" then Text.Combine(#"Contributor"[#"Element:Text"],"; ")
                        else #"Contributor"
        in Result),
    Identifier = Table.AddColumn(Contributor, "Identifier", each let 
            #"Identifier" = try Table.Column([#"http://purl.org/dc/elements/1.1/"],"identifier"){0} otherwise null ,

        //expand if table
            #"Is Table" = Value.Is(#"Identifier",type table),
            Result = if #"Is Table" then Text.Combine(#"Identifier"[#"Element:Text"],"; ")
                        else #"Identifier"
        in Result, type text),
    #"Removed Columns" = Table.RemoveColumns(Identifier,{"http://purl.org/dc/elements/1.1/"})
in
    #"Removed Columns"

Last Columns<br>
PowerQuery导入XML数据 – 将嵌套表格列合并为一个单元格

huangapple
  • 本文由 发表于 2023年6月26日 07:18:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76552768.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定