英文:
Divided HTML tags of an article into separate columns using App script in Google sheet
问题
使用应用脚本,我需要拆分文章:
在第一列,这将是<h1>的内容,在第二列是第一个<h2>的内容,接下来的列是所有<p>的内容,然后在下一列是第二个<h2>的内容,其下面是所有的<p>,以此类推,直到第七个<h2>,文章仍然保留在最后一列。然后,在那一点上,我只想将剩下的文章转储到最后一列。所以,对于可能有多于七个<h2>的文章,它将把所有剩下的文章放在最后一列。
我正在使用以下应用脚本代码来完成任务。但是,这段代码有两个问题:
首先,这段代码不会将文章中最后一个<h2>(<h2>Conclusion</h2>)之后的段落<p>标签转储到最后一列。
其次,在某些文章上,脚本会出现以下错误。
TypeError: Cannot read properties of null (reading 'match')
(anonymous)	@ Code.gs:44
(anonymous)	@ Code.gs:43
myFunction	@ Code.gs:41
这里是包含文章和应用脚本的电子表格链接。在第3行,我想要的输出。在第4行,脚本输出的结果。第3行和第4行之间的唯一区别是第4行没有文章中最后一个<h2>(<h2>Conclusion</h2>)之后的段落<p>标签在最后一列中。
function onOpen() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var pasteSheet = [ {name: "Run Script", functionName: "myFunction"}];
  ss.addMenu("Break Down Article", pasteSheet);
}
function myFunction() {
  const sheetName = "Sheet1"; // 请设置工作表名称。
  const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
  const range = sheet.getRange("A2:A" + sheet.getLastRow());
  const temp1 = range.getValues().map(([a]) =>
    a.split("\n").reduce((o, e, i, a) => {
      if (e) {
        let tagMatch = e.match(/^<(.*?)>/);
        let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
        if (tag != "P") {
          if (o.temp.length > 0) {
            o.res.push(o.temp);
            o.temp = [];
          }
          o.res.push([e]);
        } else if (i == a.length - 1) {
          if (tag == "P") {
            o.temp.push(e);
            o.res.push(o.temp);
          } else {
            o.res.push([e]);
          }
        } else {
          o.temp.push(e);
        }
      }
      return o;
    }, { res: [], temp: [] }).res
  ).map(e => e.map(f => f.join("\n")));
  const maxLen = Math.max(...temp1.map(r => r.length));
  // 我修改了下面的脚本。
  const values = temp1.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
  const res = values.map(r => {
    let n = 0;
    const idx = r.map(e => {
      let tagMatch = e.match(/^<(.*?)>/);
      let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
      return tag && tag.includes("H2") ? ++n : 0;
    }).indexOf(7);
    return [...r.splice(0, idx), r.join("\n")];
  });
  range.offset(0, 2, res.length, res[0].length).setValues(res);
}
只翻译了代码部分,不包括问题的其他内容。
<details>
<summary>英文:</summary>
Using the App Script, I need to break down the article: 
In the first column, this one would be whatever the ```<h1>``` is, in the second column whatever the first ```<h2>``` is, in the next column all the paragraphs ```<p>``` are going underneath it, then in the next whatever the second ```<h2>``` is, all the paragraphs ```<p>``` are going underneath that one and so on up until it gets to the seventh ```<h2>``` and the article still remains. And then at that point, I wanted just to dump the rest of the article into the last column. So, like for an article where we may have multiple ```<h2>``` more than seven, it would just put all remaining article in that last column.
I am using the below app script code to get the job done.
But **this code has two issues.** 
First, this code doesn't dump the last paragraphs ```<p>``` tags that come after the last ```<h2>``` (```<h2>```Conclusion```</h2>```) of an article into the last column. 
Second, the script gives this error on some articles.
TypeError: Cannot read properties of null (reading 'match')
(anonymous)	@ Code.gs:44
(anonymous)	@ Code.gs:43
myFunction	@ Code.gs:41
[Here][2] is the spreadsheet link which has the articles and the app script. In Row 3, What I want as output. In Row 4, what the script gives is output. The only difference between Row 3 and Row 4 is that Row 4 doesn't have the last paragraphs ```<p>``` tags that come after the last ```<h2>``` (```<h2>```Conclusion```</h2>```) of an article in the last column.
```function onOpen() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var pasteSheet = [ {name: "Run Script", functionName: "myFunction"}];
ss.addMenu("Break Down Article", pasteSheet);
}
function myFunction() {
const sheetName = "Sheet1"; // Please set the sheet name.
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
const range = sheet.getRange("A2:A" + sheet.getLastRow());
const temp1 = range.getValues().map(([a]) =>
a.split("\n").reduce((o, e, i, a) => {
if (e) {
let tagMatch = e.match(/^<(.*?)>/);
let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
if (tag != "P") {
if (o.temp.length > 0) {
o.res.push(o.temp);
o.temp = [];
}
o.res.push([e]);
} else if (i == a.length - 1) {
if (tag == "P") {
o.temp.push(e);
o.res.push(o.temp);
} else {
o.res.push([e]);
}
} else {
o.temp.push(e);
}
}
return o;
}, { res: [], temp: [] }).res
).map(e => e.map(f => f.join("\n")));
const maxLen = Math.max(...temp1.map(r => r.length));
// I modified below script.
const values = temp1.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
const res = values.map(r => {
let n = 0;
const idx = r.map(e => {
let tagMatch = e.match(/^<(.*?)>/);
let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
return tag && tag.includes("H2") ? ++n : 0;
}).indexOf(7);
return [...r.splice(0, idx), r.join("\n")];
});
range.offset(0, 2, res.length, res[0].length).setValues(res);
} 
答案1
得分: 2
从您提供的示例电子表格中,以下是示例脚本的内容:
function myFunction() {
  const sheetName = "Sheet1"; // 请设置工作表名称。
  const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
  const range = sheet.getRange("A3:A" + sheet.getLastRow());
  const temp1 = range.getValues().map(([a]) =>
    a.split("\n").reduce((o, e, i, a) => {
      let m = e.match(/^<(.*?)>/);
      let tag = "";
      if (m) {
        tag = m[1].toUpperCase();
      }
      if (tag != "P") {
        if (o.temp.length > 0) {
          o.res.push(o.temp);
          o.temp = [];
        }
        o.res.push([e]);
      } else if (i == a.length - 1) {
        if (tag == "P") {
          o.temp.push(e);
          o.res.push(o.temp);
        } else {
          o.res.push([e]);
        }
      } else {
        o.temp.push(e);
      }
      return o;
    }, { res: [], temp: [] }).res
  ).map(e => e.map(f => f.join("\n")));
  const res = temp1.map(r => {
    let n = 0;
    const idx = r.map(e => e ? (e.toUpperCase().includes("H2") ? ++n : 0) : 0).indexOf(7);
    return [...r.splice(0, idx > -1 ? idx : r.length), r.join("\n").trim()];
  });
  const maxLen = Math.max(...res.map(r => r.length));
  const values = res.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
  range.offset(0, 2, values.length, values[0].length).setValues(values);
}
注意:
- 从 "我有5个示例值。暂时就这些了,没有其他值。" 这个样本脚本是针对您的5个示例值的。请注意这一点。
 
英文:
From your provided sample Spreadsheet, how about the following sample script?
From your sample values, I moved the adjustment of the length of each element in a 2-dimensional array after the array processing. And, I added a condition for checking the value from e.match(/^<(.*?)>/).
Sample script:
function myFunction() {
  const sheetName = "Sheet1"; // Please set the sheet name.
  const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
  const range = sheet.getRange("A3:A" + sheet.getLastRow());
  const temp1 = range.getValues().map(([a]) =>
    a.split("\n").reduce((o, e, i, a) => {
      let m = e.match(/^<(.*?)>/);
      let tag = "";
      if (m) {
        tag = m[1].toUpperCase();
      }
      if (tag != "P") {
        if (o.temp.length > 0) {
          o.res.push(o.temp);
          o.temp = [];
        }
        o.res.push([e]);
      } else if (i == a.length - 1) {
        if (tag == "P") {
          o.temp.push(e);
          o.res.push(o.temp);
        } else {
          o.res.push([e]);
        }
      } else {
        o.temp.push(e);
      }
      return o;
    }, { res: [], temp: [] }).res
  ).map(e => e.map(f => f.join("\n")));
  const res = temp1.map(r => {
    let n = 0;
    const idx = r.map(e => e ? (e.toUpperCase().includes("H2") ? ++n : 0) : 0).indexOf(7);
    return [...r.splice(0, idx > -1 ? idx : r.length), r.join("\n").trim()];
  });
  const maxLen = Math.max(...res.map(r => r.length));
  const values = res.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
  range.offset(0, 2, values.length, values[0].length).setValues(values);
}
Note:
- From 
I have 5 sample values. That's all for now. I don't have any other., this sample script is for your 5 sample values. Please be careful about this. 
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论