英文:
Divided HTML tags of an article into separate columns using App script in Google sheet
问题
使用应用脚本,我需要拆分文章:
在第一列,这将是<h1>
的内容,在第二列是第一个<h2>
的内容,接下来的列是所有<p>
的内容,然后在下一列是第二个<h2>
的内容,其下面是所有的<p>
,以此类推,直到第七个<h2>
,文章仍然保留在最后一列。然后,在那一点上,我只想将剩下的文章转储到最后一列。所以,对于可能有多于七个<h2>
的文章,它将把所有剩下的文章放在最后一列。
我正在使用以下应用脚本代码来完成任务。但是,这段代码有两个问题:
首先,这段代码不会将文章中最后一个<h2>
(<h2>
Conclusion</h2>
)之后的段落<p>
标签转储到最后一列。
其次,在某些文章上,脚本会出现以下错误。
TypeError: Cannot read properties of null (reading 'match')
(anonymous) @ Code.gs:44
(anonymous) @ Code.gs:43
myFunction @ Code.gs:41
这里是包含文章和应用脚本的电子表格链接。在第3行,我想要的输出。在第4行,脚本输出的结果。第3行和第4行之间的唯一区别是第4行没有文章中最后一个<h2>
(<h2>
Conclusion</h2>
)之后的段落<p>
标签在最后一列中。
function onOpen() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var pasteSheet = [ {name: "Run Script", functionName: "myFunction"}];
ss.addMenu("Break Down Article", pasteSheet);
}
function myFunction() {
const sheetName = "Sheet1"; // 请设置工作表名称。
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
const range = sheet.getRange("A2:A" + sheet.getLastRow());
const temp1 = range.getValues().map(([a]) =>
a.split("\n").reduce((o, e, i, a) => {
if (e) {
let tagMatch = e.match(/^<(.*?)>/);
let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
if (tag != "P") {
if (o.temp.length > 0) {
o.res.push(o.temp);
o.temp = [];
}
o.res.push([e]);
} else if (i == a.length - 1) {
if (tag == "P") {
o.temp.push(e);
o.res.push(o.temp);
} else {
o.res.push([e]);
}
} else {
o.temp.push(e);
}
}
return o;
}, { res: [], temp: [] }).res
).map(e => e.map(f => f.join("\n")));
const maxLen = Math.max(...temp1.map(r => r.length));
// 我修改了下面的脚本。
const values = temp1.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
const res = values.map(r => {
let n = 0;
const idx = r.map(e => {
let tagMatch = e.match(/^<(.*?)>/);
let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
return tag && tag.includes("H2") ? ++n : 0;
}).indexOf(7);
return [...r.splice(0, idx), r.join("\n")];
});
range.offset(0, 2, res.length, res[0].length).setValues(res);
}
只翻译了代码部分,不包括问题的其他内容。
<details>
<summary>英文:</summary>
Using the App Script, I need to break down the article:
In the first column, this one would be whatever the ```<h1>``` is, in the second column whatever the first ```<h2>``` is, in the next column all the paragraphs ```<p>``` are going underneath it, then in the next whatever the second ```<h2>``` is, all the paragraphs ```<p>``` are going underneath that one and so on up until it gets to the seventh ```<h2>``` and the article still remains. And then at that point, I wanted just to dump the rest of the article into the last column. So, like for an article where we may have multiple ```<h2>``` more than seven, it would just put all remaining article in that last column.
I am using the below app script code to get the job done.
But **this code has two issues.**
First, this code doesn't dump the last paragraphs ```<p>``` tags that come after the last ```<h2>``` (```<h2>```Conclusion```</h2>```) of an article into the last column.
Second, the script gives this error on some articles.
TypeError: Cannot read properties of null (reading 'match')
(anonymous) @ Code.gs:44
(anonymous) @ Code.gs:43
myFunction @ Code.gs:41
[Here][2] is the spreadsheet link which has the articles and the app script. In Row 3, What I want as output. In Row 4, what the script gives is output. The only difference between Row 3 and Row 4 is that Row 4 doesn't have the last paragraphs ```<p>``` tags that come after the last ```<h2>``` (```<h2>```Conclusion```</h2>```) of an article in the last column.
```function onOpen() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var pasteSheet = [ {name: "Run Script", functionName: "myFunction"}];
ss.addMenu("Break Down Article", pasteSheet);
}
function myFunction() {
const sheetName = "Sheet1"; // Please set the sheet name.
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
const range = sheet.getRange("A2:A" + sheet.getLastRow());
const temp1 = range.getValues().map(([a]) =>
a.split("\n").reduce((o, e, i, a) => {
if (e) {
let tagMatch = e.match(/^<(.*?)>/);
let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
if (tag != "P") {
if (o.temp.length > 0) {
o.res.push(o.temp);
o.temp = [];
}
o.res.push([e]);
} else if (i == a.length - 1) {
if (tag == "P") {
o.temp.push(e);
o.res.push(o.temp);
} else {
o.res.push([e]);
}
} else {
o.temp.push(e);
}
}
return o;
}, { res: [], temp: [] }).res
).map(e => e.map(f => f.join("\n")));
const maxLen = Math.max(...temp1.map(r => r.length));
// I modified below script.
const values = temp1.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
const res = values.map(r => {
let n = 0;
const idx = r.map(e => {
let tagMatch = e.match(/^<(.*?)>/);
let tag = tagMatch ? tagMatch[1].toUpperCase() : "";
return tag && tag.includes("H2") ? ++n : 0;
}).indexOf(7);
return [...r.splice(0, idx), r.join("\n")];
});
range.offset(0, 2, res.length, res[0].length).setValues(res);
}
答案1
得分: 2
从您提供的示例电子表格中,以下是示例脚本的内容:
function myFunction() {
const sheetName = "Sheet1"; // 请设置工作表名称。
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
const range = sheet.getRange("A3:A" + sheet.getLastRow());
const temp1 = range.getValues().map(([a]) =>
a.split("\n").reduce((o, e, i, a) => {
let m = e.match(/^<(.*?)>/);
let tag = "";
if (m) {
tag = m[1].toUpperCase();
}
if (tag != "P") {
if (o.temp.length > 0) {
o.res.push(o.temp);
o.temp = [];
}
o.res.push([e]);
} else if (i == a.length - 1) {
if (tag == "P") {
o.temp.push(e);
o.res.push(o.temp);
} else {
o.res.push([e]);
}
} else {
o.temp.push(e);
}
return o;
}, { res: [], temp: [] }).res
).map(e => e.map(f => f.join("\n")));
const res = temp1.map(r => {
let n = 0;
const idx = r.map(e => e ? (e.toUpperCase().includes("H2") ? ++n : 0) : 0).indexOf(7);
return [...r.splice(0, idx > -1 ? idx : r.length), r.join("\n").trim()];
});
const maxLen = Math.max(...res.map(r => r.length));
const values = res.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
range.offset(0, 2, values.length, values[0].length).setValues(values);
}
注意:
- 从 "我有5个示例值。暂时就这些了,没有其他值。" 这个样本脚本是针对您的5个示例值的。请注意这一点。
英文:
From your provided sample Spreadsheet, how about the following sample script?
From your sample values, I moved the adjustment of the length of each element in a 2-dimensional array after the array processing. And, I added a condition for checking the value from e.match(/^<(.*?)>/)
.
Sample script:
function myFunction() {
const sheetName = "Sheet1"; // Please set the sheet name.
const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(sheetName);
const range = sheet.getRange("A3:A" + sheet.getLastRow());
const temp1 = range.getValues().map(([a]) =>
a.split("\n").reduce((o, e, i, a) => {
let m = e.match(/^<(.*?)>/);
let tag = "";
if (m) {
tag = m[1].toUpperCase();
}
if (tag != "P") {
if (o.temp.length > 0) {
o.res.push(o.temp);
o.temp = [];
}
o.res.push([e]);
} else if (i == a.length - 1) {
if (tag == "P") {
o.temp.push(e);
o.res.push(o.temp);
} else {
o.res.push([e]);
}
} else {
o.temp.push(e);
}
return o;
}, { res: [], temp: [] }).res
).map(e => e.map(f => f.join("\n")));
const res = temp1.map(r => {
let n = 0;
const idx = r.map(e => e ? (e.toUpperCase().includes("H2") ? ++n : 0) : 0).indexOf(7);
return [...r.splice(0, idx > -1 ? idx : r.length), r.join("\n").trim()];
});
const maxLen = Math.max(...res.map(r => r.length));
const values = res.map(r => [...r, ...Array(maxLen - r.length).fill(null)]);
range.offset(0, 2, values.length, values[0].length).setValues(values);
}
Note:
- From
I have 5 sample values. That's all for now. I don't have any other.
, this sample script is for your 5 sample values. Please be careful about this.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论