如何在Google Apps Script中使用Cheerio将文本转换为数组

huangapple go评论104阅读模式
英文:

How to get the every text to array with cheerio in Google Apps Script

问题

我想将每个文本内容存入数组中。但有些 div 元素的类名中包含了 " " 或 "(",因此我无法使用 find() 方法。那么,我该如何将它们全部存入数组中呢?感谢您的任何帮助。

HTML 源代码:

  1. <div id="main-0-QuoteHeader-Proxy">
  2. <div class="Mb($m-module-24)">
  3. <div class="D(f) Ai(c) Mb(6px)">
  4. <h1 class="C($c-link-text) Fw(b) Fz(24px) Mend(8px)">1.name</h1>
  5. <span class="C($c-icon) Fz(24px) Mend(20px)">2.id</span>
  6. <div class="Flxg(2)">
  7. <a class="Td(n) Px(8px) Py(3px) Fz(12px) Fw(b) Bdrs(11px) C(#188fff) Bgc($tag-bg-blue) Bgc($tag-bg-blue-hover):h" href="/class-quote?sectorId=40&amp;exchange=TAI">3.cag</a>
  8. </div>
  9. <button class="Ff(buttonFont) O(n) Bd Bxsh(n) Trsdu(.3s) Whs(nw) C(#fff) Bdc($c-button) Bgc($c-button) Cur(p) D(f) Ai(c) Fx(n) Bdrs(100px) Px(15px) Py(3px) Lh(20px) Fz(14px) Fw(b) Mstart(8px) Bdw(2px)">
  10. <svg class="Cur(p)" width="16" style="fill:#fff;stroke:#fff;stroke-width:0;vertical-align:bottom" height="16" viewBox="0 0 24 24" data-icon="star"><path d="M8.485 7.83l-6.515.21c-.887.028-1.3 1.117-.66 1.732l4.99 4.78-1.414 6.124c-.2 1.14.767 1.49 1.262 1.254l5.87-3.22 5.788 3.22c.48.228 1.464-.097 1.26-1.254l-1.33-6.124 4.962-4.78c.642-.615.228-1.704-.658-1.732l-6.486-.21-2.618-6.22c-.347-.815-1.496-.813-1.84.003L8.486 7.83zm7.06 6.05l1.11 5.11-4.63-2.576L7.33 18.99l1.177-5.103-4.088-3.91 5.41-.18 2.19-5.216 2.19 5.216 5.395.18-4.06 3.903z"></path></svg>
  11. <span class="Mstart(8px)">4.add</span>
  12. </button>
  13. </div>
  14. <div class="D(f) Jc(sb) Ai(fe)">
  15. <div class="D(f) Fld(c) Ai(fs)">
  16. <div class="D(f) Ai(fe) Mb(4px)">
  17. <span class="Fz(32px) Fw(b) Lh(1) Mend(16px) D(f) Ai(c) C($c-trend-down)">5.num</span>
  18. <span class="Fz(20px) Fw(b) Lh(1.2) Mend(4px) D(f) Ai(c) C($c-trend-down)">
  19. <span class="Mend(4px) Bds(s)" style="border-color:#00ab5e transparent transparent transparent;border-width:9px 6.5px 0 6.5px"></span>
  20. 6.up
  21. </span>
  22. <span class="Jc(fe) Fz(20px) Lh(1.2) Fw(b) D(f) Ai(c) C($c-trend-down)">7.down</span>
  23. </div>
  24. <span class="C(#6e7780) Fz(12px) Fw(b)">8.2023/02/17 13:30</span>
  25. </div>
  26. <div class="D(f)">
  27. <div class="D(f) Fld(c) Ai(c) Fw(b) Pend(8px) Bdendc($bd-primary-divider) Bdends(s) Bdendw(1px)">
  28. <span class="Fz(16px) C($c-link-text) Mb(4px)">9.total num</span>
  29. <span class="Fz(12px) C($c-icon)">10.total</span>
  30. </div>
  31. <div class="D(f) Fld(c) Ai(c) Fw(b) Px(8px) Bdendc($bd-primary-divider) Bdends(s) Bdendw(1px)">
  32. <span class="Fz(16px) C($c-link-text) Mb(4px)">11.other num</span>
  33. <span class="Fz(12px) C($c-icon)">12.other</span>
  34. </div>
  35. <div class="D(f) Fld(c) Ai(c) Fw(b) Pstart(8px)">
  36. <span class="Fz(16px) Mb(4px) C($c-trend-up)">
  37. <div class="D(f)">
  38. 13.(
  39. <span class="D(f) Ai(c)">
  40. <span class="Mend(4px) Bds(s)" style="border-color:transparent transparent #ff333a transparent;border-width:0 5px 7px 5px"></span>
  41. 14.100%
  42. </span>
  43. 15.)
  44. </div>
  45. </span>
  46. <span class="Fz(12px) C($c-icon)">16.count</span>
  47. </div>
  48. </div>
  49. </div>
  50. </div>
  51. </div>

我的代码:

  1. function getTextToArray()
  2. {
  3. var URL = "https://tw.stock.yahoo.com/d/s/major_2330.html";
  4. var source = UrlFetchApp.fetch(URL);
  5. var html = source.getContentText();
  6. const $ = Cheerio.load(html, { decodeEntities: false });
  7. // 使用选择器来获取包含文本的元素
  8. var elements = $('#main-0-QuoteHeader-Proxy div').find('h1, span, a, button');
  9. // 创建一个空数组来存储文本
  10. var textArray = [];
  11. // 遍历每个元素并将文本添加到数组中
  12. <details>
  13. <summary>英文:</summary>
  14. I want to get every text to the array. But some div name have &quot; &quot; or &quot;(&quot; in the class name that I can&#39;t use find().
  15. So How can I get all of them in array? Thank you for any help
  16. #html source code.

<div id="main-0-QuoteHeader-Proxy">
<div class="Mb($m-module-24)">
<div class="D(f) Ai(c) Mb(6px)">
<h1 class="C($c-link-text) Fw(b) Fz(24px) Mend(8px)">1.name</h1>
<span class="C($c-icon) Fz(24px) Mend(20px)">2.id</span>
<div class="Flxg(2)">
<a class="Td(n) Px(8px) Py(3px) Fz(12px) Fw(b) Bdrs(11px) C(#188fff) Bgc($tag-bg-blue) Bgc($tag-bg-blue-hover):h" href="/class-quote?sectorId=40&amp;exchange=TAI">3.cag</a>
</div>
<button class="Ff(buttonFont) O(n) Bd Bxsh(n) Trsdu(.3s) Whs(nw) C(#fff) Bdc($c-button) Bgc($c-button) Cur(p) D(f) Ai(c) Fx(n) Bdrs(100px) Px(15px) Py(3px) Lh(20px) Fz(14px) Fw(b) Mstart(8px) Bdw(2px)">
<svg class="Cur(p)" width="16" style="fill:#fff;stroke:#fff;stroke-width:0;vertical-align:bottom" height="16" viewBox="0 0 24 24" data-icon="star"><path d="M8.485 7.83l-6.515.21c-.887.028-1.3 1.117-.66 1.732l4.99 4.78-1.414 6.124c-.2 1.14.767 1.49 1.262 1.254l5.87-3.22 5.788 3.22c.48.228 1.464-.097 1.26-1.254l-1.33-6.124 4.962-4.78c.642-.615.228-1.704-.658-1.732l-6.486-.21-2.618-6.22c-.347-.815-1.496-.813-1.84.003L8.486 7.83zm7.06 6.05l1.11 5.11-4.63-2.576L7.33 18.99l1.177-5.103-4.088-3.91 5.41-.18 2.19-5.216 2.19 5.216 5.395.18-4.06 3.903z"></path></svg>
<span class="Mstart(8px)">4.add</span>
</button>
</div>
<div class="D(f) Jc(sb) Ai(fe)">
<div class="D(f) Fld(c) Ai(fs)">
<div class="D(f) Ai(fe) Mb(4px)">
<span class="Fz(32px) Fw(b) Lh(1) Mend(16px) D(f) Ai(c) C($c-trend-down)">5.num</span>
<span class="Fz(20px) Fw(b) Lh(1.2) Mend(4px) D(f) Ai(c) C($c-trend-down)">
<span class="Mend(4px) Bds(s)" style="border-color:#00ab5e transparent transparent transparent;border-width:9px 6.5px 0 6.5px"></span>
6.up
</span>
<span class="Jc(fe) Fz(20px) Lh(1.2) Fw(b) D(f) Ai(c) C($c-trend-down)">7.down</span>
</div>
<span class="C(#6e7780) Fz(12px) Fw(b)">8.2023/02/17 13:30</span>
</div>
<div class="D(f)">
<div class="D(f) Fld(c) Ai(c) Fw(b) Pend(8px) Bdendc($bd-primary-divider) Bdends(s) Bdendw(1px)">
<span class="Fz(16px) C($c-link-text) Mb(4px)">9.total num</span>
<span class="Fz(12px) C($c-icon)">10.total</span>
</div>
<div class="D(f) Fld(c) Ai(c) Fw(b) Px(8px) Bdendc($bd-primary-divider) Bdends(s) Bdendw(1px)">
<span class="Fz(16px) C($c-link-text) Mb(4px)">11.other num</span>
<span class="Fz(12px) C($c-icon)">12.other</span>
</div>
<div class="D(f) Fld(c) Ai(c) Fw(b) Pstart(8px)">
<span class="Fz(16px) Mb(4px) C($c-trend-up)">
<div class="D(f)">
13.(
<span class="D(f) Ai(c)">
<span class="Mend(4px) Bds(s)" style="border-color:transparent transparent #ff333a transparent;border-width:0 5px 7px 5px"></span>
14.100%
</span>
15.)
</div>
</span>
<span class="Fz(12px) C($c-icon)">16.count</span>
</div>
</div>
</div>
</div>
</div>

  1. #my code:

function getTextToArray()
{
var URL = "https://tw.stock.yahoo.com/d/s/major_2330.html";
var source = UrlFetchApp.fetch(URL);
var html = source.getContentText();

const $ = Cheerio.load(html,{ decodeEntities: false });

Logger.log($('#main-0-QuoteHeader-Proxy').find('div').first().text());
}

  1. this resule only can show the line &quot;1.name2.id3.cag4.add5.num6.up7.down8.2023/02/17 13:309.total num10.total11.other num12.other13.(14.100%15.)&quot;
  2. And can&#39;t use the class name to find and will show the error &quot;Error: Unmatched selector: ($c-link-text) Fw(b) Fz(24px) Mend(8px)&quot;
  3. I want resule can like array[0] = 1.name, array[1] = 2.id, array[2]= 3.cag.......
  4. </details>
  5. # 答案1
  6. **得分**: 0
  7. $(&#39;h1,span&#39;).get().map(el =&gt; $(el).text()):
  8. 这是一段代码,它选择页面上的所有`<h1>``<span>`元素,并获取它们的文本内容,然后以数组的形式返回这些文本内容。
  9. <details>
  10. <summary>英文:</summary>
  11. it looks like you want:
  12. $(&#39;h1,span&#39;).get().map(el =&gt; $(el).text())
  13. </details>

huangapple
  • 本文由 发表于 2023年2月18日 09:41:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/75490659.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定