在Go中解析RSS源。

huangapple go评论78阅读模式
英文:

Parsing RSS feed in Go

问题

我正在尝试用Go编写一个播客下载器。以下代码解析了一个RSS源,但在将解析的数据打印到标准输出时,频道的链接为空。我不知道为什么会这样。有什么建议吗?我对Go还不熟悉。

package main

import (
	"encoding/xml"
	"fmt"
	"net/http"
)

type Enclosure struct {
	Url    string `xml:"url,attr"`
	Length int64  `xml:"length,attr"`
	Type   string `xml:"type,attr"`
}

type Item struct {
	Title     string    `xml:"title"`
	Link      string    `xml:"link"`
	Desc      string    `xml:"description"`
	Guid      string    `xml:"guid"`
	Enclosure Enclosure `xml:"enclosure"`
	PubDate   string    `xml:"pubDate"`
}

type Channel struct {
	Title string `xml:"title"`
	Link  string `xml:"link"`
	Desc  string `xml:"description"`
	Items []Item `xml:"item"`
}

type Rss struct {
	Channel Channel `xml:"channel"`
}

func main() {
	resp, err := http.Get("http://www.bbc.co.uk/programmes/p02nrvz8/episodes/downloads.rss")
	if err != nil {
		fmt.Printf("Error GET: %v\n", err)
		return
	}
	defer resp.Body.Close()

	rss := Rss{}

	decoder := xml.NewDecoder(resp.Body)
	err = decoder.Decode(&rss)
	if err != nil {
		fmt.Printf("Error Decode: %v\n", err)
		return
	}

	fmt.Printf("Channel title: %v\n", rss.Channel.Title)
	fmt.Printf("Channel link: %v\n", rss.Channel.Link)

	for i, item := range rss.Channel.Items {
		fmt.Printf("%v. item title: %v\n", i, item.Title)
	}
}
英文:

I am trying to write a podcast downloader in Go. The following code parses an RSS feed but the link of the channel is empty when printing the parsed data to the standard output. I don't know why. Any suggestions? I am new to Go.

<!-- language: lang-go -->

package main
import (
&quot;encoding/xml&quot;
&quot;fmt&quot;
&quot;net/http&quot;
)
type Enclosure struct {
Url    string `xml:&quot;url,attr&quot;`
Length int64  `xml:&quot;length,attr&quot;`
Type   string `xml:&quot;type,attr&quot;`
}
type Item struct {
Title     string    `xml:&quot;title&quot;`
Link      string    `xml:&quot;link&quot;`
Desc      string    `xml:&quot;description&quot;`
Guid      string    `xml:&quot;guid&quot;`
Enclosure Enclosure `xml:&quot;enclosure&quot;`
PubDate   string    `xml:&quot;pubDate&quot;`
}
type Channel struct {
Title string `xml:&quot;title&quot;`
Link  string `xml:&quot;link&quot;`
Desc  string `xml:&quot;description&quot;`
Items []Item `xml:&quot;item&quot;`
}
type Rss struct {
Channel Channel `xml:&quot;channel&quot;`
}
func main() {
resp, err := http.Get(&quot;http://www.bbc.co.uk/programmes/p02nrvz8/episodes/downloads.rss&quot;)
if err != nil {
fmt.Printf(&quot;Error GET: %v\n&quot;, err)
return
}
defer resp.Body.Close()
rss := Rss{}
decoder := xml.NewDecoder(resp.Body)
err = decoder.Decode(&amp;rss)
if err != nil {
fmt.Printf(&quot;Error Decode: %v\n&quot;, err)
return
}
fmt.Printf(&quot;Channel title: %v\n&quot;, rss.Channel.Title)
fmt.Printf(&quot;Channel link: %v\n&quot;, rss.Channel.Link)
for i, item := range rss.Channel.Items {
fmt.Printf(&quot;%v. item title: %v\n&quot;, i, item.Title)
}
}

答案1

得分: 4

rss源中的xml具有一个包含两个子元素'link'的channel元素:'link'和'atom:link'。即使其中一个具有命名空间前缀,Go xml解组器仍然会发现冲突。请参见local name collisions failgithub上的问题

<?xml version="1.0" encoding="UTF-8"?>
...
<channel>
<title>Forum - Sixty Second Idea to Improve the World</title>
<link>http://www.bbc.co.uk/programmes/p02nrvz8</link>
...
<atom:link href="http://www.bbc.co.uk/..." />
英文:

The xml from the rss feed has a channel element with two child 'link' elements: 'link' and 'atom:link'. Even though one has a namespace prefix, Go xml unmarshaller sees a conflict. See also local name collisions fail and issue on github.

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
...
&lt;channel&gt;
&lt;title&gt;Forum - Sixty Second Idea to Improve the World&lt;/title&gt;
&lt;link&gt;http://www.bbc.co.uk/programmes/p02nrvz8&lt;/link&gt;
...
&lt;atom:link href=&quot;http://www.bbc.co.uk/...&quot; /&gt;

答案2

得分: 0

你可以使用类似go-rss这样的库或者informado这样的工具来读取各种RSS源。

英文:

Or use a library like go-rss or a tool like informado to read various RSS feeds.

huangapple
  • 本文由 发表于 2016年1月24日 20:33:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/34975837.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定