英文:
Unmarshal multiple XML items
问题
我正在尝试对具有相同结构的节点中包含的多个项目进行解组,以便进行进一步处理,但似乎无法访问数据,我不确定原因。XML数据的结构如下(我正在尝试访问所有的Item
):
<?xml version="1.0" encoding="ISO-8859-1" ?>
<datainfo>
<origin>NOAA/NOS/CO-OPS</origin>
<producttype> Annual Tide Prediction </producttype>
<IntervalType>High/Low Tide Predictions</IntervalType>
<data>
<item>
<date>2015/12/31</date>
<day>Thu</day>
<time>03:21 AM</time>
<predictions_in_ft>5.3</predictions_in_ft>
<predictions_in_cm>162</predictions_in_cm>
<highlow>H</highlow>
</item>
<item>
<date>2015/12/31</date>
<day>Thu</day>
<time>09:24 AM</time>
<predictions_in_ft>2.4</predictions_in_ft>
<predictions_in_cm>73</predictions_in_cm>
<highlow>L</highlow>
</item>
</data>
</datainfo>
我的代码如下:
package main
import (
"encoding/xml"
"fmt"
"io/ioutil"
"os"
)
// TideData 存储一系列潮汐预测
type TideData struct {
Tides []Tide `xml:"data>item"`
}
// Tide 存储单个潮汐预测
type Tide struct {
Date string `xml:"date"`
Day string `xml:"day"`
Time string `xml:"time"`
PredictionFt float64 `xml:"predictions_in_ft"`
PredictionCm float64 `xml:"predictions_in_cm"`
HighLow string `xml:"highlow"`
}
func (t Tide) String() string {
return t.Date + " " + t.Day + " " + t.Time + " " + t.HighLow
}
func main() {
xmlFile, err := os.Open("9414275 Annual.xml")
if err != nil {
fmt.Println("Error opening file:", err)
return
}
defer xmlFile.Close()
b, _ := ioutil.ReadAll(xmlFile)
var tides TideData
xml.Unmarshal(b, &tides)
fmt.Println(tides)
for _, datum := range tides.Tides {
fmt.Printf("\t%s\n", datum)
}
}
运行时输出为空,这让我认为数据没有被解组。输出为:
{[]}
英文:
I am trying to unmarshal multiple items contained in nodes with an identical structure for further processing, but don't seem to be able to access the data and I am not sure why. The XML data is structured in the following form (I am trying to access all of the Item
's:
<!-- language: xml -->
<?xml version="1.0" encoding="ISO-8859-1" ?>
<datainfo>
<origin>NOAA/NOS/CO-OPS</origin>
<producttype> Annual Tide Prediction </producttype>
<IntervalType>High/Low Tide Predictions</IntervalType>
<data>
<item>
<date>2015/12/31</date>
<day>Thu</day>
<time>03:21 AM</time>
<predictions_in_ft>5.3</predictions_in_ft>
<predictions_in_cm>162</predictions_in_cm>
<highlow>H</highlow>
</item>
<item>
<date>2015/12/31</date>
<day>Thu</day>
<time>09:24 AM</time>
<predictions_in_ft>2.4</predictions_in_ft>
<predictions_in_cm>73</predictions_in_cm>
<highlow>L</highlow>
</item>
</data>
</datainfo>
My code is:
<!-- language: go -->
package main
import (
"encoding/xml"
"fmt"
"io/ioutil"
"os"
)
// TideData stores a series of tide predictions
type TideData struct {
Tides []Tide `xml:"data>item"`
}
// Tide stores a single tide prediction
type Tide struct {
Date string `xml:"date"`
Day string `xml:"day"`
Time string `xml:"time"`
PredictionFt float64 `xml:"predictions_in_ft"`
PredictionCm float64 `xml:"predictions_in_cm"`
HighLow string `xml:"highlow"`
}
func (t Tide) String() string {
return t.Date + " " + t.Day + " " + t.Time + " " + t.HighLow
}
func main() {
xmlFile, err := os.Open("9414275 Annual.xml")
if err != nil {
fmt.Println("Error opening file:", err)
return
}
defer xmlFile.Close()
b, _ := ioutil.ReadAll(xmlFile)
var tides TideData
xml.Unmarshal(b, &tides)
fmt.Println(tides)
for _, datum := range tides.Tides {
fmt.Printf("\t%s\n", datum)
}
}
When run the output is empty, which leads me to think that the data is not unmarshalled. Output is:
{[]}
答案1
得分: 6
您正在忽略xml.Unmarshal
的错误返回。通过稍微修改您的程序,我们可以看到发生了什么:
xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil
在文档中查找,我们发现默认情况下该包只支持以UTF-8编码的XML:
// CharsetReader, if non-nil, defines a function to generate
// charset-conversion readers, converting from the provided
// non-UTF-8 charset into UTF-8. If CharsetReader is nil or
// returns an error, parsing stops with an error. One of the
// the CharsetReader's result values must be non-nil.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)
因此,您需要提供自己的字符集转换程序。您可以通过修改代码来注入它,类似于以下方式:
decoder := xml.NewDecoder(xmlFile)
decoder.CharsetReader = makeCharsetReader
err := decoder.Decode(&tides)
(请注意,我们现在从io.Reader
而不是字节数组进行解码,因此可以删除ReadAll
逻辑)。golang.org/x/text/encoding
系列包可能会帮助您实现makeCharsetReader
函数。类似以下方式可能有效:
import "golang.org/x/text/encoding/charmap"
func makeCharsetReader(charset string, input io.Reader) (io.Reader, error) {
if charset == "ISO-8859-1" {
// Windows-1252是ISO-8859-1的超集,所以应该可以在这里使用
return charmap.Windows1252.NewDecoder().Reader(input), nil
}
return nil, fmt.Errorf("Unknown charset: %s", charset)
}
然后,您应该能够解码XML。
英文:
You are ignoring the error return from xml.Unmarshal
. By slightly modifying your program, we can see what is going on:
xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil
And poking around in the documentation, we find that by default the package only supports XML encoded in UTF-8:
// CharsetReader, if non-nil, defines a function to generate
// charset-conversion readers, converting from the provided
// non-UTF-8 charset into UTF-8. If CharsetReader is nil or
// returns an error, parsing stops with an error. One of the
// the CharsetReader's result values must be non-nil.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)
So it seems you need to provide your own character set conversion routine. You can inject it by modifying your code something like this:
decoder := xml.NewDecoder(xmlFile)
decoder.CharsetReader = makeCharsetReader
err := decoder.Decode(&tides)
(note that we're now decoding from an io.Reader
rather than a byte array now, so the ReadAll
logic can be removed). The golang.org/x/text/encoding
family of packages might help you in implementing your makeCharsetReader
function. Something like this might work:
import "golang.org/x/text/encoding/charmap"
func makeCharsetReader(charset string, input io.Reader) (io.Reader, error) {
if charset == "ISO-8859-1" {
// Windows-1252 is a superset of ISO-8859-1, so should do here
return charmap.Windows1252.NewDecoder().Reader(input), nil
}
return nil, fmt.Errorf("Unknown charset: %s", charset)
}
You should then be able to decode the XML.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论