解析多个XML项

huangapple go评论83阅读模式
英文:

Unmarshal multiple XML items

问题

我正在尝试对具有相同结构的节点中包含的多个项目进行解组,以便进行进一步处理,但似乎无法访问数据,我不确定原因。XML数据的结构如下(我正在尝试访问所有的Item):

<?xml version="1.0" encoding="ISO-8859-1" ?>
<datainfo>
  <origin>NOAA/NOS/CO-OPS</origin>
  <producttype> Annual Tide Prediction </producttype>
  <IntervalType>High/Low Tide Predictions</IntervalType>
  <data>
    <item>
      <date>2015/12/31</date>
      <day>Thu</day>
      <time>03:21 AM</time>
      <predictions_in_ft>5.3</predictions_in_ft>
      <predictions_in_cm>162</predictions_in_cm>
      <highlow>H</highlow>
    </item>
    <item>
      <date>2015/12/31</date>
      <day>Thu</day>
      <time>09:24 AM</time>
      <predictions_in_ft>2.4</predictions_in_ft>
      <predictions_in_cm>73</predictions_in_cm>
      <highlow>L</highlow>
    </item>
  </data>
</datainfo>

我的代码如下:

package main

import (
	"encoding/xml"
	"fmt"
	"io/ioutil"
	"os"
)

// TideData 存储一系列潮汐预测
type TideData struct {
	Tides []Tide `xml:"data>item"`
}

// Tide 存储单个潮汐预测
type Tide struct {
	Date         string  `xml:"date"`
	Day          string  `xml:"day"`
	Time         string  `xml:"time"`
	PredictionFt float64 `xml:"predictions_in_ft"`
	PredictionCm float64 `xml:"predictions_in_cm"`
	HighLow      string  `xml:"highlow"`
}

func (t Tide) String() string {
	return t.Date + " " + t.Day + " " + t.Time + " " + t.HighLow
}

func main() {
	xmlFile, err := os.Open("9414275 Annual.xml")
	if err != nil {
		fmt.Println("Error opening file:", err)
		return
	}
	defer xmlFile.Close()

	b, _ := ioutil.ReadAll(xmlFile)

	var tides TideData
	xml.Unmarshal(b, &tides)

	fmt.Println(tides)
	for _, datum := range tides.Tides {
		fmt.Printf("\t%s\n", datum)
	}
}

运行时输出为空,这让我认为数据没有被解组。输出为:

{[]}
英文:

I am trying to unmarshal multiple items contained in nodes with an identical structure for further processing, but don't seem to be able to access the data and I am not sure why. The XML data is structured in the following form (I am trying to access all of the Item's:

<!-- language: xml -->

&lt;?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot; ?&gt; 
&lt;datainfo&gt;
  &lt;origin&gt;NOAA/NOS/CO-OPS&lt;/origin&gt;
  &lt;producttype&gt; Annual Tide Prediction &lt;/producttype&gt;
  &lt;IntervalType&gt;High/Low Tide Predictions&lt;/IntervalType&gt;
  &lt;data&gt;
    &lt;item&gt;
      &lt;date&gt;2015/12/31&lt;/date&gt;
      &lt;day&gt;Thu&lt;/day&gt;
      &lt;time&gt;03:21 AM&lt;/time&gt;
      &lt;predictions_in_ft&gt;5.3&lt;/predictions_in_ft&gt;
      &lt;predictions_in_cm&gt;162&lt;/predictions_in_cm&gt;
      &lt;highlow&gt;H&lt;/highlow&gt;
    &lt;/item&gt;
    &lt;item&gt;
      &lt;date&gt;2015/12/31&lt;/date&gt;
      &lt;day&gt;Thu&lt;/day&gt;
      &lt;time&gt;09:24 AM&lt;/time&gt;
      &lt;predictions_in_ft&gt;2.4&lt;/predictions_in_ft&gt;
      &lt;predictions_in_cm&gt;73&lt;/predictions_in_cm&gt;
      &lt;highlow&gt;L&lt;/highlow&gt;
    &lt;/item&gt;
  &lt;/data&gt;
&lt;/datainfo&gt;

My code is:

<!-- language: go -->

package main

import (
	&quot;encoding/xml&quot;
	&quot;fmt&quot;
	&quot;io/ioutil&quot;
	&quot;os&quot;
)

// TideData stores a series of tide predictions
type TideData struct {
	Tides []Tide `xml:&quot;data&gt;item&quot;`
}

// Tide stores a single tide prediction
type Tide struct {
	Date         string  `xml:&quot;date&quot;`
	Day          string  `xml:&quot;day&quot;`
	Time         string  `xml:&quot;time&quot;`
	PredictionFt float64 `xml:&quot;predictions_in_ft&quot;`
	PredictionCm float64 `xml:&quot;predictions_in_cm&quot;`
	HighLow      string  `xml:&quot;highlow&quot;`
}

func (t Tide) String() string {
	return t.Date + &quot; &quot; + t.Day + &quot; &quot; + t.Time + &quot; &quot; + t.HighLow
}

func main() {
	xmlFile, err := os.Open(&quot;9414275 Annual.xml&quot;)
	if err != nil {
		fmt.Println(&quot;Error opening file:&quot;, err)
		return
	}
	defer xmlFile.Close()

	b, _ := ioutil.ReadAll(xmlFile)

	var tides TideData
	xml.Unmarshal(b, &amp;tides)

	fmt.Println(tides)
	for _, datum := range tides.Tides {
		fmt.Printf(&quot;\t%s\n&quot;, datum)
	}
}

When run the output is empty, which leads me to think that the data is not unmarshalled. Output is:

{[]}

答案1

得分: 6

您正在忽略xml.Unmarshal的错误返回。通过稍微修改您的程序,我们可以看到发生了什么:

xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil

文档中查找,我们发现默认情况下该包只支持以UTF-8编码的XML:

// CharsetReader, if non-nil, defines a function to generate
// charset-conversion readers, converting from the provided
// non-UTF-8 charset into UTF-8. If CharsetReader is nil or
// returns an error, parsing stops with an error. One of the
// the CharsetReader's result values must be non-nil.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)

因此,您需要提供自己的字符集转换程序。您可以通过修改代码来注入它,类似于以下方式:

decoder := xml.NewDecoder(xmlFile)
decoder.CharsetReader = makeCharsetReader
err := decoder.Decode(&tides)

(请注意,我们现在从io.Reader而不是字节数组进行解码,因此可以删除ReadAll逻辑)。golang.org/x/text/encoding系列包可能会帮助您实现makeCharsetReader函数。类似以下方式可能有效:

import "golang.org/x/text/encoding/charmap"

func makeCharsetReader(charset string, input io.Reader) (io.Reader, error) {
    if charset == "ISO-8859-1" {
        // Windows-1252是ISO-8859-1的超集,所以应该可以在这里使用
        return charmap.Windows1252.NewDecoder().Reader(input), nil
    }
    return nil, fmt.Errorf("Unknown charset: %s", charset)
}

然后,您应该能够解码XML。

英文:

You are ignoring the error return from xml.Unmarshal. By slightly modifying your program, we can see what is going on:

xml: encoding &quot;ISO-8859-1&quot; declared but Decoder.CharsetReader is nil

And poking around in the documentation, we find that by default the package only supports XML encoded in UTF-8:

    // CharsetReader, if non-nil, defines a function to generate
    // charset-conversion readers, converting from the provided
    // non-UTF-8 charset into UTF-8. If CharsetReader is nil or
    // returns an error, parsing stops with an error. One of the
    // the CharsetReader&#39;s result values must be non-nil.
    CharsetReader func(charset string, input io.Reader) (io.Reader, error)

So it seems you need to provide your own character set conversion routine. You can inject it by modifying your code something like this:

decoder := xml.NewDecoder(xmlFile)
decoder.CharsetReader = makeCharsetReader
err := decoder.Decode(&amp;tides)

(note that we're now decoding from an io.Reader rather than a byte array now, so the ReadAll logic can be removed). The golang.org/x/text/encoding family of packages might help you in implementing your makeCharsetReader function. Something like this might work:

import &quot;golang.org/x/text/encoding/charmap&quot;

func makeCharsetReader(charset string, input io.Reader) (io.Reader, error) {
    if charset == &quot;ISO-8859-1&quot; {
        // Windows-1252 is a superset of ISO-8859-1, so should do here
        return charmap.Windows1252.NewDecoder().Reader(input), nil
    }
    return nil, fmt.Errorf(&quot;Unknown charset: %s&quot;, charset)
}

You should then be able to decode the XML.

huangapple
  • 本文由 发表于 2016年1月11日 07:26:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/34712015.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定