如何在golang中解析ISO6709坐标?

huangapple go评论71阅读模式
英文:

How to parse ISO6709 coordinates in golang?

问题

以下是翻译好的内容:

维基百科上关于ISO 6709的一些示例:

<!-- language: lang-none -->

大西洋 +00-025/
法国 +46+002/
巴黎 +48.52+002.20/
埃菲尔铁塔 +48.8577+002.295/
珠穆朗玛峰 +27.5916+086.5640+8850CRSWGS_84/
北极 +90+000/
太平洋 +00-160/
南极 -90+000+2800CRSWGS_84/
美国 +38-097/
纽约市 +40.75-074.00/
自由女神像 +40.6894-074.0447/

由于没有一致的分隔字符,如何解析这些数据?使用正则表达式吗?逐字节读取和解析?

澄清一下:期望的输出是一对float32类型的纬度和经度。例如:

输入:+40.6894-074.0447/
输出:40.6894 和 -074.0447

英文:

Some examples from Wikipedia on ISO 6709:

<!-- language: lang-none -->

Atlantic Ocean +00-025/
France +46+002/
Paris +48.52+002.20/
Eiffel Tower +48.8577+002.295/
Mount Everest +27.5916+086.5640+8850CRSWGS_84/
North Pole +90+000/
Pacific Ocean +00-160/
South Pole -90+000+2800CRSWGS_84/
United States +38-097/
New York City +40.75-074.00/
Statue of Liberty +40.6894-074.0447/

What's the way to parse this since there's no consistent delimiting character? Regex? Read and parse it byte by byte?

To clarify: the desired output is a pair of float32 latitude and longitudes. So for e.g:

input: +40.6894-074.0447/
output: 40.6894 and -074.0447

答案1

得分: 4

我不确定你想要提取哪些部分,但以下正则表达式可以在你的示例中选择它们:

(\+|-)\d+\.?\d+(\+|-)\d+\.?[\d]+(\+|-)?[^/]*

它将工作在多个部分上,并依赖于最后的斜杠作为终止符,尽管如果没有斜杠,也有其他方法可以解决。

(\+|-)\d+\.?\d+(\+|-)\d+\.?\d+(\+|-)?[A-Z_\d]*

它不依赖于斜杠作为终止符。

为了提供一个完美的答案,需要提供坐标的上下文。

以下是实现的代码,给定一个字符串作为输入:

import (
	"fmt"
	"regexp"
)

func main() {
	toSearch := "Atlantic Ocean +00-025/\nFrance +46+002/\nParis +48.52+002.20/\nEiffel Tower +48.8577+002.295/\nMount Everest +27.5916+086.5640+8850CRSWGS_84/\nNorth Pole +90+000/\nPacific Ocean +00-160/\nSouth Pole -90+000+2800CRSWGS_84/\nUnited States +38-097/\nNew York City +40.75-074.00/\nStatue of Liberty +40.6894-074.0447/"
	ISOCoord := regexp.MustCompile(`(\+|-)\d+\.?\d+(\+|-)\d+\.?\d+(\+|-)?[A-Z_\d]*`)
	result := ISOCoord.FindAll([]byte(toSearch), 11)
	for _, v := range result {
		fmt.Printf("%s\n", v)
	}
}

返回:

+00-025
+46+002
+48.52+002.20
+48.8577+002.295
+27.5916+086.5640+8850CRSWGS_84
+90+000
+00-160
-90+000+2800CRSWGS_84
+38-097
+40.75-074.00
+40.6894-074.0447

根据你想要两个单独的坐标的新想法,这种方法可以工作:

import (
	"fmt"
	"regexp"
	"strconv"
)

type coord struct {
	lat, long float64
}

func main() {
	toSearch := "Atlantic Ocean +00-025/\nFrance +46+002/\nParis +48.52+002.20/\nEiffel Tower +48.8577+002.295/\nMount Everest +27.5916+086.5640+8850CRSWGS_84/\nNorth Pole +90+000/\nPacific Ocean +00-160/\nSouth Pole -90+000+2800CRSWGS_84/\nUnited States +38-097/\nNew York City +40.75-074.00/\nStatue of Liberty +40.6894-074.0447/"
	ISOCoord := regexp.MustCompile(`((\+|-)\d+\.?\d*){2}`)
	result := ISOCoord.FindAllString(toSearch, -1)
	INDCoord := regexp.MustCompile(`(\+|-)\d+\.?\d*`)
	answer := make([]coord, 11)

	for i, v := range result {
		temp := INDCoord.FindAllString(v, 2)
		lat, _ := strconv.ParseFloat(temp[0], 64)
		lon, _ := strconv.ParseFloat(temp[1], 64)
		answer[i] = coord{lat, lon}
	}
	fmt.Println(answer)
}

正则表达式重复两次,这样更加健壮,但如果可能的话,只执行一次会更快。

代码还应该对转换进行错误检查,但你可以添加它。

还值得注意的是,它会修剪掉0。如果你想保留它们,即如果012.1与12.1不同,你可以省略转换为浮点数并使用字符串进行操作。

代码生成浮点数:

[{0 -25} {46 2} {48.52 2.2} {48.8577 2.295} {27.5916 86.564} {90 0} {0 -160} {-90 0} {38 -97} {40.75 -74} {40.6894 -74.0447}]

或者作为字符串:

[{+00 -025} {+46 +002} {+48.52 +002.20} {+48.8577 +002.295} {+27.5916 +086.5640} {+90 +000} {+00 -160} {-90 +000} {+38 -097} {+40.75 -074.00} {+40.6894 -074.0447}]
英文:

I'm not sure which pieces you would want to extract but the following regex works within your examples to select them all.

(\+|-)\d+\.?\d+(\+|-)\d+\.?[\d]+(\+|-)?[^/]*

It does work it out in pieces, and depends on the last / as being a terminator, though if it isn't there would be other ways around it.

(\+|-)\d+\.?\d+(\+|-)\d+\.?\d+(\+|-)?[A-Z_\d]*

Does not rely on the / terminator.

To provide a perfect answer, the context of the coordinates would be required.

here is the code to implement, given a string as input:

import (
	&quot;fmt&quot;
	&quot;regexp&quot;
)

func main() {
	toSearch := &quot;Atlantic Ocean +00-025/\nFrance +46+002/\nParis +48.52+002.20/\nEiffel Tower +48.8577+002.295/\nMount Everest +27.5916+086.5640+8850CRSWGS_84/\nNorth Pole +90+000/\nPacific Ocean +00-160/\nSouth Pole -90+000+2800CRSWGS_84/\nUnited States +38-097/\nNew York City +40.75-074.00/\nStatue of Liberty +40.6894-074.0447/&quot;
	ISOCoord := regexp.MustCompile(`(\+|-)\d+\.?\d+(\+|-)\d+\.?\d+(\+|-)?[A-Z_\d]*`)
	result := ISOCoord.FindAll([]byte(toSearch), 11)
	for _, v := range result {
		fmt.Printf(&quot;%s\n&quot;, v)
	}
}

returns:

+00-025
+46+002
+48.52+002.20
+48.8577+002.295
+27.5916+086.5640+8850CRSWGS_84
+90+000
+00-160
-90+000+2800CRSWGS_84
+38-097
+40.75-074.00
+40.6894-074.0447

Given the new idea that you want the 2 sepearate coords, this approch works:

import (
	&quot;fmt&quot;
	&quot;regexp&quot;
	&quot;strconv&quot;
)

type coord struct {
	lat, long float64
}

func main() {
	toSearch := &quot;Atlantic Ocean +00-025/\nFrance +46+002/\nParis +48.52+002.20/\nEiffel Tower +48.8577+002.295/\nMount Everest +27.5916+086.5640+8850CRSWGS_84/\nNorth Pole +90+000/\nPacific Ocean +00-160/\nSouth Pole -90+000+2800CRSWGS_84/\nUnited States +38-097/\nNew York City +40.75-074.00/\nStatue of Liberty +40.6894-074.0447/&quot;
	ISOCoord := regexp.MustCompile(`((\+|-)\d+\.?\d*){2}`)
	result := ISOCoord.FindAllString(toSearch, -1)
	INDCoord := regexp.MustCompile(`(\+|-)\d+\.?\d*`)
	answer := make([]coord, 11)

	for i, v := range result {
		temp := INDCoord.FindAllString(v, 2)
		lat, _ := strconv.ParseFloat(temp[0], 64)
		lon, _ := strconv.ParseFloat(temp[1], 64)
		answer[i] = coord{lat, lon}
	}
	fmt.Println(answer)
}

The regex is doubled so that it is a little more robust, but it would be faster to only do it once, if it were possible given the input.

The code also should have error checking on the conversions, but you can add that.

Also worth noting that it trims 0's. If you want to maintain those as stated, ie. if 012.1 is not the same as 12.1, you could just leave out the conversion to float and work with the strings.

Code produces as float:

[{0 -25} {46 2} {48.52 2.2} {48.8577 2.295} {27.5916 86.564} {90 0} {0 -160} {-90 0} {38 -97} {40.75 -74} {40.6894 -74.0447}]

or

[{+00 -025} {+46 +002} {+48.52 +002.20} {+48.8577 +002.295} {+27.5916 +086.5640} {+90 +000} {+00 -160} {-90 +000} {+38 -097} {+40.75 -074.00} {+40.6894 -074.0447}]

as string

huangapple
  • 本文由 发表于 2017年4月16日 02:53:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/43430004.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定