将字符串解析为时间,但未知布局。

huangapple go评论147阅读模式
英文:

Parsing string to time with unknown layout

问题

我有一个csv文件,想要读取:

  1. 表头名称
  2. 字段类型

所以,我写了以下代码:

  1. package main
  2. import (
  3. "encoding/csv"
  4. "fmt"
  5. "os"
  6. "log"
  7. "reflect"
  8. "strconv"
  9. )
  10. func main() {
  11. filePath := "./file.csv"
  12. headerNames := make(map[int]string)
  13. headerTypes := make(map[int]string)
  14. // 加载csv文件
  15. f, _ := os.Open(filePath)
  16. // 创建一个新的读取器
  17. r := csv.NewReader(f)
  18. // 只读取第一行
  19. header, err := r.Read()
  20. checkError("发生其他错误", err)
  21. // 添加映射:列/属性名称 --> 记录索引
  22. for i, v := range header {
  23. headerNames[i] = v
  24. }
  25. // 读取第二行
  26. record, err := r.Read()
  27. checkError("发生其他错误", err)
  28. // 检查记录字段类型
  29. for i, v := range record {
  30. var value interface{}
  31. if value, err = strconv.Atoi(v); err != nil {
  32. if value, err = strconv.ParseFloat(v, 64); err != nil {
  33. if value, err = strconv.ParseBool(v); err != nil {
  34. if value, err = strconv.ParseBool(v); err != nil { // <== 如何处理未知布局的情况
  35. // 值是字符串
  36. headerTypes[i] = "string"
  37. value = v
  38. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  39. } else {
  40. // 值是时间戳
  41. headerTypes[i] = "time"
  42. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  43. }
  44. } else {
  45. // 值是布尔型
  46. headerTypes[i] = "bool"
  47. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  48. }
  49. } else {
  50. // 值是浮点型
  51. headerTypes[i] = "float"
  52. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  53. }
  54. } else {
  55. // 值是整型
  56. headerTypes[i] = "int"
  57. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  58. }
  59. }
  60. for i, _ := range header {
  61. fmt.Printf("表头: %v \t类型: %v\n", headerNames[i], headerTypes[i])
  62. }
  63. }
  64. func checkError(message string, err error) {
  65. // 错误日志记录
  66. if err != nil {
  67. log.Fatal(message, err)
  68. }
  69. }

使用以下csv文件:

  1. name,age,developer
  2. "Hasan","46.4","true"

我得到的输出结果是:

  1. 表头: name 类型: string
  2. 表头: age 类型: float
  3. 表头: developer 类型: bool

输出结果是正确的。

我无法做到的是检查字段是否为字符串,因为我不知道字段的布局。

我知道可以根据https://go.dev/src/time/format.go中所述的格式将字符串解析为时间,并且可以构建一个自定义解析器,例如:

  1. test, err := fmtdate.Parse("MM/DD/YYYY", "10/15/1983")
  2. if err != nil {
  3. panic(err)
  4. }

但是,这只适用于我知道布局的情况下。

所以,我再次的问题是,如果我不知道布局,我应该如何解析时间,或者我应该做些什么才能解析时间?

英文:

I'm having a csv file, and want to read:

  1. Header names
  2. Fields types

So, I wrote the below:

  1. package main
  2. import (
  3. "encoding/csv"
  4. "fmt"
  5. "os"
  6. "log"
  7. "reflect"
  8. "strconv"
  9. )
  10. func main() {
  11. filePath := "./file.csv"
  12. headerNames := make(map[int]string)
  13. headerTypes := make(map[int]string)
  14. // Load a csv file.
  15. f, _ := os.Open(filePath)
  16. // Create a new reader.
  17. r := csv.NewReader(f)
  18. // Read first row only
  19. header, err := r.Read()
  20. checkError("Some other error occurred", err)
  21. // Add mapping: Column/property name --> record index
  22. for i, v := range header {
  23. headerNames[i] = v
  24. }
  25. // Read second row
  26. record, err := r.Read()
  27. checkError("Some other error occurred", err)
  28. // Check record fields types
  29. for i, v := range record {
  30. var value interface{}
  31. if value, err = strconv.Atoi(v); err != nil {
  32. if value, err = strconv.ParseFloat(v, 64); err != nil {
  33. if value, err = strconv.ParseBool(v); err != nil {
  34. if value, err = strconv.ParseBool(v); err != nil { // <== How to do this with unknown layout
  35. // Value is a string
  36. headerTypes[i] = "string"
  37. value = v
  38. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  39. } else {
  40. // Value is a timestamp
  41. headerTypes[i] = "time"
  42. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  43. }
  44. } else {
  45. // Value is a bool
  46. headerTypes[i] = "bool"
  47. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  48. }
  49. } else {
  50. // Value is a float
  51. headerTypes[i] = "float"
  52. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  53. }
  54. } else {
  55. // Value is an int
  56. headerTypes[i] = "int"
  57. fmt.Println(reflect.TypeOf(value), reflect.ValueOf(value))
  58. }
  59. }
  60. for i, _ := range header {
  61. fmt.Printf("Header: %v \tis\t %v\n", headerNames[i], headerTypes[i])
  62. }
  63. }
  64. func checkError(message string, err error) {
  65. // Error Logging
  66. if err != nil {
  67. log.Fatal(message, err)
  68. }
  69. }

And with csv file as:

  1. name,age,developer
  2. "Hasan","46.4","true"

I got an output as:

  1. Header: name is string
  2. Header: age is float
  3. Header: developer is bool

The output is correct.

The thing that I could not do is the one is checking if the field is string as I do not know what layout the field could be.

I aware I can pasre string to time as per the format stated at https://go.dev/src/time/format.go, and can build a custom parser, something like:

  1. test, err := fmtdate.Parse("MM/DD/YYYY", "10/15/1983")
  2. if err != nil {
  3. panic(err)
  4. }

But this will work only (as per my knowledge) if I know the layout?

So, again my question is, how can I parse time, or what shall I do to be able to parse it, if I do not know the layout?

答案1

得分: 0

感谢Burak的评论,我通过使用这个包找到了解决方案:github.com/araddon/dateparse

  1. // 普通解析。与time.Parse()相同的时区规则
  2. t, err := dateparse.ParseAny("3/1/2014")
  3. // 严格解析,对于模糊的mm/dd和dd/mm日期会返回错误
  4. t, err := dateparse.ParseStrict("3/1/2014")
  5. > 返回错误
  6. // 返回表示解析给定日期时间的布局的字符串
  7. layout, err := dateparse.ParseFormat("May 8, 2009 5:57:51 PM")
  8. > "Jan 2, 2006 3:04:05 PM"
英文:

Thanks to the comment by Burak, I found the solution by using this package: github.com/araddon/dateparse

  1. // Normal parse. Equivalent Timezone rules as time.Parse()
  2. t, err := dateparse.ParseAny("3/1/2014")
  3. // Parse Strict, error on ambigous mm/dd vs dd/mm dates
  4. t, err := dateparse.ParseStrict("3/1/2014")
  5. > returns error
  6. // Return a string that represents the layout to parse the given date-time.
  7. layout, err := dateparse.ParseFormat("May 8, 2009 5:57:51 PM")
  8. > "Jan 2, 2006 3:04:05 PM"

huangapple
  • 本文由 发表于 2022年7月25日 21:18:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/73109905.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定