如何使用Go每10秒读取大文件的最后几行

huangapple go评论74阅读模式
英文:

How to Read last lines from a big file with Go every 10 secs

问题

如何在不完全加载到内存中的情况下读取大型日志文件的最后两行?

我需要每10秒读取一次(在Windows机器上)...我正在尝试读取最后几行,但卡住了。

package main

import (
    "fmt"
    "time"
    "os"
)

const MYFILE = "logfile.log"

func main() {
    c := time.Tick(10 * time.Second)
    for now := range c {
        readFile(MYFILE)
    }
}

func readFile(fname string){
    file, err:=os.Open(fname)
    if err!=nil{
        panic(err)
    }
    buf:=make([]byte, 32)
    c, err:=file.ReadAt(32, ????)
    fmt.Printf("%s\n", c)
}

日志文件的内容如下:

07/25/2013 11:55:42.400, 0.559
07/25/2013 11:55:52.200, 0.477
07/25/2013 11:56:02.000, 0.463
07/25/2013 11:56:11.800, 0.454
07/25/2013 11:56:21.600, 0.424
07/25/2013 11:56:31.400, 0.382
07/25/2013 11:56:41.200, 0.353
07/25/2013 11:56:51.000, 0.384
07/25/2013 11:57:00.800, 0.393
07/25/2013 11:57:10.600, 0.456

谢谢!

英文:

how can I read the last two lines from a big log file without load it into memory completely?

I need read it every 10 secs(On a Win machine)...and I'm stuck trying to read the last lines..

package main

import (
	"fmt"
	"time"
	"os"
)

const MYFILE = "logfile.log"

func main() {
	c := time.Tick(10 * time.Second)
	for now := range c {
		readFile(MYFILE)
	}
}

func readFile(fname string){
	file, err:=os.Open(fname)
	if err!=nil{
		panic(err)
	}
	buf:=make([]byte, 32)
	c, err:=file.ReadAt(32, ????)
    fmt.Printf("%s\n", c)


}

The log file is something like:

07/25/2013 11:55:42.400, 0.559
07/25/2013 11:55:52.200, 0.477
07/25/2013 11:56:02.000, 0.463
07/25/2013 11:56:11.800, 0.454
07/25/2013 11:56:21.600, 0.424
07/25/2013 11:56:31.400, 0.382
07/25/2013 11:56:41.200, 0.353
07/25/2013 11:56:51.000, 0.384
07/25/2013 11:57:00.800, 0.393
07/25/2013 11:57:10.600, 0.456

Thanks!

答案1

得分: 19

你可以使用file.Seek()file.ReadAt()来接近文件的末尾,然后向前读取。除非你知道2行等于x字节,否则你只能估计开始寻找的位置。

你可以使用os.Stat(name)来获取文件的长度。

以下是基于ReadAt、Stat和你的示例日志文件的示例代码:

package main

import (
    "fmt"
    "os"
    "time"
)

const MYFILE = "logfile.log"

func main() {
    c := time.Tick(10 * time.Second)
    for _ = range c {
        readFile(MYFILE)
    }
}

func readFile(fname string) {
    file, err := os.Open(fname)
    if err != nil {
        panic(err)
    }
    defer file.Close()

    buf := make([]byte, 62)
    stat, err := os.Stat(fname)
    start := stat.Size() - 62
    _, err = file.ReadAt(buf, start)
    if err == nil {
        fmt.Printf("%s\n", buf)
    }

}
英文:

You can use file.Seek() or file.ReadAt() to almost the end and then Reading forward. You can only estimate where to start seeking unless you can know that 2 lines = x bytes.

You can get the File length by using the os.Stat(name)

Here is an example based on ReadAt, Stat, and your sample log file:

package main

import (
	"fmt"
	"os"
	"time"
)

const MYFILE = "logfile.log"

func main() {
	c := time.Tick(10 * time.Second)
	for _ = range c {
		readFile(MYFILE)
	}
}

func readFile(fname string) {
	file, err := os.Open(fname)
	if err != nil {
		panic(err)
	}
	defer file.Close()

	buf := make([]byte, 62)
	stat, err := os.Stat(fname)
	start := stat.Size() - 62
	_, err = file.ReadAt(buf, start)
	if err == nil {
		fmt.Printf("%s\n", buf)
	}

}

答案2

得分: 18

一些人会来到这个页面,寻找高效地读取日志文件的最后一行的方法(就像tail命令行工具一样)。

这是我读取大文件最后一行的版本。它使用了两个之前的建议(使用Seek和文件Stat)。

它逆向逐字节读取文件(不需要设置缓冲区大小),直到找到一行的开头或文件的开头。

func getLastLineWithSeek(filepath string) string {
	fileHandle, err := os.Open(filepath)

	if err != nil {
		panic("Cannot open file")
		os.Exit(1)
	}
	defer fileHandle.Close()

	line := ""
	var cursor int64 = 0
	stat, _ := fileHandle.Stat()
	filesize := stat.Size()
	for { 
		cursor -= 1
		fileHandle.Seek(cursor, io.SeekEnd)

		char := make([]byte, 1)
		fileHandle.Read(char)

		if cursor != -1 && (char[0] == 10 || char[0] == 13) { // 如果找到一行则停止
			break
		}

		line = fmt.Sprintf("%s%s", string(char), line) // 这里有更高效的方法

		if cursor == -filesize { // 如果到达文件开头则停止
			break
		}
	}

	return line
}
英文:

Some people will come to this page looking for efficiently reading the last line of a log file (like the tail command line tool).

Here is my version to read the last line of a big file. It use two previous suggestions (using Seek and file Stat).

It read the file backward, byte by byte (no need to set a buffer size) until finding the beginning of a line or the beginning of the file.

func getLastLineWithSeek(filepath string) string {
	fileHandle, err := os.Open(filepath)

	if err != nil {
		panic("Cannot open file")
		os.Exit(1)
	}
	defer fileHandle.Close()

	line := ""
	var cursor int64 = 0
	stat, _ := fileHandle.Stat()
	filesize := stat.Size()
	for { 
		cursor -= 1
		fileHandle.Seek(cursor, io.SeekEnd)

		char := make([]byte, 1)
		fileHandle.Read(char)

		if cursor != -1 && (char[0] == 10 || char[0] == 13) { // stop if we find a line
			break
		}

		line = fmt.Sprintf("%s%s", string(char), line) // there is more efficient way

		if cursor == -filesize { // stop if we are at the begining
			break
		}
	}

	return line
}

答案3

得分: 3

我认为File.Seek(0, 2)File.Read()的组合应该可以工作。

Seek调用可以让你定位到文件末尾。你可以Seek到EOF前的位置来获取最后几行。然后你可以读取直到EOF,并在goroutine中休眠10秒钟;下一次Read有机会获取更多的数据。

你可以从GNU tail的源代码中借鉴这个思路(以及最初显示最后几行的扫描逻辑)。

英文:

I think a combination of File.Seek(0, 2) and File.Read() should work.

The Seek call gets you to the end of file. You can Seek to a position a bit before the EOF to get last few lines. Then you Read till the EOF and just sleep in your goroutine for 10 seconds; next Read has a chance to get you more data.

You can snatch the idea (and the scan-back logic for initially showing few last lines) from GNU tail's source.

答案4

得分: 2

好的,这只是一个初步的想法,也许不是最好的方法,你应该检查和改进它,但似乎可以工作...

我希望有经验的Go用户也能做出贡献...

使用Stat可以获取文件的大小,并从中获取用于ReadAt的偏移量

func readLastLine(fname string) {
	file, err := os.Open(fname)
	if err != nil {
		panic(err)
	}
	defer file.Close()

	fi, err := file.Stat()
	if err != nil {
		fmt.Println(err)
	}

	buf := make([]byte, 32)
	n, err := file.ReadAt(buf, fi.Size()-int64(len(buf)))
	if err != nil {
		fmt.Println(err)
	}
	buf = buf[:n]
	fmt.Printf("%s", buf)

}
英文:

Well, this is only a raw idea and maybe not the best way, you should check and improve it, but seems to work...

I hope that experienced Go users could contribute too..

With Stat you can get the size of the file and from it get the offset for use with ReadAt

func readLastLine(fname string) {
	file, err := os.Open(fname)
	if err != nil {
		panic(err)
	}
	defer file.Close()

	fi, err := file.Stat()
	if err != nil {
		fmt.Println(err)
	}

	buf := make([]byte, 32)
	n, err := file.ReadAt(buf, fi.Size()-int64(len(buf)))
	if err != nil {
		fmt.Println(err)
	}
	buf = buf[:n]
	fmt.Printf("%s", buf)

}

答案5

得分: 2

我使用tail来减小占用空间。不确定它在性能方面如何比较。

// 使用“-1”作为计数,只获取最后一行
func printLastLines(count, path string) {
	c := exec.Command("tail", count, path)
	output, _ := c.Output()
	fmt.Println(string(output))
}

对于Windows,你需要像这样做

func printLastWindows(count, path string) {
	ps, _ := exec.LookPath("powershell.exe")
	args := strings.Split(fmt.Sprintf(`Get-Content %s | Select-Object -last %s`, path, count), " ")
	c := exec.Command(ps, args...)
	output, _ := c.Output()
	fmt.Println(string(output))
}
英文:

I used tail for a smaller footprint. Not sure how it compares performance wise.

// use "-1" as count for just last line
func printLastLines(count, path string) {
	c := exec.Command("tail", count, path)
	output, _ := c.Output()
	fmt.Println(string(output))
}

For windows you have to do something like this

func printLastWindows(count, path string) {
	ps, _ := exec.LookPath("powershell.exe")
	args := strings.Split(fmt.Sprintf(`Get-Content %s | Select-Object -last %s`, path, count), " ")
	c := exec.Command(ps, args...)
	output, _ := c.Output()
	fmt.Println(string(output))
}

答案6

得分: 1

这是我为了以相反的行顺序读取大字节而编写的代码。它不会在尾随空格上中断。

这段代码的作用是以相反的顺序循环字节,它计算遇到的字节数。当它检测到换行符时,它会通过该数字回到上一行并将其写入结果的[]byte中,然后重置该数字。它会一直这样做,直到满足maxLine变量。

这样做过于复杂,如果你只想从特定行读取字节,可能有更好的方法。变量名已经变长以便更容易阅读。

func ReverseByte(fileByte []byte, maxLine int) []byte {
	// 这是一个表示换行符的字节码,即"\n"
	nl := byte(10)

	var reverseFileByte []byte
	var lineLen, lineWritten int

	byteIndex := len(fileByte) - 1
	for lineWritten < maxLine {
		if fileByte[byteIndex] == nl {
			currentLine := make([]byte, lineLen)
			byteLineIndex := byteIndex
			var currentLineIndex int
			for currentLineIndex < lineLen {
				currentLine[currentLineIndex] = fileByte[byteLineIndex]
				byteLineIndex++
				currentLineIndex++
			}
			reverseFileByte = append(reverseFileByte, currentLine...)
			lineLen = 0
			lineWritten++
		}
		lineLen++
		byteIndex--
	}
	return reverseFileByte
}

https://go.dev/play/p/qKDFxiJQAfF

英文:

Here's the code I wrote for reading large bytes in reverse line order. It doesn't break on trailing whitespace.

What this code do is loop the bytes in reverse, it count up the number of byte it has encounter. When it detech a newline character, it loop back by that number to write the line and append() it into the resulting []byte and then reset the number. It do this until the maxLine variable is satisfied.

This is overly complicated, if you just want to read bytes from specific line, there might be a better way for that. The variable names has been longed for easier reading.

func ReverseByte(fileByte []byte, maxLine int) []byte {
	// This is a byte &quot;code&quot; for NewLine or &quot;\n&quot;
	nl := byte(10)

	var reverseFileByte []byte
	var lineLen, lineWritten int

	byteIndex := len(fileByte) - 1
	for lineWritten &lt; maxLine {
		if fileByte[byteIndex] == nl {
			currentLine := make([]byte, lineLen)
			byteLineIndex := byteIndex
			var currentLineIndex int
			for currentLineIndex &lt; lineLen {
				currentLine[currentLineIndex] = fileByte[byteLineIndex]
				byteLineIndex++
				currentLineIndex++
			}
			reverseFileByte = append(reverseFileByte, currentLine...)
			lineLen = 0
			lineWritten++
		}
		lineLen++
		byteIndex--
	}
	return reverseFileByte
}

https://go.dev/play/p/qKDFxiJQAfF

huangapple
  • 本文由 发表于 2013年7月26日 00:36:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/17863821.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定