英文:
How to Read last lines from a big file with Go every 10 secs
问题
如何在不完全加载到内存中的情况下读取大型日志文件的最后两行?
我需要每10秒读取一次(在Windows机器上)...我正在尝试读取最后几行,但卡住了。
package main
import (
"fmt"
"time"
"os"
)
const MYFILE = "logfile.log"
func main() {
c := time.Tick(10 * time.Second)
for now := range c {
readFile(MYFILE)
}
}
func readFile(fname string){
file, err:=os.Open(fname)
if err!=nil{
panic(err)
}
buf:=make([]byte, 32)
c, err:=file.ReadAt(32, ????)
fmt.Printf("%s\n", c)
}
日志文件的内容如下:
07/25/2013 11:55:42.400, 0.559
07/25/2013 11:55:52.200, 0.477
07/25/2013 11:56:02.000, 0.463
07/25/2013 11:56:11.800, 0.454
07/25/2013 11:56:21.600, 0.424
07/25/2013 11:56:31.400, 0.382
07/25/2013 11:56:41.200, 0.353
07/25/2013 11:56:51.000, 0.384
07/25/2013 11:57:00.800, 0.393
07/25/2013 11:57:10.600, 0.456
谢谢!
英文:
how can I read the last two lines from a big log file without load it into memory completely?
I need read it every 10 secs(On a Win machine)...and I'm stuck trying to read the last lines..
package main
import (
"fmt"
"time"
"os"
)
const MYFILE = "logfile.log"
func main() {
c := time.Tick(10 * time.Second)
for now := range c {
readFile(MYFILE)
}
}
func readFile(fname string){
file, err:=os.Open(fname)
if err!=nil{
panic(err)
}
buf:=make([]byte, 32)
c, err:=file.ReadAt(32, ????)
fmt.Printf("%s\n", c)
}
The log file is something like:
07/25/2013 11:55:42.400, 0.559
07/25/2013 11:55:52.200, 0.477
07/25/2013 11:56:02.000, 0.463
07/25/2013 11:56:11.800, 0.454
07/25/2013 11:56:21.600, 0.424
07/25/2013 11:56:31.400, 0.382
07/25/2013 11:56:41.200, 0.353
07/25/2013 11:56:51.000, 0.384
07/25/2013 11:57:00.800, 0.393
07/25/2013 11:57:10.600, 0.456
Thanks!
答案1
得分: 19
你可以使用file.Seek()或file.ReadAt()来接近文件的末尾,然后向前读取。除非你知道2行等于x字节,否则你只能估计开始寻找的位置。
你可以使用os.Stat(name)来获取文件的长度。
以下是基于ReadAt、Stat和你的示例日志文件的示例代码:
package main
import (
"fmt"
"os"
"time"
)
const MYFILE = "logfile.log"
func main() {
c := time.Tick(10 * time.Second)
for _ = range c {
readFile(MYFILE)
}
}
func readFile(fname string) {
file, err := os.Open(fname)
if err != nil {
panic(err)
}
defer file.Close()
buf := make([]byte, 62)
stat, err := os.Stat(fname)
start := stat.Size() - 62
_, err = file.ReadAt(buf, start)
if err == nil {
fmt.Printf("%s\n", buf)
}
}
英文:
You can use file.Seek() or file.ReadAt() to almost the end and then Reading forward. You can only estimate where to start seeking unless you can know that 2 lines = x bytes.
You can get the File length by using the os.Stat(name)
Here is an example based on ReadAt, Stat, and your sample log file:
package main
import (
"fmt"
"os"
"time"
)
const MYFILE = "logfile.log"
func main() {
c := time.Tick(10 * time.Second)
for _ = range c {
readFile(MYFILE)
}
}
func readFile(fname string) {
file, err := os.Open(fname)
if err != nil {
panic(err)
}
defer file.Close()
buf := make([]byte, 62)
stat, err := os.Stat(fname)
start := stat.Size() - 62
_, err = file.ReadAt(buf, start)
if err == nil {
fmt.Printf("%s\n", buf)
}
}
答案2
得分: 18
一些人会来到这个页面,寻找高效地读取日志文件的最后一行的方法(就像tail命令行工具一样)。
这是我读取大文件最后一行的版本。它使用了两个之前的建议(使用Seek和文件Stat)。
它逆向逐字节读取文件(不需要设置缓冲区大小),直到找到一行的开头或文件的开头。
func getLastLineWithSeek(filepath string) string {
fileHandle, err := os.Open(filepath)
if err != nil {
panic("Cannot open file")
os.Exit(1)
}
defer fileHandle.Close()
line := ""
var cursor int64 = 0
stat, _ := fileHandle.Stat()
filesize := stat.Size()
for {
cursor -= 1
fileHandle.Seek(cursor, io.SeekEnd)
char := make([]byte, 1)
fileHandle.Read(char)
if cursor != -1 && (char[0] == 10 || char[0] == 13) { // 如果找到一行则停止
break
}
line = fmt.Sprintf("%s%s", string(char), line) // 这里有更高效的方法
if cursor == -filesize { // 如果到达文件开头则停止
break
}
}
return line
}
英文:
Some people will come to this page looking for efficiently reading the last line of a log file (like the tail command line tool).
Here is my version to read the last line of a big file. It use two previous suggestions (using Seek and file Stat).
It read the file backward, byte by byte (no need to set a buffer size) until finding the beginning of a line or the beginning of the file.
func getLastLineWithSeek(filepath string) string {
fileHandle, err := os.Open(filepath)
if err != nil {
panic("Cannot open file")
os.Exit(1)
}
defer fileHandle.Close()
line := ""
var cursor int64 = 0
stat, _ := fileHandle.Stat()
filesize := stat.Size()
for {
cursor -= 1
fileHandle.Seek(cursor, io.SeekEnd)
char := make([]byte, 1)
fileHandle.Read(char)
if cursor != -1 && (char[0] == 10 || char[0] == 13) { // stop if we find a line
break
}
line = fmt.Sprintf("%s%s", string(char), line) // there is more efficient way
if cursor == -filesize { // stop if we are at the begining
break
}
}
return line
}
答案3
得分: 3
我认为File.Seek(0, 2)
和File.Read()
的组合应该可以工作。
Seek
调用可以让你定位到文件末尾。你可以Seek
到EOF前的位置来获取最后几行。然后你可以读取直到EOF,并在goroutine中休眠10秒钟;下一次Read
有机会获取更多的数据。
你可以从GNU tail
的源代码中借鉴这个思路(以及最初显示最后几行的扫描逻辑)。
英文:
I think a combination of File.Seek(0, 2)
and File.Read()
should work.
The Seek
call gets you to the end of file. You can Seek
to a position a bit before the EOF to get last few lines. Then you Read
till the EOF and just sleep in your goroutine for 10 seconds; next Read
has a chance to get you more data.
You can snatch the idea (and the scan-back logic for initially showing few last lines) from GNU tail
's source.
答案4
得分: 2
好的,这只是一个初步的想法,也许不是最好的方法,你应该检查和改进它,但似乎可以工作...
我希望有经验的Go用户也能做出贡献...
使用Stat可以获取文件的大小,并从中获取用于ReadAt的偏移量
func readLastLine(fname string) {
file, err := os.Open(fname)
if err != nil {
panic(err)
}
defer file.Close()
fi, err := file.Stat()
if err != nil {
fmt.Println(err)
}
buf := make([]byte, 32)
n, err := file.ReadAt(buf, fi.Size()-int64(len(buf)))
if err != nil {
fmt.Println(err)
}
buf = buf[:n]
fmt.Printf("%s", buf)
}
英文:
Well, this is only a raw idea and maybe not the best way, you should check and improve it, but seems to work...
I hope that experienced Go users could contribute too..
With Stat you can get the size of the file and from it get the offset for use with ReadAt
func readLastLine(fname string) {
file, err := os.Open(fname)
if err != nil {
panic(err)
}
defer file.Close()
fi, err := file.Stat()
if err != nil {
fmt.Println(err)
}
buf := make([]byte, 32)
n, err := file.ReadAt(buf, fi.Size()-int64(len(buf)))
if err != nil {
fmt.Println(err)
}
buf = buf[:n]
fmt.Printf("%s", buf)
}
答案5
得分: 2
我使用tail
来减小占用空间。不确定它在性能方面如何比较。
// 使用“-1”作为计数,只获取最后一行
func printLastLines(count, path string) {
c := exec.Command("tail", count, path)
output, _ := c.Output()
fmt.Println(string(output))
}
对于Windows,你需要像这样做
func printLastWindows(count, path string) {
ps, _ := exec.LookPath("powershell.exe")
args := strings.Split(fmt.Sprintf(`Get-Content %s | Select-Object -last %s`, path, count), " ")
c := exec.Command(ps, args...)
output, _ := c.Output()
fmt.Println(string(output))
}
英文:
I used tail
for a smaller footprint. Not sure how it compares performance wise.
// use "-1" as count for just last line
func printLastLines(count, path string) {
c := exec.Command("tail", count, path)
output, _ := c.Output()
fmt.Println(string(output))
}
For windows you have to do something like this
func printLastWindows(count, path string) {
ps, _ := exec.LookPath("powershell.exe")
args := strings.Split(fmt.Sprintf(`Get-Content %s | Select-Object -last %s`, path, count), " ")
c := exec.Command(ps, args...)
output, _ := c.Output()
fmt.Println(string(output))
}
答案6
得分: 1
这是我为了以相反的行顺序读取大字节而编写的代码。它不会在尾随空格上中断。
这段代码的作用是以相反的顺序循环字节,它计算遇到的字节数。当它检测到换行符时,它会通过该数字回到上一行并将其写入结果的[]byte
中,然后重置该数字。它会一直这样做,直到满足maxLine
变量。
这样做过于复杂,如果你只想从特定行读取字节,可能有更好的方法。变量名已经变长以便更容易阅读。
func ReverseByte(fileByte []byte, maxLine int) []byte {
// 这是一个表示换行符的字节码,即"\n"
nl := byte(10)
var reverseFileByte []byte
var lineLen, lineWritten int
byteIndex := len(fileByte) - 1
for lineWritten < maxLine {
if fileByte[byteIndex] == nl {
currentLine := make([]byte, lineLen)
byteLineIndex := byteIndex
var currentLineIndex int
for currentLineIndex < lineLen {
currentLine[currentLineIndex] = fileByte[byteLineIndex]
byteLineIndex++
currentLineIndex++
}
reverseFileByte = append(reverseFileByte, currentLine...)
lineLen = 0
lineWritten++
}
lineLen++
byteIndex--
}
return reverseFileByte
}
https://go.dev/play/p/qKDFxiJQAfF
英文:
Here's the code I wrote for reading large bytes in reverse line order. It doesn't break on trailing whitespace.
What this code do is loop the bytes in reverse, it count up the number of byte it has encounter. When it detech a newline character, it loop back by that number to write the line and append()
it into the resulting []byte
and then reset the number. It do this until the maxLine
variable is satisfied.
This is overly complicated, if you just want to read bytes from specific line, there might be a better way for that. The variable names has been longed for easier reading.
func ReverseByte(fileByte []byte, maxLine int) []byte {
// This is a byte "code" for NewLine or "\n"
nl := byte(10)
var reverseFileByte []byte
var lineLen, lineWritten int
byteIndex := len(fileByte) - 1
for lineWritten < maxLine {
if fileByte[byteIndex] == nl {
currentLine := make([]byte, lineLen)
byteLineIndex := byteIndex
var currentLineIndex int
for currentLineIndex < lineLen {
currentLine[currentLineIndex] = fileByte[byteLineIndex]
byteLineIndex++
currentLineIndex++
}
reverseFileByte = append(reverseFileByte, currentLine...)
lineLen = 0
lineWritten++
}
lineLen++
byteIndex--
}
return reverseFileByte
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论