在golang的写入器中去除连续的空行。

huangapple go评论93阅读模式
英文:

Strip consecutive empty lines in a golang writer

问题

我有一个Go的text/template,用于渲染文件,但是我发现在保留输出中的换行符的同时,很难清晰地构建模板结构。

我想在模板中添加额外的、不必要的换行符,以使其更易读,但是从输出中删除它们。任何超过正常段落换行的连续换行符都应该被压缩为正常的段落换行符,例如:

带有




太多换行符的行应该变成带有

正常段落换行符的行。

由于字符串可能太大而无法安全地存储在内存中,所以我希望将其作为输出流保留。

我的第一次尝试:

type condensingWriter struct {
	writer io.Writer
	lastLineIsEmpty bool
}

func (c condensingWriter) Write(b []byte) (n int, err error){
	thisLineIsEmpty := strings.TrimSpace(string(b)) == ""
    defer func(){
        c.lastLineIsEmpty = thisLineIsEmpty
    }()
	if c.lastLineIsEmpty && thisLineIsEmpty{
		return 0, nil
	} else {
		return c.writer.Write(b)
	}
}

这个方法不起作用,因为我天真地认为它会在换行符上进行缓冲,但实际上并不会。

有没有关于如何使其工作的建议?

英文:

I've got a Go text/template that renders a file, however I've found it difficult to structure the template cleanly while preserving the line breaks in the output.

I'd like to have additional, unnecessary newlines in the template to make it more readable, but strip them from the output. Any group of newlines more than a normal paragraph break should be condensed to a normal paragraph break, e.g.

lines with



too many breaks should become lines with

normal paragraph breaks.

The string is potentially too large to store safely in memory, so I want to keep it as an output stream.

My first attempt:

type condensingWriter struct {
	writer io.Writer
	lastLineIsEmpty bool
}

func (c condensingWriter) Write(b []byte) (n int, err error){
	thisLineIsEmpty := strings.TrimSpace(string(b)) == ""
    defer func(){
        c.lastLineIsEmpty = thisLineIsEmpty
    }()
	if c.lastLineIsEmpty && thisLineIsEmpty{
		return 0, nil
	} else {
		return c.writer.Write(b)
	}
}

This doesn't work because I naively assumed that it would buffer on newline characters, but it doesn't.

Any suggestions on how to get this to work?

答案1

得分: 1

受到zmb的方法的启发,我提出了以下的包:

//Package striplines strips runs of consecutive empty lines from an output stream.
package striplines

import (
  "io"
  "strings"
)

// Striplines wraps an output stream, stripping runs of consecutive empty lines.
// You must call Flush before the output stream will be complete.
// Implements io.WriteCloser, Writer, Closer.
type Striplines struct {
  Writer   io.Writer
  lastLine []byte
  currentLine []byte
}

func (w *Striplines) Write(p []byte) (int, error) {
  totalN := 0
  s := string(p)
  if !strings.Contains(s, "\n") {
    w.currentLine = append(w.currentLine, p...)
    return 0, nil 
  }
  cur := string(append(w.currentLine, p...))
  lastN := strings.LastIndex(cur, "\n")
  s = cur[:lastN]
  for _, line := range strings.Split(s, "\n") {
    n, err := w.writeLn(line + "\n")
    w.lastLine = []byte(line)
    if err != nil {
      return totalN, err 
    }   
    totalN += n
  }
  rem := cur[(lastN + 1):]
  w.currentLine = []byte(rem)
  return totalN, nil 
}

// Close flushes the last of the output into the underlying writer.
func (w *Striplines) Close() error {
  _, err := w.writeLn(string(w.currentLine))
  return err 
}

func (w *Striplines) writeLn(line string) (n int, err error) {
  if strings.TrimSpace(string(w.lastLine)) == "" && strings.TrimSpace(line) == "" {
    return 0, nil 
  } else {
    return w.Writer.Write([]byte(line))
  }
}

在这里查看它的运行效果:http://play.golang.org/p/t8BGPUMYhb

英文:

Inspired by zmb's approach, I've come up with the following package:

//Package striplines strips runs of consecutive empty lines from an output stream.
package striplines

import (
  "io"
  "strings"
)

// Striplines wraps an output stream, stripping runs of consecutive empty lines.
// You must call Flush before the output stream will be complete.
// Implements io.WriteCloser, Writer, Closer.
type Striplines struct {
  Writer   io.Writer
  lastLine []byte
  currentLine []byte
}

func (w *Striplines) Write(p []byte) (int, error) {
  totalN := 0
  s := string(p)
  if !strings.Contains(s, "\n") {
    w.currentLine = append(w.currentLine, p...)
    return 0, nil 
  }
  cur := string(append(w.currentLine, p...))
  lastN := strings.LastIndex(cur, "\n")
  s = cur[:lastN]
  for _, line := range strings.Split(s, "\n") {
    n, err := w.writeLn(line + "\n")
    w.lastLine = []byte(line)
    if err != nil {
      return totalN, err 
    }   
    totalN += n
  }
  rem := cur[(lastN + 1):]
  w.currentLine = []byte(rem)
  return totalN, nil 
}

// Close flushes the last of the output into the underlying writer.
func (w *Striplines) Close() error {
  _, err := w.writeLn(string(w.currentLine))
  return err 
}

func (w *Striplines) writeLn(line string) (n int, err error) {
  if strings.TrimSpace(string(w.lastLine)) == "" && strings.TrimSpace(line) == "" {
    return 0, nil 
  } else {
    return w.Writer.Write([]byte(line))
  }
}

See it in action here: http://play.golang.org/p/t8BGPUMYhb

答案2

得分: 0

一般的思路是,在输入切片中寻找连续的换行符,如果存在这样的情况,则跳过除第一个换行符之外的所有换行符。

此外,你需要跟踪上次写入的字节是否是换行符,这样下一次调用Write时就会知道是否需要消除一个换行符。你在写入器类型中添加了一个布尔值是正确的方向。然而,在这里你应该使用指针接收器而不是值接收器,否则你将修改结构体的_副本_。

你需要将

func (c condensingWriter) Write(b []byte)

改为

func (c *condensingWriter) Write(b []byte)

你可以尝试类似这样的代码。你需要使用更大的输入进行测试,以确保它正确处理所有情况。

package main

import (
	"bytes"
	"io"
	"os"
)

var Newline byte = byte('\n')

type ReduceNewlinesWriter struct {
	w               io.Writer
	lastByteNewline bool
}

func (r *ReduceNewlinesWriter) Write(b []byte) (int, error) {
	// 如果上一次调用 Write 以 \n 结尾
	// 那么我们必须跳过这里的所有起始换行符
	i := 0
	if r.lastByteNewline {
		for i < len(b) && b[i] == Newline {
			i++
		}
		b = b[i:]
	}
	r.lastByteNewline = b[len(b)-1] == Newline

	i = bytes.IndexByte(b, Newline)
	if i == -1 {
		// 没有换行符 - 直接写入整个内容
		return r.w.Write(b)
	}
	// 写入到换行符之前
	i++
	n, err := r.w.Write(b[:i])
	if err != nil {
		return n, err
	}

	// 跳过紧接的换行符并递归调用
	i++

	for i < len(b) && b[i] == Newline {
		i++
	}
	i--
	m, err := r.Write(b[i:])
	return n + m, nil
}

func main() {
	r := ReduceNewlinesWriter{
		w: os.Stdout,
	}
	io.WriteString(&r, "this\n\n\n\n\n\n\nhas\nmultiple\n\n\nnewline\n\n\n\ncharacters")
}
英文:

The general idea is you'll have to look for consecutive newlines anywhere in the input slice and if such cases exist, skip over all but the first newline character.

Additionally, you have to track whether the last byte written was a newline, so the next call to Write will know to eliminate a newline if necessary. You were on the right track by adding a bool to your writer type. However, you'll want to use a pointer receiver instead of a value receiver here, otherwise you'll be modifying a copy of the struct.

You would want to change

func (c condensingWriter) Write(b []byte)

to

func (c *condensingWriter) Write(b []byte)

You could try something like this. You'll have to test with larger inputs to make sure it handles all cases correctly.

package main
import (
&quot;bytes&quot;
&quot;io&quot;
&quot;os&quot;
)
var Newline byte = byte(&#39;\n&#39;)
type ReduceNewlinesWriter struct {
w               io.Writer
lastByteNewline bool
}
func (r *ReduceNewlinesWriter) Write(b []byte) (int, error) {
// if the previous call to Write ended with a \n
// then we have to skip over any starting newlines here
i := 0
if r.lastByteNewline {
for i &lt; len(b) &amp;&amp; b[i] == Newline {
i++
}
b = b[i:]
}
r.lastByteNewline = b[len(b) - 1] == Newline
i = bytes.IndexByte(b, Newline)
if i == -1 {
// no newlines - just write the entire thing
return r.w.Write(b)
}
// write up to the newline
i++
n, err := r.w.Write(b[:i])
if err != nil {
return n, err
}
// skip over immediate newline and recurse
i++
for i &lt; len(b) &amp;&amp; b[i] == Newline {
i++
}
i--
m, err := r.Write(b[i:])
return n + m, nil
}
func main() {
r := ReduceNewlinesWriter{
w: os.Stdout,
}
io.WriteString(&amp;r, &quot;this\n\n\n\n\n\n\nhas\nmultiple\n\n\nnewline\n\n\n\ncharacters&quot;)
}

huangapple
  • 本文由 发表于 2015年2月6日 04:21:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/28353313.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定