2015年7月10日 13:49:20go评论81阅读模式

英文:

Faster input scanning

问题

我正在尝试解决SPOJ问题，可以在这里找到。

以下是我的解决方案：

package main

import "fmt"
import "bufio"
import "os"

func main() {
    var n, k int
    var num int
    var divisible int

    in := bufio.NewReader(os.Stdin)

    fmt.Fscan(in, &n)
    fmt.Fscan(in, &k)

    for n > 0 {
        fmt.Fscan(in, &num)

        if num%k == 0 {
            divisible++
        }

        n--
    }

    fmt.Println(divisible)
}

代码运行正常。问题在于当我在SPOJ上执行时，会出现超时的问题。

我最初只使用了fmt.Scan，但后来我看到了这个帖子，建议我改用bufio来进行更快的输入扫描。

但我仍然遇到超时问题。我只是在循环中获取所有的输入，并在循环内部确定输入是否可被整除。所以，我认为问题不在于循环，而是在于输入扫描上花费的时间。我该如何改进以更快地读取输入？或者问题出在其他地方吗？

英文:

I am attempting to solve the SPOJ question that can be found here

Following is my solution:

package main

import &quot;fmt&quot;
import &quot;bufio&quot;
import &quot;os&quot;

func main() {
	var n, k int
	var num int
	var divisible int

	in := bufio.NewReader(os.Stdin)

	fmt.Fscan(in, &amp;n)
	fmt.Fscan(in, &amp;k)

	for n &gt; 0 {
		fmt.Fscan(in, &amp;num)

		if num%k == 0 {
			divisible++
		}

		n--
	}

	fmt.Println(divisible)
}

The code works fine. The issue here is that I get a timeout when I execute it in SPOJ.

I was first using only fmt.Scan but I then came across this thread that suggested that I use bufio instead for faster input scanning.

But I still get a timeout issue. I am only looping to get all the inputs and within this loop itself I determine whether the input is divisible or not. So, I believe that its not the loop but the input scanning that's taking time. How can I improve this to read the input faster? Or is the issue somewhere else?

答案1

得分: 16

你可以使用bufio.Scanner从输入中读取行。

由于我们总是读取数字，我们可以创建一个高度优化的转换器来获取数字。我们应该避免使用Scanner.Text()，它会创建一个string，而我们可以直接从Scanner.Bytes()返回的原始字节中获取数字。Scanner.Text()返回与Scanner.Bytes()相同的标记，但它首先转换为string，显然更慢，并且会生成“垃圾”并给垃圾回收器带来额外的工作。

所以这里有一个从原始字节中获取int的转换器函数：

func toInt(buf []byte) (n int) {
    for _, v := range buf {
        n = n*10 + int(v-'0')
    }
    return
}

这个toInt()函数之所以有效，是因为[]byte包含数字的十进制表示的UTF-8编码字节序列，其中只包含'0'..'9'范围内的数字，其UTF-8编码字节是一对一映射的（一个字节用于一个数字）。从数字到字节的映射只是一个偏移：'0' -> 48，'1' -> 49等等。

使用这个转换器的完整应用程序如下：

package main

import (
    "bufio"
    "fmt"
    "os"
)

func main() {
    var n, k, c int
    scanner := bufio.NewScanner(os.Stdin)

    scanner.Scan()
    fmt.Sscanf(scanner.Text(), "%d %d", &n, &k)

    for ; n > 0; n-- {
        scanner.Scan()
        if toInt(scanner.Bytes())%k == 0 {
            c++
        }
    }

    fmt.Println(c)
}

func toInt(buf []byte) (n int) {
    for _, v := range buf {
        n = n*10 + int(v-'0')
    }
    return
}

这个解决方案比例如调用strconv.Atoi()要快大约4倍。

注意：

在上面的解决方案中，我假设输入是有效的，即它总是包含有效的数字，并且在第一行之后至少包含n行（给出了n和k）。

如果输入在第n+1行后关闭，我们可以使用简化的for循环（我们甚至不需要递减并依赖于n）：

for scanner.Scan() {
    if toInt(scanner.Bytes())%k == 0 {
        c++
    }
}

英文:

You can use bufio.Scanner to read lines from the input.

And since we're always reading numbers, we can create a highly optimized converter to get the number. We should avoid using Scanner.Text() which creates a string as we can obtain the number just from the raw bytes returned by Scanner.Bytes(). Scanner.Text() returns the same token as Scanner.Bytes() but it first converts to string which is obviously slower and generates "garbage" and work for the gc.

So here is a converter function which obtains an int from the raw bytes:

func toInt(buf []byte) (n int) {
	for _, v := range buf {
		n = n*10 + int(v-&#39;0&#39;)
	}
	return
}

This toInt() works because the []byte contains the UTF-8 encoded byte sequence of the string representation of the decimal format of the number, which contains only digits in the range of '0'..'9' whose UTF-8 encoded bytes are mapped one-to-one (one byte is used for one digit). The mapping from digit to byte is simply a shift: '0' -> 48, '1' -> 49 etc.

Using this your complete application:

package main

import (
	&quot;bufio&quot;
	&quot;fmt&quot;
	&quot;os&quot;
)

func main() {
	var n, k, c int
	scanner := bufio.NewScanner(os.Stdin)

	scanner.Scan()
	fmt.Sscanf(scanner.Text(), &quot;%d %d&quot;, &amp;n, &amp;k)

	for ;n &gt; 0; n-- {
		scanner.Scan()
		if toInt(scanner.Bytes())%k == 0 {
			c++
		}
	}

	fmt.Println(c)
}

func toInt(buf []byte) (n int) {
	for _, v := range buf {
		n = n*10 + int(v-&#39;0&#39;)
	}
	return
}

This solution is about 4 times faster than calling strconv.Atoi() for example.

Notes:

In the above solution I assumed input is valid, that is it always contains valid numbers and contains at least n lines after the first (which gives us n and k).

If the input is closed after n+1 lines, we can use a simplified for (and we don't even need to decrement and rely on n):

for scanner.Scan() {
	if toInt(scanner.Bytes())%k == 0 {
		c++
	}
}

答案2

得分: 2

尝试使用bufio.Scanner（如你提到的线程中建议的）：

fmt.Scan(&n)
fmt.Scan(&k)

scanner := bufio.NewScanner(os.Stdin)
for n > 0 {
    scanner.Scan()
    k, _ := strconv.Atoi(scanner.Text())
    ...
}

英文:

Try using bufio.Scanner (as suggested in the thread you mentioned):

fmt.Scan(&amp;n)
fmt.Scan(&amp;k)

scanner := bufio.NewScanner(os.Stdin)
for n &gt; 0 {
    scanner.Scan()
    k, _ := strconv.Atoi(scanner.Text())
    ...

答案3

得分: 1

我编写了3个版本来进行比较。
第一个版本使用fmt.Scanf("%d", &v)，第二个版本使用字节转换数字（像@icza一样），第三个版本使用strconv.Atoi进行转换。为了使用这些函数，我初始化了扫描器的方式如下：

scanner := bufio.NewScanner(os.Stdin)
scanner.Split(bufio.ScanWords)

因此，每次调用scanner.Scan，它都会返回一个由空格分隔的标记。
以下是这些函数的实现：

func scanFromBytes(scanner *bufio.Scanner) (n int) {
    scanner.Scan()

    buf := scanner.Bytes()
    for _, v := range buf {
        n = n*10 + int(v-'0')
    }

    return
}

和：

func scanAtoi(scanner *bufio.Scanner) (n int) {
    scanner.Scan()

    n, _ = strconv.Atoi(scanner.Text())

    return
}

我已经使用一个大文件（40k个测试），每个测试读取大约8个整数。
fmt.Scanf的解决方案大约需要1.9秒，这是预期的（比其他解决方案更长）。
在这两个函数中，我得到了大约0.8秒的时间。但是scanAtoi总是比scanFromBytes快大约0.05秒，除了第一次（可能发生了一些缓存）。

英文:

I coded 3 versions to compare them.
The first using fmt.Scanf("%d", &v), the second converting numbers from bytes (like @icza), and the third converting using strconv.Atoi. To use the functions I initializated scanner in that way:

scanner := bufio.NewScanner(os.Stdin)
scanner.Split(bufio.ScanWords)

So, every time I call scanner.Scan, it returns a token splitted by spaces.
And the functions follow:

func scanFromBytes(scanner *bufio.Scanner) (n int) {
    scanner.Scan()

    buf := scanner.Bytes()
    for _, v := range buf {
        n = n*10 + int(v-&#39;0&#39;)
    }

    return
}

And:

func scanAtoi(scanner *bufio.Scanner) (n int) {
    scanner.Scan()

    n, _ = strconv.Atoi(scanner.Text())

    return
}

I have tested with a big file (40k tests), reading about 8 integers per test.
The fmt.Scanf solution, takes about 1.9s, as expected (more than the others).
In the two functions I got about 0.8s. But the scanAtoi always takes about 0.05s less than scanFromBytes, except for the very first time (maybe some caching occurs).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

更快的输入扫描

问题

答案1

答案2

答案3

修改正在运行的golang Docker容器，使用正在运行的shell。

将通道的所有元素消耗到一个切片中。

无法找到 /oauth/device/code Auth0 Java API。

将字符串转换为Float64：在单值上下文中的多值strconv.ParseFloat()。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论