英文:
Faster input scanning
问题
我正在尝试解决SPOJ问题,可以在这里找到。
以下是我的解决方案:
package main
import "fmt"
import "bufio"
import "os"
func main() {
var n, k int
var num int
var divisible int
in := bufio.NewReader(os.Stdin)
fmt.Fscan(in, &n)
fmt.Fscan(in, &k)
for n > 0 {
fmt.Fscan(in, &num)
if num%k == 0 {
divisible++
}
n--
}
fmt.Println(divisible)
}
代码运行正常。问题在于当我在SPOJ上执行时,会出现超时的问题。
我最初只使用了fmt.Scan
,但后来我看到了这个帖子,建议我改用bufio
来进行更快的输入扫描。
但我仍然遇到超时问题。我只是在循环中获取所有的输入,并在循环内部确定输入是否可被整除。所以,我认为问题不在于循环,而是在于输入扫描上花费的时间。我该如何改进以更快地读取输入?或者问题出在其他地方吗?
英文:
I am attempting to solve the SPOJ question that can be found here
Following is my solution:
package main
import "fmt"
import "bufio"
import "os"
func main() {
var n, k int
var num int
var divisible int
in := bufio.NewReader(os.Stdin)
fmt.Fscan(in, &n)
fmt.Fscan(in, &k)
for n > 0 {
fmt.Fscan(in, &num)
if num%k == 0 {
divisible++
}
n--
}
fmt.Println(divisible)
}
The code works fine. The issue here is that I get a timeout when I execute it in SPOJ.
I was first using only fmt.Scan
but I then came across this thread that suggested that I use bufio
instead for faster input scanning.
But I still get a timeout issue. I am only looping to get all the inputs and within this loop itself I determine whether the input is divisible or not. So, I believe that its not the loop but the input scanning that's taking time. How can I improve this to read the input faster? Or is the issue somewhere else?
答案1
得分: 16
你可以使用bufio.Scanner
从输入中读取行。
由于我们总是读取数字,我们可以创建一个高度优化的转换器来获取数字。我们应该避免使用Scanner.Text()
,它会创建一个string
,而我们可以直接从Scanner.Bytes()
返回的原始字节中获取数字。Scanner.Text()
返回与Scanner.Bytes()
相同的标记,但它首先转换为string
,显然更慢,并且会生成“垃圾”并给垃圾回收器带来额外的工作。
所以这里有一个从原始字节中获取int
的转换器函数:
func toInt(buf []byte) (n int) {
for _, v := range buf {
n = n*10 + int(v-'0')
}
return
}
这个toInt()
函数之所以有效,是因为[]byte
包含数字的十进制表示的UTF-8编码字节序列,其中只包含'0'..'9'
范围内的数字,其UTF-8编码字节是一对一映射的(一个字节用于一个数字)。从数字到字节的映射只是一个偏移:'0' -> 48
,'1' -> 49
等等。
使用这个转换器的完整应用程序如下:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
var n, k, c int
scanner := bufio.NewScanner(os.Stdin)
scanner.Scan()
fmt.Sscanf(scanner.Text(), "%d %d", &n, &k)
for ; n > 0; n-- {
scanner.Scan()
if toInt(scanner.Bytes())%k == 0 {
c++
}
}
fmt.Println(c)
}
func toInt(buf []byte) (n int) {
for _, v := range buf {
n = n*10 + int(v-'0')
}
return
}
这个解决方案比例如调用strconv.Atoi()
要快大约4倍。
注意:
在上面的解决方案中,我假设输入是有效的,即它总是包含有效的数字,并且在第一行之后至少包含n
行(给出了n
和k
)。
如果输入在第n+1
行后关闭,我们可以使用简化的for
循环(我们甚至不需要递减并依赖于n
):
for scanner.Scan() {
if toInt(scanner.Bytes())%k == 0 {
c++
}
}
英文:
You can use bufio.Scanner
to read lines from the input.
And since we're always reading numbers, we can create a highly optimized converter to get the number. We should avoid using Scanner.Text()
which creates a string
as we can obtain the number just from the raw bytes returned by Scanner.Bytes()
. Scanner.Text()
returns the same token as Scanner.Bytes()
but it first converts to string
which is obviously slower and generates "garbage" and work for the gc.
So here is a converter function which obtains an int
from the raw bytes:
func toInt(buf []byte) (n int) {
for _, v := range buf {
n = n*10 + int(v-'0')
}
return
}
This toInt()
works because the []byte
contains the UTF-8 encoded byte sequence of the string representation of the decimal format of the number, which contains only digits in the range of '0'..'9'
whose UTF-8 encoded bytes are mapped one-to-one (one byte is used for one digit). The mapping from digit to byte is simply a shift: '0' -> 48
, '1' -> 49
etc.
Using this your complete application:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
var n, k, c int
scanner := bufio.NewScanner(os.Stdin)
scanner.Scan()
fmt.Sscanf(scanner.Text(), "%d %d", &n, &k)
for ;n > 0; n-- {
scanner.Scan()
if toInt(scanner.Bytes())%k == 0 {
c++
}
}
fmt.Println(c)
}
func toInt(buf []byte) (n int) {
for _, v := range buf {
n = n*10 + int(v-'0')
}
return
}
This solution is about 4 times faster than calling strconv.Atoi()
for example.
Notes:
In the above solution I assumed input is valid, that is it always contains valid numbers and contains at least n
lines after the first (which gives us n
and k
).
If the input is closed after n+1
lines, we can use a simplified for
(and we don't even need to decrement and rely on n
):
for scanner.Scan() {
if toInt(scanner.Bytes())%k == 0 {
c++
}
}
答案2
得分: 2
尝试使用bufio.Scanner
(如你提到的线程中建议的):
fmt.Scan(&n)
fmt.Scan(&k)
scanner := bufio.NewScanner(os.Stdin)
for n > 0 {
scanner.Scan()
k, _ := strconv.Atoi(scanner.Text())
...
}
英文:
Try using bufio.Scanner
(as suggested in the thread you mentioned):
fmt.Scan(&n)
fmt.Scan(&k)
scanner := bufio.NewScanner(os.Stdin)
for n > 0 {
scanner.Scan()
k, _ := strconv.Atoi(scanner.Text())
...
答案3
得分: 1
我编写了3个版本来进行比较。
第一个版本使用fmt.Scanf("%d", &v)
,第二个版本使用字节转换数字(像@icza一样),第三个版本使用strconv.Atoi
进行转换。为了使用这些函数,我初始化了扫描器的方式如下:
scanner := bufio.NewScanner(os.Stdin)
scanner.Split(bufio.ScanWords)
因此,每次调用scanner.Scan,它都会返回一个由空格分隔的标记。
以下是这些函数的实现:
func scanFromBytes(scanner *bufio.Scanner) (n int) {
scanner.Scan()
buf := scanner.Bytes()
for _, v := range buf {
n = n*10 + int(v-'0')
}
return
}
和:
func scanAtoi(scanner *bufio.Scanner) (n int) {
scanner.Scan()
n, _ = strconv.Atoi(scanner.Text())
return
}
我已经使用一个大文件(40k个测试),每个测试读取大约8个整数。
fmt.Scanf
的解决方案大约需要1.9秒,这是预期的(比其他解决方案更长)。
在这两个函数中,我得到了大约0.8秒的时间。但是scanAtoi
总是比scanFromBytes
快大约0.05秒,除了第一次(可能发生了一些缓存)。
英文:
I coded 3 versions to compare them.
The first using fmt.Scanf("%d", &v)
, the second converting numbers from bytes (like @icza), and the third converting using strconv.Atoi
. To use the functions I initializated scanner in that way:
scanner := bufio.NewScanner(os.Stdin)
scanner.Split(bufio.ScanWords)
So, every time I call scanner.Scan, it returns a token splitted by spaces.
And the functions follow:
func scanFromBytes(scanner *bufio.Scanner) (n int) {
scanner.Scan()
buf := scanner.Bytes()
for _, v := range buf {
n = n*10 + int(v-'0')
}
return
}
And:
func scanAtoi(scanner *bufio.Scanner) (n int) {
scanner.Scan()
n, _ = strconv.Atoi(scanner.Text())
return
}
I have tested with a big file (40k tests), reading about 8 integers per test.
The fmt.Scanf
solution, takes about 1.9s, as expected (more than the others).
In the two functions I got about 0.8s. But the scanAtoi
always takes about 0.05s less than scanFromBytes
, except for the very first time (maybe some caching occurs).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论