英文:
For each Word in a File, Find If current Word is Present More than Once
问题
我对Golang
非常陌生,我在尝试查找并打印文件中包含特定相同值
的所有行时遇到了一些问题。
我的文件结构如下:
索引 文本
索引 文本
.
.
.
索引 文本
其中索引
始终是6位数
,而文本
始终是16位数
。
> 我需要查找并打印所有包含相同文本
值的行。
这是我到目前为止尝试的代码:
func main() {
//用于存储相同文本的数组
found := make([]string, 6)
r, _ := os.Open("store.txt")
scanner := bufio.NewScanner(r)
//按单词分割
scanner.Split(bufio.ScanWords)
//遍历文件中的所有单词
for scanner.Scan() {
line := scanner.Text()
//如果当前行是16位数
if(utf8.RuneCountInString(line) == 16){
currLine := line
//在相同文件中搜索所有16位数的文本
for scanner.Scan(){
searchLine := scanner.Text()
//如果找到相同的文本
if(utf8.RuneCountInString(searchLine) == 16){
//将其添加到found数组中
if(currLine == searchLine){
found = append(found, currLine)
}
}
}
}
}
//打印found数组
fmt.Println(found)
//关闭文件
r.Close()
}
然后,我想使用found
来打印与当前的found[i-element]
匹配的所有行
。
上面的代码只适用于第一步。
例如,如果在我的文件中,第一行得到1234567890123456
(例如从索引1开始),然后只检查并添加一次,它不会循环处理所有行(对于剩余的n-1个单词)。
-
我该如何解决第一个问题?
-
您认为将重复的
文本
添加到一个数组
中,然后根据它打印匹配的行是一个不好的主意吗?
提前致谢。
英文:
I'm very new to Golang
and I'm having some issues on trying to find
and print
all the lines in a file which contain a certain same value
.
My file is structured like the following:
index text
index text
.
.
.
index text
Where index
is ALWAYS 6 digits long
and text
is ALWAYS 16 digits long
.
> I need to find
and print
all the lines which contain the same text
value.
That's what I tried so far:
func main() {
//Array to contain common texts
found := make([]string, 6)
r, _ := os.Open("store.txt")
scanner := bufio.NewScanner(r)
//Splits in words
scanner.Split(bufio.ScanWords)
//Loop over all Words in the file
for scanner.Scan() {
line := scanner.Text()
//If the current line is 16 digits long
if(utf8.RuneCountInString(line) == 16){
currLine := line
//Search in the same files all the 16 digits long texts and
for scanner.Scan(){
searchLine := scanner.Text()
//If a same text is found
if(utf8.RuneCountInString(searchLine) == 16){
//Append it to found array
if(currLine == searchLine){
found = append(found, currLine)
}
}
}
}
}
//Print found Array
fmt.Println(found)
//Close File
r.Close()
}
Then, I would like to use found
to print
all the lines
which match the current found[i-element]
.
The code above works only for the very first step.
For instance, If in my file, at the very first line it gets 1234567890123456
(e.g. from index 1) then it checks and appends only one time, it does not loop for all the lines (for the remaining n-1 words).
-
How can I fix the first issue?
-
Do you think adding the duplicate
texts
in anArray
and then print the matching lines based on it is a bad idea?
Thanks in advance.
答案1
得分: 1
第一个问题是由于您在读取文件和检查重复时使用了相同的流,所以当内部循环到达文件底部时,外部循环检查是否还有更多内容可扫描,但它发现了EOF并退出。
解决问题的最简单方法是创建一个数组,将所有第一次找到的文本放入其中,当文本值已经存在时,只需打印出重复项。类似于以下代码:
duplicates := make([]string, 0)
for scanner.Scan() {
line := scanner.Text()
text := line[6:]
// 进行检查
// 如果所有的控制都没问题
if !contains(duplicates, text) {
duplicates = append(duplicates, text)
} else {
// 打印重复项
}
}
下面是contains
函数的实现:
func contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
希望对您有所帮助!
英文:
The first issue is caused because you are using the same stream to read the file and check duplicate so when the inner for reach the bottom of the file finish, then the outer for check if there is something more to scan but it find the EOF and exit.
The easiest way to solve your problem is creating an array where you put all the text that you find for the first time and when the text value are already present just print the duplicate. Something like this:
duplicates := make([]string,0)
for scanner.Scan() {
line := scanner.Text()
text := line[6:]
//Do your check
//if all your control are ok
if ! contains(duplicates, text) {
duplicates = append(duplicates, text)
} else {
//Print the duplicates
}
And here there is the contains
implementation
func contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论