英文:
CS50 speller.c segmentation fault with hash table in load function
问题
我正在学习 CS50,目前在解决问题集 'speller.c'。抱歉,如果这只是一个简单的修复,但我对 C 语言的理解还很肤浅。
我已经编写了所有的函数,但当我尝试运行程序时,出现了段错误。使用 debug50 调试后,它告诉我段错误与哈希表有关,或者是一个叫做 node 的自定义数据类型的数组。Debug50 告诉我,当我尝试运行一个 if 语句时发生了段错误,即 "if (table[hashnum]->next == NULL)",这在 'load' 函数中。
我在网上查找了段错误可能发生的原因,但据我所了解,它发生在访问已释放的指针时,当我没有足够的空间时,或者当我试图访问我不允许访问的内存,或者尝试写入内存的“只读”部分。另外,根据我对初始化全局数组的了解,程序开头创建的数组应允许我在其中读取和写入,所以我不确定我做错了什么。
感谢任何帮助和解释,谢谢
以下是我的代码。
此外,问题集的指南和目标可以在 https://cs50.harvard.edu/x/2023/psets/5/speller/ 找到。
(代码部分已省略)
英文:
I'm taking CS50 and currently on problem set 'speller.c'. Sorry if this is a simple fix, but i'm just barely understanding C.
I've written all of the functions, but when I try to run the program I get a segmentation fault. After using debug50, it tells me that the segmentation fault has something to do with the hash table, or the array of a custom data type called node. Debug50 tells me that it happens when i try to run an if statement, which is "if (table[hashnum]->next == NULL)", which is in the 'load' function.
I've looked online as to why the seg fault could be happening, but from what I understand, it happens when I access freed pointers, when i dont have enough space, when i try to access memory i'm not allowed to access, or try to write in a 'read only' part of memory. Also, from what I understand about initializing global arrays, the one made in the beginning of the program should allow me to read and write in it, so I'm not sure what i'm doing wrong.
Any help and explanations are appreciated, thanks
Below is my code.
Also, the guidelines and goal of the problem set are found at https://cs50.harvard.edu/x/2023/psets/5/speller/.
// Implements a dictionary's functionality
#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
#include "dictionary.h"
// Represents a node in a hash table
typedef struct node
{
char word[LENGTH + 1];
struct node *next;
}
node;
// TODO: Choose number of buckets in hash table
const unsigned int N = 26;
// Hash table
node *table[N];
// Global Variables
char tmp[LENGTH + 1];
int wordcount = 0;
bool hashnull = false;
// New function prototype
void freehash(node* node);
// Returns true if word is in dictionary, else false
bool check(const char *word)
{
// Runs word through hash function
int hashnum = hash(word);
node * tmpnode = table[hashnum]->next;
// Compares all words in linked list given by hashnum to 'word' to see if any match/if 'word' is spelled correctly
while (strcasecmp(word, tmpnode->word) != 0)
{
if (tmpnode->next == NULL)
{
return false;
}
tmpnode = tmpnode->next;
}
return true;
}
// Hashes word to a number
unsigned int hash(const char *word)
{
// TODO: Improve this hash function
return toupper(word[0]) - 'A';
}
// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
// Opens dictionary
FILE *dict = fopen(dictionary, "r");
// Checks if dictionary opened successfully
if (dict == NULL)
{
fclose(dict);
printf("\n\n Could not load dictionary.\n");
return false;
}
// Load dictionary into hash table
node *tmpnode;
int hashnum;
char check;
while(fread(&check, sizeof(char), 1, dict))
{
fseek(dict, -1, SEEK_CUR);
//// Take a single word from dictionary and put into dictword (new node)
// Make new node/element to add to hash table
node *dictword = malloc(sizeof(node));
if (dictword == NULL)
{
fclose(dict);
free(dictword);
return false;
}
tmpnode = dictword;
// Iterates through all of the letters in a word from dictionary and ends when next line (/n) is read
int tmpctr = 0;
while(fread(&dictword->word[tmpctr], sizeof(char), 1, dict))
{
if(dictword->word[tmpctr] == '\n')
{
break;
}
else
{
tmpctr++;
}
}
// Run hash function to see where in hash table new word will go
hashnum = hash(dictword->word);
//// Put new node for current word into hash table
if (table[hashnum]->next == NULL)
{
table[hashnum]->next = tmpnode;
dictword->next = NULL;
}
else
{
tmpnode = table[hashnum]->next;
dictword->next = tmpnode;
table[hashnum]->next = dictword;
}
// Increase wordcount
wordcount++;
}
// Close dictionary stream (after success)
fclose(dict);
// ***** NO NEED TO MALLOC HERE; THATS WHAT THE UNLOAD FUNCTION IS FOR
return true;
}
// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
if (wordcount > 0)
{
return wordcount;
}
else
{
return 0;
}
}
// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)
{
node * tmpnode;
for(int i = 0; i < N; i++)
{
tmpnode = table[i]->next;
freehash(tmpnode);
}
return true;
}
void freehash(node* node)
{
// If node currently being 'freed' isn't the last node, call freehash() with next node in line
if (node->next != NULL)
{
freehash(node->next);
}
free(node);
return;
}
答案1
得分: 1
-
load()
: 如果fopen()
失败,你会执行fclose(NULL)
,导致段错误。 -
load()
: 由于node *table[N]
是全局变量,它会被初始化为零。在load()
函数中,你执行table[hashnum]->next
,但会导致段错误,因为table[hashnum]
为NULL。也许你想要这样做:
if (!table[hashnum]) {
table[hashnum] = tmpnode;
} else if (table[hashnum]->next) {
另外,将tmpnode
的作用范围最小化,只在需要它的地方使用它(即在设置->next
的情况下)。这样可以使你的代码更容易阅读。
load()
: 目前你包括了文件中的'\n'
,但你可能不应该包括它,而是应该以NUL终止你的字符串:
size_t i = 0;
for(; i < LENGTH && fread(&dictword->word[i], 1, 1, dict) && dictword->word[i] != '\n'; i++);
dictword->word[i] = 'size_t i = 0;
for(; i < LENGTH && fread(&dictword->word[i], 1, 1, dict) && dictword->word[i] != '\n'; i++);
dictword->word[i] = '\0';
';
load()
: 文件读取逻辑有点奇怪,首先读取一个字节,然后将文件指针回退,然后逐个字母地读取,而不检查单词是否过长。正如@Barmar建议的,可以使用fgets()
获取一行,然后使用strcspn()
将\n
替换为\0
。
英文:
-
load()
: Iffopen()
fails youfclose(NULL)
which segfaults. -
load()
: Asnode *table[N]
is a global variable it is zero initialized. Inload()
you dotable[hashnum]->next
which segfaults astable[hashnum]
is NULL. Maybe you want:
if (!table[hashnum]) {
table[hashnum] = tmpnode;
} else if (table[hashnum]->next) {
As aside, minimize the scope of tmpnode
to just where it's needed (which is the case where ->next is set). This makes your code easier to read.
load()
: You currently include the '\n' of the file but you probably shouldn't and instead want to NUL terminate your string instead:
size_t i = 0;
for(; i < LENGTH && fread(&dictword->word[i], 1, 1, dict) && dictword->word[i] != '\n'; i++);
dictword->word[i] = ' size_t i = 0;
for(; i < LENGTH && fread(&dictword->word[i], 1, 1, dict) && dictword->word[i] != '\n'; i++);
dictword->word[i] = '\0';
';
load()
: The file reading logic is kinda odd, read one byte, then you back up the file pointer then read letter by letter without checking if word is too long. As @Barmar suggest, just get a line withfgets()
, then usestrcspn()
to replace the\n
with a\0
.
答案2
得分: 0
Code also has subtle bugs beyond CS50 expectations.
OP hashed with return toupper(word[0]) - 'A';
-
This should be done as
return toupper(((unsigned char *)word)[0]) - 'A';
aschar
may be signed withword[0] < 0
andtoupper(int)
is defined forunsigned char
values andEOF
. Characters should be examined as if there areunsigned char
even whenchar
is signed. -
strcasecmp()
, although not a standard function, more often converts to lower (thantoupper()
) and then compares. When the case mapping is not 1-to-1, e.g.toupper()
mapsÿ
andy
toY
, buttolower()
mapsY
toy
. The hash withtoupper
will not return 0 withstrcasecmp("ÿ","y")
. Best to use the same caseto
-ness.
英文:
Code also has subtle bugs beyond CS50 expectations.
OP hashed with return toupper(word[0]) - 'A';
-
This should be done as
return toupper(((unsigned char *)word)[0]) - 'A';
aschar
may be signed withword[0] < 0
andtoupper(int)
is defined forunsigned char
values andEOF
. Characters should be examined as if there areunsigned char
even whenchar
is signed. -
strcasecmp()
, although not a standard function, more often converts to lower (thantopper()
) and then compares. When the case mapping is not 1-to-1, e.g.toupper()
mapsÿ
andy
toY
, buttolower()
mapsY
toy
. The hash withtoupper
will not return 0 withstrcasecmp("ÿ","y")
. Best to use the same caseto
-ness.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论