操作 argv(假设)和语言漏洞

huangapple go评论78阅读模式
英文:

Manipulating argv (assumptions) and language loopholes

问题

当解析 argv 时,参考:

int main (int argc, char *argv[]) {}

我理解根据标准,argv[argc] == NULL,但是能否保证 argv[i] != NULLi < argc 的情况下?

换句话说,如果我执行 argv[i][0],其中 i < argc,我是否能保证不会出现段错误,因为 argv[i] != NULL

我认为程序可能会违反 argv 的规则,考虑以下情况:

execlp("malicious_program", "ls", NULL); // 程序名不在 argv[0] 中
execlp("ls", "ls", NULL, "-al", NULL); // 结尾之前有 NULL
execlp("ls", "ls"); // 结尾没有 NULL

操作系统是否提供了保护措施(我认为它可以从执行的参数数量简单地计算 argc),如果没有,这是否意味着不能安全地做出这些假设?

当编写自己的程序时,我是否可以使用类似以下的习惯用法:

if (*argv[i] == NULL) break; // 如果没有 NULL 终止符,会进入无限循环;如果在中间有 NULL,则会漏掉参数

这最后一部分是我问题的出发点。

英文:

When parsing through argv, referring to

int main (int argc, char *argv[]) {}

I understand that argv[argc] == NULL according to the standard, but is it guaranteed that argv[i] != NULL where i &lt; argc?

In other words, if I perform argv[i][0] where i &lt; argc, am I guaranteed to not segfault, because argv[i] != NULL?

I think that programs could break the argv rules, consider these:

execlp(&quot;malicious_program&quot;, &quot;ls&quot;, NULL); // program name not in argv[0]
execlp(&quot;ls&quot;, &quot;ls&quot;, NULL, &quot;-al&quot;, NULL); // NULL prior to end
execlp(&quot;ls&quot;, &quot;ls&quot;); // no NULL at end

Does the operating system provide safeguards (I think it could simply calculate argc from the number of arguments to exec), and if not, does that mean that these assumptions cannot safely be made?

When writing my own program, can I therefore use an idiom such as:

if (*argv[i] == NULL) break; // infinite loop if no NULL terminator, miss args if NULL middle

This last bit is where my question stemmed from.

答案1

得分: 2

execlp("malicious_program", "ls", NULL); // program name not in argv[0]

这段代码中的内容是:

execlp("malicious_program", "ls", NULL); // 程序名称不在 argv[0] 中

这段代码没有问题。许多程序可以使用不同的 argv[0] 值运行,并将其用作执行不同操作的标志。

例如,如果以 - 字符作为 argv[0] 的第一个字符来运行 shell,它将作为登录 shell 执行;这是由于没有关于 shell 参数的标准,但登录系统需要一种告诉 shell 以登录模式运行的方式,这是历史遗留下来的。

在某些系统上,shbash 是相同的程序;它检查 argv[0] 来确定是否启用 bash 扩展。

如果使用某个其他的 argv[0] 执行 telnet,它将被视为目标主机名;这允许您创建指向服务器名称的符号链接,并将它们用作快捷方式。

不对 argv[0] 进行任何特殊使用的程序(绝大多数情况下)会完全忽略它。在 argv[0] 放置一个 "malicious" 值将不会产生任何效果。

为什么 malicious_program 会关心您将 ls 放在 argv[0] 中呢?您是否弄反了,本意是这样吗:

execlp("ls", "malicious_program", NULL);
execlp("ls", "ls", NULL, "-al", NULL); // NULL prior to end

NULL 之后的任何参数都将被忽略,因为 execlp() 通过 NULL 参数来确定参数的结束位置。可变参数列表不提供函数确定实际参数数量的方法。因此,任何参数在 argc 之前都不可能为空,因为空值确定了 argc 的值。

execlp("ls", "ls"); // 没有在末尾添加 NULL

这将导致未定义的行为。没有 NULLexeclp() 不知道有多少个参数(参见上文),它将尝试访问不存在的 argv 参数。

至于是否安全访问这些字符串,我认为通常都是安全的。当程序加载器构建 argv 时,它会构建一组全新的字符串,而不是简单地传递给 exec*() 的指针。argv 在内存中的实际布局是一块连续分配的内存块,每个参数都是依次分配的。例如,如果 argv 如下所示:

argv[0] = "ls"
argv[1] = "ls"
argv[2] = "filename"
argv[3] = NULL

则保存所有参数的内存将如下所示:

ls
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0
ls
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0
filename
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0

argv 的元素是指向此块内存的指针。因此,这些指针永远不会无效,它们始终指向此块内存。当然,字符串本身可能没有意义。在您不提供 NULL 参数的情况下,它将复制垃圾字符串到 argv 中。

英文:
execlp(&quot;malicious_program&quot;, &quot;ls&quot;, NULL); // program name not in argv[0]

There's nothing wrong with this. Many programs can be run with different argv[0] values, and they use this as a flag to execute differently.

For example, if a shell is run with the first character of argv[0] being the - character, it executes as a login shell; this is a historical artifact of there not being a standard for arguments to shells, but the login system needs a way of telling the shell that it should operate in login mode.

On some systems, sh and bash are the same program; it checks argv[0] to determine whether to enable bash extensions.

If telnet is executed with some other argv[0], it's taken to be the destination hostname; this allows you to make symlinks to telnet with the name of a server, and use them as a shortcuts.

Programs that don't make any special use of argv[0] (the vast majority) completely ignore it. Putting a "malicious" value there will have no effect.

Why would malicious_program care that you put ls in argv[0]? Did you write that backwards, and intend this:

execlp(&quot;ls&quot;, &quot;malicious_program&quot;, NULL);
execlp(&quot;ls&quot;, &quot;ls&quot;, NULL, &quot;-al&quot;, NULL); // NULL prior to end

Any arguments after NULL will be ignored, because the NULL argument is how execlp() determines where the end of the arguments is. Variadic argument lists don't provide any way for the function to determine the actual number of arguments. So there's no way for any argument before argc to be null, because the null value determines the value of argc.

execlp(&quot;ls&quot;, &quot;ls&quot;); // no NULL at end

This will cause undefined behavior. Without NULL, execlp() doesn't know how many arguments there are (see above), and it will try to access nonexistent arguments into argv.

As for whether it's safe to access these strings, I think it should always be. When argv is constructed by the program loader, it constructs a brand new set of strings, it doesn't simply pass along the pointers that were provided to exec*(). The actual layout of argv in memory is a single block of memory, with each argument consecutively allocated. E.g. if argv is

argv[0] = &quot;ls&quot;
argv[1] = &quot;ls&quot;
argv[2] = &quot;filename&quot;
argv[3] = NULL

the memory holding all the arguments will look like:

ls
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0
ls
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0
filename
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
ls
ls\0ls\0filename\0\0
filename
ls\0ls\0filename\0\0
ls\0ls\0filename\0\0

and the elements of argv are pointers into this block.

So these pointers will never be invalid, they will always point into this block. The strings themselves may be meaningless, of course. In the case where you don't provide a NULL argument, it will copy garbage strings into argv.

huangapple
  • 本文由 发表于 2023年6月22日 12:51:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76528692.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定