有没有在项目中重新组织C头文件的首选方法?

huangapple go评论69阅读模式
英文:

Is there a preferred method to reorganise C header files in a project?

问题

我被给予了一个中等规模但复杂的C项目(总共约20万行),其中包含约100个.h文件和几乎相同数量的.c文件。

其中许多.h文件对应于等效的.c文件,但有一个.h文件,让我们称其为project_common.h,它 #include 了许多其他.h文件,并包含大约2000行主要是结构体和枚举定义等的内容。其中许多结构体嵌套很深,因此它们的顺序非常重要。

该文件的结构大致如下:

  1. #include guard
  2. #include <assert.h>
  3. #include <stdint.h>
  4. /* 等等 */
  5. #include "project_aaa.h"
  6. #include "project_bbb.h"
  7. /* 然后还有大约30行类似这样的内容。这些内容按字母顺序排列,并且已经付出了一定的努力,以确保它们可以以任何顺序包含。 */
  8. /* 然后大约有2000行的结构体、枚举、函数定义等内容。 */

我被要求将大部分或全部的这2000行移动,并且要为它们创建新的.h文件,或将它们粘贴到现有的.h文件之一。一个规则是每个头文件必须能够独立于其他所有头文件进行包含。换句话说,每个头文件在包含之前都不应该需要其他头文件。

经过一些努力,我很快意识到,即使作为一名具有25年以上经验的高级软件工程师,由于结构定义的复杂和分层性质,这并不是一项容易的任务。

我特别遇到的问题有:

  1. 在project_common.h中包含所有这些头文件是不好的做法,而且违背了拆分的初衷。
  2. 很难将所有这些内容分解,以便生成的头文件可以以任何组合的方式包含在给定的C文件中。

因此,我要问的是,是否有任何工具可以帮助将所有.h文件重构为更优化的配置,和/或是否有一种被认可的方法比试错更好?

到目前为止,我已经尝试过移动结构体定义,但进展非常缓慢和繁琐,尽管我几乎将project_common.h的大小减少了一半,但我创建的新头文件只有在按照正确顺序包含它们时才能工作。

英文:

I have been given a medium-sized but complex C project (about 200,000 lines in total) which contains around 100 .h files and nearly as many .c files.

Many of the .h files correspond to equivalent .c files, but there is one .h file in particular, let's call it project_common.h, that #includes many of the other .h files as well as containing about 2000 lines of mostly struct and enum definitions etc. Many of the structs are heavily nested so that their order very much matters.

The structure of the file is roughly:

  1. #include guard
  2. #include &lt;assert.h&gt;
  3. #include &lt;stdint.h&gt;
  4. /* etc */
  5. #include &quot;project_aaa.h&quot;
  6. #include &quot;project_bbb.h&quot;
  7. /* Then another 30 or so lines like this. These are in alphabetical
  8. order and a certain amount of effort has been made that they can be
  9. included in any order. */
  10. /* Then about 2000 lines of struct, enums, function definitions etc. */

I have been tasked with moving most or all of the 2000 lines and either creating new .h files for them or pasting them into one of the existing .h files. One rule is that each header must be able to be included independently of all others. In other words, each header must not need other headers to be included before it.
After some effort, I've quickly realised that, even as a senior software engineer of 25+ years' experience, this is not an easy task at all because of the very complex and hierarchical nature of the struct definitions.

My problems in particular are:

  1. It's bad practice to include all those headers in project_common.h, and defeats the point of splitting it all up.
  2. It's really REALLY hard to split all this up in such a way that the resulting headers can be included from a given C file in any combination.

So what I am asking is, are there any tools out there that can help with refactoring all the .h files into a more optimal configuration, and/or is there a recognised method that's better than trial and error?

So far I've tried moving struct definitions around, but progress is very slow and tedious, and although I have nearly halved the size of project_common.h, the new headers I have created only work if they are included in the right order.

答案1

得分: 2

我被委托移动大部分或所有的 2000 行代码,要么为它们创建新的 .h 文件,要么将它们粘贴到现有的 .h 文件中。

我不清楚用于此目的的重构工具,但作为代码风格的一个重要问题,我认为每个头文件和常规源文件 X 都应该 #include 所有直接声明 X 定义或直接引用的任何函数或变量的其他头文件,以及每个定义了 X 直接使用的宏的头文件,只有这些。这同样适用于 #include 系统和第三方头文件,就像适用于项目内部的头文件一样。

你可能已经在朝着支持能够单独 #include 任何头文件的目标迈出了很大的一步。然而,当你说

新创建的头文件只有在按正确顺序包含它们时才能正常工作。

如果每个头文件都 #include 所有提供它本身需要的声明的其他头文件,并且还提供了有效的保护,防止多次包含,那么出现 #include 顺序问题的唯一可能是你存在依赖循环。如果你在开始时没有循环,那么你进行重构时不会产生特定的原因。如果你的重构确实产生了循环,那就意味着该循环中的一些或所有头文件的内容应该合并到同一个头文件中。

此外,这可能很明显,选择将现有的声明移到何处时,我建议侧重于语义关系,而不是简单的代码依赖关系。通常一起使用的事物很可能被选择放在同一个头文件中,但那些只是偶然出现在某些相同依赖链中的事物则不太可能。

现在,我想你说...

一个规则是每个头文件必须能够独立于其他所有头文件进行包含。

... 你可能的意思不仅是可以随意选择要包含的头文件,而不必担心顺序和依赖关系,而且还意味着不允许任何头文件包含其他头文件。如果是这样,那么这是一个人为的、难以维护的规定。这意味着,例如,无论何时你有一个结构或联合类型,它嵌入(而不只是指向)项目中其他内部类型的对象,这两个类型必须在同一个头文件中声明。如果你碰巧被这样的情况困扰,那么你目前的任务提供了一个很好的机会来反对它。

最后,作为一个实际问题,我会从文件顶部开始逐步进行工作。这样你将首先处理具有最少依赖关系的声明。你甚至可能发现将其视为许多小型重构的系列,而不是一个巨大的重构会更有用。

英文:

> I have been tasked with moving most or all of the 2000 lines and either creating new .h files for them or pasting them into one of the existing .h files.

I don't know about refactoring tools for this purpose, but as a salient matter of code style, I hold that every header and regular source file, X, should itself #include every other header that directly declares any function or variable that X defines or directly references, and every header defining a macro that X directly uses, and only those. That applies to #includeing system and third-party headers just as much as to your project's internal headers.

You may have already come a long way in that direction in support of the goal of making it possible to #include any header individually. However, it seems clear that you cannot be fully adhering to that principle when you say

> the new headers I have created only work if they are included in the right order.

If each header #includes all the other headers providing declarations that it needs itself, and also provides effective guards against multiple inclusion, then the only way to have #include-order problems is if you have a dependency cycle. If you did not already have a cycle when you started then there is no particular reason why your refactoring should produce one. If your refactoring does produce one then that implies that some or all of the contents of the headers in that cycle should be merged into the same header.

Also, and this may be obvious, in choosing where to move the existing declarations, I would recommend focusing on semantic relationships rather than on simple code dependency relationships. Things that are usually used together are a likely choice for cohabitation in the same header, but not so much things that just happen to be in some of the same dependency chains.

Now, I suppose it's possible that when you say ...

> One rule is that each header must be able to be included independently of all others.

... you mean not just that one can pick and choose headers to include without concern about order and dependencies, but also that no header is permitted to include any of the others. If so, then that's an artificial and difficult to sustain provision. It implies, for instance, that wherever you have a structure or union type that embeds (not just points to) an object of one of the project's other internal types, those two types must be declared in the same header. If you happen to be saddled with something like that, then your current task provides a good context for pushing back against it.

Finally, as a practical matter, I would start at the top of the file and work downward from there. This way you will work first with the declarations that have the fewest dependencies. You may even find it useful to think of this and work on it as a series of many small refactorings instead of one huge one.

huangapple
  • 本文由 发表于 2023年7月28日 00:31:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76781765.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定