“What is actually the Working directory in Git?”

huangapple go评论77阅读模式
英文:

¿What is actually the Working directory in Git?

问题

我正在花费大量时间来清楚地了解关于「Git 中的工作目录」的概念。
这是一个特定的文件夹或目录吗?还是目录的一个版本?是否有人可以帮助我理解这个概念呢?
如果我在本地创建一个名为「mydir」的目录,
然后运行:git init。
谢谢。

英文:

I am spending lots of time to get a clear idea about the 'Working directory in Git'
Is it a especific folder or directory? or is a version of a directory? Can anyone help me to understand this concept.
What if I create a directory 'mydir' locally
then I run: git init.
thanks

答案1

得分: 4

在Git中,“工作目录”一词曾经是“工作树”的同义词。但现在不再是,因为“工作目录”这个词组也可以被您的操作系统使用(通常在前面加上第三个词,如“当前工作目录”)。现代的Git尽量多地使用“工作树”这个词组,尽管有时会缩短为“work-tree”或“worktree”,例如在git worktree add中。

在您的操作系统中,当使用“当前工作目录”这个词组时,这指的是您在工作时所在的文件夹或目录。这可能位于您的“工作树”内。

在Git中,“工作树”一词指的是操作系统维护的目录和文件,其中包含“您”的文件副本。这些文件是“您的”,可以按您的意愿处理:Git只是从提交的文件中“填充”它们。

如果我在本地创建一个名为'mydir'的目录,然后运行:git init

让我将其重新表述为以下一系列Shell命令:

$ mkdir mydir
$ cd mydir
$ git init

mkdir命令会在您的当前工作目录内创建一个新的空目录。然后cd命令进入这个空目录,这样现在./mydir就成为您的当前工作目录。git init命令在其自己的当前工作目录中运行,而这个当前工作目录正是这个空目录。

由于您运行git init时,目录mydir当时是空的,Git会在这个mydir目录内创建一个名为.git的隐藏目录/文件夹。这个隐藏目录包含了“仓库本体”。仓库由许多文件和目录组成,实现了几个数据库:

  • 其中一个数据库是一个简单的key-value存储,使用散列ID来定位内部Git对象。这是构成Git仓库的两个主要数据库中的主要(通常也是最大的)数据库。

  • 另一个数据库也是一个简单的key-value存储,使用名称作为键来存储散列ID,然后在第一个数据库中使用这些散列ID。这是构成Git仓库的第二个数据库。在当前版本的Git中,这个特定数据库的实现有点问题:它过于依赖于您的操作系统。在macOS和Windows上,它往往有点缺陷。Git正在进行工作,以用适当的数据库实现替换它,从而消除这个问题。

  • 除了这两个主要数据库之外,仓库还包含许多辅助文件,包括Git的“索引”(也称为“暂存区”)。这里最重要的一点是,所有这些实体都位于.git目录内。

由于还没有提交,两个主要数据库都是空的。此时,Git的索引也是空的。

您的“工作树”包括当前工作目录内的所有文件和目录,除了.git目录,该目录包含Git的文件。由于工作树是您的,并且由您的操作系统维护(而不是由Git维护),您现在可以在这里创建任何您喜欢的文件。

在某些时候,您会希望Git创建一个新的提交。这将是仓库中的第一个提交。要创建此提交,您将会将要放入此初始提交的文件添加到Git的索引/暂存区中,使用git add命令。git add程序的工作原理是将您的工作树文件复制到Git的索引中。因此,如果您的操作系统的当前工作目录mydir目录,您现在可以创建一些文件:

$ echo "repository for project X" > README
$ git add README
$ git commit

这里的echo命令在您的工作树中创建一个名为README的新文件。git add命令获取工作树文件,对其进行压缩和Git处理,使其准备好存储在新提交中,并将存储的文件写入Git的索引。最后的命令git commit会从您(制作提交的人)那里获取一些元数据,并将Git的索引和此元数据写入,将结果存储在主数据库中,从而创建一个新的提交。

一旦您创建了这个新的初始提交(仓库中的第一个提交),分支名称就可以存在了。在此之前,它们无法存在,因为每个分支名称都必须保存一个有效的、现有的散列ID,而未来提交的散列ID是不可预测的。现在,只有一个提交,这是任何分支名称可以保存的唯一散列ID。

随着时间的推移,您将向仓库添加越来越多的提交。(通常情况下,几乎不会删除提交,除非例如git rebase用新的改进后的提交替换提交。虽然不是不可能,但确实很困难。)因此,每个新提交都会增加仓库的内容。

仓库本身由以下内容组成:

  • 保存提交和其他对象的数据库,以及查找它们的名称;
  • Git的索引,用于保存您拟议的下一个提交;
  • 您和/或Git可能会发现有用的其他维护项目。

提交对象,实际上是大数据库中的所有对象,都是严格的只读的。没有什么可以改变

英文:

In Git, the phrase working directory was once a synonym for working tree. It isn't any longer, because the phrase working directory may also be used by your OS (usually with a third word in front, as current working directory). Modern Git tries to use the phrase working tree as much as possible, though this is sometimes shortened to work-tree or worktree, as in git worktree add for instance.

In your OS, when they use the phrase current working directory, this refers to the folder or directory<sup>1</sup> you are working in at the time. That may be within your working tree.

In Git, the phrase working tree refers to the OS-maintained directories-and-files that hold your copies of files. These are yours, to deal with as you wish: Git simply fills them in from committed files.

> What if I create a directory 'mydir' locally then I [run]: git init

Let me rephrase this as the following series of shell commands:

$ mkdir mydir
$ cd mydir
$ git init

The mkdir creates a new, empty directory, within your current working directory. The cd then enters this empty directory, so that now what was ./mydir is your current working directory. The git init command runs with its own current working directory being this empty directory.

Since the directory mydir was empty at the time you ran git init, Git will create a hidden directory / folder named .git within this mydir directory. This hidden directory contains the repository proper. The repository consists of a number of files and directories that implement several databases:

  • One database is a simple key-value store that uses hash IDs to locate internal Git objects. This is the main (and usually largest) of the two primary databases that make up a Git repository.

  • One database is another simple key-value store that uses names as keys, to store hash IDs, which are then used in the first database. This is the secondary database that makes up a Git repository. This particular database's implementation in current versions of Git tends to be a bit dodgy: it relies too much on your operating system. On macOS and Windows, it tends to be a bit flawed. There is ongoing work in Git to replace this with a proper database implementation, which will eliminate this problem.

  • Apart from these two main databases, the repository contains many auxiliary files, including Git's index (aka staging area). The most important point here is that all of these entities live within the .git directory, though.

As there are no commits yet, both main databases are empty. At this point, so is Git's index.

Your work-tree consists of all files and directories inside your current working directory except the .git directory, which holds Git's files. Since your work-tree is yours, and is maintained by your OS (not by Git), you can now create any files you like here.

At some point, you will want to have Git create a new commit. This will be the very first commit in the repository. To create this commit, you will add the files you would like to go into this initial commit, into Git's index / staging-area, using git add. The git add program works by copying your work-tree files into Git's index. So, with your OS's current working directory being the mydir directory, you can now just create some file(s):

$ echo &quot;repository for project X&quot; &gt; README
$ git add README
$ git commit

The echo command here creates a new file named README in your working tree. The git add command takes the working tree file, compresses and Git-ifies it to make it ready to be stored in a new commit, and writes the stored file into Git's index.<sup>2</sup> The final command, git commit, gathers some metadata from you—the person making the commit—and writes out Git's index and this metadata, storing the results in the main database, to create a new commit.

Once you've made this new, initial commit—the very first commit in the repository—it becomes possible for branch names to exist. They cannot exist until this point because each branch name must hold a valid, existing hash ID, and hash IDs for future commits are not predictable.<sup>3</sup> Now that there is one commit, that's the only hash ID that any branch name can hold.<sup>4</sup>

Over time, you will add more and more commits to the repository. (In general, it's pretty rare to ever drop a commit, except for, e.g., the way git rebase replaces commits with new-and-improved ones. It's not impossible, it is just difficult.) Each new commit therefore adds to the repository.

The repository itself, then, consists of:

  • the databases that hold commits and other objects, and the names that find them;
  • Git's index, used to hold your proposed next commit; and
  • other maintenance items that you and/or Git may find useful.

The commit objects, and in fact all objects in the big database, are strictly read-only. Nothing and no one can ever change them. They're in a form that is directly useful only to Git itself, though.

Cloning the repository consists of copying the two databases, although the names database is only partly copied, and gets changed during the cloning process.

Meanwhile, your working tree is where you have Git extract commits, turning stuff that's only directly useful to Git—and that is read-only—into stuff you can work with and modify. These are your files. This is how you do your work, in your working tree. You can use the results to update Git's index, and then use Git's index to create a new commit, that adds on to the repository without changing anything that already exists in the repository.


<sup>1</sup>At the OS level, the terms folder and directory are synonyms. Git itself does not store folders or directories: it just stores files whose names may contain embedded slashes, such as path/to/file.ext. That's all one single file name. Your OS may force you to first make a folder named path, then in that folder, make a folder named to, and only then use the combined path and to folders to make a file named file.ext within that path. The current working directory can be changed to path, so that you would use the name to/file.ext, instead of path/to/file.ext, or even to path/to so that you would use the name file.ext. In all cases, Git will internally work with a stored file named path/to/file.ext. So your current working directory is an OS concept, referring to how you move around within the folders that your OS maintains.

<sup>2</sup>Technically, the index doesn't actually hold the files directly. It holds instead a Git blob object hash ID for the file, which provides the key to the key-value object database so that Git can look up the file's content, plus the name of the file—complete with (forward) slashes—and some additional information. The blob object holds a compressed and de-duplicated copy of the file's content.

This de-duplication, and the fact that it is git add that readies the file for committing, means that git commit will go quite fast, as it need not prepare anything for committing: it just saves, permanently, the blob objects already stored in the index.

<sup>3</sup>The hash ID of a commit is a cryptographic checksum of the commit's complete content. The content include not only the saved source files (as an internal Git tree object), but also the exact date-and-time-stamp. Since we don't even know what you'll commit in the future, much less exactly when you will commit it, we cannot compute what the future hash ID will be. You may know what you will commit, which gets you closer; but unless you know exactly when you will commit it, you won't know the hash ID either.

<sup>4</sup>Branch names in particular are constrained: they may only hold a commit hash ID. Tag names can hold the hash ID of any of Git's four internal object types. (Usually, though, a tag name either holds a commit hash ID, or the hash ID of a newly-created annotated tag object, which in turn holds a commit hash ID.) Other types of names may have their own constraints.

huangapple
  • 本文由 发表于 2020年10月25日 09:04:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/64519435.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定