Is it possible to use JGit's in-memory repository to add one small file to a big Git repository without checking it out locally?

huangapple go评论83阅读模式
英文:

Is it possible to use JGit's in-memory repository to add one small file to a big Git repository without checking it out locally?

问题

我有一个大型(几 GB)的 Git 仓库。我希望某个应用程序在该仓库内创建小文件并提交更改。这应该在不将这些 GB 的数据检出到磁盘上的情况下完成。

我在一个 JGit 的代码示例中找到了一个将远程仓库克隆到内存中的示例。

我是否可以使用 JGit(类似下面展示的方式)向远程仓库中添加文件,而无需在本地将其检出(即不需要将 GB 级别的数据传输到将运行该代码的机器上)?

DfsRepositoryDescription repoDesc = new DfsRepositoryDescription();
InMemoryRepository repo = new InMemoryRepository(repoDesc);
Git git = new Git(repo);
git.remoteAdd();
git.commit();

更新 1: 整个目录(已跟踪的文件加上 .git)的大小为 1.2G.git 文件夹本身的大小为 573M。

% du -hs .
1.2G	.
% du -hs .git
573M	.git
英文:

I have a large (several GB) Git repository. I want some application to create small files within that repository and commit the changes. This should happen without checking out those gigabytes to the disk.

I found a JGit code sample in which a remote repository is cloned into an in-memory respository.

Can I use JGit (something like shown below) in order to add a file to a remote repository without checking it out locally (i. e. without transferring gigabytes of data to the machine where that code will run)?

DfsRepositoryDescription repoDesc = new DfsRepositoryDescription();
InMemoryRepository repo = new InMemoryRepository(repoDesc);
Git git = new Git(repo);
git.remoteAdd();
git.commit();

Update 1: The size of the entire directory (tracked files plus .git) is 1.2G. The size of .git alone is 573M.

% du -hs .
1.2G	.
% du -hs .git
573M	.git

答案1

得分: 1

注意,在RAM中进行的检出将从远程下载完全相同的数据。在克隆完成后,它将在RAM中使用 1,2G,而不是在磁盘上使用。

您还可以查看如何使用JGit API创建和添加一个“文件”,以查看它是否是将所需数据添加到您的存储库的便捷方式。


如果您希望限制从git服务器下载的内容量:您可以创建一个浅克隆。
例如,参见此Stack Overflow答案

git clone --depth 1 <repo_url> -b <branch_name>

这将下载最新提交的内部文件(在您的情况下为500M的压缩版本),并对此提交进行实际检出(500M)。

您可以根据需要将此操作在磁盘上或者在RAM中执行。

在您的情况下,这仍然需要在RAM或磁盘上使用600-700M的空间,包括检出的文件,
但它将仅从服务器下载约100M(我猜测:通过从您的工作站运行浅克隆命令来检查实际大小)用于头提交的压缩版本。

英文:

Note that a checkout in RAM will download the exact same data from the remote. It will just use 1,2G in RAM, rather than on disk, after the clone is completed.

You may also want to see how you can create and add a "file" using JGit api, to see if it is a convenient way to add the data you want to your repo.


If you want to limit the amount of stuff that's downloaded from your git server : you can create a shallow clone.
see this SO answer for example :

git clone --depth 1 &lt;repo_url&gt; -b &lt;branch_name&gt;

This will download the inner files for the latest commit (a compressed version of the 500M in your case), and run an actual checkout (the 500M) of this commit.

You can run this either on disk, or in RAM, depending on your needs.

In your situation, this would still require something in the 600-700M usage, in RAM or on disk, including the checked out files,
but it would download only ~100M (I'm guessing : check the actual size by running the shallow clone command from your workstation) from the server for the compressed version of the head commit.

huangapple
  • 本文由 发表于 2020年10月10日 19:21:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/64292837.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定