怎样在Java中原子性地从src复制多个文件到dest?

huangapple go评论66阅读模式
英文:

How to copy multiple files atomically from src to dest in java?

问题

在一个需求中,我需要将多个文件从一个位置复制到另一个网络位置。

假设我在/src位置有以下文件。
a.pdf,b.pdf,a.doc,b.doc,a.txt和b.txt

我需要将a.pdf,a.doc和a.txt文件以原子方式一次性复制到/dest位置。

目前我正在使用Java.nio.file.Files包,并且代码如下:

Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1 = Paths.get("/dest/a.pdf");

Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2 = Paths.get("/dest/a.doc");

Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3 = Paths.get("/dest/a.txt");

Files.copy(srcFile1, destFile1);
Files.copy(srcFile2, destFile2);
Files.copy(srcFile3, destFile3);

但是这个过程中文件是一个接一个地复制的。
作为替代,为了使整个过程是原子的,
我考虑将所有文件压缩并移动到/dest,然后在目标位置解压缩。

这个方法是否正确,可以使整个复制过程是原子的?有没有人有类似的经验并解决了这个问题。

英文:

in one requirement, i need to copy multiple files from one location to another network location.

let assume that i have the following files present in the /src location.
a.pdf, b.pdf, a.doc, b.doc, a.txt and b.txt

I need to copy a.pdf, a.doc and a.txt files atomically into /dest location at once.

Currently i am using Java.nio.file.Files packages and code as follows

Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1 = Paths.get("/dest/a.pdf");

Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2 = Paths.get("/dest/a.doc");

Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3 = Paths.get("/dest/a.txt");

Files.copy(srcFile1, destFile1);
Files.copy(srcFile2, destFile2);
Files.copy(srcFile3, destFile3);

but this process the file are copied one after another.
As an alternate to this, in order to make whole process as atomic,
i am thinking of zipping all the files and move to /dest and unzip at the destination.

is this approach is correct to make whole copy process as atomic ? any one experience similar concept and resolved it.

答案1

得分: 2

> 这种方法将整个复制过程作为原子操作是否正确?是否有人有类似的概念并解决了这个问题。

您可以将文件复制到一个新的临时目录,然后重命名该目录。

在重命名临时目录之前,您需要删除目标目录。

如果目标目录中已经存在其他不希望覆盖的文件,您可以将临时目录中的所有文件移动到目标目录。

然而,这并不是完全原子的。

在删除 /dest 的情况下:

String tmpPath = "/tmp/in/same/partition/as/source";
File tmp = new File(tmpPath);
tmp.mkdirs();
Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1 = Paths.get(tmpPath + "/dest/a.pdf");

Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2 = Paths.get(tmpPath + "/dest/a.doc");

Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3 = Paths.get(tmpPath + "/dest/a.txt");

Files.copy(srcFile1, destFile1);
Files.copy(srcFile2, destFile2);
Files.copy(srcFile3, destFile3);
delete(new File("/dest"));
tmp.renameTo("/dest");
void delete(File f) throws IOException {
  if (f.isDirectory()) {
    for (File c : f.listFiles())
      delete(c);
  }
  if (!f.delete())
    throw new FileNotFoundException("Failed to delete file: " + f);
}

仅覆盖文件的情况:

String tmpPath = "/tmp/in/same/partition/as/source";
File tmp = new File(tmpPath);
tmp.mkdirs();
Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1 = Paths.get("/dest/a.pdf");
Path tmp1 = Paths.get(tmpPath + "/a.pdf");

Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2 = Paths.get("/dest/a.doc");
Path tmp2 = Paths.get(tmpPath + "/a.doc");

Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3 = Paths.get("/dest/a.txt");
Path tmp3 = Paths.get(tmpPath + "/a.txt");

Files.copy(srcFile1, tmp1);
Files.copy(srcFile2, tmp2);
Files.copy(srcFile3, tmp3);

// 非原子部分的开始(如果需要,可以再次执行)

Files.deleteIfExists(destFile1);
Files.deleteIfExists(destFile2);
Files.deleteIfExists(destFile3);

Files.move(tmp1, destFile1);
Files.move(tmp2, destFile2);
Files.move(tmp3, destFile3);
// 非原子部分的结束

即使第二种方法包含一个非原子部分,复制过程本身也使用临时目录,以防文件被覆盖。

如果在移动文件过程中中止,可以轻松地完成操作。

有关移动文件的参考,请参阅 https://stackoverflow.com/a/4645271/10871900,有关递归删除目录的参考,请参阅 https://stackoverflow.com/a/779529/10871900

英文:

> is this approach is correct to make whole copy process as atomic ? any one experience similar concept and resolved it.

You can copy the files to a new temporary directory and then rename the directory.

Before renaming your temporary directory, you need to delete the destination directory

If other files are already in the destination directory that you don't want to overwrite, you can move all files from the temporary directory to the destination directory.

This is not completely atomic, however.

With removing /dest:

String tmpPath="/tmp/in/same/partition/as/source";
File tmp=new File(tmpPath);
tmp.mkdirs();
Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1 = Paths.get(tmpPath+"/dest/a.pdf");

Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2 = Paths.get(tmpPath+"/dest/a.doc");

Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3 = Paths.get(tmpPath+"/dest/a.txt");

Files.copy(srcFile1, destFile1);
Files.copy(srcFile2, destFile2);
Files.copy(srcFile3, destFile3);
delete(new File("/dest"));
tmp.renameTo("/dest");
void delete(File f) throws IOException {
  if (f.isDirectory()) {
    for (File c : f.listFiles())
      delete(c);
  }
  if (!f.delete())
    throw new FileNotFoundException("Failed to delete file: " + f);
}

With just overwriting the files:

String tmpPath="/tmp/in/same/partition/as/source";
File tmp=new File(tmpPath);
tmp.mkdirs();
Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1=paths.get("/dest/a.pdf");
Path tmp1 = Paths.get(tmpPath+"/a.pdf");

Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2=Paths.get("/dest/a.doc");
Path tmp2 = Paths.get(tmpPath+"/a.doc");

Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3=Paths.get("/dest/a.txt");
Path destFile3 = Paths.get(tmpPath+"/a.txt");

Files.copy(srcFile1, tmp1);
Files.copy(srcFile2, tmp2);
Files.copy(srcFile3, tmp3);

//Start of non atomic section(it can be done again if necessary)

Files.deleteIfExists(destFile1);
Files.deleteIfExists(destFile2);
Files.deleteIfExists(destFile2);

Files.move(tmp1,destFile1);
Files.move(tmp2,destFile2);
Files.move(tmp3,destFile3);
//end of non-atomic section

Even if the second method contains a non-atomic section, the copy process itself uses a temporary directory so that the files are not overwritten.

If the process aborts during moving the files, it can easily be completed.

See https://stackoverflow.com/a/4645271/10871900 as reference for moving files and https://stackoverflow.com/a/779529/10871900 for recursively deleting directories.

答案2

得分: 2

首先,有几种方法可以复制文件或目录。Baeldung提供了关于不同可能性的很好的见解。此外,您还可以使用Spring的FileCopyUtils。不幸的是,所有这些方法都不是原子操作。

我找到了一个旧帖子并进行了一些调整。您可以尝试使用低级事务管理支持。这意味着您可以将方法封装在一个事务中,并定义在回滚中应该执行的操作。Baeldung也有一篇很好的文章。

@Autowired
private PlatformTransactionManager transactionManager;

@Transactional(rollbackOn = IOException.class)
public void copy(List<File> files) throws IOException {
    TransactionDefinition transactionDefinition = new DefaultTransactionDefinition();
    TransactionStatus transactionStatus = transactionManager.getTransaction(transactionDefinition);

    TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronization() {

        @Override
        public void afterCompletion(int status) {
            if (status == STATUS_ROLLED_BACK) {
                // 尝试删除已创建的文件
            }
        }
    });

    try {
        // 复制文件
        transactionManager.commit(transactionStatus);
    } finally {
        transactionManager.rollback(transactionStatus);
    }
}

或者您可以使用简单的try-catch块。如果抛出异常,您可以删除已创建的文件。

英文:

First there are several possibilities to copy a file or a directory. Baeldung gives a very nice insight into different possibilities. Additionally you can also use the FileCopyUtils from Spring. Unfortunately, all these methods are not atomic.

I have found an older post and adapt it a little bit. You can try using the low-level transaction management support. That means you make a transaction out of the method and define what should be done in a rollback. There is also a nice article from Baeldung.

@Autowired
private PlatformTransactionManager transactionManager;

@Transactional(rollbackOn = IOException.class)
public void copy(List&lt;File&gt; files) throws IOException {
    TransactionDefinition transactionDefinition = new DefaultTransactionDefinition();
    TransactionStatus transactionStatus = transactionManager.getTransaction(transactionDefinition);

    TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronization() {

        @Override
        public void afterCompletion(int status) {
            if (status == STATUS_ROLLED_BACK) {
                // try to delete created files
            }
        }
    });

    try {
        // copy files
        transactionManager.commit(transactionStatus);
    } finally {
        transactionManager.rollback(transactionStatus);
    }
}

Or you can use a simple try-catch-block. If an exception is thrown you can delete the created files.

答案3

得分: 0

你的问题缺乏原子性目标。即使解压缩本身也不是原子操作,虚拟机在解压缩第二个文件的块之间可能会因为OutOfMemoryError而崩溃。因此,可能会出现一个文件完整,第二个文件不完整,第三个文件完全丢失的情况。

我能想到的唯一办法是采用两阶段提交,就像所有关于临时目标变成真实目标的建议一样。通过这种方式,您可以确保第二个操作要么根本不会发生,要么会创建最终状态。

另一种方法是在目标位置写入一种类似于廉价校验和文件的东西。这将使得外部进程可以监听此类文件的创建,并使用找到的文件验证其内容。

后一种方法与直接提供容器/ZIP/存档文件的方式相同,而不是将文件堆积在一个目录中。大多数存档文件都具有或支持完整性检查。

(操作系统和文件系统在写入过程中目录或文件夹消失的行为上也有所不同。有些接受并将所有数据写入可恢复的缓冲区。其他一些在接受写入后不做任何更改。还有一些在第一次写入时立即失败,因为设备上的目标块是未知的。)

英文:

Your question lacks the goal of atomicity. Even unzipping is never atomic, the VM might crash with OutOfMemoryError right in between inflating the blocks of the second file. So there's one file complete, a second not and a third entirely missing.

The only thing I can think of is a two phase commit, like all the suggestions with a temporary destination that suddenly becomes the real target. This way you can be sure, that the second operation either never occurs or creates the final state.

Another approach would be to write a sort of cheap checksum file in the target afterwards. This would make it easy for an external process to listen for creation of such files and verify their content with the files found.

The latter would be the same like offering the container/ ZIP/ archive right away instead of piling files in a directory. Most archives have or support integrity checks.

(Operating systems and file systems also differ in behaviour if directories or folders disappear while being written. Some accept it and write all data to a recoverable buffer. Others still accept writes but don't change anything. Others fail immediately upon first write since the target block on the device is unknown.)

答案4

得分: 0

对于原子写入:

标准文件系统没有原子性概念,因此您只需要执行单个动作 - 这将是原子的。

因此,要以原子方式写入多个文件,您需要创建一个带有时间戳的文件夹,并将文件复制到此文件夹中。

然后,您可以将其重命名为最终目标位置或创建一个符号链接。

您可以使用类似于此的任何方法,比如Linux上的基于文件的卷等。

请记住,删除现有符号链接并创建新链接永远不会是原子操作,因此您需要在代码中处理这种情况,并在重命名/链接的文件夹可用后切换到重命名/链接的文件夹,而不是删除/创建链接。但是,在正常情况下,删除并创建新链接是一个非常快速的操作。

对于原子读取:

问题不在代码中,而是在操作系统/文件系统级别。

不久前,我遇到了一个非常类似的情况。有一个正在运行并同时更改多个文件的数据库引擎。我需要复制当前状态,但在复制第一个文件之前,第二个文件已经被更改了。

有两个不同的选项:

  1. 使用支持快照的文件系统。在某个时刻,您创建一个快照,然后从中复制文件。
  2. 您可以在Linux上锁定文件系统使用 fsfreeze --freeze,然后使用 fsfreeze --unfreeze 解锁它。当文件系统被冻结时,您可以像往常一样读取文件,但没有进程可以更改它们。

对我来说,这些选项都不起作用,因为我无法更改文件系统类型,而且无法锁定文件系统(它是根文件系统)。

我创建了一个空文件,将其挂载为 loop 文件系统,并对其进行了格式化。从那时起,我可以仅冻结我的虚拟卷,而不触及根文件系统。

我的脚本首先调用 fsfreeze --freeze /my/volume,然后执行复制操作,然后调用 fsfreeze --unfreeze /my/volume。在复制操作的持续时间内,文件无法被更改,因此复制的文件完全来自同一时刻 - 对于我的目的,这就像是一个原子操作。

顺便说一下,一定不要 fsfreeze 您的根文件系统 :-)。我曾经这样做了,重启是唯一的解决方案。

类似数据库的方法:

即使是数据库也不能依赖原子操作,因此它们首先将更改写入WAL(预写日志),然后将其刷新到存储中。一旦刷新,它们可以将更改应用于数据文件。

如果出现任何问题/崩溃,数据库引擎首先加载数据文件并检查WAL中是否有未应用的事务,最终将它们应用。

这也被称为日志记录,并且一些文件系统(ext3、ext4)使用它。

英文:

FOR ATOMIC WRITE:

There is no atomicity concept for standard filesystems, so you need to do only single action - that would be atomic.

Therefore, for writing more files in an atomic way, you need to create a folder with, let's say, the timestamp in its name, and copy files into this folder.

Then, you can either rename it to the final destination or create a symbolic link.

You can use anything similar to this, like file-based volumes on Linux, etc.

Remember that deleting the existing symbolic link and creating a new one will never be atomic, so you would need to handle the situation in your code and switch to the renamed/linked folder once it's available instead of removing/creating a link. However, under normal circumstances, removing and creating a new link is a really fast operation.

FOR ATOMIC READ:

Well, the problem is not in the code, but on the operation system/filesystem level.

Some time ago, I got into a very similar situation. There was a database engine running and changing several files "at once". I needed to copy the current state, but the second file was already changed before the first one was copied.

There are two different options:
Use a filesystem with support for snapshots. At some moment, you create a snapshot and then copy files from it.
You can lock the filesystem (on Linux) using fsfreeze --freeze, and unlock it later with fsfreeze --unfreeze. When the filesystem is frozen, you can read the files as usual, but no process can change them.

None of these options worked for me as I couldn't change the filesystem type, and locking the filesystem wasn't possible (it was root filesystem).

I created an empty file, mount it as a loop filesystem, and formatted it. From that moment on, I could fsfreeze just my virtual volume without touching the root filesystem.

My script first called fsfreeze --freeze /my/volume, then perform the copy action, and then called fsfreeze --unfreeze /my/volume. For the duration of the copy action, the files couldn't be changed, and so the copied files were all exactly from the same moment in time - for my purpose, it was like an atomic operation.

Btw, be sure to not fsfreeze your root filesystem :-). I did, and restart is the only solution.

DATABASE-LIKE APPROACH:

Even databases cannot rely on atomic operations, and so they first write the change to WAL (write-ahead log) and flush it to the storage. Once it's flushed, they can apply the change to the data file.

If there is any problem/crash, the database engine first loads the data file and checks whether there are some unapplied transactions in WAL and eventually apply them.

This is also called journaling, and it's used by some filesystems (ext3, ext4).

答案5

得分: 0

我希望这个解决方案会有用根据我的理解您需要将文件从一个目录复制到另一个目录
所以我的解决方案如下
谢谢!! 

public class CopyFilesDirectoryProgram {
    

    public static void main(String[] args) throws IOException {
        // TODO Auto-generated method stub
        String sourcedirectoryName = "//mention your source path";
        String targetdirectoryName = "//mention your destination path";
        File sdir = new File(sourcedirectoryName);
        File tdir = new File(targetdirectoryName);
        //调用执行方法
        abc(sdir, tdir);

    }

    private static void abc(File sdir, File tdir) throws IOException {
        
        if (sdir.isDirectory()) {
            copyFilesfromDirectory(sdir, tdir);
        } else {
            Files.copy(sdir.toPath(), tdir.toPath());
        }
    }
    
    private static void copyFilesfromDirectory(File source, File target) throws IOException {
        
        if (!target.exists()) {
            target.mkdir();
        } else {
            for (String items : source.list()) {
                abc(new File(source, items), new File(target, items));
            }
        }
    }
}
英文:

I hope this solution would be useful : as per my understanding you need to copy the files from one directory to another directory.
so my solution is as follows:
Thank You.!!

public class CopyFilesDirectoryProgram {

public static void main(String[] args) throws IOException {
	// TODO Auto-generated method stub
	String sourcedirectoryName=&quot;//mention your source path&quot;;
	String targetdirectoryName=&quot;//mention your destination path&quot;;
	File sdir=new File(sourcedirectoryName);
	File tdir=new File(targetdirectoryName);
    //call the method for execution
	abc (sdir,tdir);

}

private static void abc(File sdir, File tdir) throws IOException {
	
	if(sdir.isDirectory()) {
		copyFilesfromDirectory(sdir,tdir);
	}
		else
		{
			Files.copy(sdir.toPath(), tdir.toPath());
		}
	}


private static void copyFilesfromDirectory(File source, File target) throws IOException {
	
	if(!target.exists()) {
		target.mkdir();
		
	}else {
		for(String items:source.list()) {
			abc(new File(source,items),new File(target,items));
		}
	}
}

}

huangapple
  • 本文由 发表于 2020年9月11日 02:07:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/63835385.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定