英文:
How to open a file in Java that does not prevent external "Safe Save"?
问题
我们想在Java中打开一个文件并读取其内容。
这个文件可能会被外部应用使用安全保存进行更新。这意味着该文件将被外部读取,其更新的内容将被存储到一个新文件中。最终,原始文件将被删除,新文件将被重命名以匹配原始文件的名称。
不幸的是,在我们的Java应用程序读取原始文件的同时,外部进程在重命名(安全保存的最后一部分)时失败了。
我们尝试过不同类型的打开模式,但无法找到不会使外部读取器失败的解决方案。
是否有一些方法可以打开一个文件,不会干扰访问相同文件的外部进程?理想情况下,无论外部进程如何移动或删除文件,我们都希望在我们的Java应用程序中收到异常。并且仅在那里收到异常。
您对如何实现这一点有任何想法吗?
编辑:
关于用例的一些澄清:
这是一种索引器的场景。我们希望索引一个潜在的非常大的文件系统,第三方独立进程也可以并发地从中读取或写入。我们无法控制第三方进程。
复制原始文件似乎是一个很大的开销,而且我们不确定是否会对解决原始问题有帮助,因为它可能也会在安全保存时使外部读取器失败。
最后但并非最不重要的: 这应该适用于Windows和Linux。但我们在Windows上遇到了这个问题。
英文:
We want to open a file in Java and read its contents.
This file may be updated by an external application using Safe Save. That means the file will be externally read and its updated contents will be stored to a new file. Eventually the original file is deleted and the new file is renamed to match the original file's name.
Unfortunately the external process fails during rename (last part of the Safe Save) when our Java Application is reading the original file at the same time.
We played with different kind of open modes but could not get a solution that does not fail the external reader.
Is there some way to open a file that does not interfere with external processes accessing the same file? Ideally, whenever an external process moves or deletes the file we would like to get an exception in our Java application. And only there.
Do you have any ideas on how to achieve that?
EDIT:
Just some clarification regarding the use case:
This an indexer like scenario. We want to index contents of a potentially very large filesystem where 3rd party independent processes can concurrently read from or write to as well. We have no control over the 3rd party processes.
Copying the original file seems like a big overhead and we are not sure if that helps with the original problem as it will probably fail the external reader on a Safe Save as well.
Last but not least: This should work on Windows and Linux. But we are experiencing this problems on Windows.
答案1
得分: 1
在Windows上,一个文件是否可以在打开的情况下重命名或删除取决于FILE_SHARE_DELETE
共享模式标志。这个标志应该在使用低级别CreateFile
函数打开文件时传递。
不幸的是,Java API不允许您控制低级别的Windows特定标志。有一个已经打开的bug报告,建议默认添加FILE_SHARE_DELETE,但由于向后兼容性(某些应用程序可能依赖于此行为),可能不太可能实现。报告中的一条评论建议使用以下解决方法来绕过此问题:而不是new FileInputStream(file)
,请使用java.nio API。
InputStream in = Files.newInputStream(file.toPath());
我现在无法访问Windows以验证此解决方法是否使用了正确的共享模式。
英文:
On Windows, whether a file can be renamed or deleted while it's open is controlled by the FILE_SHARE_DELETE
sharing mode flag. This flag should be passed in when the file is opened with the low level CreateFile
function.
Unfortunately, Java API does not give you control over low level Windows-specific flags. There is an open bug report to have FILE_SHARE_DELETE added by default, but it's unlikely it will be done because of backwards compatibility (some applications may depend on this behavior). the A comment in the report suggests a workaround: instead of new FileInputStream(file)
use the java.nio API.
InputStream in = Files.newInputStream(file.toPath());
I don't have access to Windows right now to verify that this workaround uses the right sharing mode.
答案2
得分: 0
复制原始文件并在您的Java程序中使用此副本,同时跟踪原始文件。
这里,这可能对您有所帮助:
java.nio.file包提供了一个文件更改通知API,称为Watch Service API。该API使您能够向监视服务注册一个目录(或多个目录)。在注册时,您告诉服务您感兴趣的事件类型:文件创建、文件删除或文件修改。当服务检测到感兴趣的事件时,它会转发给注册的进程。注册的进程有一个专用于监视其注册事件的线程(或线程池)。当事件发生时,会根据需要进行处理。官方文档
英文:
Make a copy of the original file an use this within your Java program, and at the same time keep track of the original file.
Here, this might help you out:
The java.nio.file package provides a file change notification API, called the Watch Service API. This API enables you to register a directory (or directories) with the watch service. When registering, you tell the service which types of events you are interested in: file creation, file deletion, or file modification. When the service detects an event of interest, it is forwarded to the registered process. The registered process has a thread (or a pool of threads) dedicated to watching for any events it has registered for. When an event comes in, it is handled as needed. Official docs
答案3
得分: 0
你不能仅通过文件实现这一点,至少不能在不做额外假设的情况下实现。如果这些进程不同步,你将会遇到以下情况之一:(a) 错误 (b) 数据损坏或者 (c) 二者皆有。此外,这种系统会不稳定,容易出现竞态条件和实现特定的细节问题。这意味着,即使看起来工作正常,它也不会始终以每种情况下的正确方式工作。
根据你的情况,你可以尝试使用一些组合方式来解决,比如调度(例如进程A每隔偶数分钟运行一次,进程B每隔奇数分钟运行一次),排他/共享打开标志,范围锁,复制文件,文件变更通知,失败时的重试等。如果你能确保你的假设永远不会被打破,你可能会得到一个足够好的解决方案。但总的来说,这是一种糟糕的工程实践,应该避免。
对于一个合适的解决方案,你需要让两个进程意识到它们正在彼此交流。你所面临的情况实际上是一个数据库的典型用例。除了使用数据库,还有很多其他同步访问数据的方法——消息传递、流、锁、共享内存等。每种方法都有其自身的优势和缺点,如果不了解更多关于你特定情况的信息,很难说哪种方法更好。
英文:
You cannot achieve this only with files, at least not without making additional assumptions. If the processes are not synchronized you will get either (a) errors (b) corrupted data or (c) both. Furthermore, such system will be unstable, prone to race conditions and implementation-specific details. This means that even if it looks like it's working it will not work correctly always and in each case.
Depending on your circumstances you might try to use a combination of scehduling (i.e. process A runs every even minute, process B every odd minute), exclusive/shared open flags, range locks, copying files, file change notifiers, retrying on failure etc. If you can somehow ensure that your assumptions are never broken you might end up with something which is "good enough". But all in all, this is a bad engineering practice and should be avoided.
For a proper solution, you need to make both processes aware that they are talking to each other. What you have is really a textbook use case for a database. Besides using a database there are plenty of other ways to synchronize access to data - messaging, streams, locks, shared memory etc. Each way has its own benefits and downsides and without knowing more about your specific situation it is impossible to say which would be better.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论