Golang文件和文件夹复制/镜像在多个服务器之间。

huangapple go评论75阅读模式
英文:

Golang file and folder replication / mirroring across multiple servers

问题

考虑以下情景。在一个负载均衡的环境中,我有3个独立的CMS实例运行在3台不同的物理服务器上。这3个独立运行的应用实例共享同一个数据库。

在每台服务器上,CMS都有一个/media文件夹,其中包含所有媒体子文件夹和文件。我的问题是,我该如何在Golang中实现/编码一个文件复制服务/功能,以便当一个服务器上的子文件夹或文件被添加/更改/删除时,它会被复制/复制/删除到所有其他服务器上?

我需要查看哪些包,或者你是否有一个小的代码片段可以帮助我入门?那将非常棒。

*编辑:
这个问题被标记为“重复”,但实际上并不是。然而,它是设置共享网络文件系统的一个替代方案。我在考虑将同一个文件的副本保存在所有服务器上,并进行同步和更新,这可能比共享它们更好。

英文:

Consider this scenario. In a load-balanced environment, I have 3 separate instances of a CMS running on 3 different physical servers. These 3 separate running instances of the application is sharing the same database.

On each server, the CMS has a /media folder where all media subfolders and files reside. My question is how I'd implement/code a file replication service/functionality in Golang, so when a subfolder or file is added/changed/deleted on one of the servers, it'll get copied/replicated/deleted on all other servers?

What packages would I need to look in to, or perhaps you have a small code snippet to help me get started? That would be awesome.

Edit:
This question has been marked as "duplicate", but it is not. It is however an alternative to setting up a shared network file system. I'm thinking that keeping a copy of the same file on all servers, synchronizing and keeping them updated might be better than sharing them.

答案1

得分: 3

你可能不应该这样做。使用分布式文件系统、对象存储(如S3或GCS)或类似btsync或syncthing的同步程序会更好。

如果你仍然想自己实现,那将是具有挑战性的。你基本上是在构建一个分布式数据库,而这些数据库很难做到完美。

初步看,你可以尝试使用etcdraft,但不幸的是,etcd在处理大文件时效果不佳。

你可以在上传时使用ssh将文件复制到每个服务器。但是,当一个服务器宕机时会发生什么?或者当两个人同时更新同一个文件时会发生什么?

也许你可以设计一个系统,使每个文件都有一个唯一的ID(可以基于其内容的哈希值来安全地去重),这些文件只能添加,不能更新或删除。这样可以解决同时更新的问题,但仍然会有停机时间的问题。

一种方法是每个服务器在添加文件时维护一个追加日志:

版本号 | 文件哈希值
      1 |   abcd123
      2 |   efgh456
      3 |   ijkl789

通过这种方式,你可以从服务器获取每个文件,并且只需要一个数字就足以知道何时添加了一个文件。(例如,如果你认为服务器A的版本号是5,然后得知它现在是版本7,你就知道需要同步2个文件)

你可以使用数据库表来实现这一点:

ID | 本地服务器ID | 远程服务器ID | 版本号 | 文件哈希值

你可以定期轮询该表,并通过ssh或http在机器之间进行同步。如果一个服务器宕机,你可以不断重试直到它恢复正常。

或者,如果你不想为此使用集中式数据库,你可以使用类似memberlist的库。每个节点的本地元数据可以作为其版本号。

无论哪种方式,单个服务器上上传文件和在所有服务器上都可用之间都会有一定的延迟。处理这个问题很困难,这就是为什么你可能不应该这样做的原因。

英文:

You probably shouldn't do this. Use a distributed file system, object storage (ala S3 or GCS) or a syncing program like btsync or syncthing.

If you still want to do this yourself, it will be challenging. You are basically building a distributed database and they are difficult to get right.

At first blush you could checkout something like etcd or raft, but unfortunately etcd doesn't work well with large files.

You could, on upload, also copy the file to every other server using ssh. But then what happens when a server goes down? Or what happens when two people update the same file at the same time?

Maybe you could design it such that every file gets a unique id (perhaps based on the hash of its contents so you can safely dedupe) and those files can never be updated or deleted, only added. That would solve the simultaneous update problem, but you'd still have the downtime problem.

One approach would be for each server to maintain an append-only version log when a file is added:

VERSION | FILE HASH
      1 |   abcd123
      2 |   efgh456
      3 |   ijkl789

With that you can pull every file from a server and a single number would be sufficient to know when a file is added. (For example if you think Server A is on version 5, and you get informed it is now on version 7, you know you need to sync 2 files)

You could do this with a database table:

ID | LOCAL_SERVER_ID | REMOTE_SERVER_ID | VERSION | FILE HASH

Which you could periodically poll and do your syncing via ssh or http between machines. If a server was down you could just retry until it works.

Or if you didn't want to have a centralized database for this you could use a library like memberlist. The local meta data for each node could be its version.

Either way there will be some amount of delay between a file was uploaded to a single server, and when it's available on all of them. Handling that well is hard, which is why you probably shouldn't do this.

huangapple
  • 本文由 发表于 2015年5月14日 00:51:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/30220917.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定