英文:
File sync between n web servers in cluster
问题
在一个网络集群中有 n 个节点。文件可以上传到任何一个节点,然后必须分发到每个其他节点。这个分发过程不需要在一个事务中进行(实际上不能这样做,分布式事务不可扩展),并且可以接受一定的延迟,尽管必须保持最小。冲突可以任意解决(通常是最后一次写入为准),只要解决方案也分布到所有节点,以便最终所有节点都具有相同的文件集。节点可以动态添加和删除,而无需重新配置现有节点。不能有单点故障,并且不需要额外的设备来解决这个问题(比如 RabbitMQ)。
我考虑使用 consul.io 进行动态配置,这样每个节点可以参考 consul 来确定其他可用节点,并编写一个守护进程(使用 Golang)来监视相关文件夹,并使用 ZeroMQ 与其他节点进行通信。
但感觉我可能在重新发明轮子。这是一个常见的问题,我预计可能已经有解决方案了,只是我不知道而已。或者也许我的方法是错误的,有其他解决方法?
英文:
There are n nodes in a web cluster. Files may be uploaded to any node and then must be distributed to every other node. This distribution does not have to happen in a transaction (in fact it must not, distributed transactions don't scale) and some latency is acceptable, although must be minimal. Conflicts can be resolved arbitrarily (typically last write wins) provided that the resolution is also distributed to all nodes so that eventually all nodes have the same set of files. Nodes can be added and removed dynamically without having to reconfigure existing nodes. There must be no single point of failure and no additional boxes required to solve this (such as RabbitMQ)
I am thinking along the lines of using consul.io for dynamic configuration so that each node can refer to consul to determine what other nodes are available and writing a daemon (Golang) that monitors the relevant folders and communicates with other nodes using ZeroMQ.
Feels like I would be re-inventing the wheel though. This is a common problem and I expect there are solutions available already that I don't know about? Or perhaps my approach is wrong and there is another way to solve this?
答案1
得分: 1
是的,最近在分布式同步方面有一些进展:
你可以使用syncthing(开源)或BitTorrent Sync。
Syncthing是基于节点的,也就是你可以将节点添加到一个集群中,并选择要同步的文件夹。
BTSync是基于文件夹的,也就是你可以获取一个文件夹的“密钥”,并与该文件夹的所有人进行同步。
根据我的经验,BTSync具有更好的发现和连接性,但整个同步过程是闭源的,没有人真正知道发生了什么。Syncthing是用Go语言编写的,但有时会遇到发现对等节点的问题。
Syncthing和BTSync都使用局域网广播和跟踪器进行发现,据我所知。
编辑:或者,如果你真的很酷,可以使用IPFS来托管最新版本,使用IPNS来“命名”它,并在服务器上挂载IPNS。你可以将IPFS引导列表设置为你的一些服务器,这样甚至可以让你独立于外部跟踪器。
英文:
Yes, there has been some stuff going on with distributed synchronization lately:
You could use syncthing (open source) or BitTorrent Sync.
Syncthing is node-based, i.e. you add nodes to a cluster and choose which folders to synchronize.
BTSync is folder-based, i.e. you obtain a "secret" for a folder and can synchronize with everyone in the swarm for that folder.
From my experience, BTSync has a better discovery and connectivity, but the whole synchronization process is closed source and nobody really knows what happens. Syncthing is written in go, but sometimes has trouble discovering peers.
Both syncthing and BTSync use LAN discovery via broadcast and a tracker for discovery, AFAIK.
EDIT: Or, if you're really cool, use IPFS to host the latest version, IPNS to "name" that and mount the IPNS on the servers. You can set the IPFS bootstrap list to some of your servers, which would even make you independent of external trackers.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论