英文:
How do I migrate data where source has 2 data directories to a target cluster with just 1?
问题
目前:
单节点
Cassandra 3.11.3 + KairosDB 1.2
两个数据存储路径
/data/cassandra/data/kairosdb 4T 旧数据
/data1/cassandra/data/kaiosdb 1.1T 当前写入数据
目标:
三节点
Cassandra 3.11.3 + KairosDB 1.2
一个数据存储路径
/data/cassandra/data/kairosdb
在这种情况下,如何将单节点下的两个数据目录中的数据迁移到一个具有一个数据目录的三节点集群?我知道如何在将单节点迁移到三节点集群时执行(并已经练习过),但只有在只有一个数据目录的情况下才可以。将2个数据目录迁移到1个,我已经在互联网上搜索了很长时间,但没有找到相关资料。
英文:
Now:
Single Node
cassandra 3.11.3 + kairosdb 1.2
Two data storage path
/data/cassandra/data/kairosdb 4T Old data
/data1/cassandra/data/kaiosdb 1.1T Now Wrting data
target:
Three Node
cassandra 3.11.3 + kairosdb 1.2
One data storage path
/data/cassandra/data/kairosdb
In this case, how to migrate the data in two data directories under a single node to a three-node cluster, each node of this three-node cluster has only one data directory
I understand how to do it (and have practiced it) when migrating a single node to a three-node cluster, but only when there is only one data directory.2 data directories are migrated to 1,I have searched the Internet for a long time, but there is no reference material.
答案1
得分: 1
数据目录是单个Cassandra节点关心的事情,但集群不关心。
通常,您希望所有节点共享相同的配置,但对于复制来说,在每个节点的磁盘上的SSTables位于哪里并不重要。
因此,迁移到这里将与您之前练习的相同。
也就是说,我选择的过程是将新节点添加为具有正确复制的第二个数据中心,运行修复以使所有数据同步,然后停用原始节点。
英文:
Data directories are something that the individual Cassandra node cares about but the Cluster doesn't.
Usually you'd want to have all nodes share the same configuration but for replication it really doesn't matter where the SSTables are on Disk on each node.
So migrating here would be the same as you've practiced.
That said the process I'd choose would be to add the new nodes as a second DC with the right replication, run a repair to have all the data in sync and then decommission the original node.
答案2
得分: 0
当将应用程序数据克隆到新的集群时,源集群的配置大部分是不相关的。重要的是:(a)您知道需要迁移哪些应用程序表,以及(b)它们所在的目录。
特别针对您的情况,有两种可用选项:
- 文件复制方法,和
- 批量加载方法。
文件复制方法
警告 - 只有当应用程序键空间的复制因子为3时,此方法才有效。否则,您只能使用批量加载方法。
从源节点复制应用程序数据到目标节点的高级步骤如下:
- 对第一个应用程序表创建一个快照。
- 在目标集群上创建表模式。
- 将数据文件(通过SCP、FTP等方式)从源节点复制到三个目标节点上的相应表子目录。
- 在目标节点上运行
nodetool refresh
命令。
对每个需要克隆的应用程序表重复上述步骤。
有关详细信息,请参阅我在如何将快照恢复到另一个Cassandra集群中记录的步骤。
批量加载方法
将应用程序数据批量加载到目标集群的高级步骤如下:
- 对第一个应用程序表创建一个快照。
- 使用
sstableloader
实用程序将数据批量加载到目标集群。 - 重复上述步骤,直到所有应用程序表都克隆到目标集群为止。
有关详细信息,请参阅我在如何将数据迁移到新的Cassandra集群中记录的步骤。干杯!
英文:
When cloning application data to a new cluster, the configuration of the source cluster is mostly irrelevant. What is important is that (a) you know which application tables you need to migrate, and (b) the directories where they are located.
Specifically for your situation, there are two options available to you:
- the file copy method, and
- the bulk-load method.
File copy method
WARNING - This method will only work if the application keyspace(s) have a replication factor of 3. Otherwise, you can only use the bulk-load method.
The high level steps to copy the application data from the source node to the target nodes are:
- Take a snapshot of the first application table.
- Create the table schema on the target cluster.
- Copy the data files (via SCP, FTP, etc) from the source node to the corresponding table sub-directory on the three target nodes.
- Run the
nodetool refresh
command on the target nodes.
Repeat the steps above for each application table you need to clone.
For details, see the procedure I documented in How to restore snapshots to another Cassandra cluster.
Bulk-load method
The high-level steps for bulk-loading application data to the target cluster are:
- Take a snapshot of the first application table.
- Bulk-load the data to the target cluster using the
sstableloader
utility. - Repeat the steps above until all application tables have been cloned to the target cluster.
For details, see the procedure I documented in How to migrate data to a new Cassandra cluster. Cheers!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论