2023年2月19日 21:50:14go评论49阅读模式

英文:

How do I migrate data where source has 2 data directories to a target cluster with just 1?

问题

目前：
单节点
Cassandra 3.11.3 + KairosDB 1.2
两个数据存储路径
/data/cassandra/data/kairosdb 4T 旧数据
/data1/cassandra/data/kaiosdb 1.1T 当前写入数据

目标：
三节点
Cassandra 3.11.3 + KairosDB 1.2
一个数据存储路径
/data/cassandra/data/kairosdb

在这种情况下，如何将单节点下的两个数据目录中的数据迁移到一个具有一个数据目录的三节点集群？我知道如何在将单节点迁移到三节点集群时执行（并已经练习过），但只有在只有一个数据目录的情况下才可以。将2个数据目录迁移到1个，我已经在互联网上搜索了很长时间，但没有找到相关资料。

英文:

Now:
Single Node
cassandra 3.11.3 + kairosdb 1.2
Two data storage path
/data/cassandra/data/kairosdb 4T Old data
/data1/cassandra/data/kaiosdb 1.1T Now Wrting data

target:
Three Node
cassandra 3.11.3 + kairosdb 1.2
One data storage path
/data/cassandra/data/kairosdb

In this case, how to migrate the data in two data directories under a single node to a three-node cluster, each node of this three-node cluster has only one data directory

I understand how to do it (and have practiced it) when migrating a single node to a three-node cluster, but only when there is only one data directory.2 data directories are migrated to 1,I have searched the Internet for a long time, but there is no reference material.

答案1

得分: 1

数据目录是单个Cassandra节点关心的事情，但集群不关心。
通常，您希望所有节点共享相同的配置，但对于复制来说，在每个节点的磁盘上的SSTables位于哪里并不重要。

因此，迁移到这里将与您之前练习的相同。

也就是说，我选择的过程是将新节点添加为具有正确复制的第二个数据中心，运行修复以使所有数据同步，然后停用原始节点。

英文:

Data directories are something that the individual Cassandra node cares about but the Cluster doesn't.
Usually you'd want to have all nodes share the same configuration but for replication it really doesn't matter where the SSTables are on Disk on each node.

So migrating here would be the same as you've practiced.

That said the process I'd choose would be to add the new nodes as a second DC with the right replication, run a repair to have all the data in sync and then decommission the original node.

答案2

得分: 0

当将应用程序数据克隆到新的集群时，源集群的配置大部分是不相关的。重要的是：（a）您知道需要迁移哪些应用程序表，以及（b）它们所在的目录。

特别针对您的情况，有两种可用选项：

文件复制方法，和
批量加载方法。

文件复制方法

警告 - 只有当应用程序键空间的复制因子为3时，此方法才有效。否则，您只能使用批量加载方法。

从源节点复制应用程序数据到目标节点的高级步骤如下：

对第一个应用程序表创建一个快照。
在目标集群上创建表模式。
将数据文件（通过SCP、FTP等方式）从源节点复制到三个目标节点上的相应表子目录。
在目标节点上运行nodetool refresh命令。

对每个需要克隆的应用程序表重复上述步骤。

有关详细信息，请参阅我在如何将快照恢复到另一个Cassandra集群中记录的步骤。

批量加载方法

将应用程序数据批量加载到目标集群的高级步骤如下：

对第一个应用程序表创建一个快照。
使用sstableloader实用程序将数据批量加载到目标集群。
重复上述步骤，直到所有应用程序表都克隆到目标集群为止。

有关详细信息，请参阅我在如何将数据迁移到新的Cassandra集群中记录的步骤。干杯！

英文:

When cloning application data to a new cluster, the configuration of the source cluster is mostly irrelevant. What is important is that (a) you know which application tables you need to migrate, and (b) the directories where they are located.

Specifically for your situation, there are two options available to you:

the file copy method, and
the bulk-load method.

File copy method

WARNING - This method will only work if the application keyspace(s) have a replication factor of 3. Otherwise, you can only use the bulk-load method.

The high level steps to copy the application data from the source node to the target nodes are:

Take a snapshot of the first application table.
Create the table schema on the target cluster.
Copy the data files (via SCP, FTP, etc) from the source node to the corresponding table sub-directory on the three target nodes.
Run the nodetool refresh command on the target nodes.

Repeat the steps above for each application table you need to clone.

For details, see the procedure I documented in How to restore snapshots to another Cassandra cluster.

Bulk-load method

The high-level steps for bulk-loading application data to the target cluster are:

Take a snapshot of the first application table.
Bulk-load the data to the target cluster using the sstableloader utility.
Repeat the steps above until all application tables have been cloned to the target cluster.

For details, see the procedure I documented in How to migrate data to a new Cassandra cluster. Cheers!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将源中有2个数据目录的数据迁移到只有1个数据目录的目标集群？

问题

答案1

答案2

文件复制方法

批量加载方法

File copy method

Bulk-load method

Golang cassandra gocql事务需要延迟时间来执行。

如何在服务器端为Cassandra存储时间戳？

Spring-data-reactive-cassandra 的 deleteById 操作没有任何影响。

Codec not found for requested operation: [UDT("keyspace".user_product_info) <->model.user.UserProductInfo]

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论