英文:
How to take data from cassandra with limit or data of the last 30 days
问题
我正在使用Apache Cassandra数据库,想要从现有的Cassandra数据库中备份过去30天的数据,并将其导入到另一个Cassandra数据库中。我发现我们可以使用CQLSH的COPY命令来进行备份,但似乎无法设置备份数据的限制。是否有办法在Cassandra上备份过去30天的数据或者设置备份的限制?
非常感谢任何帮助。
英文:
I'm working on Apache Cassandra DB and I want to take a backup of the last 30 days data from an existing Cassandra DB and want to import it to other Cassandra DB. I've found out that we can take backup using COPY command of CQLSH. But we can't provide limit it that. Is there a way to take the backup of the last 30 days or with any limit on Cassandra?
Any help is really appreciated.
答案1
得分: 1
CQLSH不是一个备份工具。如果您只想复制数据,那么可以在表中使用日期列或使用writetime等内容来提取数据,例如使用Spark。
如果您需要一个真正的备份解决方案,那么可以使用类似于Cassandra的备份工具Medusa。它允许您设置计划并执行真正的备份,但是您无法要求它备份30天前的数据。您可以创建一个备份,其中包含所有数据,然后在30天后可以创建另一个备份,如果愿意,可以删除第一个备份,或者将第一个备份还原到另一个群集中。
https://github.com/thelastpickle/cassandra-medusa
英文:
CQLSH is not a backup tool. If you're looking to just copy data out then you could use a date column within your table or use writetime and something like spark to pull the data out.
If you want a real backup solution, then use something like medusa which is a backup tool for Cassandra. That will allow you to set a schedule and do real backups, however, you won't be able to tell it to take a backup of data from 30 days ago. You'll take a backup, which will have all of the data, then in 30 days you can take another backup and drop the first one if you'd like or restore the first backup you made to your other cluster.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论