如何从Cassandra获取带有限制或最近30天数据。

huangapple go评论82阅读模式
英文:

How to take data from cassandra with limit or data of the last 30 days

问题

我正在使用Apache Cassandra数据库,想要从现有的Cassandra数据库中备份过去30天的数据,并将其导入到另一个Cassandra数据库中。我发现我们可以使用CQLSH的COPY命令来进行备份,但似乎无法设置备份数据的限制。是否有办法在Cassandra上备份过去30天的数据或者设置备份的限制?

非常感谢任何帮助。

英文:

I'm working on Apache Cassandra DB and I want to take a backup of the last 30 days data from an existing Cassandra DB and want to import it to other Cassandra DB. I've found out that we can take backup using COPY command of CQLSH. But we can't provide limit it that. Is there a way to take the backup of the last 30 days or with any limit on Cassandra?

Any help is really appreciated.

答案1

得分: 1

CQLSH不是一个备份工具。如果您只想复制数据,那么可以在表中使用日期列或使用writetime等内容来提取数据,例如使用Spark。

如果您需要一个真正的备份解决方案,那么可以使用类似于Cassandra的备份工具Medusa。它允许您设置计划并执行真正的备份,但是您无法要求它备份30天前的数据。您可以创建一个备份,其中包含所有数据,然后在30天后可以创建另一个备份,如果愿意,可以删除第一个备份,或者将第一个备份还原到另一个群集中。

https://github.com/thelastpickle/cassandra-medusa

英文:

CQLSH is not a backup tool. If you're looking to just copy data out then you could use a date column within your table or use writetime and something like spark to pull the data out.

If you want a real backup solution, then use something like medusa which is a backup tool for Cassandra. That will allow you to set a schedule and do real backups, however, you won't be able to tell it to take a backup of data from 30 days ago. You'll take a backup, which will have all of the data, then in 30 days you can take another backup and drop the first one if you'd like or restore the first backup you made to your other cluster.

https://github.com/thelastpickle/cassandra-medusa

huangapple
  • 本文由 发表于 2023年7月13日 12:24:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/76675923.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定