开始修复Debezium Engine的偏移问题应该从哪里开始?

huangapple go评论63阅读模式
英文:

Where to begin with fixing an offsets issue with Debezium Engine

问题

我正在使用Debezium引擎从MySQL数据库同步数据。由于我正在使用Debezium引擎,我正在使用org.apache.kafka.connect.storage.FileOffsetBackingStore来记录我的当前变更偏移。我认为我的计算机最近断电,导致我的偏移文件损坏。当我现在尝试运行我的Debezium引擎应用程序时,我从Debezium得到以下错误。

  1. ERROR io.debezium.embedded.EmbeddedEngine - 无法配置和启动'org.apache.kafka.connect.storage.FileOffsetBackingStore'偏移后备存储
  2. org.apache.kafka.connect.errors.ConnectException: java.io.StreamCorruptedException: invalid stream header: 00000000
  3. at org.apache.kafka.connect.storage.FileOffsetBackingStore.load(FileOffsetBackingStore.java:86)
  4. at org.apache.kafka.connect.storage.FileOffsetBackingStore.start(FileOffsetBackingStore.java:59)
  5. at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:691)
  6. at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:192)
  7. at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
  8. at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
  9. at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  10. at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  11. at java.base/java.lang.Thread.run(Thread.java:833)
  12. Caused by: java.io.StreamCorruptedException: invalid stream header: 00000000
  13. at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:987)
  14. at java.base/java.io.ObjectInputStream.<init>(ObjectInputStream.java:414)
  15. at org.apache.kafka.connect.util.SafeObjectInputStream.<init>(SafeObjectInputStream.java:48)
  16. at org.apache.kafka.connect.storage.FileOffsetBackingStore.load(FileOffsetBackingStore.java:71)
  17. ... 8 common frames omitted

我想以正确的方式解决这个问题,通过修复Debezium偏移文件,但我不知道从哪里开始。我首先需要弄清楚我想要的偏移量,这将是失败的偏移量(或接近它的偏移量)。我可以从损坏的文件中获取偏移量,但如果无法获取,我是否可以使用时间戳来找到一个好的起始偏移量?看起来我可以使用这个工具来将文件更新到我选择的偏移量位置(https://github.com/nathan-smit-1/HashmapEditor),一旦我知道了它。然后,如何按时间顺序获取偏移量列表,以便知道我应该将其更改为哪个偏移量?

英文:

I'm using Debezium engine to sync data from a MySQL database. Since I'm using Debezium Engine I'm using the org.apache.kafka.connect.storage.FileOffsetBackingStore to record my current changes offset. I think my computer had a power outage recently which resulted in the corruption of my offset file. When I try to run my Debezium engine app now, I get this error from Debezium.

  1. ERROR io.debezium.embedded.EmbeddedEngine - Unable to configure and start the &#39;org.apache.kafka.connect.storage.FileOffsetBackingStore&#39; offset backing store
  2. org.apache.kafka.connect.errors.ConnectException: java.io.StreamCorruptedException: invalid stream header: 00000000
  3. at org.apache.kafka.connect.storage.FileOffsetBackingStore.load(FileOffsetBackingStore.java:86)
  4. at org.apache.kafka.connect.storage.FileOffsetBackingStore.start(FileOffsetBackingStore.java:59)
  5. at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:691)
  6. at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:192)
  7. at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
  8. at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
  9. at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  10. at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  11. at java.base/java.lang.Thread.run(Thread.java:833)
  12. Caused by: java.io.StreamCorruptedException: invalid stream header: 00000000
  13. at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:987)
  14. at java.base/java.io.ObjectInputStream.&lt;init&gt;(ObjectInputStream.java:414)
  15. at org.apache.kafka.connect.util.SafeObjectInputStream.&lt;init&gt;(SafeObjectInputStream.java:48)
  16. at org.apache.kafka.connect.storage.FileOffsetBackingStore.load(FileOffsetBackingStore.java:71)
  17. ... 8 common frames omitted

I'd like to fix this issue the proper way, by repairing the Debezium offset file but I don't know where to begin. I'm thinking I first need to figure out what offset I want, which would be the offset at which it was failing (or an offset before but near it). I may be able to get the offset from the corrupted file, but if not, can I use a timestamp to find a good offset to start at? It looks like I can use this tool to update the file to point to an offset of my choice (https://github.com/nathan-smit-1/HashmapEditor) once I know it. How, though, do I get a list of offsets in chronological order so I know which one I should change it to?

答案1

得分: 0

  1. 删除损坏的 offsets.dat 文件。
  2. 启动 Debezium 以生成一个新的、可用的 offsets.dat 文件。
  3. 使用过去的 Debezium 日志找到 Debezium 最近处理过的偏移量(越新越好)。
  4. 使用能够二进制序列化步骤 3 中找到的偏移信息的工具编辑 offsets.dat 文件。具体取决于你要更改什么,一个十六进制编辑器可能会起作用,我不确定,我使用了一个 Java 应用程序将数据写入文件。
英文:

After a lot of trial and error and lots of online research I found the best solution to this was the following steps:

  1. Delete the corrupt offsets.dat file
  2. Start Debezium to generate a new, working offsets.dat file for this machine
  3. Use past Debezium logs to find an offset that Debezium processed recently (the more recent the better)
  4. Edit the offsets.dat file using a tool that can binary serialize the offset information you found in step 3. A hex editor might work depending on what you're changing, I don't know, I used a Java app to write the data to the file

huangapple
  • 本文由 发表于 2023年6月13日 05:54:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76460549.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定