英文:
Dataproc serverless writing to Bigtable: org.apache.spark.SparkException: Task failed while writing rows
问题
如何找出根本原因?(我从Cassandra读取并写入Bigtable)
我尝试过:
- 查看Cassandra日志
- 消除列,以防它是数据问题
- 将spark.cassandra.input.fetch.size_in_rows从100减少到10
- spark.speculation既为true又为false
- 等等
它在抛出错误之前首先加载了数十万行。Bigtable有数TB的可用空间。
23/03/30 18:13:42 WARN TaskSetManager: 在阶段1.0中丢失任务5.0(TID 6)(执行器1,IP地址:10.128.0.46):org.apache.spark.SparkException:在写入行时任务失败
at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:163)
at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$write$1(SparkHadoopWriter.scala:88)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: 失败的操作:IllegalArgumentException:1次,存在问题的服务器:bigtable.googleapis.com
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.getExceptions(BigtableBufferedMutator.java:188)
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.handleExceptions(BigtableBufferedMutator.java:142)
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.mutate(BigtableBufferedMutator.java:133)
at org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:101)
at org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:52)
at org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil.write(SparkHadoopWriter.scala:246)
at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$executeTask$1(SparkHadoopWriter.scala:138)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1525)
at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:135)
... 9 more
英文:
How do I find out root cause? (I'm reading from Casssandra and writing to Bigtable)
I've tried:
- looking through Cassandra logs
- eliminating columns in case it was a data issue
- reducing spark.cassandra.input.fetch.size_in_rows from 100 to 10
- spark.speculation both true and false
- etc.
It does load 100s of thousands of rows first before it throws the error. Bigtable has TBs of free space.
23/03/30 18:13:42 WARN TaskSetManager: Lost task 5.0 in stage 1.0 (TID 6) (10.128.0.46 executor 1): org.apache.spark.SparkException: Task failed while writing rows
at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:163)
at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$write$1(SparkHadoopWriter.scala:88)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IllegalArgumentException: 1 time, servers with issues: bigtable.googleapis.com
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.getExceptions(BigtableBufferedMutator.java:188)
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.handleExceptions(BigtableBufferedMutator.java:142)
at com.google.cloud.bigtable.hbase.BigtableBufferedMutator.mutate(BigtableBufferedMutator.java:133)
at org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:101)
at org.apache.hadoop.hbase.mapred.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:52)
at org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil.write(SparkHadoopWriter.scala:246)
at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$executeTask$1(SparkHadoopWriter.scala:138)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1525)
at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:135)
... 9 more
答案1
得分: 1
错误消息表明它是由IllegalArgumentException引起的。
鉴于在引发错误之前,您能够将数千行写入Bigtable,很可能是因为达到了100,000个变更限制 https://cloud.google.com/bigtable/quotas#limits-operations。请注意,这个限制是基于变更的数量,而不是行数。
有可能某些行具有过多的列,并且每个列都会转换为一个变更 https://cloud.google.com/bigtable/docs/writes#write-types。
您可以尝试以下方法:
- 检查您如何从Cassandra数据创建行变更。
- 检查是否有一些行具有超过10,000列(假设您每列创建1个变更)。
英文:
The error message indicates that it's caused by IllegalArgumentException.
Given that you were able to write thousands of rows to Bigtable before it throws the error, it's likely that you hit the 100,000 mutation limit https://cloud.google.com/bigtable/quotas#limits-operations. Note that this limit is on the number of mutations instead of number of rows.
It's possible that that some of the rows has too many columns, and each column is converted into a mutation https://cloud.google.com/bigtable/docs/writes#write-types.
You can try the following things:
- Check how you're creating row mutations from your cassandra data.
- Check if there are some rows with more than 10000 columns (assuming you're creating 1 mutation per column)
答案2
得分: 0
一些来自Cassandra的行出现了损坏:其中一些行的键中存在空值。在将表导出为CSV文件并加载到另一个数据库后,我偶然发现了这个问题。在删除这些损坏的行之后,一切都加载正常。
英文:
It turns out that a few rows from Cassandra were corrupt: there were nulls in the keys for a few rows. I discovered this accidentally after dumping the table to csv files and loading into another database.
After removing those corrupt rows, everything loaded fine.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论