英文:
How to retrieve an Array instead of ArrayData from a Compute Function with RecordBatches in Apache Arrow
问题
I'm trying to extract an Array out of a Datum after a Compute Operation.
ARROW_ASSIGN_OR_RAISE(rbatch, ipc_reader->Read(i));
std::shared_ptr<arrow::Array> numbers_array_a = rbatch->column(2);
std::shared_ptr<arrow::Array> numbers_array_b = rbatch->column(3);
// Get element-wise sum of both columns A and B in our Table. Note that here we use
// CallFunction(), which takes the name of the function as the first argument.
ARROW_ASSIGN_OR_RAISE(my_datum, arrow::compute::CallFunction(
"add", {numbers_array_a,
numbers_array_b}));
The Datum now holds an Int64 Array
std::cout << "Datum kind: " << my_datum.ToString()
<< " content type: " << my_datum.type()->ToString() << std::endl;
>> Datum kind: Array content type: int64
When I now try to print the Array
std::cout << my_datum.array()->ToString()<< std::end;
I get this Error
class "arrow::ArrayData" has no member "ToString"
The Array class has a ToString() function, but as far as I know I can't convert ArrayData into an Array.
I tried to convert the ArrayData into an Array, I tried to initialize an Array with the ArrayData and I tried to access the values from the ArrayData all without success.
I tried to initialize a RecordBatch with ArrayData.
I also tried to look for alternatives to retrieve the Array from the Datum but also without success.
How can I print or even access the Array inside the Datum?
英文:
I'm trying to extract an Array out of a Datum after a Compute Operation.
ARROW_ASSIGN_OR_RAISE(rbatch, ipc_reader->Read(i));
std::shared_ptr<arrow::Array> numbers_array_a = rbatch->column(2);
std::shared_ptr<arrow::Array> numbers_array_b = rbatch->column(3);
// Get element-wise sum of both columns A and B in our Table. Note that here we use
// CallFunction(), which takes the name of the function as the first argument.
ARROW_ASSIGN_OR_RAISE(my_datum, arrow::compute::CallFunction(
"add", {numbers_array_a,
numbers_array_b}));
The Datum now holds an Int64 Array
std::cout << "Datum kind: " << my_datum.ToString()
<< " content type: " << my_datum.type()->ToString() << std::endl;
>> Datum kind: Array content type: int64
When I now try to print the Array
std::cout << my_datum.array()->ToString()<< std::end;
I get this Error
class "arrow::ArrayData" has no member "ToString"
The Array class has a ToString() function, but as fas as I know I can't convert ArrayData into an Array.
I tried to convert the ArrayData into an Array, I tried to initialize an Array with the ArrayData and I tried to access the values from the ArrayData all without success.
I tried to initialize a RecordBatch with ArrayData.
I also tried to look for alternatives to retrieve the Array from the Datum but also without success.
How can I print or even access the Array inside the Datum?
答案1
得分: 0
我认为你想要使用 arrow::Datum::make_array
函数:
std::shared_ptr<arrow::Array> result_array = my_datum.make_array();
英文:
I think you want the arrow::Datum::make_array
function:
std::shared_ptr<arrow::Array> result_array = my_datum.make_array();
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论