英文:
Convert XML File with nested hierarchy placed in Azure Data lake to CSV using C# Azure Function
问题
我有以下XML文件,具有以下结构,要使用Azure函数C#将其转换为CSV。 XML文件位于Azure Data Lake位置。 文件的结构如下。
<root id="1" created_date="01/01/2023" asof_date="01/01/2023">
<level1>
<data1>sdfs</data1>
<data2>true</data2>
<level2 rec="4">
<level_record>
<groupid>1</groupid>
<groupname>somegroup</groupname>
<groupdate>01/01/2023</groupdate>
<groupvalue>5</groupvalue>
<groupkey>ag55</groupkey>
</level_record>
<level_record>
<groupid>2</groupid>
<groupname>somegroup1</groupname>
<groupdate>02/01/2023</groupdate>
<groupvalue>6</groupvalue>
<groupkey>ag56</groupkey>
</level_record>
</level2>
</level1>
</root>
如何从Azure数据湖中读取文件并将其转换为CSV文件?
英文:
I have the following xml file with the below structure to convert to csv using Azure function C#. The XML file is located in Azure Data Lake location. The structure of the file is as follows.
<root id="1" created_date="01/01/2023" asof_date="01/01/2023">
<level1>
<data1>sdfs</data1>
<data2>true</data2>
<level2 rec="4">
<level_record>
<groupid>1</groupid>
<groupname>somegroup</groupname>
<groupdate>01/01/2023</groudate>
<groupvalue>5</groupvalue>
<groupkey>ag55</groupkey>
</level_record>
<level_record>
<groupid>2</groupid>
<groupname>somegroup1</groupname>
<groupdate>02/01/2023</groudate>
<groupvalue>6</groupvalue>
<groupkey>ag56</groupkey>
</level_record>
</level2>
</level1>
</root>
How do i read the file from Azure data lake and convert it as a csv file?
答案1
得分: 0
以下是使用C#编写的Azure Function示例,该示例从Azure Data Lake Storage中读取XML文件并将其转换为CSV文件:
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
using Microsoft.Azure.Storage;
using Microsoft.Azure.Storage.Auth;
using Microsoft.Azure.Storage.Blob;
using System.IO;
using System.Xml.Linq;
namespace YourNamespace
{
public static class ConvertXmlToCsvFunction
{
[Function("ConvertXmlToCsvFunction")]
public static void Run([BlobTrigger("your-container/{name}", Connection = "AzureWebJobsStorage")] Stream xmlStream, string name, FunctionContext context)
{
var logger = context.GetLogger("ConvertXmlToCsvFunction");
logger.LogInformation($"Processing file: {name}");
try
{
// 读取XML文件内容
string xmlContent;
using (StreamReader reader = new StreamReader(xmlStream))
{
xmlContent = reader.ReadToEnd();
}
// 解析XML内容
XDocument xDoc = XDocument.Parse(xmlContent);
// 提取数据并转换为CSV格式
XElement rootElement = xDoc.Element("root");
XElement level1Element = rootElement.Element("level1");
XElement level2Element = level1Element.Element("level2");
// 创建CSV标题
string csv = "groupid,groupname,groupdate,groupvalue,groupkey" + "\n";
// 遍历level_record元素并提取数据
foreach (XElement recordElement in level2Element.Elements("level_record"))
{
string groupid = recordElement.Element("groupid").Value;
string groupname = recordElement.Element("groupname").Value;
string groupdate = recordElement.Element("groupdate").Value;
string groupvalue = recordElement.Element("groupvalue").Value;
string groupkey = recordElement.Element("groupkey").Value;
// 追加CSV行
csv += $"{groupid},{groupname},{groupdate},{groupvalue},{groupkey}" + "\n";
}
// 将CSV内容保存到文件
string csvFileName = Path.ChangeExtension(name, "csv");
string csvFilePath = Path.Combine(Path.GetTempPath(), csvFileName);
File.WriteAllText(csvFilePath, csv);
logger.LogInformation($"CSV file created: {csvFilePath}");
}
catch (Exception ex)
{
logger.LogError($"An error occurred: {ex.Message}");
throw;
}
}
}
}
请注意,这是Azure Function的C#示例代码,用于执行将XML文件转换为CSV文件的操作。
英文:
Here is the example of Azure Function in C# that reads an XML file from Azure Data Lake Storage and converts it to a CSV file
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
using Microsoft.Azure.Storage;
using Microsoft.Azure.Storage.Auth;
using Microsoft.Azure.Storage.Blob;
using System.IO;
using System.Xml.Linq;
namespace YourNamespace
{
public static class ConvertXmlToCsvFunction
{
[Function("ConvertXmlToCsvFunction")]
public static void Run([BlobTrigger("your-container/{name}", Connection = "AzureWebJobsStorage")] Stream xmlStream, string name, FunctionContext context)
{
var logger = context.GetLogger("ConvertXmlToCsvFunction");
logger.LogInformation($"Processing file: {name}");
try
{
// Read the XML file content
string xmlContent;
using (StreamReader reader = new StreamReader(xmlStream))
{
xmlContent = reader.ReadToEnd();
}
// Parse the XML content
XDocument xDoc = XDocument.Parse(xmlContent);
// Extract data and convert to CSV format
XElement rootElement = xDoc.Element("root");
XElement level1Element = rootElement.Element("level1");
XElement level2Element = level1Element.Element("level2");
// Create the CSV header
string csv = "groupid,groupname,groupdate,groupvalue,groupkey" + "\n";
// Iterate over level_record elements and extract data
foreach (XElement recordElement in level2Element.Elements("level_record"))
{
string groupid = recordElement.Element("groupid").Value;
string groupname = recordElement.Element("groupname").Value;
string groupdate = recordElement.Element("groupdate").Value;
string groupvalue = recordElement.Element("groupvalue").Value;
string groupkey = recordElement.Element("groupkey").Value;
// Append the CSV row
csv += $"{groupid},{groupname},{groupdate},{groupvalue},{groupkey}" + "\n";
}
// Save the CSV content to a file
string csvFileName = Path.ChangeExtension(name, "csv");
string csvFilePath = Path.Combine(Path.GetTempPath(), csvFileName);
File.WriteAllText(csvFilePath, csv);
logger.LogInformation($"CSV file created: {csvFilePath}");
}
catch (Exception ex)
{
logger.LogError($"An error occurred: {ex.Message}");
throw;
}
}
}
}
答案2
得分: 0
尝试以下。XML 不是有效的,因为 groupdate 没有相同的开始和结束标记。
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication52
{
class Program
{
const string INPUT_FILENAME = @"c:\temp\test.xml";
const string OUTPUT_FILENAME = @"c:\temp\test.csv";
static void Main(string[] args)
{
StreamWriter writer = new StreamWriter(OUTPUT_FILENAME);
XDocument doc = XDocument.Load(INPUT_FILENAME);
int rowCount = 0;
foreach (XElement record in doc.Descendants("level_record"))
{
rowCount++;
if (rowCount == 1)
{
//write csv header row
string[] headers = record.Elements().Select(x => x.Name.LocalName).ToArray();
writer.WriteLine(string.Join(",", headers));
}
//assume elements are in same order all the time.
string[] data = record.Elements().Select(x => (string)x).ToArray();
writer.WriteLine(string.Join(",", data));
}
writer.Flush();
writer.Close();
}
}
}
英文:
Try following. The xml is not valid since groupdate doesn't have same start end end tag.
<!-- begin snippet: js hide: false console: true babel: false -->
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication52
{
class Program
{
const string INPUT_FILENAME = @"c:\temp\test.xml";
const string OUTPUT_FILENAME = @"c:\temp\test.csv";
static void Main(string[] args)
{
StreamWriter writer = new StreamWriter(OUTPUT_FILENAME);
XDocument doc = XDocument.Load(INPUT_FILENAME);
int rowCount = 0;
foreach (XElement record in doc.Descendants("level_record"))
{
rowCount++;
if (rowCount == 1)
{
//write csv header row
string[] headers = record.Elements().Select(x => x.Name.LocalName).ToArray();
writer.WriteLine(string.Join(",", headers));
}
//assume elements are in same order all the time.
string[] data = record.Elements().Select(x => (string)x).ToArray();
writer.WriteLine(string.Join(",", data));
}
writer.Flush();
writer.Close();
}
}
}
<!-- end snippet -->
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论