Azure B2C TrustFrameworkLocalization.xml 本地化工具

huangapple go评论62阅读模式
英文:

Azure B2C TrustFrameworkLocalization.xml localization tool

问题

我有一个名为TrustFrameworkLocalization.xml的文件,其中包含本地化字符串,目前只有英语,但字符串的数量相当多。我的本地化团队的同事不是技术人员,因此他们无法正确地修改XML文件。是否有任何工具,他们可以使用它来本地化B2C字符串并生成TrustFrameworkLocalization.xml文件?类似于JSON的BabelEdit之类的工具?

英文:

I have TrustFrameworkLocalization.xml file, that contains localized strings, for now only English, but amount of strings is quite big. My colleagues from localization team are not technical people. So, they are not able to modify XML file in the right way. Is there any tool, where they could localize B2C strings and generate TrustFrameworkLocalization.xml file? Something like BabelEdit for JSON?

答案1

得分: 1

以下是您要翻译的内容:

这是导入脚本:

import pandas as pd
import os
import xml.etree.ElementTree as ET
from xml.dom import minidom
import argparse
import numpy as np


def prettify(elem):
    """返回元素的漂亮格式化XML字符串。"""
    rough_string = ET.tostring(elem, 'utf-8')
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="  ")


def process_file(file_path):
    # 加载工作簿
    workbook = pd.read_excel(file_path, sheet_name=None)

    base_name = os.path.basename(file_path)
    base_name_without_ext = os.path.splitext(base_name)[0]

    # 为合并输出创建根元素
    combined_resources = ET.Element('LocalizedResources')

    # 处理每个工作表
    for sheet_name, data in workbook.items():
        # 创建根元素
        resources = ET.Element('LocalizedResources')
        resources.set(
            'Id', f'api.{base_name_without_ext}.{sheet_name.lower()}')

        # 创建LocalizedStrings元素
        strings = ET.SubElement(resources, 'LocalizedStrings')

        # 处理每一行
        for i, row in data.iterrows():
            localized_string = ET.SubElement(strings, 'LocalizedString')

            # 根据行数据添加属性,将NaN转换为空字符串
            localized_string.set('ElementType', str(
                row['ElementType']) if pd.notna(row['ElementType']) else '')

            if pd.notna(row['ElementId']):
                localized_string.set('ElementId', str(row['ElementId']))

            localized_string.set('StringId', str(
                row['StringId']) if pd.notna(row['StringId']) else '')

            localized_string.text = str(
                row['Value']) if pd.notna(row['Value']) else ''

        # 将处理后的资源添加到combined_resources
        combined_resources.append(resources)

    # 写入单个合并的XML文件
    xml_str = prettify(combined_resources)
    with open(f"{base_name_without_ext}_combined.xml", "w") as f:
        f.write(xml_str)


if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        description='处理XLSX文件并生成合并的XML文件。')
    parser.add_argument('-f', '--file', required=True,
                        help='输入XLSX文件的路径。')
    args = parser.parse_args()
    process_file(args.file)

这是导出脚本:

import xlsxwriter
import argparse
import xml.etree.ElementTree as ET
import pandas as pd

# 定义命令行参数
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", help="XML文件的路径")
parser.add_argument("-m", "--mode", help="模式:导入或导出")
args = parser.parse_args()

print(f"解析文件:{args.file}")
tree = ET.parse(args.file)
root = tree.getroot()
localized_resources = root.findall(
    ".//{http://schemas.microsoft.com/online/cpim/schemas/2013/06}LocalizedResources")

# 创建新的Excel文件并添加工作表。
print("正在打开工作簿...")
print(f"找到{len(localized_resources)}个本地化资源")

writer = pd.ExcelWriter('AzureAD_B2C_Translations.xlsx', engine='xlsxwriter')

for resource in localized_resources:
    resource_id = resource.attrib['Id']
    localized_strings = resource.findall(
        ".//{http://schemas.microsoft.com/online/cpim/schemas/2013/06}LocalizedString")
    print(
        f"资源{resource.attrib['Id']}包含{len(localized_strings)}个本地化字符串...")
    output = []
    for string in localized_strings:
        items = string.items()
        items.append(("Value", string.text))  # 类型:忽略
        output.append(dict(items))
    df = pd.DataFrame(output)

    df.to_excel(writer, sheet_name=resource_id[4:],
                startrow=0, header=True, index=False)

writer.close()

首先,我想要导出脚本来执行两者,但我发现只让GPT编写两个脚本更容易。让它编写导入脚本只花了5分钟,使用了良好的提问方式。

英文:

I asked GPT-4 to write a python script which will for each LocalizedResources Tag in the Policy output an XLSX File that has as it's name the Id of that tag, with the LocalizedString Attributes as columns. It only took 1-2 feedback rounds for it to work. I then took the English example and let people copy the sheet and localize the "Value" columun and give the sheet a name like "fr". Or you just copy & paste the whole Value column into DeepL and it will localize the entire column.

I then asked it to write another script which will read in my excel and output an XML file with XML that looks like it is supposed to and construct a name for each "LocalizedResources" from api + filename + sheetname. I showed it how the XML looks like and it worked after 2 tries. Just had to tell it to remove elements where e.g. the StringId is empty. Then I just copy pasted the whole thing back into the policy.

This is the import script.

import pandas as pd
import os
import xml.etree.ElementTree as ET
from xml.dom import minidom
import argparse
import numpy as np


def prettify(elem):
    """Return a pretty-printed XML string for the Element."""
    rough_string = ET.tostring(elem, 'utf-8')
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="  ")


def process_file(file_path):
    # Load the workbook
    workbook = pd.read_excel(file_path, sheet_name=None)

    base_name = os.path.basename(file_path)
    base_name_without_ext = os.path.splitext(base_name)[0]

    # Create root element for combined output
    combined_resources = ET.Element('LocalizedResources')

    # Process each sheet
    for sheet_name, data in workbook.items():
        # Create root element
        resources = ET.Element('LocalizedResources')
        resources.set(
            'Id', f'api.{base_name_without_ext}.{sheet_name.lower()}')

        # Create LocalizedStrings element
        strings = ET.SubElement(resources, 'LocalizedStrings')

        # Process each row
        for i, row in data.iterrows():
            localized_string = ET.SubElement(strings, 'LocalizedString')

            # Add attributes based on row data, convert NaN to ''
            localized_string.set('ElementType', str(
                row['ElementType']) if pd.notna(row['ElementType']) else '')

            if pd.notna(row['ElementId']):
                localized_string.set('ElementId', str(row['ElementId']))

            localized_string.set('StringId', str(
                row['StringId']) if pd.notna(row['StringId']) else '')

            localized_string.text = str(
                row['Value']) if pd.notna(row['Value']) else ''

        # Add processed resources to combined_resources
        combined_resources.append(resources)

    # Write to a single combined XML file
    xml_str = prettify(combined_resources)
    with open(f"{base_name_without_ext}_combined.xml", "w") as f:
        f.write(xml_str)


if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        description='Process an XLSX file and generate a combined XML file.')
    parser.add_argument('-f', '--file', required=True,
                        help='Path to the input XLSX file.')
    args = parser.parse_args()
    process_file(args.file)

That's the export script:

import xlsxwriter
import argparse
import xml.etree.ElementTree as ET
import pandas as pd

# Define command line arguments
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", help="Path to XML file", )
parser.add_argument("-m", "--mode", help="Mode: Import or Export")
args = parser.parse_args()

print(f"Parsing file: {args.file}")
tree = ET.parse(args.file)
root = tree.getroot()
localized_resources = root.findall(
    ".//{http://schemas.microsoft.com/online/cpim/schemas/2013/06}LocalizedResources")

# Create an new Excel file and add a worksheet.
print("Opening Workbook...")
print(f"Found {len(localized_resources)} localized resources")

writer = pd.ExcelWriter('AzureAD_B2C_Translations.xlsx', engine='xlsxwriter')

for resource in localized_resources:
    resource_id = resource.attrib['Id']
    localized_strings = resource.findall(
        ".//{http://schemas.microsoft.com/online/cpim/schemas/2013/06}LocalizedString")
    print(
        f"Resource {resource.attrib['Id']} has {len(localized_strings)} localized strings...")
    output = []
    for string in localized_strings:
        items = string.items()
        items.append(("Value", string.text))  # type: ignore
        output.append(dict(items))
    df = pd.DataFrame(output)

    df.to_excel(writer, sheet_name=resource_id[4:],
                startrow=0, header=True, index=False)

writer.close()

First I wanted the export script to do both, but I found it easier to just have GPT write two script. Having it write the Import Script literally took 5 minutes with a welll articulated prompt.

huangapple
  • 本文由 发表于 2023年7月17日 19:50:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76704162.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定