

I have TrustFrameworkLocalization.xml file, that contains localized strings, for now only English, but amount of strings is quite big. My colleagues from localization team are not technical people. So, they are not able to modify XML file in the right way. Is there any tool, where they could localize B2C strings and generate TrustFrameworkLocalization.xml file? Something like BabelEdit for JSON?


I asked GPT-4 to write a python script which will for each LocalizedResources Tag in the Policy output an XLSX File that has as it's name the Id of that tag, with the LocalizedString Attributes as columns. It only took 1-2 feedback rounds for it to work. I then took the English example and let people copy the sheet and localize the "Value" columun and give the sheet a name like "fr". Or you just copy & paste the whole Value column into DeepL and it will localize the entire column.

I then asked it to write another script which will read in my excel and output an XML file with XML that looks like it is supposed to and construct a name for each "LocalizedResources" from api + filename + sheetname. I showed it how the XML looks like and it worked after 2 tries. Just had to tell it to remove elements where e.g. the StringId is empty. Then I just copy pasted the whole thing back into the policy.

This is the import script.

  1. import pandas as pd
  2. import os
  3. import xml.etree.ElementTree as ET
  4. from xml.dom import minidom
  5. import argparse
  6. import numpy as np
  7. def prettify(elem):
  8. """Return a pretty-printed XML string for the Element."""
  9. rough_string = ET.tostring(elem, 'utf-8')
  10. reparsed = minidom.parseString(rough_string)
  11. return reparsed.toprettyxml(indent=" ")
  12. def process_file(file_path):
  13. # Load the workbook
  14. workbook = pd.read_excel(file_path, sheet_name=None)
  15. base_name = os.path.basename(file_path)
  16. base_name_without_ext = os.path.splitext(base_name)[0]
  17. # Create root element for combined output
  18. combined_resources = ET.Element('LocalizedResources')
  19. # Process each sheet
  20. for sheet_name, data in workbook.items():
  21. # Create root element
  22. resources = ET.Element('LocalizedResources')
  23. resources.set(
  24. 'Id', f'api.{base_name_without_ext}.{sheet_name.lower()}')
  25. # Create LocalizedStrings element
  26. strings = ET.SubElement(resources, 'LocalizedStrings')
  27. # Process each row
  28. for i, row in data.iterrows():
  29. localized_string = ET.SubElement(strings, 'LocalizedString')
  30. # Add attributes based on row data, convert NaN to ''
  31. localized_string.set('ElementType', str(
  32. row['ElementType']) if pd.notna(row['ElementType']) else '')
  33. if pd.notna(row['ElementId']):
  34. localized_string.set('ElementId', str(row['ElementId']))
  35. localized_string.set('StringId', str(
  36. row['StringId']) if pd.notna(row['StringId']) else '')
  37. localized_string.text = str(
  38. row['Value']) if pd.notna(row['Value']) else ''
  39. # Add processed resources to combined_resources
  40. combined_resources.append(resources)
  41. # Write to a single combined XML file
  42. xml_str = prettify(combined_resources)
  43. with open(f"{base_name_without_ext}_combined.xml", "w") as f:
  44. f.write(xml_str)
  45. if __name__ == '__main__':
  46. parser = argparse.ArgumentParser(
  47. description='Process an XLSX file and generate a combined XML file.')
  48. parser.add_argument('-f', '--file', required=True,
  49. help='Path to the input XLSX file.')
  50. args = parser.parse_args()
  51. process_file(args.file)

That's the export script:

  1. import xlsxwriter
  2. import argparse
  3. import xml.etree.ElementTree as ET
  4. import pandas as pd
  5. # Define command line arguments
  6. parser = argparse.ArgumentParser()
  7. parser.add_argument("-f", "--file", help="Path to XML file", )
  8. parser.add_argument("-m", "--mode", help="Mode: Import or Export")
  9. args = parser.parse_args()
  10. print(f"Parsing file: {args.file}")
  11. tree = ET.parse(args.file)
  12. root = tree.getroot()
  13. localized_resources = root.findall(
  14. ".//{}LocalizedResources")
  15. # Create an new Excel file and add a worksheet.
  16. print("Opening Workbook...")
  17. print(f"Found {len(localized_resources)} localized resources")
  18. writer = pd.ExcelWriter('AzureAD_B2C_Translations.xlsx', engine='xlsxwriter')
  19. for resource in localized_resources:
  20. resource_id = resource.attrib['Id']
  21. localized_strings = resource.findall(
  22. ".//{}LocalizedString")
  23. print(
  24. f"Resource {resource.attrib['Id']} has {len(localized_strings)} localized strings...")
  25. output = []
  26. for string in localized_strings:
  27. items = string.items()
  28. items.append(("Value", string.text)) # type: ignore
  29. output.append(dict(items))
  30. df = pd.DataFrame(output)
  31. df.to_excel(writer, sheet_name=resource_id[4:],
  32. startrow=0, header=True, index=False)
  33. writer.close()

First I wanted the export script to do both, but I found it easier to just have GPT write two script. Having it write the Import Script literally took 5 minutes with a welll articulated prompt.

