2023年2月16日 11:15:29go评论93阅读模式

英文:

convert string which contains sub string to dictionary

问题

我试图将特定格式的字符串转换为Python字典。
字符串格式如下，
```python
st1 = '&#39;key1 key2=value2 key3=&quot;key3.1, key3.2=value3.2 , key3.3 = value3.3, key3.4&quot; key4&#39;'

我想解析它并将其转换为以下字典，

dict1 {
    key1: None,
    key2: value2,
    key3: {
            key3.1: None,
            key3.2: value3.2,
            key3.3: value3.3,
            key3.4: None
          }
    key4: None,

我尝试使用python的re包和字符串分割函数，但未能实现结果。我有成千上万个相同格式的字符串，我正在尝试自动化处理它。有人能帮忙吗？


<details>
<summary>英文:</summary>
I am tring to convert particular strings which are in particular format to Python dictionary.
String format is like below,

st1 = 'key1 key2=value2 key3="key3.1, key3.2=value3.2 , key3.3 = value3.3, key3.4" key4'


I want to parse it and convert to dictionary as below,

dict1 {
key1: None,
key2: value2,
key3: {
key3.1: None,
key3.2: value3.2,
key3.3: value3.3,
key3.2: None
}
key4: None,


I tried to use python re package and string split function. not able to acheive the result. I have thousands of string in same format, I am trying to automate it. could someone help. 
</details>
# 答案1
**得分**: 0
```python
如果您的所有字符串都是一致的，并且只有一层子字典，下面的代码应该能起作用，您可能需要对其进行微调/更改。
import json
st1 = 'key1 key2=item2 key3="key3.1, key3.2=item3.2 , key3.3 = item3.3, key3.4" key4'
st1 = st1.replace(' = ', '=')
st1 = st1.replace(' ,', ',')
new_dict = {}
no_keys=False
while not no_keys:
    st1 = st1.lstrip()
    
    if " " in st1:
        item = st1.split(" ")[0]
    else:
        item = st1
    
    if '=' in item:
        if '="' in item:
            item = item.split('=')[0]
            new_dict[item] = {}        
            
            st1 = st1.replace(f'{item}=', '')
            sub_items = st1.split('"')[1]
            sub_values = sub_items.split(',')
    
            for sub_item in sub_values:
                if "=" in sub_item:
                    sub_key, sub_value = sub_item.split('=')
                    new_dict[item].update({sub_key.strip():sub_value.strip()})
                else:
                    new_dict[item].update({sub_item.strip(): None})
                
                st1 = st1.replace(f'"{sub_items}"', '')
        else:
            key, value = item.split('=')
            new_dict.update({key:value})
            st1 = st1.replace(f"{item} ", "")
    else:
        new_dict.update({item: None})
        st1 = st1.replace(f"{item}", "")
        
    if st1 == "":
        no_keys=True    
        
print(json.dumps(new_dict, indent=4))

英文:

If all your strings are consistent, and only have 1 layer of sub dict, this code below should do the trick, you may need to make tweaks/changes to it.

import json
st1 = &#39;key1 key2=item2 key3=&quot;key3.1, key3.2=item3.2 , key3.3 = item3.3, key3.4&quot; key4&#39;
st1 = st1.replace(&#39; = &#39;, &#39;=&#39;)
st1 = st1.replace(&#39; ,&#39;, &#39;,&#39;)
new_dict = {}
no_keys=False
while not no_keys:
	st1 = st1.lstrip()
	
	if &quot; &quot; in st1:
		item = st1.split(&quot; &quot;)[0]
	else:
		item = st1
	
	if &#39;=&#39; in item:
		if &#39;=&quot;&#39; in item:
			item = item.split(&#39;=&#39;)[0]
			new_dict[item] = {}		
			
			st1 = st1.replace(f&#39;{item}=&#39;,&#39;&#39;)
			sub_items = st1.split(&#39;&quot;&#39;)[1]
			sub_values = sub_items.split(&#39;,&#39;)
			for sub_item in sub_values:
				if &quot;=&quot; in sub_item:
					sub_key, sub_value = sub_item.split(&#39;=&#39;)
					new_dict[item].update({sub_key.strip():sub_value.strip()})
				else:
					new_dict[item].update({sub_item.strip(): None})
			
			st1 = st1.replace(f&#39;&quot;{sub_items}&quot;&#39;, &#39;&#39;)
		else:
			key, value = item.split(&#39;=&#39;)
			new_dict.update({key:value})
			st1 = st1.replace(f&quot;{item} &quot;,&quot;&quot;)
	else:
		new_dict.update({item: None})
		st1 = st1.replace(f&quot;{item}&quot;,&quot;&quot;)
		
	if st1 == &quot;&quot;:
		no_keys=True	
	
print(json.dumps(new_dict, indent=4))

答案2

得分: 0

考虑使用解析工具，如 lark。对于你的情况，这是一个简单的例子：

_grammar = r&#39;&#39;&#39;
    ?start: value
    
    ?value: object
           | NON_SEPARATOR_STRING?
    object : &quot;\&quot;&quot; [pair (_SEPARATOR pair)*] &quot;\&quot;&quot;
    pair : NON_SEPARATOR_STRING [_PAIRTOR] value
    
    NON_SEPARATOR_STRING: /[a-zA-z0-9\.]+/
    _SEPARATOR: /[,  ]+/
            | &quot;,&quot;
    _PAIRTOR: &quot; = &quot;
            | &quot;=&quot;
&#39;&#39;&#39;
parser = Lark(_grammar)
st1 = &#39;key1 key2=value2 key3=&quot;key3.1, key3.2=value3.2 , key3.3 = value3.3, key3.4&quot; key4&#39;
tree = parser.parse(f&#39;&quot;{st1}&quot;&#39;)
print(tree.pretty())
&quot;&quot;&quot;
object
  pair
    key1
    value
  pair
    key2
    value2
  pair
    key3
    object
      pair
        key3.1
        value
      pair
        key3.2
        value3.2
      pair
        key3.3
        value3.3
      pair
        key3.4
        value
  pair
    key4
    value
&quot;&quot;&quot;

然后，你可以编写自己的 Transformer 来将这个 tree 转换为你想要的日期类型。

英文:

Consider use parsing tool like lark. A simple example to your case:

_grammar = r&#39;&#39;&#39;
    ?start: value
    
    ?value: object
           | NON_SEPARATOR_STRING?
    object : &quot;\&quot;&quot; [pair (_SEPARATOR pair)*] &quot;\&quot;&quot;
    pair : NON_SEPARATOR_STRING [_PAIRTOR] value
    
    NON_SEPARATOR_STRING: /[a-zA-z0-9\.]+/
    _SEPARATOR: /[,  ]+/
            | &quot;,&quot;
    _PAIRTOR: &quot; = &quot;
            | &quot;=&quot;
&#39;&#39;&#39;
parser = Lark(_grammar)
st1 = &#39;key1 key2=value2 key3=&quot;key3.1, key3.2=value3.2 , key3.3 = value3.3, key3.4&quot; key4&#39;
tree = parser.parse(f&#39;&quot;{st1}&quot;&#39;)
print(tree.pretty())
&quot;&quot;&quot;
object
  pair
    key1
    value
  pair
    key2
    value2
  pair
    key3
    object
      pair
        key3.1
        value
      pair
        key3.2
        value3.2
      pair
        key3.3
        value3.3
      pair
        key3.4
        value
  pair
    key4
    value
&quot;&quot;&quot;

Then you can write your own Transformer to transform this tree to your desired date type.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将包含子字符串的字符串转换为字典

问题

答案2

如何在Python中模仿Golang的make()函数？

如何在Amazon Linux 2服务器上安装wget？

Dict in Django TemplateView throws Server Error 500, Suggested to use ListView that helps for DetailView

Python 将 XML 转换为数据框中的标签中的标签。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。