2023年5月7日 22:45:21go评论71阅读模式

英文:

if the name exists in the Excel, check if a folder for that name exists, if true copy the Folder

问题

这是您提供的Python代码的翻译部分：

我已经在这个问题上工作了几天，但没有运气。我有一些文件夹在另一个文件夹中，我需要检查这些文件夹的名称是否出现在Excel表格中。如果是的话，我想将文件复制到另一个文件夹中。这似乎是一个容易的任务，但我对Python还比较陌生，所以我感到迷茫。

以下是Excel文件中名称列表的示例：
[mike nicolai Tesla, Thomas Edison, Edie Morphy, Josef mike Tesla Johanssen]

目录的示例，这是一个总体架构 R:\this folder\Folder\Subfolder\lastName, first Name
R:\this folder\docs\GERMANY\Tesla Johanssen, Josef mike
R:\this folder\docs\GERMANY\Morphy, Edie
R:\this folder\docs\France\nicolai Tesla, Josef mike

我已经创建了以下两个函数，以便根据需要复制或提取名称。由于中间名称，我必须检查函数是否返回的不仅仅是名字的第一个和最后一个部分。为此，我有两个函数 *sort_Names(_name)* 和 *extract_Names_From_Folders(folder_path)*

```python
import os
from glob import glob
import shutil

def extract_Names_From_Folders(folder_path):
    folder_Name=folder_path.split(sep="\\")
    folder_Name=folder_Name[-1::]
    if len(''.join(map(str, folder_Name)).split(' '))>2:
        _lName=''.join(map(str, folder_Name)).split(" ")[0]
        _mName=''.join(map(str, folder_Name)).split(" ")[1]
        _fName=''.join(map(str, folder_Name)).split(" ")[2]
        return _fName,_mName,_lName
    elif len(''.join(map(str, folder_Name)).split(' '))>1:
        _lName=''.join(map(str, folder_Name)).split(" ")[0]
        _fName=''.join(map(str, folder_Name)).split(" ")[1]
        return _fName,_lName
    elif len(''.join(map(str, folder_Name)).split(' '))>0:
        _fName=''.join(map(str, folder_Name)).split(" ")[0]
        return _fName

def sort_Names(_name):
    if len(''.join(map(str, _name)).split(" "))==4:
        excel_fName2=''.join(map(str, _name)).split(" ")[3]
        excel_fName1=''.join(map(str, _name)).split(" ")[2]
        excel_mName= ''.join(map(str, _name)).split(" ")[0]
        excel_lName=''.join(map(str, _name)).split(" ")[1]
        return excel_fName1,excel_fName2,excel_mName,excel_lName
    elif len(''.join(map(str, _name)).split(" "))==3:
        excel_fName=''.join(map(str, _name)).split(" ")[2]
        excel_mName= ''.join(map(str, _name)).split(" ")[0]
        excel_lName=''.join(map(str, _name)).split(" ")[1]
        return excel_fName,excel_mName,excel_lName
    elif len(''.join(map(str, _name)).split(" "))==2:
        excel_fName=''.join(map(str, _name)).split(" ")[1]
        excel_lName=''.join(map(str, _name)).split(" ")[0]
        return excel_fName,excel_lName
    elif len(''.join(map(str, _name)).split(" "))==1:
        excel_fName=''.join(map(str, _name)).split(" ")[0]
        return excel_fName
    else: return [0]

def copy_the_file(folder_name,_name,df):
    index = _name.index(folder_name)
    df.at[_name, "User Request Form Created?\n[Yes/No]"] = "Yes"
    # Copy the subfolder to the Proof folder
    proof_folder_path = r"r:\blabla\Proof"
    shutil.copytree(folder_name, os.path.join(proof_folder_path, _name), dirs_exist_ok=True)

以下是主要部分。此代码当前不起作用，我不知道为什么。它变得如此复杂，如此之快。我相当确定可以以更简单和更Pythonic的方式完成。但我不知道如何做。

import pandas as pd
from glob import glob

# 设置文件路径
excel_file_path = r"k:\blabla\path.xlsx"
folder_path = r"r:\blabla\path\*\*&"
df = pd.read_excel(excel_file_path, header=1)

# 从“Name”列中获取名称列表
names = df["Name"].tolist()
for folder_name in glob(folder_path, recursive=True):
    # 检查函数是否返回3个或2个变量。函数返回一个元组
    mName=''
    tuple_=extract_Names_From_Folders(folder_name)
    if len(tuple_) == 3:
        fName = tuple_[0] if tuple_ else None
        mName= tuple_[1] if len(tuple_) > 1 else None
        lName= tuple_[2] if len(tuple_) > 2 else None
    elif len(tuple_) == 2:
        fName = tuple_[0] if tuple_ else None
        lName= tuple_[1] if len(tuple_) > 1 else None
    elif len(tuple_) == 1:
        fName = tuple_[0] if tuple_ else None

    for _name in names:
        if len(_name)>2:
            tuple_=sort_Names(_name)
            if len(tuple_) == 4:
                excel_fName2=tuple_[1] if tuple_ else None
                excel_fName1=tuple_[0] if tuple_ else None
                excel_mName= tuple_[2] if tuple_ else None
                excel_lName=tuple_[3] if tuple_ else None
                print(excel_fName1+' '+excel_fName2,excel_mName+' '+excel_lName)
                # 检查子文件夹是否与Excel文件中的名称匹配
                if excel_fName1+' '+excel_fName2 == fName and excel_mName+" "+excel_lName == lName:
                    copy_the_file(folder_name,_name,df)
            if len(tuple_) == 3:
                excel_fName=tuple_[0] if tuple_ else None
                excel_mName= tuple_[1] if tuple_ else None
                excel_lName=tuple_[2] if tuple_ else None
                print(excel_f

<details>
<summary>英文:</summary>

I&#39;ve been working on this for days but no luck. I have some folder in another folder and I need to check if the Name of these folders appear on a Excel sheet. If so I want to copy the files to another folder. It seems easy job to be, but I&#39;m kind of new to python so I&#39;m lost. 

An Example of the list of names in excel file:

    [mike nicolai Tesla, Thomas Edison, Edie Morphy, Josef mike Tesla Johanssen]

an example of the directories, this is a over all schema R:\this folder\Folder\Subfolder\lastName, first Name

    R:\this folder\docs\GERMANY\Tesla Johanssen, Josef mike
    R:\this folder\docs\GERMANY\Morphy, Edie   
    R:\this folder\docs\France\nicolai Tesla, Josef mike

   


I have made these Functions, in order to Copy or extract the names accordingly. Due to the middle names, I had to check if the Function returns more than just First and Last Name. I have 2 functions for that purpose *sort_Names(_name)* &amp; *extract_Names_From_Folders(folder_path)*

    import os
    from glob import glob
    import shutil

    def extract_Names_From_Folders(folder_path):
        folder_Name=folder_path.split(sep=&quot;\\&quot;)
        folder_Name=folder_Name[-1::]
        if len(&#39;&#39;.join(map(str, folder_Name)).split(&#39; &#39;))&gt;2:
            _lName=&#39;&#39;.join(map(str, folder_Name)).split(&quot; &quot;)[0]
            _mName=&#39;&#39;.join(map(str, folder_Name)).split(&quot; &quot;)[1]
            _fName=&#39;&#39;.join(map(str, folder_Name)).split(&quot; &quot;)[2]
            return _fName,_mName,_lName
        elif len(&#39;&#39;.join(map(str, folder_Name)).split(&#39; &#39;))&gt;1:
            _lName=&#39;&#39;.join(map(str, folder_Name)).split(&quot; &quot;)[0]
            _fName=&#39;&#39;.join(map(str, folder_Name)).split(&quot; &quot;)[1]            
            return _fName,_lName
        elif len(&#39;&#39;.join(map(str, folder_Name)).split(&#39; &#39;))&gt;0:
            _fName=&#39;&#39;.join(map(str, folder_Name)).split(&quot; &quot;)[0]  
            return _fName
        
    def sort_Names(_name):
        if len(&#39;&#39;.join(map(str, _name)).split(&quot; &quot;))==4:  
            excel_fName2=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[3]
            excel_fName1=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[2]
            excel_mName= &#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[0]
            excel_lName=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[1]
            return excel_fName1,excel_fName2,excel_mName,excel_lName
        elif len(&#39;&#39;.join(map(str, _name)).split(&quot; &quot;))==3:  
            excel_fName=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[2]
            excel_mName= &#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[0]
            excel_lName=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[1]
            return excel_fName,excel_mName,excel_lName
        elif len(&#39;&#39;.join(map(str, _name)).split(&quot; &quot;))==2:
            excel_fName=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[1]        
            excel_lName=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[0]
            return excel_fName,excel_lName
        elif len(&#39;&#39;.join(map(str, _name)).split(&quot; &quot;))==1:
            excel_fName=&#39;&#39;.join(map(str, _name)).split(&quot; &quot;)[0]        
            return excel_fName
        else: return [0]
    
    
    def copy_the_file(folder_name,_name,df):    
        index = _name.index(folder_name)        
        df.at[_name, &quot;User Request Form Created?\n[Yes/No]&quot;] = &quot;Yes&quot;
        # Copy the subfolder to the Proof folder
        proof_folder_path = r&quot;r:\blabla\Proof&quot;
        shutil.copytree(folder_name, os.path.join(proof_folder_path, _name), dirs_exist_ok=True)


Here&#39;s the main part. This code doesn&#39;t currently work I have no idea *why*. It got complex so fast. I&#39;m pretty sure it can be done in easier and more pythonic way. But have no clue how. 

    import pandas as pd
    from glob import glob
    
    # Set up file paths
    excel_file_path = r&quot;k:\blabla\path.xlsx&quot;
    folder_path = r&quot;r:\blabla\path\*\*&quot;
    df = pd.read_excel(excel_file_path, header=1)
    
    # Get the list of names from the &quot;Name&quot; column
    names = df[&quot;Name&quot;].tolist()
    for folder_name in glob(folder_path, recursive = True):
        #Check if function returns 3 or 2 variables. A function returns a Tuple
        mName=&#39;&#39;
        tuple_=extract_Names_From_Folders(folder_name)
        if len(tuple_) == 3:
            fName = tuple_[0] if tuple_ else None
            mName= tuple_[1] if len(tuple_) &gt; 1 else None
            lName= tuple_[2] if len(tuple_) &gt; 2 else None
        elif len(tuple_) == 2:
            fName = tuple_[0] if tuple_ else None        
            lName= tuple_[1] if len(tuple_) &gt; 1 else None
        elif len(tuple_) == 1:
            fName = tuple_[0] if tuple_ else None
    
        #convert list to string then split by comma
        # print(&#39;&#39;.join(map(str, folder_Name)).split(&#39;,&#39;)[0])
        
        for _name in names: 
            if len(_name)&gt;2:
                tuple_=sort_Names(_name)
                if len(tuple_) == 4:
                    excel_fName2=tuple_[1] if tuple_ else None
                    excel_fName1=tuple_[0] if tuple_ else None
                    excel_mName= tuple_[2] if tuple_ else None
                    excel_lName=tuple_[3] if tuple_ else None
                    print(excel_fName1+&#39; &#39;+excel_fName2,excel_mName+&#39; &#39;+excel_lName)      
      # Check if the subfolder matches a name in the Excel file      
                    if excel_fName1+&#39; &#39;+excel_fName2 == fName and excel_mName+&quot; &quot;+excel_lName == lName:       
                        copy_the_file(folder_name,_name,df)
                if len(tuple_) == 3:            
                    excel_fName=tuple_[0] if tuple_ else None
                    excel_mName= tuple_[1] if tuple_ else None
                    excel_lName=tuple_[2] if tuple_ else None
                    print(excel_fName+&#39; &#39;+excel_mName,excel_lName)
                    if excel_fName == fName and excel_mName+&quot; &quot;+excel_lName == lName:       
                        copy_the_file(folder_name,_name,df)
                elif len(tuple_) == 2:
                    excel_fName = tuple_[0] if tuple_ else None        
                    excel_lName = tuple_[1] if len(tuple_) &gt; 1 else None
                    print(excel_fName +&#39; &#39;+excel_lName)            
                    if excel_fName==fName and excel_lName == lName:
                        copy_the_file(folder_name,_name,df)
                elif len(tuple_) == 1:
                    fName = tuple_[0] if tuple_ else None


</details>


# 答案1
**得分**: 1

你可以使用 `difflib` 模块来找到相似的名称。通过在 `.get_close_matches` 方法中匹配，你可以尝试找到最接近的值。对于你提供的数据，`cutoff=.5` 是一个合适的值。如果这不起作用，你可以对数据框中的单词进行排序，将它们转换为相同的大小写，以使比较更准确。如果这仍然不起作用，你可以使用第三方库来比较字符串。

<details>
<summary>英文:</summary>

You can use the `difflib` module to find similar names. By matching in the `.get_close_matches` method, you can try to find the closest value. For the data you gave, `cutoff=.5` came up. If it doesn&#39;t work out, you can sort the words in the dataframe lines, convert them to the same case, so that the comparisons are more accurate. If this fails, you can use third-party libraries to compare strings.

import pandas as pd
import glob
import shutil
import difflib

excel_file_path = r'test.xlsx'
folder_path = r'F:\this folder\docs'
proof_folder_path = r'F:\this folder\Proof'

df_excel = pd.read_excel(excel_file_path, names=['name'], dtype='string')

pathes = glob.glob(rf'{folder_path}**')

We create lists, one with a partial path to the folders, the second with their names.

part_path = [part for path in pathes for part in path.split(sep=r"\docs")[-1::]]
folder_name = [name for path in pathes for name in path.split(sep="\")[-1::]]

We collect everything in another dataframe.

dict_folder = {'name': folder_name, 'path': pathes, 'part_path': part_path}
df_folder = pd.DataFrame(dict_folder, dtype='string')

We start a cycle of checking folder names and overwriting the necessary ones.

for name in df_excel['name']:
# You may have to match cutoff=, so for data that is 0.4 selected and unnecessary, 0.6 did not select any.
find_it = difflib.get_close_matches(name, df_folder['name'], n=1, cutoff=.5)

# If the name matches, overwrite the subfolder in a new folder.
if find_it:
df_path = df_folder.loc[df_folder[&#39;name&#39;] == find_it[0]]
shutil.copytree(df_path[&#39;path&#39;].values[0],
f&quot;{proof_folder_path}{df_path[&#39;part_path&#39;].values[0]}&quot;,
dirs_exist_ok=True)


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

if the name exists in the Excel, check if a folder for that name exists, if true copy the Folder.

问题

We create lists, one with a partial path to the folders, the second with their names.

We collect everything in another dataframe.

We start a cycle of checking folder names and overwriting the necessary ones.

文本框不会移动

如何在discord.py的提交机器人中添加接受/拒绝按钮？

Python多线程在任务处理完成后无法终止。

无法确定 sys.argv 的模块路径错误 – 每个 Python 程序都返回此错误。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论