英文:
Function does not return expected dataframe output
问题
I'm a newbie in programming, python, nlp, and stackoverflow so grateful for your patience!
I have developed a function that extracts some text from several pdf files and sets up a pandas dataframe with it as well as other details from the original pdf files.
Outside of the function set up, these steps work well but once I 'package' them in the function I can't get the output to work and the resulting dataframe stays empty. I'm clearly missing something, help please!
Here's the function (accessing text from relevant numbered file - Identified by Trustcode and Year).
def Accessingtxt_func(Trustcode):
DFTrust_Text1=pd.DataFrame(columns=['Text','Month','Year','Type', 'Trustcode'])
for year in range(2021,2023):
with open(os.path.join(mypath,f'{ts}Trust{Trustcode}-{year}a.txt'), 'w', encoding='utf-8') as fw:
txt_content = extract_text(f'Trust{Trustcode}-{year}a.pdf')
fw.write(txt_content)
txt_content= txt_content.split('\n\n')
DFTrust_Text1=DFTrust_Text1.append({'Text': txt_content, 'Year': {year}, 'Month':9, 'Type':1, 'Trustcode':Trustcode},ignore_index=True)
return DFTrust_Text1
year=year+1
return DFTrust_Text1
The function compiles fine, and I then run it in a loop like this
for Trustcode in range(12,14):
print(Trustcode)
Accessingtxt_func(Trustcode)
DFTrust_Text1.head()
Which also runs fine, however I can't get it to provide the dataframe head and when calling the function in each loop step. Don't know why either.
I then still call the dataframe out after the loop like so
DFTrust_Text1.head()
But I get an empty dataframe shell, not the expected dataframe with rows for Trustcodes 12,13 and years 2021 and 2022.
Text Month Year Type Trustcode
I've tried with various positionings of the dataframe inside outside, global/local variable, but can't get it to work. Thanks for your help
英文:
I'm a newbie in programming, python, nlp and stackoverflow so grateful for your patience!
I have developed a function that extracts some text from several pdf files and sets up a pandas dataframe with it as well as other details from the original pdf files.
Outside of the function set up, these steps work well but once I 'package' them in the function I can't get the output to work and the resulting dataframe stays empty. I'm clearly missing something, help please!
Here's the function (accessing text from relevant numbered file -Identified by Trustcode and Year).
def Accessingtxt_func(Trustcode):
DFTrust_Text1=pd.DataFrame(columns=['Text','Month','Year','Type', 'Trustcode'])
for year in range(2021,2023):
with open(os.path.join(mypath,f'{ts}Trust{Trustcode}-{year}a.txt'), 'w', encoding='utf-8') as fw:
txt_content = extract_text(f'Trust{Trustcode}-{year}a.pdf')
fw.write(txt_content)
txt_content= txt_content.split('\n\n')
DFTrust_Text1=DFTrust_Text1.append({'Text': txt_content, 'Year': {year}, 'Month':9, 'Type':1, 'Trustcode':Trustcode},ignore_index=True)
return DFTrust_Text1
year=year+1
return DFTrust_Text1
The function compiles fine, and I then run it in a loop like this
for Trustcode in range(12,14):
print(Trustcode)
Accessingtxt_func(Trustcode)
DFTrust_Text1.head()
Which also runs fine, however I can't get it to provide the dataframe head and when calling the function in each loop step. Don't know why either.
I then still call the dataframe out after the loop like so
DFTrust_Text1.head()
But I get an empty dataframe shell, not the expected dataframe with rows for Trustcodes 12,13 and years 2021 and 2022.
Text Month Year Type Trustcode
I've tried with various positionings of the dataframe inside outside, global/local variable, but can't get it to work. Thanks for your help
答案1
得分: 0
需要在调用函数时分配一个数据框架:
newdf = Accessingtxt_func(Trustcode)
newdf.head()
英文:
You need to asign a dataframe when calling the function:
newdf = Accessingtxt_func(Trustcode)
newdf.head()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论