英文:
Template Literals Not Being read properly in Python and returning: Errno 22, Invalid Argument
问题
以下是您要翻译的内容:
我正在编写一个非常简单的Python函数,用户调用该函数时只需提供一个参数:网站名称,该函数将简单地创建一个以网站名称命名的空Markdown文件,例如 www.example.com.md。
为此,我尝试使用模板文字(template literal),使用 f'',但是在尝试使用模板文字时,Python返回无效的参数错误。当我硬编码网站为 'www.example.com' 时,它可以正常工作。
那么,如何在Python中使模板字符串正常工作而不返回错误?
def fileCreator(website):
outputFile = open(f'{website}.md', 'w')
print(outputFile)
fileCreator('https://www.example.com/')
英文:
I am writing a very simple function in Python where the user calls the function with one paramater: website, and the function simply makes an empty markdown file with the website name, for example www.example.com.md.
For this purpose I am trying to use a template literal using f'', however Python returns invalid argument when trying to use a template literal. When I hardcode the website 'www.example.com' it works fine.
So how do you get template strings to work in Python without returning an error?
def fileCreator(website):
outputFile = open(f'{website}.md', 'w')
print(outputFile)
fileCreator('https://www.example.com/')
答案1
得分: 3
在Windows中,文件名不能包含正斜杠。您可以在Python中使用re库来删除空格和斜杠等字符,以使文件名有效。
编辑:修改以将原始链接存储为另一个变量,同时输出最终修改后的文件名。
import re
def fileCreator(website):
original_url = website
website = re.sub(r'[^\w\s-]', '-', website).strip().lower()
website = re.sub(r'[-\s]+', '-', website)
outputFile = open(f'{website}.md', 'w')
print(outputFile)
return original_url
original_url = fileCreator('https://www.example.com/')
print(original_url)
输出:
<_io.TextIOWrapper name='https-www-example-com-.md' mode='w' encoding='cp1252'>
英文:
Files in windows can't have a forward slash. You can use the re library in python to remove characters like spaces and slashes to make it a valid file name.
Edit: modified to store the original link as another var while also outputting the final modified file name.
import re
def fileCreator(website):
original_url = website
website = re.sub(r'[^\w\s-]', '-', website).strip().lower()
website = re.sub(r'[-\s]+', '-', website)
outputFile = open(f'{website}.md', 'w')
print(outputFile)
return original_url
original_url = fileCreator('https://www.example.com/')
print(original_url)
Output:
<_io.TextIOWrapper name='https-www-example-com-.md' mode='w' encoding='cp1252'>
答案2
得分: 3
问题很简单,文件名包含正斜杠 ("/")。要么不包含这种字符的网站名称,像这样:
def fileCreator(website):
outputFile = open(f'{website}.md', 'w')
print(outputFile)
fileCreator('www.example.com')
要么在 "fileCreator" 函数中移除它们,像这样:
def fileCreator(website):
website = website.replace("https://","")
website = website.replace("/","")
outputFile = open(f'{website}.md', 'w')
print(outputFile)
fileCreator('https://www.example.com/')
英文:
The problem simply is that the file name contains forward slash ("/")
So either pass the website name without such character like that:
def fileCreator(website):
outputFile = open(f'{website}.md', 'w')
print(outputFile)
fileCreator('www.example.com')
Or remove them in the "fileCreator" function like that:
def fileCreator(website):
website = website.replace("https://","")
website = website.replace("/","")
outputFile = open(f'{website}.md', 'w')
print(outputFile)
fileCreator('https://www.example.com/')
答案3
得分: 0
使用标准库中的 urllib.parse
包来对 URL 进行编码和解码,以转换为有效的文件格式。它还提供了一种将其解码回有效 URL 的方法。
import urllib.parse
url = "https://docs.python.org/3/library/urllib.parse.html"
# 作为有效路径
url2path = urllib.parse.quote_plus(url)
print(url2path)
# https%3A%2F%2Fdocs.python.org%2F3%2Flibrary%2Furllib.parse.html
# 作为 URL
path2url = urllib.parse.unquote(url2path)
print(path2url)
# 'https://docs.python.org/3/library/urllib.parse.html'
URL 由不同的组件构成,包括 scheme、hostname 等,这些组件可以很容易地作为 urllib.parse.ParseResult
实例的 named tuple 来访问。
r = urllib.parse.urlparse(url) # ParseResult 对象
print(r)
# ParseResult(scheme='https', netloc='docs.python.org', path='/3/library/urllib.parse.html', params='', query='', fragment='')
print(r.hostname)
# docs.python.org
因此,如果您只关心 "网站名称",请使用 urlparse.hostname
,我强烈不建议使用基于直接字符串操作的解决方案,例如正则表达式。首先尝试使用标准库为该任务提供的功能!
英文:
Use the package urllib.parse
from the standard library to encode and decode a url into a valid file format. It also provide a way to decode it back to a valid url.
import urllib.parse
url = "https://docs.python.org/3/library/urllib.parse.html"
# as valid path
url2path = urllib.parse.quote_plus(url)
print(url2path)
#https%3A%2F%2Fdocs.python.org%2F3%2Flibrary%2Furllib.parse.html
# as url
path2url = urllib.parse.unquote(url2path)
print(path2url)
#'https://docs.python.org/3/library/urllib.parse.html'
An url is made of different components, scheme, hostname, ... which are easily accessible as a named tuple of an instance of urllib.parse.ParseResult
.
r = urllib.parse.urlparse(url) # ParseResult object
print(r)
#ParseResult(scheme='https', netloc='docs.python.org', path='/3/library/urllib.parse.html', params='', query='', fragment='')
print(r.hostname)
#docs.python.org
So if you care only about the "website name" use urlparse.hostname
and I strongly discourage to use solutions based on direct string manipulation such as regex. Try first to use what the standard library offers for that task!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论