Template Literals Not Being read properly in Python and returning: Errno 22, Invalid Argument

huangapple go评论61阅读模式
英文:

Template Literals Not Being read properly in Python and returning: Errno 22, Invalid Argument

问题

以下是您要翻译的内容:

我正在编写一个非常简单的Python函数用户调用该函数时只需提供一个参数网站名称该函数将简单地创建一个以网站名称命名的空Markdown文件例如 www.example.com.md

为此我尝试使用模板文字template literal),使用 f''但是在尝试使用模板文字时Python返回无效的参数错误当我硬编码网站为 'www.example.com'它可以正常工作

那么如何在Python中使模板字符串正常工作而不返回错误

def fileCreator(website):
    outputFile = open(f'{website}.md', 'w')

    print(outputFile)

fileCreator('https://www.example.com/')
英文:

I am writing a very simple function in Python where the user calls the function with one paramater: website, and the function simply makes an empty markdown file with the website name, for example www.example.com.md.

For this purpose I am trying to use a template literal using f'', however Python returns invalid argument when trying to use a template literal. When I hardcode the website 'www.example.com' it works fine.

So how do you get template strings to work in Python without returning an error?

    def fileCreator(website):
        outputFile = open(f'{website}.md', 'w')
    
        print(outputFile)

 fileCreator('https://www.example.com/')

答案1

得分: 3

在Windows中,文件名不能包含正斜杠。您可以在Python中使用re库来删除空格和斜杠等字符,以使文件名有效。

编辑:修改以将原始链接存储为另一个变量,同时输出最终修改后的文件名。

import re

def fileCreator(website):
    original_url = website
    website = re.sub(r'[^\w\s-]', '-', website).strip().lower()
    website = re.sub(r'[-\s]+', '-', website)
    outputFile = open(f'{website}.md', 'w')
    print(outputFile)
    return original_url

original_url = fileCreator('https://www.example.com/')
print(original_url)

输出:

<_io.TextIOWrapper name='https-www-example-com-.md' mode='w' encoding='cp1252'>
英文:

Files in windows can't have a forward slash. You can use the re library in python to remove characters like spaces and slashes to make it a valid file name.

Edit: modified to store the original link as another var while also outputting the final modified file name.

import re

def fileCreator(website):
    original_url = website
    website = re.sub(r&#39;[^\w\s-]&#39;, &#39;-&#39;, website).strip().lower()
    website = re.sub(r&#39;[-\s]+&#39;, &#39;-&#39;, website)
    outputFile = open(f&#39;{website}.md&#39;, &#39;w&#39;)
    print(outputFile)
    return original_url

original_url = fileCreator(&#39;https://www.example.com/&#39;)
print(original_url)

Output:

&lt;_io.TextIOWrapper name=&#39;https-www-example-com-.md&#39; mode=&#39;w&#39; encoding=&#39;cp1252&#39;&gt;

答案2

得分: 3

问题很简单,文件名包含正斜杠 ("/")。要么不包含这种字符的网站名称,像这样:

def fileCreator(website):
    outputFile = open(f'{website}.md', 'w')

    print(outputFile)

fileCreator('www.example.com')

要么在 "fileCreator" 函数中移除它们,像这样:

def fileCreator(website):
    website = website.replace("https://","")
    website = website.replace("/","")
    outputFile = open(f'{website}.md', 'w')

    print(outputFile)

fileCreator('https://www.example.com/')
英文:

The problem simply is that the file name contains forward slash ("/")
So either pass the website name without such character like that:

def fileCreator(website):
    outputFile = open(f&#39;{website}.md&#39;, &#39;w&#39;)

    print(outputFile)

fileCreator(&#39;www.example.com&#39;)

Or remove them in the "fileCreator" function like that:

def fileCreator(website):
    website = website.replace(&quot;https://&quot;,&quot;&quot;)
    website = website.replace(&quot;/&quot;,&quot;&quot;)
    outputFile = open(f&#39;{website}.md&#39;, &#39;w&#39;)

    print(outputFile)

fileCreator(&#39;https://www.example.com/&#39;)

答案3

得分: 0

使用标准库中的 urllib.parse 包来对 URL 进行编码和解码,以转换为有效的文件格式。它还提供了一种将其解码回有效 URL 的方法。

import urllib.parse

url = "https://docs.python.org/3/library/urllib.parse.html"

# 作为有效路径
url2path = urllib.parse.quote_plus(url)
print(url2path)
# https%3A%2F%2Fdocs.python.org%2F3%2Flibrary%2Furllib.parse.html

# 作为 URL
path2url = urllib.parse.unquote(url2path)
print(path2url)
# 'https://docs.python.org/3/library/urllib.parse.html'

URL 由不同的组件构成,包括 schemehostname 等,这些组件可以很容易地作为 urllib.parse.ParseResult 实例的 named tuple 来访问。

r = urllib.parse.urlparse(url) # ParseResult 对象
print(r)
# ParseResult(scheme='https', netloc='docs.python.org', path='/3/library/urllib.parse.html', params='', query='', fragment='')

print(r.hostname)
# docs.python.org

因此,如果您只关心 "网站名称",请使用 urlparse.hostname,我强烈不建议使用基于直接字符串操作的解决方案,例如正则表达式。首先尝试使用标准库为该任务提供的功能!

英文:

Use the package urllib.parse from the standard library to encode and decode a url into a valid file format. It also provide a way to decode it back to a valid url.

import urllib.parse

url = &quot;https://docs.python.org/3/library/urllib.parse.html&quot;

# as valid path
url2path = urllib.parse.quote_plus(url)
print(url2path)
#https%3A%2F%2Fdocs.python.org%2F3%2Flibrary%2Furllib.parse.html

# as url
path2url = urllib.parse.unquote(url2path)
print(path2url)
#&#39;https://docs.python.org/3/library/urllib.parse.html&#39;

An url is made of different components, scheme, hostname, ... which are easily accessible as a named tuple of an instance of urllib.parse.ParseResult.

r = urllib.parse.urlparse(url) # ParseResult object
print(r)
#ParseResult(scheme=&#39;https&#39;, netloc=&#39;docs.python.org&#39;, path=&#39;/3/library/urllib.parse.html&#39;, params=&#39;&#39;, query=&#39;&#39;, fragment=&#39;&#39;)

print(r.hostname)
#docs.python.org

So if you care only about the "website name" use urlparse.hostname and I strongly discourage to use solutions based on direct string manipulation such as regex. Try first to use what the standard library offers for that task!

huangapple
  • 本文由 发表于 2023年3月31日 22:15:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/75899568.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定