英文:
Python multi processing on for loop
问题
我有一个带有两个参数的函数
reqs = [1223, 1456, 1243, 20455]
url = "传递一个URL"
def crawl(i, url):
print("%s 是 %s" % (i, url))
我想通过多进程的概念触发上面的函数。
from multiprocessing import Pool
if __name__ == '__main__':
p = Pool(5)
print(p.map([crawl(i, url) for i in reqs]))
上面的代码对我不起作用。有人能帮我吗!
----- 添加新的代码 -----
from multiprocessing import Pool
reqs = [1223, 1456, 1243, 20455]
url = "传递一个URL"
def crawl(combined_args):
print("%s 是 %s" % (combined_args[0], combined_args[1]))
def main():
p = Pool(5)
print(p.map(crawl, [(i, url) for i in reqs]))
if __name__ == '__main__':
main()
当我尝试执行上面的代码时,我得到以下错误
请注意,我没有翻译代码中的变量名和函数名,只翻译了注释和字符串。
英文:
I have a function with two parameters
reqs =[1223,1456,1243,20455]
url = "pass a url"
def crawl(i,url):
print("%s is %s" % (i, url))
I want to trigger above function by multi processing concept.
from multiprocessing import Pool
if __name__ == '__main__':
p = Pool(5)
print(p.map([crawl(i,url) for i in reqs]))
above code is not working for me. can anyone please help me on this!
----- ADDING NEW CODE ---------
from multiprocessing import Pool
reqs = [1223,1456,1243,20455]
url = "pass a url"
def crawl(combined_args):
print("%s is %s" % (combined_args[0], combined_args[1]))
def main():
p = Pool(5)
print(p.map(crawl, [(i,url) for i in reqs]))
if __name__ == '__main__':
main()
when I am trying to execute above code, I am getting below error
答案1
得分: 2
根据multiprocessing.Pool.map,这是函数参数行:
map(func, iterable[, chunksize])
您试图将迭代器传递给map,而不是(func, iterable)
。
请参考以下的multiprocessing.pool示例(源代码):
import time
from multiprocessing import Pool
work = (["A", 5], ["B", 2], ["C", 1], ["D", 3])
def work_log(work_data):
print(" Process %s waiting %s seconds" % (work_data[0], work_data[1]))
time.sleep(int(work_data[1]))
print(" Process %s Finished." % work_data[0])
def pool_handler():
p = Pool(2)
p.map(work_log, work)
if __name__ == '__main__':
pool_handler()
请注意,他在work_log
函数中传递了一个参数,并在函数中使用索引来获取相关字段。
关于您的示例:
from multiprocessing import Pool
reqs = [1223, 1456, 1243, 20455]
url = "pass a url"
def crawl(combined_args):
print("%s is %s" % (combined_args[0], combined_args[1]))
def main():
p = Pool(5)
print(p.map(crawl, [(i, url) for i in reqs]))
if __name__ == '__main__':
main()
结果为:
1223 is pass a url
1456 is pass a url
1243 is pass a url
20455 is pass a url
[None, None, None, None] # 这是map函数的输出
英文:
According to the multiprocessing.Pool.map this is the function argument line:
map(func, iterable[, chunksize])
You are trying to pass to the map a iterator instead of (func, iterable)
.
Please refer to the following example of multiprocessing.pool (source):
import time
from multiprocessing import Pool
work = (["A", 5], ["B", 2], ["C", 1], ["D", 3])
def work_log(work_data):
print(" Process %s waiting %s seconds" % (work_data[0], work_data[1]))
time.sleep(int(work_data[1]))
print(" Process %s Finished." % work_data[0])
def pool_handler():
p = Pool(2)
p.map(work_log, work)
if __name__ == '__main__':
pool_handler()
Please note that he is passing one argument to the work_log
function and in the function he use the index to get to the relevant fields.
Refering to your example:
from multiprocessing import Pool
reqs = [1223,1456,1243,20455]
url = "pass a url"
def crawl(combined_args):
print("%s is %s" % (combined_args[0], combined_args[1]))
def main():
p = Pool(5)
print(p.map(crawl, [(i,url) for i in reqs]))
if __name__ == '__main__':
main()
Results with:
1223 is pass a url
1456 is pass a url
1243 is pass a url
20455 is pass a url
[None, None, None, None] # This is the output of the map function
答案2
得分: 1
问题已解决。爬虫函数应该在单独的模块中,如下所示:
crawler.py
def crawl(combined_args):
print("%s is %s" % (combined_args[0], combined_args[1]))
run.py
from multiprocessing import Pool
import crawler
def main():
p = Pool(5)
print(p.map(crawler.crawl, [(i, url) for i in reqs]))
if __name__ == '__main__':
main()
然后输出将如下所示:
output :
1223 is pass a url
1456 is pass a url
1243 is pass a url
20455 is pass a url
[None, None, None, None] # 这是map函数的输出
英文:
Issue resolved. crawl function should in separate module like below:
crawler.py
def crawl(combined_args):
print("%s is %s" % (combined_args[0], combined_args[1]))
run.py
from multiprocessing import Pool
import crawler
def main():
p = Pool(5)
print(p.map(crawler.crawl, [(i,url) for i in reqs]))
if __name__ == '__main__':
main()
Then output will be like below:
**output :**
1223 is pass a url
1456 is pass a url
1243 is pass a url
20455 is pass a url
[None, None, None, None] # This is the output of the map function
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论