英文:
ForkJoin is running on 1 thread
问题
以下是翻译好的内容:
我有以下代码,这是一种网站爬虫的模拟,它会爬取页面/子页面,并将结果连接到一个字符串,其中包含页面的内容。
我使用了 `Runtime.getRuntime().availableProcessors()`,因此我假设它会在多个线程上运行。但事实并非如此。
```java
package Concurrency;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;
public class ForkJoinPoolDemo {
public static class MyTask extends RecursiveTask<String>
{
private String url;
public MyTask(String url)
{
this.url = url;
}
@Override
protected String compute() {
System.out.println(Thread.currentThread().getName() + "/" + url);
if(url.equals("http://google.com/b1")) {
return "Content from /b1";
} else if(url.equals("http://google.com/b2")) {
return "Content from /b2";
} else if(url.equals("http://google.com/b")) {
List<MyTask> tasks = new ArrayList<>();
tasks.add(new MyTask("http://google.com/b1"));
tasks.add(new MyTask("http://google.com/b2"));
String result = "Content from /b\n";
for(MyTask task : tasks) {
task.fork();
result += task.join() + "\n";
}
return result;
} else if(url.equals("http://google.com")) {
List<MyTask> tasks = new ArrayList<>();
tasks.add(new MyTask("http://google.com/a"));
tasks.add(new MyTask("http://google.com/b"));
tasks.add(new MyTask("http://google.com/c"));
tasks.add(new MyTask("http://google.com/d"));
tasks.add(new MyTask("http://google.com/e"));
tasks.add(new MyTask("http://google.com/f"));
tasks.add(new MyTask("http://google.com/g"));
tasks.add(new MyTask("http://google.com/h"));
tasks.add(new MyTask("http://google.com/i"));
tasks.add(new MyTask("http://google.com/j"));
String result = "Content from /\n";
for (MyTask task : tasks) {
task.fork();
result += task.join() + "\n";
}
return result;
} else {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "Content from " + url;
}
}
}
public static void main(String[] args) {
ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
String result = pool.invoke(new MyTask("http://google.com"));
System.out.println(result);
}
}
为什么每个分支都在同一个线程上运行?
ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j
英文:
I have the following code, which is a simulation of a sort of site scraper, which scrapes pages/ subpages and joins the result to a string with the contents of the pages.
I have used Runtime.getRuntime().availableProcessors()
, so I assumed that it will run on multiple threads. But this does not seem to be the case.
package Concurrency;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;
public class ForkJoinPoolDemo {
public static class MyTask extends RecursiveTask<String>
{
private String url;
public MyTask(String url)
{
this.url = url;
}
@Override
protected String compute() {
System.out.println(Thread.currentThread().getName() + "/" + url);
if(url.equals("http://google.com/b1")) {
return "Content from /b1";
} else if(url.equals("http://google.com/b2")) {
return "Content from /b2";
} else if(url.equals("http://google.com/b")) {
List<MyTask> tasks = new ArrayList<>();
tasks.add(new MyTask("http://google.com/b1"));
tasks.add(new MyTask("http://google.com/b2"));
String result = "Content from /b\n";
for(MyTask task : tasks) {
task.fork();
result += task.join() + "\n";
}
return result;
} else if(url.equals("http://google.com")) {
List<MyTask> tasks = new ArrayList<>();
tasks.add(new MyTask("http://google.com/a"));
tasks.add(new MyTask("http://google.com/b"));
tasks.add(new MyTask("http://google.com/c"));
tasks.add(new MyTask("http://google.com/d"));
tasks.add(new MyTask("http://google.com/e"));
tasks.add(new MyTask("http://google.com/f"));
tasks.add(new MyTask("http://google.com/g"));
tasks.add(new MyTask("http://google.com/h"));
tasks.add(new MyTask("http://google.com/i"));
tasks.add(new MyTask("http://google.com/j"));
String result = "Content from /\n";
for (MyTask task : tasks) {
task.fork();
result += task.join() + "\n";
}
return result;
} else {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "Content from " + url;
}
}
}
public static void main(String[] args) {
ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
String result = pool.invoke(new MyTask("http://google.com"));
System.out.println(result);
}
}
Why is every fork running on the same thread?
ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j
答案1
得分: 2
你在每次生成新任务并等待其完成之后再提交另一个任务时都会被阻塞在join操作上。相反,先生成所有的任务,然后收集它们的结果:
for(MyTask task : tasks) {
task.fork();
}
for(MyTask task : tasks) {
result += task.join() + "\n";
}
英文:
You're blocking on join each time you spawn a new task waiting for it to complete before submitting another task. Instead, spawn all the tasks first and then collect their results:
for(MyTask task : tasks) {
task.fork();
}
for(MyTask task : tasks) {
result += task.join() + "\n";
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论