ForkJoin正在1个线程上运行

huangapple go评论71阅读模式
英文:

ForkJoin is running on 1 thread

问题

以下是翻译好的内容:

我有以下代码这是一种网站爬虫的模拟它会爬取页面/子页面并将结果连接到一个字符串其中包含页面的内容

我使用了 `Runtime.getRuntime().availableProcessors()`,因此我假设它会在多个线程上运行但事实并非如此

```java
package Concurrency;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;

public class ForkJoinPoolDemo {

    public static class MyTask extends RecursiveTask<String>
    {
        private String url;
        public MyTask(String url)
        {
            this.url = url;
        }

        @Override
        protected String compute() {

            System.out.println(Thread.currentThread().getName() + "/" + url);

            if(url.equals("http://google.com/b1")) {
                return "Content from /b1";
            } else if(url.equals("http://google.com/b2")) {
                return "Content from /b2";
            } else if(url.equals("http://google.com/b")) {
                List<MyTask> tasks = new ArrayList<>();
                tasks.add(new MyTask("http://google.com/b1"));
                tasks.add(new MyTask("http://google.com/b2"));
                String result = "Content from /b\n";

                for(MyTask task : tasks) {
                    task.fork();
                    result += task.join() + "\n";
                }
                return result;
            } else if(url.equals("http://google.com")) {

                List<MyTask> tasks = new ArrayList<>();
                tasks.add(new MyTask("http://google.com/a"));
                tasks.add(new MyTask("http://google.com/b"));
                tasks.add(new MyTask("http://google.com/c"));
                tasks.add(new MyTask("http://google.com/d"));
                tasks.add(new MyTask("http://google.com/e"));
                tasks.add(new MyTask("http://google.com/f"));
                tasks.add(new MyTask("http://google.com/g"));
                tasks.add(new MyTask("http://google.com/h"));
                tasks.add(new MyTask("http://google.com/i"));
                tasks.add(new MyTask("http://google.com/j"));
                String result = "Content from /\n";

                for (MyTask task : tasks) {
                    task.fork();
                    result += task.join() + "\n";
                }
                return result;
            } else {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                return "Content from " + url;
            }
        }
    }

    public static void main(String[] args) {
        ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
        String result = pool.invoke(new MyTask("http://google.com"));
        System.out.println(result);
    }
}

为什么每个分支都在同一个线程上运行?

ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j
英文:

I have the following code, which is a simulation of a sort of site scraper, which scrapes pages/ subpages and joins the result to a string with the contents of the pages.

I have used Runtime.getRuntime().availableProcessors(), so I assumed that it will run on multiple threads. But this does not seem to be the case.

package Concurrency;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;

public class ForkJoinPoolDemo {

    public static class MyTask extends RecursiveTask&lt;String&gt;
    {
        private String url;
        public MyTask(String url)
        {
            this.url = url;
        }

        @Override
        protected String compute() {

            System.out.println(Thread.currentThread().getName() + &quot;/&quot; + url);

            if(url.equals(&quot;http://google.com/b1&quot;)) {
                return &quot;Content from /b1&quot;;
            } else if(url.equals(&quot;http://google.com/b2&quot;)) {
                return &quot;Content from /b2&quot;;
            } else if(url.equals(&quot;http://google.com/b&quot;)) {
                List&lt;MyTask&gt; tasks = new ArrayList&lt;&gt;();
                tasks.add(new MyTask(&quot;http://google.com/b1&quot;));
                tasks.add(new MyTask(&quot;http://google.com/b2&quot;));
                String result = &quot;Content from /b\n&quot;;

                for(MyTask task : tasks) {
                    task.fork();
                    result += task.join() + &quot;\n&quot;;
                }
                return result;
            } else if(url.equals(&quot;http://google.com&quot;)) {

                List&lt;MyTask&gt; tasks = new ArrayList&lt;&gt;();
                tasks.add(new MyTask(&quot;http://google.com/a&quot;));
                tasks.add(new MyTask(&quot;http://google.com/b&quot;));
                tasks.add(new MyTask(&quot;http://google.com/c&quot;));
                tasks.add(new MyTask(&quot;http://google.com/d&quot;));
                tasks.add(new MyTask(&quot;http://google.com/e&quot;));
                tasks.add(new MyTask(&quot;http://google.com/f&quot;));
                tasks.add(new MyTask(&quot;http://google.com/g&quot;));
                tasks.add(new MyTask(&quot;http://google.com/h&quot;));
                tasks.add(new MyTask(&quot;http://google.com/i&quot;));
                tasks.add(new MyTask(&quot;http://google.com/j&quot;));
                String result = &quot;Content from /\n&quot;;

                for (MyTask task : tasks) {
                    task.fork();
                    result += task.join() + &quot;\n&quot;;
                }
                return result;
            } else {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                return &quot;Content from &quot; + url;
            }
        }
    }

    public static void main(String[] args) {
        ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
        String result = pool.invoke(new MyTask(&quot;http://google.com&quot;));
        System.out.println(result);
    }
}

Why is every fork running on the same thread?

ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j

答案1

得分: 2

你在每次生成新任务并等待其完成之后再提交另一个任务时都会被阻塞在join操作上。相反,先生成所有的任务,然后收集它们的结果:

for(MyTask task : tasks) {
  task.fork();
}

for(MyTask task : tasks) {
  result += task.join() + "\n";
}
英文:

You're blocking on join each time you spawn a new task waiting for it to complete before submitting another task. Instead, spawn all the tasks first and then collect their results:

for(MyTask task : tasks) {
task.fork();
}
for(MyTask task : tasks) {
result += task.join() + &quot;\n&quot;;
}

huangapple
  • 本文由 发表于 2020年8月25日 05:27:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/63568915.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定