问题

以下是翻译好的内容：

我有以下代码，这是一种网站爬虫的模拟，它会爬取页面/子页面，并将结果连接到一个字符串，其中包含页面的内容。

我使用了 `Runtime.getRuntime().availableProcessors()`，因此我假设它会在多个线程上运行。但事实并非如此。

```java
package Concurrency;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;

public class ForkJoinPoolDemo {

    public static class MyTask extends RecursiveTask<String>
    {
        private String url;
        public MyTask(String url)
        {
            this.url = url;
        }

        @Override
        protected String compute() {

            System.out.println(Thread.currentThread().getName() + "/" + url);

            if(url.equals("http://google.com/b1")) {
                return "Content from /b1";
            } else if(url.equals("http://google.com/b2")) {
                return "Content from /b2";
            } else if(url.equals("http://google.com/b")) {
                List<MyTask> tasks = new ArrayList<>();
                tasks.add(new MyTask("http://google.com/b1"));
                tasks.add(new MyTask("http://google.com/b2"));
                String result = "Content from /b\n";

                for(MyTask task : tasks) {
                    task.fork();
                    result += task.join() + "\n";
                }
                return result;
            } else if(url.equals("http://google.com")) {

                List<MyTask> tasks = new ArrayList<>();
                tasks.add(new MyTask("http://google.com/a"));
                tasks.add(new MyTask("http://google.com/b"));
                tasks.add(new MyTask("http://google.com/c"));
                tasks.add(new MyTask("http://google.com/d"));
                tasks.add(new MyTask("http://google.com/e"));
                tasks.add(new MyTask("http://google.com/f"));
                tasks.add(new MyTask("http://google.com/g"));
                tasks.add(new MyTask("http://google.com/h"));
                tasks.add(new MyTask("http://google.com/i"));
                tasks.add(new MyTask("http://google.com/j"));
                String result = "Content from /\n";

                for (MyTask task : tasks) {
                    task.fork();
                    result += task.join() + "\n";
                }
                return result;
            } else {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                return "Content from " + url;
            }
        }
    }

    public static void main(String[] args) {
        ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
        String result = pool.invoke(new MyTask("http://google.com"));
        System.out.println(result);
    }
}

为什么每个分支都在同一个线程上运行？

ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j

英文:

I have the following code, which is a simulation of a sort of site scraper, which scrapes pages/ subpages and joins the result to a string with the contents of the pages.

I have used Runtime.getRuntime().availableProcessors(), so I assumed that it will run on multiple threads. But this does not seem to be the case.

package Concurrency;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;

public class ForkJoinPoolDemo {

    public static class MyTask extends RecursiveTask&lt;String&gt;
    {
        private String url;
        public MyTask(String url)
        {
            this.url = url;
        }

        @Override
        protected String compute() {

            System.out.println(Thread.currentThread().getName() + &quot;/&quot; + url);

            if(url.equals(&quot;http://google.com/b1&quot;)) {
                return &quot;Content from /b1&quot;;
            } else if(url.equals(&quot;http://google.com/b2&quot;)) {
                return &quot;Content from /b2&quot;;
            } else if(url.equals(&quot;http://google.com/b&quot;)) {
                List&lt;MyTask&gt; tasks = new ArrayList&lt;&gt;();
                tasks.add(new MyTask(&quot;http://google.com/b1&quot;));
                tasks.add(new MyTask(&quot;http://google.com/b2&quot;));
                String result = &quot;Content from /b\n&quot;;

                for(MyTask task : tasks) {
                    task.fork();
                    result += task.join() + &quot;\n&quot;;
                }
                return result;
            } else if(url.equals(&quot;http://google.com&quot;)) {

                List&lt;MyTask&gt; tasks = new ArrayList&lt;&gt;();
                tasks.add(new MyTask(&quot;http://google.com/a&quot;));
                tasks.add(new MyTask(&quot;http://google.com/b&quot;));
                tasks.add(new MyTask(&quot;http://google.com/c&quot;));
                tasks.add(new MyTask(&quot;http://google.com/d&quot;));
                tasks.add(new MyTask(&quot;http://google.com/e&quot;));
                tasks.add(new MyTask(&quot;http://google.com/f&quot;));
                tasks.add(new MyTask(&quot;http://google.com/g&quot;));
                tasks.add(new MyTask(&quot;http://google.com/h&quot;));
                tasks.add(new MyTask(&quot;http://google.com/i&quot;));
                tasks.add(new MyTask(&quot;http://google.com/j&quot;));
                String result = &quot;Content from /\n&quot;;

                for (MyTask task : tasks) {
                    task.fork();
                    result += task.join() + &quot;\n&quot;;
                }
                return result;
            } else {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                return &quot;Content from &quot; + url;
            }
        }
    }

    public static void main(String[] args) {
        ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
        String result = pool.invoke(new MyTask(&quot;http://google.com&quot;));
        System.out.println(result);
    }
}

Why is every fork running on the same thread?

ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j

答案1

得分: 2

你在每次生成新任务并等待其完成之后再提交另一个任务时都会被阻塞在join操作上。相反，先生成所有的任务，然后收集它们的结果：

for(MyTask task : tasks) {
  task.fork();
}

for(MyTask task : tasks) {
  result += task.join() + "\n";
}

英文:

You're blocking on join each time you spawn a new task waiting for it to complete before submitting another task. Instead, spawn all the tasks first and then collect their results:

for(MyTask task : tasks) {
task.fork();
}
for(MyTask task : tasks) {
result += task.join() + &quot;\n&quot;;
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

ForkJoin正在1个线程上运行

问题

答案1

Maven无法在将依赖项添加到pom.xml后找到nd4j .jar文件。

Java的SortMap比较器，使得数字键在字母键之后排序

当元素超过其大小的一半时，调整数组大小。

How to set letter spacing in Java Swing GUI?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论