Skip to main content

Java Fork and Join using ForkJoinPool

Java Fork and Join using ForkJoinPool

Jakob Jenkov
Last update: 2015-02-03
The ForkJoinPool was added to Java in Java 7. The ForkJoinPool is similar to the Java ExecutorService but with one difference. The ForkJoinPool makes it easy for tasks to split their work up into smaller tasks which are then submitted to the ForkJoinPool too. Tasks can keep splitting their work into smaller subtasks for as long as it makes to split up the task. It may sound a bit abstract, so in this fork and join tutorial I will explain how the ForkJoinPool works, and how splitting tasks up work.

Fork and Join Explained

Before we look at the ForkJoinPool I want to explain how the fork and join principle works in general.
The fork and join principle consists of two steps which are performed recursively. These two steps are the fork step and the join step.

Fork

A task that uses the fork and join principle can fork (split) itself into smaller subtasks which can be executed concurrently. This is illustrated in the diagram below:
Splitting tasks into smaller tasks is referred to as forking tasks.By splitting itself up into subtasks, each subtask can be executed in parallel by different CPUs, or different threads on the same CPU.
A task only splits itself up into subtasks if the work the task was given is large enough for this to make sense. There is an overhead to splitting up a task into subtasks, so for small amounts of work this overhead may be greater than the speedup achieved by executing subtasks concurrently.
The limit for when it makes sense to fork a task into subtasks is also called a threshold. It is up to each task to decide on a sensible threshold. It depends very much on the kind of work being done.

Join

When a task has split itself up into subtasks, the task waits until the subtasks have finished executing.
Once the subtasks have finished executing, the task may join (merge) all the results into one result. This is illustrated in the diagram below:
Merging the results of subtasks is referred to as joining results.Of course, not all types of tasks may return a result. If the tasks do not return a result then a task just waits for its subtasks to complete. No result merging takes place then.

The ForkJoinPool

The ForkJoinPool is a special thread pool which is designed to work well with fork-and-join task splitting. The ForkJoinPool located in the java.util.concurrent package, so the full class name isjava.util.concurrent.ForkJoinPool.

Creating a ForkJoinPool

You create a ForkJoinPool using its constructor. As a parameter to the ForkJoinPool constructor you pass the indicated level of parallelism you desire. The parallelism level indicates how many threads or CPUs you want to work concurrently on on tasks passed to the ForkJoinPool. Here is a ForkJoinPool creation example:
ForkJoinPool forkJoinPool = new ForkJoinPool(4);
This example creates a ForkJoinPool with a parallelism level of 4.

Submitting Tasks to the ForkJoinPool

You submit tasks to a ForkJoinPool similarly to how you submit tasks to an ExecutorService. You can submit two types of tasks. A task that does not return any result (an "action"), and a task which does return a result (a "task"). These two types of tasks are represented by the RecursiveAction andRecursiveTask classes. How to use both of these tasks and how to submit them will be covered in the following sections.

RecursiveAction

RecursiveAction is a task which does not return any value. It just does some work, e.g. writing data to disk, and then exits.
RecursiveAction may still need to break up its work into smaller chunks which can be executed by independent threads or CPUs.
You implement a RecursiveAction by subclassing it. Here is a RecursiveAction example:
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.RecursiveAction;

public class MyRecursiveAction extends RecursiveAction {

    private long workLoad = 0;

    public MyRecursiveAction(long workLoad) {
        this.workLoad = workLoad;
    }

    @Override
    protected void compute() {

        //if work is above threshold, break tasks up into smaller tasks
        if(this.workLoad > 16) {
            System.out.println("Splitting workLoad : " + this.workLoad);

            List<MyRecursiveAction> subtasks =
                new ArrayList<MyRecursiveAction>();

            subtasks.addAll(createSubtasks());

            for(RecursiveAction subtask : subtasks){
                subtask.fork();
            }

        } else {
            System.out.println("Doing workLoad myself: " + this.workLoad);
        }
    }

    private List<MyRecursiveAction> createSubtasks() {
        List<MyRecursiveAction> subtasks =
            new ArrayList<MyRecursiveAction>();

        MyRecursiveAction subtask1 = new MyRecursiveAction(this.workLoad / 2);
        MyRecursiveAction subtask2 = new MyRecursiveAction(this.workLoad / 2);

        subtasks.add(subtask1);
        subtasks.add(subtask2);

        return subtasks;
    }

}
This example is very simplified. The MyRecursiveAction simply takes a fictive workLoad as parameter to its constructor. If the workLoad is above a certain threshold, the work is split into subtasks which are also scheduled for execution (via the .fork() method of the subtasks. If the workLoad is below a certain threshold then the work is carried out by the MyRecursiveAction itself.
You can schedule a MyRecursiveAction for execution like this:
MyRecursiveAction myRecursiveAction = new MyRecursiveAction(24);

forkJoinPool.invoke(myRecursiveAction);

RecursiveTask

RecursiveTask is a task that returns a result. It may split its work up into smaller tasks, and merge the result of these smaller tasks into a collective result. The splitting and merging may take place on several levels. Here is a RecursiveTask example:
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.RecursiveTask;
    
    
public class MyRecursiveTask extends RecursiveTask<Long> {

    private long workLoad = 0;

    public MyRecursiveTask(long workLoad) {
        this.workLoad = workLoad;
    }

    protected Long compute() {

        //if work is above threshold, break tasks up into smaller tasks
        if(this.workLoad > 16) {
            System.out.println("Splitting workLoad : " + this.workLoad);

            List<MyRecursiveTask> subtasks =
                new ArrayList<MyRecursiveTask>();
            subtasks.addAll(createSubtasks());

            for(MyRecursiveTask subtask : subtasks){
                subtask.fork();
            }

            long result = 0;
            for(MyRecursiveTask subtask : subtasks) {
                result += subtask.join();
            }
            return result;

        } else {
            System.out.println("Doing workLoad myself: " + this.workLoad);
            return workLoad * 3;
        }
    }

    private List<MyRecursiveTask> createSubtasks() {
        List<MyRecursiveTask> subtasks =
        new ArrayList<MyRecursiveTask>();

        MyRecursiveTask subtask1 = new MyRecursiveTask(this.workLoad / 2);
        MyRecursiveTask subtask2 = new MyRecursiveTask(this.workLoad / 2);

        subtasks.add(subtask1);
        subtasks.add(subtask2);

        return subtasks;
    }
}
This example is similar to the RecursiveAction example except it returns a result. The classMyRecursiveTask extends RecursiveTask<Long> which means that the result returned from the task is a Long.
The MyRecursiveTask example also breaks the work down into subtasks, and schedules these subtasks for execution using their fork() method.
Additionally, this example then receives the result returned by each subtask by calling the join() method of each subtask. The subtask results are merged into a bigger result which is then returned. This kind of joining / mergining of subtask results may occur recursively for several levels of recursion.
You can schedule a RecursiveTask like this:
MyRecursiveTask myRecursiveTask = new MyRecursiveTask(128);

long mergedResult = forkJoinPool.invoke(myRecursiveTask);

System.out.println("mergedResult = " + mergedResult);    
Notice how you get the final result out from the ForkJoinPool.invoke() method call.

ForkJoinPool Critique

It seems not everyone is equally happy with the new ForkJoinPool in Java 7. While searching for experiences with, and opinions about, the ForkJoinPool, I came across the following critique:
A Java Fork-Join Calamity
It is well worth a read before you plan to use the ForkJoinPool in your own projects.

Comments

Popular posts from this blog

Mockito interview Questions

1.       Question 1. What Is Mockito? Answer : Mockito allows creation of mock object for the purpose of Test Driven Development and Behavior Driven development. Unlike creating actual object, Mockito allows creation of fake object (external dependencies) which allows it to give consistent results to a given invocation. 2.       Question 2. Why Do We Need Mockito? What Are The Advantages? Answer : Mockito differentiates itself from the other testing framework by removing the expectation beforehand. So, by doing this, it reduces the coupling. Most of the testing framework works on the "expect-run-verify". Mockito allows it to make it "run-verify" framework. Mockito also provides annotation which allows to reduce the boilerplate code. 3.       Question 3. Can You Explain A Mockito Framework? Answer : In Mockito, you always check a particular class. The dependency in that class is injected using m...

JAVA Expert Interview Questions Answers 2017

Java Basics ::  Interview Questions and Answers Home  »  Interview Questions  »  Technical Interview  »  Java Basics  » Interview Questions 1.     What is the difference between a constructor and a method? A constructor is a member function of a class that is used to create objects of that class. It has the same name as the class itself, has no return type, and is invoked using the new operator. A method is an ordinary member function of a class. It has its own name, a return type (which may be void), and is invoked using the dot operator. 2.     What is the purpose of garbage collection in Java, and when is it used? The purpose of garbage collection is to identify and discard objects that are no longer needed by a program so that their resources can be reclaimed and reused. A Java object is subject to garbage collection when it becomes unreachable to the program in which it is used. 3.  ...

Java Example Program to Convert List to Set

import java.util.ArrayList; import java.util.HashSet; import java.util.Set; public class ListToSet {  /**   * @author Amarjit Kumar   * @category interview questions   *   * Description: Convert List to set in java with example program   *   */  public static void main(String[] args) {   ArrayList<String> arrList= new ArrayList<>();   arrList.add("Java");   arrList.add("Java");   arrList.add("List to String");   arrList.add("Example Program");   Set<String> strSet = new HashSet<String>(arrList);   System.out.println(strSet);   System.out.println(arrList);  } } /*  * Java program to convert list to set. Convert ArrayList of string to HashSet  * in java example program How to convert List to Set in java Set<String> strSet  * = new HashSet<String>(arrList); HashSet having a constructor which will take  * list as an ar...