Friday, 17 August 2018

Java Lambda Streams and Groovy Clouses Comparisons

This Blog post will look at some proverbial operations on List data structure and make some comparison between Java 8/9 and Groovy syntax.

Oracle Java Guides, Oracle Java Learning, Oracle Java Tutorial and Materials

So firstly, the data structure.  It’s just a simple Rugby player who has name and a rating.

Java


class RugbyPlayer {
    private String name;
    private Integer rating;
 
    RugbyPlayer(String name, Integer rating) {
        this.name = name;
        this.rating = rating;
    }

    public String toString() {
        return name + "," + rating;
    }
     
    public String getName() {
        return name;
    }
     
    public Integer getRating() {
        return rating;
    }
}

//...
//...
List<RugbyPlayer> players = Arrays.asList(
    new RugbyPlayer("Tadgh Furlong", 9),
    new RugbyPlayer("Bundee AKi", 7),
    new RugbyPlayer("Rory Best", 8),
    new RugbyPlayer("Jacob StockDale", 8)
);

Groovy

@ToString
class RugbyPlayer {
    String name
    Integer rating
}
//...
//...
List<RugbyPlayer> players = [
    new RugbyPlayer(name: "Tadgh Furlong", rating: 9),
    new RugbyPlayer(name: "Bundee AKi", rating: 7),
    new RugbyPlayer(name: "Rory Best", rating: 8),
    new RugbyPlayer(name: "Jacob StockDale", rating: 8)
]

Find a specific record


Java

// Find Tadgh Furlong
Optional<RugbyPlayer> result = players.stream()
    .filter(player -> player.getName().indexOf("Tadgh")  >= 0)
    .findFirst();      
String outputMessage = result.isPresent() ? result.get().toString() : "not found";
System.out.println(outputMessage);

Groovy

println players.find{it.name.indexOf("Tadgh") >= 0}

Comments
  • The Java lambda has just one parameter in – player.  This doesn’t need to be typed as its type can be inferred.  Note: this lambda only uses one parameter.  If there were two parameters in the parameter list, parenthesis would be needed around the parameter list.
  • In Java, a stream must be created from the List first.  A lambda is then used to before performing a function which returns an Optional
  • The lambda definition doesn’t need a return statement.  It also doesn’t need {} braces or one of those semi-colons to complete a Java statement.  However, you can use {} if you want to and if you want to, you must include the ; and the return statement.  Note: if you lambda is more than one line, you don’t get a choice, you must use {}.   It is recommended best practise to keep Lambda’s short and to just one line.
  • Java 8 supports fluent APIs for pipeline Stream operations.  This is also supported in Groovy Collection operations.
  • In Java a player variable is specified for the Lambda.  The Groovy closure doesn’t need to specify a variable.  It can just use “it” which is the implicit reference to the parameter (similar to _ in Scala).
  • The Java filter API takes a parameters of type Predicate.   A Functional Interface means: can be used as the assignment target for a lambda expression or method reference.  Predicate, is type of Functional interface.  It’s one abstract method is: boolean test(T t).    In this case, in the lamda, the player corresponds to t.  The body definition should evaluate to a true or a false, in our case player.getName().indexOf(“Tadgh”) will always evaluate to true or false.  True corresponds to a match.
  • Java 8 has other types of Functional Interfaces:
    • Function – it takes one argument and returns a result
    • Consumer – it takes one argument and returns no result (represents a side effect)
    • Supplier – it takes not argument and returns a result
    • Predicate – it takes one argument and returns a boolean
    • BiFunction – it takes two arguments and returns a result
    • BinaryOperator – it is similar to a BiFunction, taking two arguments and returning a result. The two arguments and the result are all of the same types
    • UnaryOperator – it is similar to a Function, taking a single argument and returning a result of the same type
  • Java 8 can infer the type for the lambda input parameters. Note if you have to specify the parameter type,  the declaration must be in brackets which adds further verbosity.
  • Groovy can println directly.  No System.out needed, and no need for subsequent braces.
  • Like Java, Groovy doesn’t need the return statement.  However, this isn’t just for closures, in Groovy it extends to every method.    Whatever is evaluated as the last line is automatically returned.
  • Groovy has no concept of a Functional interface.  This can mean if you forget to ensure your last expression is an appropriate boolean expression, you get unexpected results and bugs at runtime.
  • The arrow operator is used in both Groovy and Java to mean effectively the same thing – separating parameter list from body definition. In Groovy it is only needed it you need to declare the parameters (the default it, doesn’t suffice). Note: In Scala, => is used.

Find specific records


Java

// Find all players with a rating over 8
List<RugbyPlayer> ratedPlayers = players.stream()
    .filter(player -> player.getRating() >= 8)
    .collect(Collectors.toList());
ratedPlayers.forEach(System.out::println);

Groovy

println players.findAll{it.rating >= 8}

Comments

◈ In the Java version, the Iterable Object ratedPlayers has its forEach method invoked.   This method takes a FunctionalInterface of type Consumer.  Consumer, methods a function which takes an input parameter but returns nothing, it is void.

◈ System.out::println is a method reference – a new feature in Java 8.   It is syntactic sugar to reduce the verbosity of some lambdas.  This is essentially saying, for every element in ratedPlayers, execute, System.out.println, passing in the the current element as a parameter.

◈ Again less syntax from Groovy.  The function can operate on the collection, there is no need to create a Stream.

◈ We could have just printed the entire list in the Java sample, but heck I wanted to demo forEach and method reference.

Map from object type to another


Java

// Map the Rugby players to just names. 
// Note, the way we convert the list to a stream and then back again to a to a list using the collect API. 
System.out.println("Names only...");
List<String> playerNames = players.stream().map(player -> player.getName()).collect(Collectors.toList());
playerNames.forEach(System.out::println);

Groovy

println players.collect{it.name}

Comments

A stream is needed to be created first before executing the Lambda.  Then the collect() method is invoked on the Stream – this is needed to convert it back to a List. This makes code more verbose.

Perform a Reduction calculation


Java

System.out.println("Max player rating only...");
Optional<Integer> maxRatingOptional = players.stream().map(RugbyPlayer::getRating).reduce(Integer::max);
String maxRating = maxRatingOptional.isPresent() ? maxRatingOptional.get().toString() : "No max";
System.out.println("Max rating=" + maxRating);

Groovy

def here = players.inject(null){ 
    max, it -> 
        it.rating > max?.rating ? it : max
}

Comments

1. The null safe operator is used in the Groovy inject closure – so that the first comparsion will work

Wednesday, 15 August 2018

How to avoid deadlock in Java Threads?

Deadlock, Java Threads, Oracle Java Guides, Oracle Java Certifications, Oracle Java Study Material

How to avoid deadlock in Java? is one of the popular Java interview question and flavor of the season for multi-threading, asked mostly at a senior level with lots of follow up questions. Even though question looks very basic but most of the Java developers get stuck once you start going deep.

Interview questions start with “What is a deadlock?”

The answer is simple when two or more threads are waiting for each other to release the resource they need (lock) and get stuck for infinite time, the situation is called deadlock. It will only happen in case of multitasking or multi-threading.

How do you detect deadlock in Java?

Though this could have many answers, my version is first I would look at the code if I see a nested synchronized block or calling one synchronized method from other, or trying to get a lock on a different object then there is a good chance of deadlock if a developer is not very careful.

Another way is to find it when you actually get dead-locked while running the application, try to take a thread dump, in Linux you can do this by command “kill -3”, this will print status of all threads in application log file and you can see which thread is locked on which object.

You can analyze that thread dump with using tools like fastthread.io which allows you to upload your thread dump and analyze it.

Another way is to use the jConsole/VisualVM, it will show you exactly which threads are getting locked and on which object.

Write a Java program which will result in deadlock?


Once you answer the earlier question, they may ask you to write code which will result in a deadlock in Java?

Here is one of my version:

view sourceprint?

/**
 * Java program to create a deadlock by imposing circular wait.
 *
 * @author WINDOWS 8
 *
 */
public class DeadLockDemo {

    /*
     * This method request two locks, first String and then Integer
     */
    public void method1() {
        synchronized (String.class) {
            System.out.println("Aquired lock on String.class object");

            synchronized (Integer.class) {
                System.out.println("Aquired lock on Integer.class object");
            }
        }
    }

    /*
     * This method also requests same two lock but in exactly
     * Opposite order i.e. first Integer and then String.
     * This creates potential deadlock, if one thread holds String lock
     * and other holds Integer lock and they wait for each other, forever.
     */
    public void method2() {
        synchronized (Integer.class) {
            System.out.println("Aquired lock on Integer.class object");

            synchronized (String.class) {
                System.out.println("Aquired lock on String.class object");
            }
        }
    }
}

If method1() and method2() both will be called by two or many threads , there is a good chance of deadlock because if thread 1 acquires lock on Sting object while executing method1() and thread 2 acquires lock on Integer object while executing method2() both will be waiting for each other to release lock on Integer and String to proceed further which will never happen.

This diagram exactly demonstrates our program, where one thread holds a lock on one object and waiting for other object locks which are held by other thread.

Deadlock, Java Threads, Oracle Java Guides, Oracle Java Certifications, Oracle Java Study Material

You can see that Thread 1 wants the lock on object 2 which is held by Thread 2, and Thread 2 wants a lock on Object 1 which is held by Thread 1. Since no thread is willing to give up, there is a deadlock and the Java program is stuck.

How to avoid deadlock in Java?


Now the interviewer comes to the final part, one of the most important in my view;
How do you fix a deadlock in code? or How to avoid deadlock in Java?

If you have looked above code carefully then you may have figured out that real reason for deadlock is not multiple threads but the way they are requesting a lock, if you provide an ordered access then the problem will be resolved.

Here is my fixed version, which avoids deadlock by a voiding circular wait with no preemption, one of the four conditions which need for deadlock.

public class DeadLockFixed {

    /**
     * Both method are now requesting lock in same order, first Integer and then String.
     * You could have also done reverse e.g. first String and then Integer,
     * both will solve the problem, as long as both method are requesting lock
     * in consistent order.
     */
    public void method1() {
        synchronized (Integer.class) {
            System.out.println("Aquired lock on Integer.class object");

            synchronized (String.class) {
                System.out.println("Aquired lock on String.class object");
            }
        }
    }

    public void method2() {
        synchronized (Integer.class) {
            System.out.println("Aquired lock on Integer.class object");

            synchronized (String.class) {
                System.out.println("Aquired lock on String.class object");
            }
        }
    }
}

Now there would not be any deadlock because both methods are accessing lock on Integer and String class literal in the same order. So, if thread A acquires a lock on Integer object, thread B will not proceed until thread A releases Integer lock, same way thread A will not be blocked even if thread B holds String lock because now thread B will not expect thread A to release Integer lock to proceed further.

Monday, 13 August 2018

Java 9 Reactive Streams

Java 9 Reactive Streams allows us to implement non-blocking asynchronous stream processing. This is a major step towards applying reactive programming model to core java programming.

Java 9 Reactive Streams


Reactive Streams is about asynchronous processing of stream, so there should be a Publisher and a Subscriber. The Publisher publishes the stream of data and the Subscriber consumes the data.

Oracle Java 9, Oracle Java Certification, Oracle Java Tutorial and Material, Oracle Java Study Materials

Sometimes we have to transform the data between Publisher and Subscriber. Processor is the entity sitting between the end publisher and subscriber to transform the data received from publisher so that subscriber can understand it. We can have a chain of processors.

Oracle Java 9, Oracle Java Certification, Oracle Java Tutorial and Material, Oracle Java Study Materials

It’s very clear from the above image that Processor works both as Subscriber and a Publisher.

Java 9 Flow API


Java 9 Flow API implements the Reactive Streams Specification. Flow API is a combination of Iterator and Observer pattern. Iterator works on pull model where application pulls items from the source, whereas Observer works on push model and reacts when item is pushed from source to application.

Java 9 Flow API subscriber can request for N items while subscribing to the publisher. Then the items are pushed from publisher to subscriber until there are no more items left to push or some error occurs.

Oracle Java 9, Oracle Java Certification, Oracle Java Tutorial and Material, Oracle Java Study Materials

Java 9 Flow API Classes and Interfaces


Let’s have a quick look at Flow API classes and interfaces.
  • java.util.concurrent.Flow: This is the main class of Flow API. This class encapsulates all the important interfaces of the Flow API. This is a final class and we can’t extend it.
  • java.util.concurrent.Flow.Publisher: This is a functional interface and every publisher has to implement it’s subscribe method to add the given subscriber to receive messages.
  • java.util.concurrent.Flow.Subscriber: Every subscriber has to implement this interface. The methods in the subscriber are invoked in strict sequential order. There are four methods in this interface:
    • onSubscribe: This is the first method to get invoked when subscriber is subscribed to receive messages by publisher. Usually we invoke subscription.request to start receiving items from processor.
    • onNext: This method gets invoked when an item is received from publisher, this is where we implement our business logic to process the stream and then request for more data from publisher.
    • onError: This method is invoked when an irrecoverable error occurs, we can do cleanup taks in this method, such as closing database connection.
    • onComplete: This is like finally method and gets invoked when no other items are being produced by publisher and publisher is closed. We can use it to send notification of successful processing of stream.
  • java.util.concurrent.Flow.Subscription: This is used to create asynchronous non-blocking link between publisher and subscriber. Subscriber invokes its request method to demand items from publisher. It also has cancel method to cancel the subscription i.e. closing the link between publisher and subscriber.
  • java.util.concurrent.Flow.Processor: This interface extends both Publisher and Subscriber, this is used to transform the message between publisher and subscriber.
  • java.util.concurrent.SubmissionPublisher: A Publisher implementation that asynchronously issues submitted items to current subscribers until it is closed. It uses Executor framework We will use this class in reactive stream examples to add subscriber and then submit items to them.

Java 9 Reactive Stream Example


Let’s start with a simple example where we will implement Flow API Subscriber interface and use SubmissionPublisher to create publisher and send messages.

Stream Data


Let’s say we have an Employee class that will be used to create the stream message to be sent from publisher to subscriber.

package com.oraclejavacentral.reactive.beans;

public class Employee {

private int id;
private String name;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public Employee(int i, String s) {
this.id = i;
this.name = s;
}
public Employee() {
}
@Override
public String toString() {
return "[id="+id+",name="+name+"]";
}
}

We also have a utility class to create a list of employees for our example.

package com.oraclejavacentral.reactive_streams;

import java.util.ArrayList;
import java.util.List;

import com.oraclejavacentral.reactive.beans.Employee;

public class EmpHelper {

public static List<Employee> getEmps() {

Employee e1 = new Employee(1, "Pankaj");
Employee e2 = new Employee(2, "David");
Employee e3 = new Employee(3, "Lisa");
Employee e4 = new Employee(4, "Ram");
Employee e5 = new Employee(5, "Anupam");
List<Employee> emps = new ArrayList<>();
emps.add(e1);
emps.add(e2);
emps.add(e3);
emps.add(e4);
emps.add(e5);
return emps;
}

}

Subscriber


package com.oraclejavacentral.reactive_streams;

import java.util.concurrent.Flow.Subscriber;
import java.util.concurrent.Flow.Subscription;

import com.oraclejavacentral.reactive.beans.Employee;

public class MySubscriber implements Subscriber<Employee> {

private Subscription subscription;
private int counter = 0;
@Override
public void onSubscribe(Subscription subscription) {
System.out.println("Subscribed");
this.subscription = subscription;
this.subscription.request(1); //requesting data from publisher
System.out.println("onSubscribe requested 1 item");
}

@Override
public void onNext(Employee item) {
System.out.println("Processing Employee "+item);
counter++;
this.subscription.request(1);
}

@Override
public void onError(Throwable e) {
System.out.println("Some error happened");
e.printStackTrace();
}

@Override
public void onComplete() {
System.out.println("All Processing Done");
}

public int getCounter() {
return counter;
}

}

◈ Subscription variable to keep reference so that request can be made in onNext method.
◈ counter variable to keep count of number of items processed, notice that it’s value is increased in onNext method. This will be used in our main method to wait for execution to finish before ending the main thread.
◈ Subscription request is invoked in onSubscribe method to start the processing. Also notice that it’s again called in onNext method after processing the item, demanding next item to process from the publisher.
◈ onError and onComplete doesn’t have much here, but in real world scenario they should be used to perform corrective measures when error occurs or cleanup of resources when processing completes successfully.

Reactive Stream Test Program


We will use SubmissionPublisher as Publisher for our examples, so let’s look at the test program for our reactive stream implementation.

package com.oraclejavacentral.reactive_streams;

import java.util.List;
import java.util.concurrent.SubmissionPublisher;

import com.oraclejavacentral.reactive.beans.Employee;

public class MyReactiveApp {

public static void main(String args[]) throws InterruptedException {

// Create Publisher
SubmissionPublisher<Employee> publisher = new SubmissionPublisher<>();

// Register Subscriber
MySubscriber subs = new MySubscriber();
publisher.subscribe(subs);

List<Employee> emps = EmpHelper.getEmps();

// Publish items
System.out.println("Publishing Items to Subscriber");
emps.stream().forEach(i -> publisher.submit(i));

// logic to wait till processing of all messages are over
while (emps.size() != subs.getCounter()) {
Thread.sleep(10);
}
// close the Publisher
publisher.close();

System.out.println("Exiting the app");

}

}

The most important piece of above code is subscribe and submit methods invocation of publisher. We should always close publisher to avoid any memory leaks.

We will get following output when above program is executed.

Subscribed
Publishing Items to Subscriber
onSubscribe requested 1 item
Processing Employee [id=1,name=Pankaj]
Processing Employee [id=2,name=David]
Processing Employee [id=3,name=Lisa]
Processing Employee [id=4,name=Ram]
Processing Employee [id=5,name=Anupam]
Exiting the app
All Processing Done

Note that if we won’t have logic for main method to wait before all the items are processed, then we will get unwanted results.

Message Transformation Example


Processor is used to transform the message between a publisher and subscriber. Let’s say we have another subscriber which is expecting a different type of message to process. Let’s say this new message type is Freelancer.

package com.oraclejavacentral.reactive.beans;

public class Freelancer extends Employee {

private int fid;

public int getFid() {
return fid;
}

public void setFid(int fid) {
this.fid = fid;
}
public Freelancer(int id, int fid, String name) {
super(id, name);
this.fid = fid;
}
@Override
public String toString() {
return "[id="+super.getId()+",name="+super.getName()+",fid="+fid+"]";
}
}

We have a new subscriber to consume Freelancer stream data.

package com.oraclejavacentral.reactive_streams;

import java.util.concurrent.Flow.Subscriber;
import java.util.concurrent.Flow.Subscription;

import com.oraclejavacentral.reactive.beans.Freelancer;

public class MyFreelancerSubscriber implements Subscriber<Freelancer> {

private Subscription subscription;
private int counter = 0;
@Override
public void onSubscribe(Subscription subscription) {
System.out.println("Subscribed for Freelancer");
this.subscription = subscription;
this.subscription.request(1); //requesting data from publisher
System.out.println("onSubscribe requested 1 item for Freelancer");
}

@Override
public void onNext(Freelancer item) {
System.out.println("Processing Freelancer "+item);
counter++;
this.subscription.request(1);
}

@Override
public void onError(Throwable e) {
System.out.println("Some error happened in MyFreelancerSubscriber");
e.printStackTrace();
}

@Override
public void onComplete() {
System.out.println("All Processing Done for MyFreelancerSubscriber");
}

public int getCounter() {
return counter;
}

}

Processor


The important part is the implementation of Processor interface. Since we want to utilize the SubmissionPublisher, we would extend it and use it wherever applicable.

package com.oraclejavacentral.reactive_streams;

import java.util.concurrent.Flow.Processor;
import java.util.concurrent.Flow.Subscription;
import java.util.concurrent.SubmissionPublisher;
import java.util.function.Function;

import com.oraclejavacentral.reactive.beans.Employee;
import com.oraclejavacentral.reactive.beans.Freelancer;

public class MyProcessor extends SubmissionPublisher<Freelancer> implements Processor<Employee, Freelancer> {

private Subscription subscription;
private Function<Employee,Freelancer> function;
public MyProcessor(Function<Employee,Freelancer> function) {  
    super();  
    this.function = function;  
  }  
@Override
public void onSubscribe(Subscription subscription) {
this.subscription = subscription;
subscription.request(1);
}

@Override
public void onNext(Employee emp) {
submit((Freelancer) function.apply(emp));  
    subscription.request(1);  
}

@Override
public void onError(Throwable e) {
e.printStackTrace();
}

@Override
public void onComplete() {
System.out.println("Done");
}

}

◈ Function will be used to convert Employee object to Freelancer object.
◈ We will convert incoming Employee message to Freelancer message in onNext method and then use SubmissionPublisher submit method to send it to the subscriber.
◈ Since Processor works as both subscriber and publisher, we can create a chain of processors between end publishers and subscribers.

Message Transformation Test


package com.oraclejavacentral.reactive_streams;

import java.util.List;
import java.util.concurrent.SubmissionPublisher;

import com.oraclejavacentral.reactive.beans.Employee;
import com.oraclejavacentral.reactive.beans.Freelancer;

public class MyReactiveAppWithProcessor {

public static void main(String[] args) throws InterruptedException {
// Create End Publisher
SubmissionPublisher<Employee> publisher = new SubmissionPublisher<>();

// Create Processor
MyProcessor transformProcessor = new MyProcessor(s -> {
return new Freelancer(s.getId(), s.getId() + 100, s.getName());
});

//Create End Subscriber
MyFreelancerSubscriber subs = new MyFreelancerSubscriber();

//Create chain of publisher, processor and subscriber
publisher.subscribe(transformProcessor); // publisher to processor
transformProcessor.subscribe(subs); // processor to subscriber

List<Employee> emps = EmpHelper.getEmps();

// Publish items
System.out.println("Publishing Items to Subscriber");
emps.stream().forEach(i -> publisher.submit(i));

// Logic to wait for messages processing to finish
while (emps.size() != subs.getCounter()) {
Thread.sleep(10);
}

// Closing publishers
publisher.close();
transformProcessor.close();

System.out.println("Exiting the app");
}

}

Read the comments in the program to properly understand it, most important change is the creation of producer-processor-subscriber chain. We will get following output when above program is executed.

Subscribed for Freelancer
Publishing Items to Subscriber
onSubscribe requested 1 item for Freelancer
Processing Freelancer [id=1,name=Pankaj,fid=101]
Processing Freelancer [id=2,name=David,fid=102]
Processing Freelancer [id=3,name=Lisa,fid=103]
Processing Freelancer [id=4,name=Ram,fid=104]
Processing Freelancer [id=5,name=Anupam,fid=105]
Exiting the app
All Processing Done for MyFreelancerSubscriber
Done

Cancel Subscription


We can use Subscription cancel method to stop receiving message in subscriber. Note that if we cancel the subscription, then subscriber will not receive onComplete or onError signal.

Here is a sample code where subscriber is consuming only 3 messages and then canceling the subscription.

@Override
public void onNext(Employee item) {
System.out.println("Processing Employee "+item);
counter++;
if(counter==3) {
this.subscription.cancel();
return;
}
this.subscription.request(1);
}

Note that in this case, our logic to halt main thread before all the messages are processed will go into infinite loop. We can add some additional logic for this scenario, may be some global variable to look for if subscriber has stopped processing or canceled subscription.

Back Pressure


When publisher is producing messages in much faster rate than it’s being consumed by subscriber, back pressure gets built. Flow API doesn’t provide any mechanism to signal about back pressure or to deal with it. But we can devise our own strategy to deal with it, such as fine tuning the subscriber or reducing the message producing rate.

Friday, 10 August 2018

Garbage Collection in Java

Garbage collection in java is one of the advance topic. Java GC knowledge helps us in fine tuning our application runtime performance.

Garbage Collection in Java


◈ In Java, the programmers don’t need to take care of destroying the objects that are out of use. The Garbage Collector takes care of it.
◈ Garbage Collector is a Daemon thread that keeps running in the background. Basically, it frees up the heap memory by destroying the unreachable objects.
◈ Unreachable objects are the ones that are no longer referenced by any part of the program.
◈ We can choose the garbage collector for our java program through JVM options, we will look into these in later section of this tutorial.

How Automatic Garbage Collection works?


Automatic Garbage collection is a process of looking at the Heap memory, identifying(also known as “marking”) the unreachable objects, and destroying them with compaction.

Garbage Collection in Java, Oracle Java Certifications, Oracle Java Guides

An issue with this approach is that, as the number of objects increases, the Garbage Collection time keeps on increasing, as it needs to go through the entire list of objects, looking for the unreachable object.

However, the empirical analysis of applications shows that most of the objects are short-lived.

This behavior was used to improve the performance of JVM, and the adopted methodology is commonly called Generational Garbage Collection. In this method, the Heap space is divided into generations like Young Generation, Old or Tenured Generation, and Permanent Generation.

The Young generation heap space is the new where all the new Objects are created. Once it gets filled up, minor garbage collection (also known as, Minor GC) takes place. Which means, all the dead objects from this generation are destroyed This process is quick because as we can see from the graph, most of them would be dead. The surviving objects in young generation are aged and eventually moves to the older generations.

The Old Generation is used to store long surviving objects. Typically, a threshold is set for young generation object and when that age is met, the object gets moved to the old generation. Eventually, the old generation needs to be collected. This event is called a Major GC (major garbage collection). Often it is much slower because it involves all live objects.
Also, there is Full GC, which means cleaning the entire Heap – both Young and older generation spaces.

Lastly, up to Java 7, there was a Permanent Generation (or Perm Gen), which contained metadata required by the JVM to describe the classes and methods used in the application. It was removed in Java 8.

Java Garbage Collectors


The JVM actually provides four different garbage collectors, all of them generational. Each one has their own advantages and disadvantages. The choice of which garbage collector to use lies with us and there can be dramatic differences in the throughput and application pauses.

All these, split the managed heap into different segments, using the age-old assumptions that most objects in the heap are short-lived and should be recycled quickly.

So, the four types of garbage collectors are:

Serial GC


This is the simplest garbage collector, designed for single threaded systems and small heap size. It freezes all applications while working. Can be turned on using -XX:+UseSerialGC JVM option.

Parallel/Throughput GC


This is JVM’s default collector in JDK 8. As the name suggests, it uses multiple threads to scan through the heap space and perform compaction. A drawback of this collector is that it pauses the application threads while performing minor or full GC.
It is best suited if applications that can handle such pauses, and try to optimize CPU overhead caused by the collector.

The CMS collector


The CMS collector (“concurrent-mark-sweep”) algorithm uses multiple threads (“concurrent”) to scan through the heap (“mark”) for unused objects that can be recycled (“sweep”).

This collector goes in Stop-The-World(STW) mode in two cases:

-While initializing the initial marking of roots, ie. objects in the old generation that are reachable from thread entry points or static variables
-When the application has changed the state of the heap while the algorithm was running concurrently and forcing it to go back and do some final touches to make sure it has the right objects marked.

This collector may face promotion failures. If some objects from young generation are to be moved to the old generation, and the collector did not have enough time to make space in the old generation space, a promotion failure will occur.

In order to prevent this, we may provide more of the heap size to the old generation or provide more background threads to the collector.

G1 collector


Last but not the least is the Garbage-First collector, designed for heap sizes greater than 4GB. It divides the heap size into regions spanning from 1MB to 32Mb, based on the heap size.

There is a concurrent global marking phase to determine the liveliness of objects throughout the heap. After the marking phase is complete, G1 knows which regions are mostly empty. It collects unreachable objects from these regions first, which usually yields a large amount of free space. So G1 collects these regions(containing garbage) first, and hence the name Garbage-First. G1 also uses a pause prediction model in order to meet a user-defined pause time target. It selects the number of regions to collect based on the specified pause time target.

The G1 garbage collection cycle includes the phases as shown in the figure:

Garbage Collection in Java, Oracle Java Certifications, Oracle Java Guides

1. Young-only phase: This phase includes only the young generation objects and promotes them to the old generation. The transition between the young-only phase and the space-reclamation phase starts when the old generation is occupied up to a certain threshold, ie. the Initiating Heap Occupancy threshold. At this time, G1 schedules an Initial Mark young-only collection instead of a regular young-only collection.

◈ Initial Marking: This type of collection starts the marking process in addition to a regular young-only collection. Concurrent marking determines all currently live objects in the old generation regions to be kept for the following space-reclamation phase. While marking hasn’t completely finished, regular young-only collections may occur. Marking finishes with two special stop-the-world pauses: Remark and Cleanup.
◈ Remark: This pause finalizes the marking itself, and performs global reference processing and class unloading. Between Remark and Cleanup G1 calculates a summary of the liveness information concurrently, which will be finalized and used in the Cleanup pause to update internal data structures.
◈ Cleanup: This pause also takes the completely empty regions, and determines whether a space-reclamation phase will actually follow. If a space-reclamation phase follows, the young-only phase completes with a single young-only collection.

2. Space-reclamation phase: This phase consists of multiple mixed collections — in addition to young generation regions, also evacuates live objects of old generation regions. The space-reclamation phase ends when G1 determines that evacuating more old generation regions wouldn’t yield enough free space worth the effort.

G1 can be enabled using the –XX:+UseG1GC flag.

This strategy reduced the chances of the heap being depleted before the background threads have finished scanning for unreachable objects. Also, it compacts the heap on-the-go, which the CMS collector can do only in STW mode.

In Java 8 a beautiful optimization is provided with G1 collector, called string deduplication. As we know the character arrays that represent our strings occupies much of our heap space. A new optimization has been made that enables the G1 collector to identify strings which are duplicated more than once across our heap and modify them to point to the same internal char[] array, to avoid multiple copies of the same string residing in the heap unnecessarily. We can use the -XX:+UseStringDeduplication JVM argument to enable this optimization.

G1 is the default garbage collector in JDK 9.

Java 8 PermGen and Metaspace


As mentioned earlier, the Permanent Generation space was removed since Java 8. So now, the JDK 8 HotSpot JVM uses the native memory for the representation of class metadata which is called Metaspace.

Most of the allocations for the class metadata are made out of the native memory. Also, there is a new flag MaxMetaspaceSize, to limit the amount of memory used for class metadata. If we do not specify the value for this, the Metaspace re-sizes at runtime as per the demand of the running application.

Metaspace garbage collection is triggered when the class metadata usage reaches MaxMetaspaceSize limit. Excessive Metaspace garbage collection may be a symptom of classes, classloaders memory leak or inadequate sizing for our application.

Tuesday, 7 August 2018

Java 8 Functional Interfaces

If we look into some other programming languages such as C++, JavaScript; they are called functional programming language because we can write functions and use them when required. Some of these languages support Object Oriented Programming as well as Functional Programming.

Java 8 Functional Interfaces, Oracle Java Guides, Oracle Java Study Materials

Being object oriented is not bad, but it brings a lot of verbosity to the program. For example, let’s say we have to create an instance of Runnable. Usually we do it using anonymous classes like below.

Runnable r = new Runnable(){
@Override
public void run() {
System.out.println("My Runnable");
}};

If you look at the above code, the actual part that is of use is the code inside run() method. Rest all of the code is because of the way java programs are structured.

Java 8 Functional Interfaces and Lambda Expressions help us in writing smaller and cleaner code by removing a lot of boiler-plate code.

Java 8 Functional Interface


An interface with exactly one abstract method is called Functional Interface. @FunctionalInterface annotation is added so that we can mark an interface as functional interface.

It is not mandatory to use it, but it’s best practice to use it with functional interfaces to avoid addition of extra methods accidentally. If the interface is annotated with @FunctionalInterface annotation and we try to have more than one abstract method, it throws compiler error.

The major benefit of java 8 functional interfaces is that we can use lambda expressions to instantiate them and avoid using bulky anonymous class implementation.

Java 8 Collections API has been rewritten and new Stream API is introduced that uses a lot of functional interfaces. Java 8 has defined a lot of functional interfaces in java.util.function package. Some of the useful java 8 functional interfaces are Consumer, Supplier, Function and Predicate.

java.lang.Runnable is a great example of functional interface with single abstract method run().

Below code snippet provides some guidance for functional interfaces:

interface Foo { boolean equals(Object obj); }
// Not functional because equals is already an implicit member (Object class)

interface Comparator<T> {
 boolean equals(Object obj);
 int compare(T o1, T o2);
}
// Functional because Comparator has only one abstract non-Object method

interface Foo {
  int m();
  Object clone();
}
// Not functional because method Object.clone is not public

interface X { int m(Iterable<String> arg); }
interface Y { int m(Iterable<String> arg); }
interface Z extends X, Y {}
// Functional: two methods, but they have the same signature

interface X { Iterable m(Iterable<String> arg); }
interface Y { Iterable<String> m(Iterable arg); }
interface Z extends X, Y {}
// Functional: Y.m is a subsignature & return-type-substitutable

interface X { int m(Iterable<String> arg); }
interface Y { int m(Iterable<Integer> arg); }
interface Z extends X, Y {}
// Not functional: No method has a subsignature of all abstract methods

interface X { int m(Iterable<String> arg, Class c); }
interface Y { int m(Iterable arg, Class<?> c); }
interface Z extends X, Y {}
// Not functional: No method has a subsignature of all abstract methods

interface X { long m(); }
interface Y { int m(); }
interface Z extends X, Y {}
// Compiler error: no method is return type substitutable

interface Foo<T> { void m(T arg); }
interface Bar<T> { void m(T arg); }
interface FooBar<X, Y> extends Foo<X>, Bar<Y> {}
// Compiler error: different signatures, same erasure

Lambda Expression


Lambda Expression are the way through which we can visualize functional programming in the java object oriented world. Objects are the base of java programming language and we can never have a function without an Object, that’s why Java language provide support for using lambda expressions only with functional interfaces.

Since there is only one abstract function in the functional interfaces, there is no confusion in applying the lambda expression to the method. Lambda Expressions syntax is (argument) -> (body). Now let’s see how we can write above anonymous Runnable using lambda expression.

Runnable r1 = () -> System.out.println("My Runnable");

Let’s try to understand what is happening in the lambda expression above.

◈ Runnable is a functional interface, that’s why we can use lambda expression to create it’s instance.
◈ Since run() method takes no argument, our lambda expression also have no argument.
◈ Just like if-else blocks, we can avoid curly braces ({}) since we have a single statement in the method body. For multiple statements, we would have to use curly braces like any other methods.

Why do we need Lambda Expression


1. Reduced Lines of Code

One of the clear benefit of using lambda expression is that the amount of code is reduced, we have already seen that how easily we can create instance of a functional interface using lambda expression rather than using anonymous class.

2. Sequential and Parallel Execution Support

Another benefit of using lambda expression is that we can benefit from the Stream API sequential and parallel operations support.

Java 8 Functional Interfaces, Oracle Java Guides, Oracle Java Study Materials

To explain this, let’s take a simple example where we need to write a method to test if a number passed is prime number or not.

Traditionally we would write it’s code like below. The code is not fully optimized but good for example purpose, so bear with me on this.

//Traditional approach
private static boolean isPrime(int number) {
if(number < 2) return false;
for(int i=2; i<number; i++){
if(number % i == 0) return false;
}
return true;
}

The problem with above code is that it’s sequential in nature, if the number is very huge then it will take significant amount of time. Another problem with code is that there are so many exit points and it’s not readable. Let’s see how we can write the same method using lambda expressions and stream API.

//Declarative approach
private static boolean isPrime(int number) {
return number > 1
&& IntStream.range(2, number).noneMatch(
index -> number % index == 0);
}

IntStream is a sequence of primitive int-valued elements supporting sequential and parallel aggregate operations. This is the int primitive specialization of Stream.

For more readability, we can also write the method like below.

private static boolean isPrime(int number) {
IntPredicate isDivisible = index -> number % index == 0;

return number > 1
&& IntStream.range(2, number).noneMatch(
isDivisible);
}

If you are not familiar with IntStream, it’s range() method returns a sequential ordered IntStream from startInclusive (inclusive) to endExclusive (exclusive) by an incremental step of 1.

noneMatch() method returns whether no elements of this stream match the provided predicate. It may not evaluate the predicate on all elements if not necessary for determining the result.

3. Passing Behaviors into methods

Let’s see how we can use lambda expressions to pass behavior of a method with a simple example. Let’s say we have to write a method to sum the numbers in a list if they match a given criteria. We can use Predicate and write a method like below.

public static int sumWithCondition(List<Integer> numbers, Predicate<Integer> predicate) {
    return numbers.parallelStream()
    .filter(predicate)
    .mapToInt(i -> i)
    .sum();
}

Sample usage:

//sum of all numbers
sumWithCondition(numbers, n -> true)
//sum of all even numbers
sumWithCondition(numbers, i -> i%2==0)
//sum of all numbers greater than 5
sumWithCondition(numbers, i -> i>5)

4. Higher Efficiency with Laziness

One more advantage of using lambda expression is the lazy evaluation, for example let’s say we need to write a method to find out the maximum odd number in the range 3 to 11 and return square of it.

Usually we will write code for this method like this:

private static int findSquareOfMaxOdd(List<Integer> numbers) {
int max = 0;
for (int i : numbers) {
if (i % 2 != 0 && i > 3 && i < 11 && i > max) {
max = i;
}
}
return max * max;
}

Above program will always run in sequential order but we can use Stream API to achieve this and get benefit of Laziness-seeking. Let’s see how we can rewrite this code in functional programming way using Stream API and lambda expressions.

public static int findSquareOfMaxOdd(List<Integer> numbers) {
return numbers.stream()
.filter(NumberTest::isOdd) //Predicate is functional interface and
.filter(NumberTest::isGreaterThan3) // we are using lambdas to initialize it
.filter(NumberTest::isLessThan11) // rather than anonymous inner classes
.max(Comparator.naturalOrder())
.map(i -> i * i)
.get();
}

public static boolean isOdd(int i) {
return i % 2 != 0;
}

public static boolean isGreaterThan3(int i){
return i > 3;
}

public static boolean isLessThan11(int i){
return i < 11;
}

If you are surprised with the double colon (::) operator, it’s introduced in Java 8 and used for method references. Java Compiler takes care of mapping the arguments to the called method. It’s short form of lambda expressions i -> isGreaterThan3(i) or i -> NumberTest.isGreaterThan3(i).

Lambda Expression Examples


Below I am providing some code snippets for lambda expressions with small comments explaining them.

() -> {}                     // No parameters; void result

() -> 42                     // No parameters, expression body
() -> null                   // No parameters, expression body
() -> { return 42; }         // No parameters, block body with return
() -> { System.gc(); }       // No parameters, void block body

// Complex block body with multiple returns
() -> {
  if (true) return 10;
  else {
    int result = 15;
    for (int i = 1; i < 10; i++)
      result *= i;
    return result;
  }
}                         

(int x) -> x+1             // Single declared-type argument
(int x) -> { return x+1; } // same as above
(x) -> x+1                 // Single inferred-type argument, same as below
x -> x+1                   // Parenthesis optional for single inferred-type case

(String s) -> s.length()   // Single declared-type argument
(Thread t) -> { t.start(); } // Single declared-type argument
s -> s.length()              // Single inferred-type argument
t -> { t.start(); }          // Single inferred-type argument

(int x, int y) -> x+y      // Multiple declared-type parameters
(x,y) -> x+y               // Multiple inferred-type parameters
(x, final y) -> x+y        // Illegal: can't modify inferred-type parameters
(x, int y) -> x+y          // Illegal: can't mix inferred and declared types

Method and Constructor References


A method reference is used to refer to a method without invoking it; a constructor reference is similarly used to refer to a constructor without creating a new instance of the named class or array type.

Examples of method and constructor references:

System::getProperty
System.out::println
"abc"::length
ArrayList::new
int[]::new

That’s all for Java 8 Functional Interfaces and Lambda Expression Tutorial. I would strongly suggest to look into using it because this syntax is new to Java and it will take some time to grasp it.

Monday, 6 August 2018

How to use CyclicBarrier in Java - Concurrency

This is the second part of my concurrency tutorial, in the first part, you have learned how to use CountDownLatch and in this part, you will learn how to use CyclicBarrier class in Java. CyclicBarrier is another concurrency utility introduced in Java 5 which is used when a number of threads (also known as parties) want to wait for each other at a common point, also known as the barrier before starting processing again. Its similar to CountDownLatch but instead of calling countDown() each thread calls await() and when last thread calls await() which signals that it has reached the barrier, all thread started processing again, also known as a barrier is broken. You can use CyclicBarrier wherever you want to use CountDownLatch, but the opposite is not possible because you can not reuse the latch once the count reaches to zero. Some of the common usages of CyclicBarrier is in writing a unit test for concurrent program, to simulate concurrency in a test class or calculating final result after an individual task has completed.

Oracle Java Tutorial and Material, Oracle Java Guides, Oracle Java Certifications

In this tutorial, I will show you an example of how four worker thread waits for other before starting again. As I told in the previous article, concurrency is hard to master, sometimes even if you read a couple of articles on one topic, you don't get what you are looking for. If you understand how and where to use CountDownLatch and CyclicBarrier, then only you will feel confident.

CyclicBarrier Example in Java


You just cannot understand concurrency without an example, seeing is believing here. It's difficult to comprehend words like worker thread, parties, waiting for each other at a point, until you see it live in action. In this program, we have four worker threads and one main thread, which is running your main method. We have an object of CyclicBarrier, which is initialized with parties = 4, the argument we passed in CyclicBarrier constructor is nothing but number of party, which is actually number of threads to stop at barrier. The barrier will not be broken until all parties are arrived. A party (thread) is said to be arrived with it call barrier.await() method.

In our example setup, we have given each worker thread a different name, starting from PARTY-1 to PARTY-4 just to have a meaningful output. We have passed the same instance of the cyclic barrier to each thread. If you look at their Runnable implementation, you will find that each party sleep for some seconds and then call await() method on the barrier.

Oracle Java Tutorial and Material, Oracle Java Guides, Oracle Java Certifications

The sleep is introduced so that every thread calls barrier method after some time.  Sleep time is also in increasing order, which means PARTY-4 should be the last one to call await. So as per our theory, every thread (party) should wait after calling await() until the last thread (PARTY-4) calls the await() method, after that every thread should wake up and start processing.

Of-course they need to compete for CPU and they will start running once they got the CPU from thread scheduler, but what is more important is that once the barrier is broken, each thread (party) becomes eligible for scheduling.

import java.util.concurrent.BrokenBarrierException;
import java.util.concurrent.CyclicBarrier;

/**
 * Java Program to demonstrate how to use CyclicBarrier, Its used when number of threads
 * needs to wait for each other before starting again.
 *
 * @author Javin Paul
 */
public class HelloHP {

    public static void main(String args[]) throws InterruptedException, BrokenBarrierException {
     
        CyclicBarrier barrier = new CyclicBarrier(4);
        Party first = new Party(1000, barrier, "PARTY-1");
        Party second = new Party(2000, barrier, "PARTY-2");
        Party third = new Party(3000, barrier, "PARTY-3");
        Party fourth = new Party(4000, barrier, "PARTY-4");
     
        first.start();
        second.start();
        third.start();
        fourth.start();
     
        System.out.println(Thread.currentThread().getName() + " has finished");

    }

}

class Party extends Thread {
    private int duration;
    private CyclicBarrier barrier;

    public Party(int duration, CyclicBarrier barrier, String name) {
        super(name);
        this.duration = duration;
        this.barrier = barrier;
    }

    @Override
    public void run() {
        try {
            Thread.sleep(duration);
            System.out.println(Thread.currentThread().getName() + " is calling await()");
            barrier.await();
            System.out.println(Thread.currentThread().getName() + " has started running again");
        } catch (InterruptedException | BrokenBarrierException e) {
            e.printStackTrace();
        }
    }
}

Output
main has finished
PARTY-1 is calling await()
PARTY-2 is calling await()
PARTY-3 is calling await()
PARTY-4 is calling await()
PARTY-4 has started running again
PARTY-1 has started running again
PARTY-2 has started running again
PARTY-3 has started running again

If you look at the output is exactly matches with our theory. Each worker thread (PARTY 1 - 3) calls the await() method and then they stop processing until PARTY-4 comes and call await() method, after that every thread gets a wake up call and started execution again, depending upon when they are scheduled by Java thread scheduler.

This is how CyclicBarrier class works. You can still reuse the barrier object and if a thread calls barrier.await() again, it will wait for four worker thread before it gets wake up call. By the way,  If barrier is broken before a thread calls await() then this method will throw BrokenBarrierException.

When to use CyclicBarrier in Java Program


It is a very useful class and have several practical uses. You can use this to perform final task once individual task are completed. You can use it to write some unit tests to check some variants as well. Remember you can reuse the barrier as opposed to latch.  One a side note, this CyclicBarrier example is also a good example of how to catch multiple exception in one catch block in Java, a feature introduced in JDK 1.7. You can see that we have two unrelated exceptions InterruptedException and BrokenBarrierException, but we have caught then in same catch block, because of this feature, this code requires Java 7 to run. If you are not using JDK 7 then just use two catch block instead of one.

Oracle Java Tutorial and Material, Oracle Java Guides, Oracle Java Certifications

That's all about how to use CyclicBarrier in Java. In this CyclicBarrier Example you have not only learned  how to use CyclicBarrier but also when to use it. You should use it when one thread needs to wait for a fixed number of thread before starting an event e.g. Party. You can use CyclicBarrier to write concurrency unit test and implement generic algorithms in Java.