PMD

PMD scans Java source code and looks for potential problems like:

  • Possible bugs – empty try/catch/finally/switch statements
  • Dead code – unused local variables, parameters and private methods
  • Suboptimal code – wasteful String/StringBuffer usage
  • Overcomplicated expressions – unnecessary if statements, for loops that could be while loops
  • Duplicate code – copied/pasted code means copied/pasted bugs

The default rulesets: basic.xml, unusedcode.xml and imports.xml

  • best practice would be to create a custom ruleset file kept under code versioning control and used with the IDE integration plugin; this way standards changes are easy to propagate around the team
  • use standard PMD rules plus new custom rules

It can be run as

  • stand alone:

./bin/run.sh pmd -d /opt/data/paypipe/connectors/src/main/java/ -f summaryhtml -R rulesets/java/basic.xml -version 1.6 -language java > ../pmdSummary.html

  • mvn plugin, outputing by default to target/pmd.xml and target/site/pmd.html
    mvn pmd:pmd

Including PMD in the Maven verify lifecycle phase:
<project>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-pmd-plugin</artifactId>
                <version>3.2</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>check</goal>
                            <goal>cpd-check</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

</project>

Integrating with IDE using PMD plugin:
http://pmd.sourceforge.net/pmd-5.1.2/integrations.html

  • a custom ruleset file can be declared to the PMD Plugin so changes are shared

JBoss Drools – A Hands-on Introduction

Tags

, ,

If you thought that things like Artificial Intelligence or logical programming are all dead and buried in the 80s and have no relevance to our enterprise projects, think again. Drools is a Java framework that implements a form of AI called rule-based Expert System that, even though it might not win you Jeopardy but it is an open-source JBoss project that can help you quickly process data according to large sets of business rules and it will allow you to define those rules in a readable, user-friendly way. When looking at a rule, it is pretty clear what it is about to both a developer with little business domain knowledge and to a non technical business analyst:

rule "Valid for loan"
 when
   For an active loan type
   There is a person that
    - is at least 18 years old
    - is not too old for the loan
    - earns more than or equal to the minimum income for the loan
    - has compatible citizenship with the loan
 then
   Tell that person they qualify for the loan
end

1. Some theoretical background

Expert Systems are computer systems that make decisions like a human expert would, based on a way of representing knowledge (which forms their so called knowledge base) they infer conclusions. This is different from conventional programming because it doesn’t work by following a procedure but instead it tries to mimic human reasoning about knowledge. Drools is a Rule Engine that uses the rule-based approach to implement an Expert System. A Rule engine is any system that uses rules, in any form, that can be applied to data to produce outcomes.
To bring back some of the traumas of the August exams in college, the official documentation adds that Drools is more precisely classified as a Production Rule System, a concept in Formal Grammars.

In rulebased systems, knowledge is represented in the form of if-then rules. For example, the following rule could be part of such a system:

IF Person wants to buy a house
   Person does not have enough money for a house
THEN Person goes to bank for a loan

To actually trigger this rule, we will need a Person object or fact matching the conditions of the rule. We need to provide our rules with a number of facts where they can work upon.

The process that decides weather each fact satisfies the Rules is called Pattern Matching, and is performed by the so called Inference Engine. If a fact satisfies more than one rule, the matched rules are said to be in conflict and it becomes the job of a component called Agenda to decide the order in which those rules will be executed. The Rules are stored in the Production Memory and the facts that the Inference Engine matches against are kept in the Working Memory.

There are several algorithms that can be used by an Inference Engine for pattern matching, like Rete or Leaps. Drools is based on a Java implementation of Rete which they called ReteOO. It would be too slow to iterate over all the rules and for each rule check all the facts if they match or not. Instead, by using Rete, the facts are turned into a digital tree where each node (except the root) corresponds to a pattern occurring in the left-hand-side (the condition part) of a rule. So, for a fact to trigger the execution rule, it has to make it all the way down to leaf level. More details here; but what is important to keep in mind is that this algorithm is faster than an iterative approach by several orders of magnitude. However, the cost for this performance is memory usage and for very large expert systems this might turn into a real problem.

2. Usages of a Rule Engine

Rule Engines are a suitable solution for problems that don’t have a satisfactory traditional programming approach. Here are some typical scenarios when this can happen:

  • the problem may not be complex, but you can’t see a non-fragile way of building a solution for it
  • the problem is beyond any obvious algorithmic solution.
  • the logic changes often
  • domain experts (or business analysts) are readily available, but are nontechnical.

Obviously, rules are not any silver bullet, their usage is not suited for workflow or process execution. If you are not able to easily write declarative rules for a problem then maybe rules are not the right tool for that job. Some common use cases when Rule engines are recommended are: validations, calculations, routing and filtering, monitoring and diagnostics, logistics, etc.

3. The Drools language

To showcase some of the basic features of Drools I created a small Java project called LoanAdvisor that includes a simple expert system that can give feedback to a bank’s clients about their eligibility for each available type of loan. You can download the sources (and a md5) from here: http://www.fileswap.com/folder/eQgRngeY/

We need some basic POJOs, first:

public class Person {
 private String name;
 private Integer age;
 private Country citizenship;
 private Double income;
...
}
public class LoanType {
 private String name;
 private Integer maxAge;
 private Set<Country> compatibleCitizenships;
 private Double minIncome;
 private boolean active;
...
}

And a service to process the feedback for each person:

public interface IFeedbackService {
 void addFeedback(Person person, String message);
 String getFeedback();
}

Next comes the fun part, writing the rules. Drools allows you to write rules in more than one way. First step is to see the “native” rule language at work.

3.1 Drools native rule format – DRL

A rule file is simple a text file, typically with a .drl extension that is short for Drools Rule Language. In a DRL file you can have multiple rules, queries and functions, as well as some resource declarations like imports, globals and attributes that are assigned and used by your rules. However, you are also able to spread your rules across multiple rule files (in that case, the extension .rule is suggested).

Let’s dig into a native language rules file src/main/rules/LoanAdvisor.drl:

1. package loan.advisor.drl

2. import loan.advisor.dto.Person
3. import loan.advisor.dto.LoanType
4. import loan.advisor.services.FeedbackService
5. 
6. global FeedbackService feedbackService

First is the package definition. The package name itself is the namespace, and is not related to files or folders in any way. Next come the imports, the Java types we reference. Then comes a so called global object that is an object that won’t be part of our working memory but that is needed by our rules; these could be anything from services to loggers.

 9. rule "Underage Persons"
 10.  salience 100 //filter out the kids first
 11.  when
 12.    $person: Person(age < 18)
 13.  then
 14.    feedbackService.addFeedback($person, "You have to grow up first!");
 15.    retract($person);
 16. end
 17.
 18. rule "Above maximum age limit"
 19.  when
 20.    LoanType(active==true, $loanName: name, $maxAge: maxAge)
 21.    $person: Person(age > $maxAge)
 22. then
 23.    feedbackService.addFeedback($person,
 24.    String.format("You do NOT qualify for '%s': you are older than %d.", $loanName, $maxAge));
 25. end

The first rule, called “Underage Persons”, is supposed to identify the persons that are below the minimum legal age for asking a bank loan and remove them from the working memory since they won’t be eligible for any kind of credit. So it makes sense that this should always be the very first rule to be executed. This is done by specifying a salience for the rule (line 10), the higher the salience, the higher the priority of the rule. 100 is a random value, since all the other rules have default salience, any positive integer would have been the same.

The two most interesting parts of a rule are the when and the then part. The when part, which is called the left hand side(LHS) of the rule, contains the conditions that need to be fulfilled in order for the then part, called the right hand side(RHS) of the rule, to execute. If the LHS doesn’t match anything in the working memory, the RHS is not executed. Writing a rule, the mindset is a little similar to writing SQL statements.

Line 12 contains a so called pattern that matches any instance of Person from the Working Memory that has the age property less than 18 and then it binds it to a variable called $person. The prefixed dollar symbol ($) is just a convention but it can be useful in complex rules where it helps to easily differentiate between variables and fields. Also note the way the age property is being accessed. Drools follows the standard Java bean specifications by using standard JDK Introspector class to map the properties. So all you need are public accessors that use the standard naming conventions. While something like Person(getAge() < 18) is also legal, by working directly with the fields we allow Drools to create field indexes that enhance the performance. Another thing to note is that Drools supports auto boxing and unboxing.

Line 14 sends add a feedback message for the matched person using our service. After that, we remove the object from the working memory so it won’t be processed by the other rules.

The second rule, “Above maximum age limit”, compares the maximum age permitted by each credit type with each person’s age and tells the persons that are too old the reason they don’t qualify for the loan. Here, at line 20, we are not binding a variable to a LoanType object but instead we are binding to the bean’s properties: $loanName: name, $maxAge: maxAge

We can have more than one constraint in a single pattern. Let’s see the last rule that gives positive feedback to the persons that satisfy all requirements for a type of loan:

45. rule "Valid for loan"
46.  when
47.    $loan: LoanType(active==true)
48.    $person: Person(age >= 18, age <= $loan.maxAge,
49.       income >= $loan.minIncome,
50.       $loan.compatibleCitizenships contains citizenship)
51.  then
52.    feedbackService.addFeedback($person,
53.    String.format("You qualify for '%s'! One of our agents will contact you shortly.", $loan.getName()));
54. end

Here, at lines 48 to 50, the 4 conditions are separated by commas which is the Drools AND logical operator. As you can see on line 50, DRL offers some convenience operators for working on Collections and Maps like contains, member of, in, etc.

3.2 Domain Specific Language – DSL

DSLs can serve as a layer of separation between rule authoring (and rule authors) and the
technical intricacies resulting from the modelling of domain object and the rule engine’s native
language and methods. If your rules need to be read and validated by domain experts (such as
business analysts, for instance) who are not programmers, you should consider using a DSL; it
hides implementation details and focuses on the rule logic proper. Using DSL has no impact on performance because it works by changing the parsing at compile time.

Here is how the first two DRL rules we looked at look like when using a DLS file. This is from src/main/rules/LoanAdvisor.dslr:

 1. package loan.advisor.dslr
 2.
 3. import loan.advisor.dto.Person
 4. import loan.advisor.dto.LoanType
 5. import loan.advisor.services.FeedbackService
 6.
 7. global FeedbackService feedbackService
 8.
 9. expander dictionary.dsl
 10.
 11. rule "Underage Persons"
 12.  salience 100
 13.  when
 14.    There is a person younger than 18
 15.  then
 16.    Use feedbackService to tell that person "you have to grow up first!"
 17.    Exclude that person
 18. end
 19.
 20. rule "Above maximum age limit"
 21.  when
 22.    For an active loan type
 23.    A person is too old
 24.  then
 25.    Use feedbackService to give this negative feedback "you exceed the maximum age limit"
 26. end

The magic is done by creating another text file dictionary.dsl to describe our domain specific language and by declaring it as an expander for the rule file (line 9).

A dsl file contains a set of expressions and their synonym in DRL. For instance

[when]There is a person younger than {minAge}=$person:Person(age < {minAge})
[then]Exclude that person=retract($person);
[then]Use {feedbackService} to tell that person {message}=feedbackService.addFeedback($person, {message});

The part enclosed inside the square parentheses specifies if the statement can be used in the LHS or in RHS. Next is the statement that can contain a place holder enclosed in accolades. The place holder {minAge} is passed from the rules file. What is left of the equals sign is then translated into the DRL expressions at compile-time.

Here is the rule that selects the persons valid for a loan type:

 44. rule "Valid for loan"
 45.  when
 46.    For an active loan type
 47.    There is a person that
 48.     - is at least 18 years old
 49.     - is not too old for the loan
 50.     - earns more than or equal to the minimum income for the loan
 51.     - has compatible citizenship with the loan
 52.  then
 53.    Use feedbackService to give positive feedback
 54. end

And here is the LHS in the DLS:

 6.  [when]There is a person that=$person:Person()
 7.  [when]- is at least {minAge} years old=age >= {minAge}
 8.  [when]- is not too old for the loan=age <= $loan.maxAge
 9.  [when]- earns more than or equal to the minimum income for the loan=income >= $loan.minIncome
 10. [when]- has compatible citizenship with the loan=$loan.compatibleCitizenships contains citizenship

The – after the [when] signals Drools that the condition will refer only to the fields of a selected object. It is a way to make the code more readable.

4. Running the rules

4.1. Adding the Eclipse plugin

Drools provides a plugin for Eclipse that will let you handle dependencies like the required jars of a project and will give you a special perspective with special editors, automatic rule validation and even debugging options. Here are the steps to install have a functional Drools environment in your Eclipse:
a. install GEF plugin: Help->Software updates…->Available Software->Add Site… from the help menu. Location is: http://download.eclipse.org/tools/gef/updates/releases/
b. download
http://downloads.jboss.org/drools/release/5.5.0.Final/droolsjbpm-tools-distribution-5.5.0.Final.zip
Unzip it and then follow the steps in ReadMeDroolsJbpmTools.txt
c. from the Preferences window, select the newly added option Drools – Installed Drools Runtimes; click “Add…” and then “Create a new Drools 5 Runtime …” from the newly opened window

If anything goes wrong you can always consult the documentation on http://www.jboss.org/drools .
You will now have a Drools perspective and by creating a new Drools project you will also get the sources for a Hello World example.

4.2 Running the example

This part discusses the LoanAdvisor project you can download from here: http://www.fileswap.com/folder/eQgRngeY/

So far, we described how it all should work, but we didn’t see the rule engine in action. The following code will change this:

 1. // load up the knowledge base
 2.  KnowledgeBase kbase = readDslKnowledgeBase();
 3.
 4.  StatefulKnowledgeSession ksession = kbase
 5.         .newStatefulKnowledgeSession();
 6.  KnowledgeRuntimeLogger logger = KnowledgeRuntimeLoggerFactory
 7.      .newFileLogger(ksession, "loanAdvisor");
 8.
 9.  ksession.setGlobal("feedbackService", new FeedbackService());
 10. populateSession(ksession);
 11.
 12. // go !
 13. ksession.fireAllRules();

It all starts from a KnowledgeBase object that contains our rules. From this we need a session which can be a stateless one if we don’t need to use the inference engine or stateful for longer lived iterative processing. Notice line 9 where we link the global variable with a java object.

The knowledge base instantiation is pretty straight forward, it adds up to using a KnowledgeBuilder load and parse the rules files into so called knowledge packages:

14. private static KnowledgeBase readDslKnowledgeBase() throws Exception {
15.   KnowledgeBuilder kbuilder = KnowledgeBuilderFactory
16.                                  .newKnowledgeBuilder();
17.   kbuilder.add(ResourceFactory.newClassPathResource("dictionary.dsl"),
18.                                  ResourceType.DSL);
19.   kbuilder.add(ResourceFactory.newClassPathResource("LoanAdvisor.dslr"),
20.                                  ResourceType.DSLR);
21.   KnowledgeBuilderErrors errors = kbuilder.getErrors();
22.   if (errors.size() > 0) {
23.      for (KnowledgeBuilderError error : errors) {
24.        System.err.println(error);
25.      }
26.      throw new IllegalArgumentException("Could not parse knowledge.");
27.   }
28.   KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase();
29.   kbase.addKnowledgePackages(kbuilder.getKnowledgePackages());
30.   return kbase;
31. }

The populateSession method populates the session object with the objects that we want in our working memory. For this test we’ll insert 6 persons and 4 loan types:

Name

Maximum age

Compatible citizenships

Minimum income

Is loan active?

The Standard Credit

65

UK, French

90,000

true

The Platinum Credit

80

UK, French

190,000

true

The Student Loan

23

UK, French

25,000

true

The Poor Student Loan

20

UK, French

12,000

false

 

Here is a basic output of the FeedbacService after running all the rules:

TOM HOPPER, aged 17, citizen of UK, yearly income of 0.00 €:
 You have to grow up first!

MIHAI IONESCU, aged 33, citizen of Romania, yearly income of 80,000.00 €:
 You do NOT qualify for 'The Student Loan': this loan is available only in [France, UK] .
 You do NOT qualify for 'The Student Loan': you are older than 23.
 You do NOT qualify for 'The Platinum Credit': this loan is available only in [France, UK] .
 You do NOT qualify for 'The Platinum Credit': you earn less than 190,000.00 €.
 You do NOT qualify for 'The Standard Credit': this loan is available only in [France, UK] .
 You do NOT qualify for 'The Standard Credit': you earn less than 90,000.00 €.

JANE MILLER, aged 35, citizen of UK, yearly income of 95,000.00 €:
 You do NOT qualify for 'The Student Loan': you are older than 23.
 You do NOT qualify for 'The Platinum Credit': you earn less than 190,000.00 €.
 You qualify for 'The Standard Credit'! One of our agents will contact you shortly.

JHON ADAMS, aged 21, citizen of UK, yearly income of 24,000.00 €:
 You do NOT qualify for 'The Student Loan': you earn less than 25,000.00 €.
 You do NOT qualify for 'The Platinum Credit': you earn less than 190,000.00 €.
 You do NOT qualify for 'The Standard Credit': you earn less than 90,000.00 €.

MIKE MORRISON, aged 71, citizen of UK, yearly income of 190,000.00 €:
 You do NOT qualify for 'The Student Loan': you are older than 23.
 You qualify for 'The Platinum Credit'! One of our agents will contact you shortly.
 You do NOT qualify for 'The Standard Credit': you are older than 65.

JEAN ALESSI, aged 25, citizen of France, yearly income of 86,000.00 €:
 You do NOT qualify for 'The Student Loan': you are older than 23.
 You do NOT qualify for 'The Platinum Credit': you earn less than 190,000.00 €.
 You do NOT qualify for 'The Standard Credit': you earn less than 90,000.00 €.

Java Threads – the important stuff

There are two basic strategies for using Thread objects to create a concurrent application.

  • To directly control thread creation and management, simply instantiate Thread each time the application needs to initiate an asynchronous task.
  • To abstract thread management from the rest of your application, pass the application’s tasks to an executor.

Starting a thread:

new Thread().start()

new Thread(Runnable r).start()

Declaring a thread:

  • extend Thread, override run() method
  • implement Runnable, implement run() method,

Issues

Due to shared memory space, threads can interleave

– interference: 2 threads concurrently modifying the same object produce an unpredictable outcome

– memory inconsistency: 2 threads concurrently using the same object, one modifying and one reading its state produce an unpredictable outcome

synchronized

– a monitor mechanism using a lock object

– problems caused: deadlock, starvation

The monitor mechanism

– a monitor is a mechanism that implements thread safety and it consists of

  • lock object (mutex) that ensures mutually exclusive access for a thread to a block of code
  • condition variables – container of threads that acquired the same lock object and that are either waiting for a notification that a condition is fulfilled (Object.wait / Condition.await) or can notify other threads that that condition is fulfilled (Object.notify / Condition.signal)

Atomic action

= cannot be interleaved, has no side effects

– read & write are atomic for all variables (except long and double)

– read & write are atomic for all volatile variables

volatile

– changes to a volatile variable are always visible to other threads. What’s more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change

– more efficient than synchronized

java.util.concurrent

interface Lock

– works just like synchronized keyword

– plus information about the state of the lock, non blocking lock, can abandon locking

interface Executor

– starts threads: executorObj.execute(runnableObj);

concurrent collections:

–  provide atomic operations on collections

CopyOnWriteArraySet, CopyOnWriteArrayList

– write creates a clone of the collection so Iterators see snapshots

ConcurrentSkipListSet

– concurrent sorted set

ConcurrentMap

– putIfAbsent, remove, replace

BlockingQueue

– extends Queue, facilitates operations on full or empty queues

http://tutorials.jenkov.com/java-concurrency/deadlock-prevention.html

Git – how many unit tests were added in a java repo over a period of time

First step, I want to list all the diffs between two dates for a given file

git log –since=”2012-10-11″ –until=”2012-10-12″ -p README.md

Then, I want to count how many lines begin with the string “+This”.

grep -c –regexp=”+This” minp

outputs 4.

Lets assemble with the first command:

git log –since=”2012-10-11″ –until=”2012-10-12″ -p README.md | grep
-c –regexp=”+This”

Next I want to apply these commands to all files with a given name in
a tree of folders. I first want to make sure the script is working
without getting into much trouble counting strings in several files.
So I’ll pass the README.md in the current folder, using ls:

ls *md | git log –since=”2012-10-11″ –until=”2012-10-12″ -p | grep
-c –regexp=”+This”

The output is 1 as expected.

To list all the README.md files recursevly i’ll use the find command,
so to count all the times a line was added starting with “This” in a
period of time on all *md files I’ll do this:

find . -name *md | git log –since=”2012-10-11″ –until=”2012-10-12″
-p | grep -c –regexp=”+This”