23/04/2015

How to build predicates dynamically using expression trees

Home

I'm working at the application which finds so called execution patterns in logs recorded by IntelliTrace historical debugger. An execution pattern is a sequence of methods calls that is widely used in the application and it is a kind of automatically generated documentation. The part of the algorithm is filtering of found patterns based on criteria like the length of a pattern or the number of different methods in a pattern.

At the beginning I used only 2 criteria so it was easy to handle all possible combinations of them i.e. use the first criterion, use the second criterion, use both and used none. Then I added 3rd criterion and I thought that for 3 criteria I still don't need a generic mechanism. However, shortly it turned out that I want to handle 5 criteria what gives 32 of possible combinations. This time I did it once and for all.

I decided to use expression trees to dynamically build an expression that verifies any combination of criteria. The code is quite simple. Firstly we need an enum for all criteria.
[Flags]
public enum Crieria : byte
{
    None = 0,
    CriterionOne = 1,
    CriterionTwo = 2,
    All = CriterionOne | CriterionTwo
}
We also need a class that will represent patterns.
public class Pattern
{
    public int FieldOne { get; set; }
    public int FieldTwo { get; set; }
}
Now we can write a code that will dynamically build needed expressions. I assumed that every criterion has a corresponding static method that knows how to check if a current pattern fulfils it or not. The final expression produced by CreateExpression method will be of the following form pattern => predicate1(pattern) && predicate2(pattern) && predicate3(pattern)....
public static class FilterBuilder
{
    public static Func<Pattern, bool> CreateExpression(Crieria filteringMode)
    {
        var param = Expression.Parameter(typeof(Pattern));

        var subExpressions = new List<MethodCallExpression>();

        if ((filteringMode & Crieria.CriterionOne) != 0)
            subExpressions.Add(Expression.Call(typeof(FilterBuilder), nameof(CriterionOnePredicate), null, param));

        if ((filteringMode & Crieria.CriterionTwo) != 0)
            subExpressions.Add(Expression.Call(typeof(FilterBuilder), nameof(CriterionTwoPredicate), null, param));

        //Other criteria...

        if (subExpressions.Count == 0)
            return p => true;

        Expression finalExpression = subExpressions[0];
        for (var i = 1; i < subExpressions.Count; ++i)
            finalExpression = Expression.And(finalExpression, subExpressions[i]);

        return Expression.Lambda<Func<Pattern, bool>>(finalExpression, param).Compile();
    }

    public static bool CriterionOnePredicate(Pattern p)
    {
        return p.FieldOne > 0;
    }

    public static bool CriterionTwoPredicate(Pattern p)
    {
        return p.FieldTwo < 0;
    }
}
The code can be made even more generic but I'll leave it as an exercise. When I finished this code I started to worry about performance. It is critical for me because my application needs to process large amount of patterns efficiently. I made the following simple test in which dynamically generated and static functions are executed 1 million times.
var iterations = 1000000;

var predicate = FilterBuilder.CreateExpression(Crieria.All);
MeasureIt<Pattern>((p) => predicate(p), new Pattern(), iterations);

predicate = FilterBuilder.CreateExpression(Crieria.CriterionOne);
MeasureIt<Pattern>((p) => predicate(p), new Pattern(), iterations);

MeasureIt<Pattern>((p) =>
{
   FilterBuilder.CriterionOnePredicate(p);
   FilterBuilder.CriterionTwoPredicate(p);
}, new Pattern(), iterations );

MeasureIt<Pattern>((p) => FilterBuilder.CriterionOnePredicate(p), new Pattern(), iterations);
In order to measure time of calculations I used MeasureIt method from my earlier post and I received the following results:
Total time: 54
Total time: 27
Total time: 18
Total time: 12
Dynamically generated predicates are 2-3 times slower than static ones. However, we are still talking here about dozens of milliseconds in order to make 1 million calls. For me it is acceptable.

0 comments:

Post a Comment