White Horse Software

November 2023

I've been using Entity Framework professionally for around seven years now. There are other ORMs available, but EF is the most popular - not least because it is backed by Microsoft's own documentation. Overall it is a useful and flexible way to interact with the database from the familiarity of .NET code. Database tables are exposed as .NET classes, and values (and types) are exposed as .NET properties.

Like any other tool that abstracts underlying functionality, the benefits also come with tradeoffs. How bothersome this is depends a great deal on the project demands and personal (and team) preferences. Personally I haven't found many gripes, but there are a few that have persisted across projects and even across companies. At the top of the gripe list are the way EF handles multiple entities in one result. This functionality is subheaded under an EF concept called "filtered includes". What follows is a list of how filtered includes have made it to the top of the gripe list.

Filtered includes do not come close to having the power of typical SQL queries.
Filtered includes don't handle complex relationships well. Raw SQL queries do a much better job. Of course some flavors of SQL do this better than others: some may have windowing functions while others don't, etc. But one of the main jobs of an ORM like EF is to avoid writing these SQL statements. If a complex relationship needs to be part of the query, filtered includes - and thus EF - won't handle it.
Filtered includes are not unit-testable
Even though filtered includes look like LINQ, they don't give the same results that typical LINQ operations would. This means the filtered includes give unexpected results in unit testing. Stated more seriously, filtered includes aren't unit testable.

One way around this is by growing the unit test so that it ties into an actual test database. This leads to a ton of boilerplate and less maintainability. Remember, setup code within a unit test can itself be a source of bugs. Personally I like to keep unit tests small and tight, to stay focused on a single principle and limit the places where a bug could creep in. But EF filtered includes do not allow this.

For an example of what this looks like, consider a scenario where a Computer has a collection of Parts. But the relationship between Computer and Part is slightly complicated: only Part with a PartId greater than 2 can become a part in a Computer:
```
// computer entity with implied PK-FK relationship to part
public class Computer
{
    public List Parts { get; set; }
}

public class Part
{
    public int PartId { get; set; }
}

...
// filtered include, where only parts greater than 2 are allowed
public static IQueryable IncludesSpecificPart(IQueryable computers)
{
    return computers.Include(computer =>
        computer.Parts.Where(part => part.PartId > 2));
}
```
The filtered include is in a method that easily allows unit testing without needing any database access. In fact, this method allows easy setup of a number of scenarios that could be tested, since the scenario can be modified by simply changing the input:
```
static bool TestExpectedPartsExcluded(Func, IQueryable> uut)
{
    var inputs = new List
    {
        new()
        {
            Parts = new()
            {
                new(){PartId = 1}
            }
        }
    };

    var outputs = uut(inputs.AsQueryable());

    return outputs.Any();
}
```
Unfortunately, this doesn't work. Whereas a normal LINQ statement (which filtered includes emulate) would correctly filter out things, in this case the result is always true, whether all inputs should meet the filter criteria or not.

Now, what is especially tricky about this is that this method does work when connected to a database context. But for unit testing, it gives false positives (or, depending on the test, false negatives).
There are other options besides Filtered Includes - but they are all terrible
Here are a list of options, should you need a complex relationship but decide to not use filtered includes:
- If operations needed on a child relationship, turn the query on its head so the child is actually the top-most parent
  Unfortunately this just means the problem is reversed, and the parent-now-child can't be operated on properly.
- Perform separate queries on each entity that needs to be operated on
  If filtered includes aren't an option (because tracking is needed), and operations need to be performed on entities at more than one relationship, this may be the only option.
  
  Although this does simplify things up front, by causing operations and queries to be focused on one entity (which EF really encourages), it means things will be quite painful later when trying to combine things. The combination must happen either via .Join and .Select (which requires hand-writing something to match existing entities) or in memory, using typical C# code and objects.
Filtered includes are only guaranteed to work if used with .AsNoTracking()
The following caution box is at the bottom of the documentation for filtered includes:
In case of tracking queries, results of Filtered Include may be unexpected due to navigation fixup. All relevant entities that have been queried for previously and have been stored in the Change Tracker will be present in the results of Filtered Include query, even if they don't meet the requirements of the filter. Consider using NoTracking queries or re-create the DbContext when using Filtered Include in those situations.
Runtime errors indicate if a filtered include isn't set up properly

This one would be less of a gripe if it weren't for the fact that filtered includes are part of an ORM, which should wrap database schemas and values. In other words, the database context should provide an indication things are good before runtime, not fail during runtime.

Of course someone may point out setup, especially database setup, even considering an ORM, is a requirement for any code that interacts with outside data. That's true, but my point is that filtered includes fail during runtime, with things that should be caught during compile time. For example, a big gotcha with this area is making sure navigations are unique.

Consider this:

parents
    .Include(parent => parent.Children.Where(child => child.IsWanted))
    .ThenInclude(child => child.Grades)

...then later this is done (perhaps because trying to reuse code or split queries into something more unit-testable):

parents
    .Include(parent => parent.Children.Where(child => child.IsWanted))
    .ThenInclude(child => child.Toys)

...this won't work. There is a navigation filter defined twice: once to filter grades by wanted children then again later to filter toys by wanted children. When these are combined into a single query, things will compile and the program will run, but a runtime error will be thrown when the query is used.

Conclusion: Entity Framework is wonderful for dealing with database entities singly, or with very simple scenarios like those given in the beginner's documentation. But it has a long way to go to gracefully handle complicated relationships between entities. The designated way of doing this, using filtered includes, has a lot of gotchas. Especially those who are familiar with querying their favorite relational database, will feel very constrained.

White Horse Software

Smart code, for humans.