.NET Include – Does it use Left Joins?
As expected in any good relational database you will have tables where data is related through foreign keys. Sometimes getting that data through LINQ and .NET isn’t always easy as you must use joins or includes to retrieve related data.
In a recent project, we had a C# object where we wanted to include data from a related table. We used this object in other projects to pass around via JSON but in this method, we wanted to have all the data ready when queried from the database.
Trying to use the .NET .Include() method to pull in this related data was causing us issues. Here is an example of the class setup we had.
internal class Product
{
public int ProductID { get; set; }
public string ProductName { get; set; }
public Category Category { get; set; }
}
internal class Category
{
public int CategoryId { get; set; }
public string Name { get; set; }
}
Nothing special, basically a Product class that had some categories. But the odd thing is that whenever we were querying the Product data that didn’t have a category with a .Include() it would always return null.
See this example of the issue
List<Category> categories = new List<Category>();
Category category1 = new Category()
{
CategoryId = 1,
Name = "Category 1"
};
categories.Add(category1);
Category category2 = new Category()
{
CategoryId = 2,
Name = "Category 2"
};
categories.Add(category2);
List<Product> productList = new List<Product>();
Product product = new Product();
product.ProductID = 123;
product.ProductName = "Product 123";
productList.Add(product);
myDbContext myDbContext = new myDbContext(new DbContextOptionsBuilder<DbContext>().UseInMemoryDatabase("mockDB").Options);
myDbContext.Products.AddRange(productList);
myDbContext.SaveChanges();
var myfirstproduct = myDbContext.Products.Where(x => x.ProductID == 123).Include(y => y.Category).FirstOrDefault();
Console.WriteLine(myfirstproduct.ProductName);
If I ran this code as is I would get the dreaded “Object reference not set to an instance of an object” because myfirstproduct was null.
But looking at everything I read online using the .Include() on an IQueryable should do a LEFT JOIN when interpreting the SQL.
By making one small change to the code I could get the value to come back:
List<Category> categories = new List<Category>();
Category category1 = new Category()
{
CategoryId = 1,
Name = "Category 1"
};
categories.Add(category1);
Category category2 = new Category()
{
CategoryId = 2,
Name = "Category 2"
};
categories.Add(category2);
List<Product> productList = new List<Product>();
Product product = new Product();
product.ProductID = 123;
product.ProductName = "Product 123";
//*********************************************
product.Category = category1; //<-gave this product a category
//*********************************************
productList.Add(product);
myDbContext myDbContext = new myDbContext(new DbContextOptionsBuilder<DbContext>().UseInMemoryDatabase("mockDB").Options);
myDbContext.Products.AddRange(productList);
myDbContext.SaveChanges();
var myfirstproduct = myDbContext.Products.Where(x => x.ProductID == 123).Include(y => y.Category).FirstOrDefault();
Console.WriteLine(myfirstproduct.ProductName);
By giving my product a category all is well with retrieving the product name of myfirstproduct.
This led me to scour why the .Include() worked this way.
It seems in order for this to work correctly. The foreign relationship, or the object being referenced in the .Include() method, needs to be nullable.
If the object is not nullable it forces the join to be an INNER JOIN as .NET thinks the object must exist since it is not nullable. This makes sense to me. So by making this small change to my class. Everything returns whether a category exists on the product or not.
internal class Product
{
public int ProductID { get; set; }
public string ProductName { get; set; }
public Category? Category { get; set; }
}
internal class Category
{
public int CategoryId { get; set; }
public string Name { get; set; }
}