Monday, November 26, 2012

LINQ Outer Joins

I found out today how to do outer joins in LINQ and thought I’d save it as an Aide Memoire post.

Given a typical Foreign-Key relationship:

class Preference
{
    public int Id { get; set; }
    public string Description { get; set; }
}

class Person
{
    public int Id { get; set; }
    public string Forename { get; set; }
    public string Surname { get; set; }
    public int? PreferenceId { get; set; }
}

 

And this sample data:

    var peopleList = new List<Person>();
    peopleList.Add(new Person {Id = 77, Forename = "Fred", Surname = "Flintstone", PreferenceId = 2 });
    peopleList.Add(new Person {Id = 154, Forename = "Barney", Surname = "Rubble", PreferenceId = 1 });
    peopleList.Add(new Person {Id = 308, Forename = "Wilma", Surname = "Flintstone" });
    peopleList.Add(new Person {Id = 462, Forename = "Betty", Surname = "Rubble", PreferenceId = 1 });
    peopleList.Add(new Person {Id = 616, Forename = "Bam Bam", Surname = "Rubble", PreferenceId = 4 });
    peopleList.Add(new Person {Id = 770, Forename = "Pebbles", Surname = "Flintstone" });

    var preferenceList = new List<Preference>();
    preferenceList.Add(new Preference {Id = 1, Description = "Coffee"});
    preferenceList.Add(new Preference {Id = 2, Description = "Tea"});
    preferenceList.Add(new Preference {Id = 3, Description = "Hot Chocolate"});
    preferenceList.Add(new Preference {Id = 4, Description = "Fruit Juice"});

 

The obvious LINQ query:

    var peopleWithPreferences = from p in peopleList
                            join pr in preferenceList
                            on p.PreferenceId equals pr.Id
                            select new  {p.Forename, p.Surname, pr.Description};

 

Produces results that contain only the inner joined pairs of people and their beverage preference:

Forename  Surname     Description
Fred      Flintstone  Tea
Barney    Rubble      Coffee
Betty     Rubble      Coffee
Bam Bam   Rubble      Fruit Juice
 

In order to retrieve the full list of people with their beverage preference if they have one, the LINQ query becomes:

    var peopleAnyPreferences = from p in peopleList
                            join pr in preferenceList
                            on p.PreferenceId equals pr.Id into joinedPreferences
                            from j in joinedPreferences.DefaultIfEmpty()
                            select new  {p.Forename, p.Surname, pref = j != null ? j.Description : string.Empty};

 

Forename  Surname     Description
Fred      Flintstone  Tea
Barney    Rubble      Coffee
Wilma     Flintstone
Betty     Rubble      Coffee
Bam Bam   Rubble      Fruit Juice
Pebbles   Flintstone

 

The differences being the intermediate results joinedPreferences and its use with the DefaultIfEmpty extension, with a null check for the nullable column.


UPDATE: From a technique shown in this post by Jim Wooley:

A way of avoiding the need for intermediate results, using old-fashioned T-SQL style joins via a LINQ Where extension:

var betterPeopleAnyPreferences = from p in peopleList 
                    from pr in preferenceList
                    .Where(x => p.PreferenceId == x.Id)
                    .DefaultIfEmpty()
                    select new  {p.Forename, p.Surname, pref = pr != null ? pr.Description : string.Empty};