Entity Framework, MVC, Repositories, Code First Migrations: Putting it together (part 1)

I love Microsoft’s Entity Framework and the MVC framework. Put together, they make my work as a C# developer a lot easier because they take care of a lot of the heavy lifting. As a bonus, they help make my applications more structured, more easily testable and more fun to work on.

But it took me a while to get where I wanted to be. As with all technologies, it takes a while to get comfortable with them and learn how to apply them in a good way. This is in part because most tutorials focus on the details. They explain how to use a certain technology, pattern or framework. But it’s often hard to find a practical ‘use case’. How do you use put it together in an actual webapplication? 

In this post, I would like to share what I’ve learned by showing you how I apply said technologies, patterns and ideas in a simple webapplication. The sourcecode is available from BitBucket, so feel free to check it out yourself and play a bit with the code: . A published version of the sourcecode is available here. In upcoming posts, I will discuss other aspects of the sourcecode; Unit testing, dependency injection, Bootstrap, AppHarbor (continuous deployment), Git, etc.

The best way to read this article is to check out the source code with a Git client (like SourceTree) and run it in Visual Studio (2012/2013). You’ll see what the application does and how it is set up. Instead of describing in detail what I’ve done, I will instead focus on the bigger picture and talk about some highlights. Oh, and i’m going to assume that you are familiar with Entity Framework and MVC.

Running the website

First, check out the source code with a Git client (like SourceTree) to your local drive. Once downloaded, open the solution file with Visual Studio 2012/2013 and run it. If you have LocalDb installed as a part of Visual Studio, a database will be automatically created and seeded with data:

If you don't have LocalDb, or it doesn't work, simply change the connection string in the web.config of the Website project to point to another SQL Server. Do make sure that the user account you use has CREATE DATABASE permissions:

<connectionStrings>
    <add name="database" connectionString="Server=localhost\SQLSERVER; Integrated Security=true; Database=Entity Framework with MVC;" providerName="System.Data.SqlClient" />
</connectionStrings>

Project structure; the basics

The solution consists of two main projects; Repositories and Website. There are also two projects with unit tests, Repositories.Tests and Website.Tests. I always like to separate concerns, so I put my database logic in it’s own project. The repositories project is a class library with a DataContext, my domain entities, migrations and a number of repositories that I use to manipulate the database (more about this below):

In the solution, I use NuGet to import packages for Entity Framework 6, Bootstrap, Ninject (for MVC) and Microsoft ASP.NET Web Optimization (for bundles). Of course, I always allow NuGet to automatically download these packages when they are missing. With this feature enabled, I don't have to worry about downloading the packages and I don't have to check the massive 'Packages' folder into BitBucket, saving a lot of valuable disk space. 

Entity Framework: Code First

I’m a big fan of Entity Framework's Code First approach. This allows me to write my domain entities as regular C# classes (or so-called Plain Old C Classes, POCO) and define relationships between them:

public class Application
{
    [Key]
    [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
    public int Id { get; set; }
    public virtual Team Team { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
    public DateTime? Deleted { get; set; }

    [ForeignKey("Team")]
    public int TeamId { get; set; }
}

public class Team
{
    [Key]
    [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
    public int Id { get; set; }
    public string Name { get; set; }
    public DateTime? Deleted { get; set; }

    // navigation properties
    public virtual ICollection<Application> Applications { get; set; }
}

Although this may seem like more manual work than using the Entity Framework designer, it does gives me more control over what’s happening. It also avoids tying my domain model directly to Entity Framework. This makes it easier to unit test my code and keep it flexible. With a bit of extra code, the above two classes provide Entity Framework with sufficient information to create my database and tables from scratch for me. In fact, this happens every time you start the application if the database does not yet exist.

And that really is the best thing about Entity Framework. It takes away the tedious work of creating tables, mapping them to C# classes and debugging the mappings. This requires a few things, which I will describe below. Note that the below classes are automatically created by Entity Framework when you run this command in the Package Manager Console in Visual Studio (View > Other Windows):

Enable-Migrations InitialCreate -StartUpProjectName "WebSite" -ProjectName "Repositories"

Anyways, the first thing we need is a DataContext class:

public class DataContext : DbContext, IDataContext
{
    public DataContext()
        : base("database")
    {
    }

    public new IDbSet<T> Set<T>() where T : class
    {
        return base.Set<T>();
    }

    public IDbSet<Application> Applications { get; set; }
    public IDbSet<Team> Teams { get; set; }

    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
        base.OnModelCreating(modelBuilder);
    }

    public void ExecuteCommand(string command, params object[] parameters)
    {
        base.Database.ExecuteSqlCommand(command, parameters);
    }
}

This class is pretty much the same for every project, except for the domain specific properties Applications and Teams. It inherits from a custom IDataContext interface and the Entity Framework DbContext base class. The interface is there to allow me to inject mocked or faked data contexts for unit- and regression testing purposes (more about that in another post). In the constructor parameter, I specify the name of the connection string to use from the website's web.config file. Another important file is the Configuration class, that also lives in the Repositories project:

public sealed class Configuration : DbMigrationsConfiguration<DataContext>
{
    public Configuration()
    {
        AutomaticMigrationsEnabled = false;
        AutomaticMigrationDataLossAllowed = true;
    }

    protected override void Seed(DataContext context)
    {
        context.Teams.AddOrUpdate(p => p.Name,
            new Team() { Name = "Team A" },
            new Team() { Name = "Team B" });
        context.SaveChanges();

        context.Applications.AddOrUpdate(p => p.Name,
            new Application() { Name = "Application A",
                                TeamId = context.Teams.First(x => x.Name == "Team A").Id, 
                                Description = "This is application A" },
            new Application() { Name = "Application B", 
                                TeamId = context.Teams.First(x => x.Name == "Team B").Id, 
                                Description = "This is application B" },
            new Application() { Name = "Application C", 
                                TeamId = context.Teams.First(x => x.Name == "Team A").Id, 
                                Description = "This is application C" });
        context.SaveChanges();
    }
}

The configuration class does two things. It configures Entity Framework Migrations and it specifies what data has to be seeded into the database. As you can see, I have disabled Automatic Migrations. This would allow EF to automatically update the database schema with changes that I make to my domain entities (Application, Team). So, if I were to add a new property called StartDate to Application, the corresponding column will be automatically added to the Application table the next time my application runs. Although this may be useful locally, I wouldnt recommend it for the following reasons:

  • Less control: You have less control over (data) migrations, as they happen automatically whenever the schema changes. Even when the changes are not stable yet;
  • Data loss: As a result, you may lose valuable data (locally or even on a production environment);

The alternative to automatic migrations is to make Entity Framework generate a migration for you whenever you change your domain entities and feel confident that the change is stable (what you wanted, stable, bug free, etc). You can do this with the following command in the Package Manager Console window in Visual Studio:

Add-Migration [description] -StartUpProjectName "WebSite" -ProjectName "Repositories"

The migration is created by Entity Framework by comparing the database schema with the domain entities in your project that appear on the DataContext class (Team and Application). After creating the migration, I only have to call this command to create the database at the location specified by the connection string in the website's web.config:

Update-Database -StartUpProjectName "WebSite" -ProjectName "Repositories" -verbose

If this is your first migration (i.e. the database does not yet exist, or is empty), an initial migration is created with the following command:

Add-Migration InitialCreate -StartUpProjectName "WebSite" -ProjectName "Repositories"

The resulting migrations are stored as regular C# or VB code in the Migrations folder of the Repositories project. The generated ‘initial create’ migration looks like this:

public partial class InitialCreate : DbMigration
{
    public override void Up()
    {
        CreateTable(
            "dbo.Applications",
            c => new
                {
                    Id = c.Int(nullable: false, identity: true),
                    Name = c.String(),
                    Deleted = c.DateTime(),
                    TeamId = c.Int(nullable: false),
                })
            .PrimaryKey(t => t.Id)
            .ForeignKey("dbo.Teams", t => t.TeamId, cascadeDelete: true)
            .Index(t => t.TeamId);

        CreateTable(
            "dbo.Teams",
            c => new
                {
                    Id = c.Int(nullable: false, identity: true),
                    Name = c.String(),
                    Deleted = c.DateTime(),
                })
            .PrimaryKey(t => t.Id);

    }

    public override void Down()
    {
        DropForeignKey("dbo.Applications", "TeamId", "dbo.Teams");
        DropIndex("dbo.Applications", new[] { "TeamId" });
        DropTable("dbo.Teams");
        DropTable("dbo.Applications");
    }
}

You can see that there is an Up() and a Down() method. Basically, this allows Entity Framework to migrate your database up or down to a specific version based on the entire chain of all the scripts in the Migrations folder. You can also migrate to a specific version by calling the aforementionedUpdate-Database command with the additional argument -TargetMigration: [Name]. 

This approach to migrating and versioning your database is a good idea because it keeps a history of your database changes in your source code and - by extension - in your source code repository. Whenever you check out an older version of the source code, Entity Framework will automatically create the database as it was at that point in time (based on the migrations). Even better, by carefully sticking to migrations like this, you can update production environments programmatically. You can extend the Up() and Down() method with code that also migrates your data or execute SQL commands. 

Entity Framework: Automatic Database Initialization

Entity Framework’s ability to automatically update your database is very useful, especially when you are working locally. To make my life easier, I always allow Entity Framework to automatically update my database through a DatabaseInitializer that I specify in my web.config for maximum flexibility:

<entityFramework>
    <defaultConnectionFactory type="System.Data.Entity.Infrastructure.LocalDbConnectionFactory, EntityFramework">
        <parameters>
        <parameter value="v11.0" />
        </parameters>
    </defaultConnectionFactory>

    <!-- Migrate database to latest version (if you remove this, database migrations will not be applied) -->
    <contexts>
        <context type="ChristiaanVerwijs.MvcSiteWithEntityFramework.Repositories.DataContext, repositories">
        <databaseInitializer type="System.Data.Entity.MigrateDatabaseToLatestVersion`2[[ChristiaanVerwijs.MvcSiteWithEntityFramework.Repositories.DataContext, repositories], [ChristiaanVerwijs.MvcSiteWithEntityFramework.Repositories.Configuration, repositories]], EntityFramework" />
        </context>
    </contexts>
</entityFramework>

The initializer uses the MigrateDatabaseToLatestVersion initializer that is part of Entity Framework and is called every time the application is (re)started. It checks if there are any pending migrations (by looking in the __MigrationHistory table), and execute them one by one. The initializer uses the Configuration.cs class and the associated seeder code.

By using the web.config to set the initializer, I can turn this behavior off for our production environment. I can do this by creating a web.config transform for our production environment (e.g. Web.Release.Config):

<?xml version="1.0"?>
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
  <!-- Remove automatic migration for EF so that the database is not updated or seeded automatically -->
  <entityFramework>
    <contexts xdt:Transform="Remove">
    </contexts>
  </entityFramework>
</configuration>

Entity Framework: Repositories

A repository is a design pattern intended to decouple database code from your business logic. This helps you achieve several goals. First, the decoupling makes your code easier to read and maintain, because database code is not spread throughout your application (separation of concerns). Second, the code becomes significantly easier to unit test. You can easily mock repositories while testing your business logic instead of having to set up databases, tables and seeding them with data. And third, your business logic does not depend strongly on a specific database platform.

There are blog posts that explain how to implement the Repository pattern in great detail (like this one), so I will not go into detail here. Instead, I will show you my implementation of a generic repository class and how I use it:

public class GenericRepository<T> : IGenericRepository<T> where T : class
{
    private IDataContext dataContext;

    public GenericRepository(IDataContext dataContext)
    {
        this.dataContext = dataContext;
    }

    public virtual IEnumerable<T> GetAll()
    {
        return this.DataSource();
    }

    public virtual T GetById(int id)
    {
        return this.dataContext.Set<T>().Find(id);
    }

    public virtual void InsertAndSubmit(T entity)
    {
        this.dataContext.Set<T>().Add(entity);
        this.SaveChanges();
    }

    public virtual void UpdateAndSubmit(T entity)
    {
        this.SaveChanges();
    }

    public virtual void DeleteAndSubmit(T entity)
    {
        this.dataContext.Set<T>().Remove(entity);
        this.SaveChanges();
    }

    public virtual void SoftDeleteAndSubmit(T entity)
    {
        if (typeof(T).GetProperty("Deleted") != null)
        {
            entity.GetType().GetProperty("Deleted").SetValue(entity, DateTime.Now, null);
            this.UpdateAndSubmit(entity);
        }
        else
        {
            throw new InvalidOperationException("This entity type does not support soft deletion. Please add a DateTime? property called Deleted and try again.");
        }

    }

    public void ExecuteCommand(string sql, params object[] parameters)
    {
        this.dataContext.ExecuteCommand(sql, parameters);
    }

    #region Private Helpers
    /// <summary>
    /// Returns expression to use in expression trees, like where statements. For example query.Where(GetExpression("IsDeleted", typeof(boolean), false));
    /// </summary>
    /// <param name="propertyName">The name of the property. Either boolean or a nulleable typ</param>
    private Expression<Func<T, bool>> GetExpression(string propertyName, object value)
    {
        var param = Expression.Parameter(typeof(T));
        var actualValueExpression = Expression.Property(param, propertyName);

        var lambda = Expression.Lambda<Func<T, bool>>(
            Expression.Equal(actualValueExpression,
                Expression.Constant(value)),
            param);

        return lambda;
    }

    protected IQueryable<T> DataSource()
    {
        var query = dataContext.Set<T>().AsQueryable<T>();
        var property = typeof(T).GetProperty("Deleted");

        if (property != null)
        {
            query = query.Where(GetExpression("Deleted", null));
        }

        return query;
    }

    protected virtual void SaveChanges()
    {
        this.dataContext.SaveChanges();
    }
    #endregion
}

I will go into a bit more detail below, but I would like to emphasize that this abstract class is used as the foundation for other concrete repositories. I usually create domain specific repositories for every entity type in my datamodel. Take this simple example of the repository that manages my Application entities:

public class ApplicationRepository : RepositoryBase<Application>, IApplicationRepository
{
    public ApplicationRepository(IDataContext dataContext)
        : base(dataContext)
    {
    }
}

By inheriting from RepositoryBase with a specific type, my ApplicationRepository inherits all strongly-typed methods from the base class.

One question you may ask is why not use RepositoryBase directory in the business logic directly? Why create all these additional entity-specific repositories? This is a good question and is mostly a personal preference. In my code, I like to set up my entity-specific repositories in such a way that custom data queries become methods on the relevant repository (e.g. “GetByName(string name)” on the ApplicationRepository) whenever I need them. I find this more readable and easier to test than passing LINQ statements into a generic repository class, which is usually the alternative (e.g. “Where(p => p.Name == name)”). It also makes it easier to apply more fine-grained tuning (like caching and lazy-loading) to specific data queries. The downside of this approach is that you need a bit more boilerplate code.

But let’s get back to the RepositoryBase class and the concrete ApplicationRepository. You’ll notice a few things when looking at the code:

  • First, the DataContext is injected through the constructor. This makes it easy to swap out the datacontext with a fake or mocked context for testing purposes. More about this in an upcoming post;
  • Second, the repository implements a ‘soft deletion’ pattern that only marks an entity as deleted by does not actually delete it. The SoftDeleteAndSubmit only sets the Deleted property (a datetime) to today’s date. All data retrieval methods, except for GetById(id), filter out soft-deleted entities automatically. This logic is implemented within the RepositoryBase class and works for every entity with a ‘DateTime? Deleted’ property. If such as property does not exist, the SoftDeleteAndSubmit method will throw an InvalidOperationException. But the other methods will still function; 
  • Third, the InsertAndSubmit, UpdateAndSubmit and DeleteAndSubmit methods all submit their changes right away (as clearly stated by their names). I often find that I don’t need transactions or units of work in basic applications. If I do, it is easy to add additional ‘Delete’, ‘Update’ or ‘Insert’ methods that don’t save their changes right away;
  • The UpdateAndSubmit method doesn’t really do anything, except call SaveChanges(). Granted, this is a bit hacky. But I think it makes my code more explicit and readable. Plus, the method can be extended in the future with custom code (e.g. for updating a cache);

Concluding thoughts

There you have it. A relatively straightforward sample application. Granted, the application doesn't really do anything useful, but it does show you how a number of technologies, patterns and ideas can work together. The best way to get the hang of it, is to have a go at the source code and play around with it for a bit. You'll certainly notice a lot of other things that I haven't written about in this post. There's some unit testing going on, I'm using Bootstrap, the application is running on AppHarbor and there's some faking and dependency injection going on. I will write about this in upcoming posts.

Oh, and I'm certain there are many improvements that can be made to this code. Maybe you'll find some bugs or identify improvements. Feel free to share them below, 

Update: you can read the second part in this series here.

Christiaan Verwijs
Christiaan Verwijs

Scrum Master, Trainer, Developer & founder of Agilistic