Hanalei, Hawaii 2010-03-18

SubSonic: Migrate Me

Just last night I ran my first "Green Light" unit tests on what I think is going to be a pretty groovy new feature: Migrations. Usually I don't like to talk much about functionality until it's 90%-ish, but I figured that it might be good to let everyone know the what/how/why now so I can take all your lovely comments and see if I can pop them in before it gets released.

Just last night I ran my first "Green Light" unit tests on what I think is going to be a pretty groovy new feature: Migrations. Usually I don't like to talk much about functionality until it's 90%-ish, but I figured that it might be good to let everyone know the what/how/why now so I can take all your lovely comments and see if I can pop them in before it gets released.

For those who don't know, Migrations allow you to build out your DB schema using code. It's a Rails Thing, and if you're not familiar with them you can watch one of my webcasts where I use them (right around the 3 minute mark), or just read up on them here.

The concept is that you version the building of your DB into code files, called "Migrations". Using code, you can build and change your DB as needed, and keep a reference of this in your app's codebase. Since you're using code to build out the schema, you can do all kinds of neat things like add data, ping services, send emails, etc whenever you change around the schema. This puts the DB work into your app and out of the designer, which in some cases can be neat.

Code Sample
Currently, all migrations are built off of an abstract base class called, as you might guess, SubSonic.Migration. Each migration must inherit from this base class, and must also implement two methods: Up (versions your DB upward) and Down(versions it down). Here's an example:

    public class Migration001:Migration {

        public override void Up() {
            TableSchema.TableCollection tables = new TableSchema.TableCollection();

            TableSchema.Table t = new TableSchema.Table("Test1");
            t.AddColumn("Name", System.Data.DbType.String);
            t.AddColumn("Description", System.Data.DbType.String,5000);
            t.AddColumn("DateEntered", System.Data.DbType.DateTime,0, false, "getdate()");
            AddSubSonicStateColumns(t);
            
            tables.Add(t);

            t = new TableSchema.Table("Test2");
            t.AddColumn("Name", System.Data.DbType.String);
            t.AddColumn("Friend", System.Data.DbType.String);
            t.AddColumn("FriendInt", System.Data.DbType.Int32, 0,false,"50");
            AddSubSonicStateColumns(t);
            tables.Add(t);

            Create(tables);

        }

        public override void Down() {
            Drop(new string[] { "Test1", "Test2" });
        }
    }
  • Notice that the name of the Migration is MigrationNUMBER - this isn't important from SubSonic's point of view, but in general naming like this will help you. I'll get more into our "convention" on this later in the post.
  • Also notice we're using TableSchema.Table, a good old friend, to specify the schema for each table. We're also using a TableSchema.TableCollection to hold each spec. This syntax might change as I want to make this code as razor-stupid-simple as I can, and would appreciate any thoughts on a more "fluent" and "discoverable" method signature.
  • We have 5 overrides for AddColumn which allow you to quickly add a column spec to your schema as needed. There really shouldn't be a SQL schema you can't create here (but I'm sure I'll eat these words).
  • Notice as well that I didn't create a Primary Key field - this goes to the core of what SubSonic tries to do for you (remove repetition). You and I both know that your table needs a PK, and most of the time that column follows the pattern TableNameID. So if you don't specify a PK - we add it for you (if you haven't defined one already). If you want to use your own - you can! You can use whatever name and data type you want by using AddPrimaryKeyColumn().
  • We also added some sugar the mix with "AddSubSonicStateColumns()" - this does what you'd think it would, adding CreatedBy, CreatedOn, ModifiedBy, ModifiedOn to your table.
  • Finally - notice that the second column added is of type "String" and I've set the length to 5000. This means something to SubSonic since the max length on nvarchar (our default string type) is 4000. Using the value 5000 will auto-set the data type to nvarchar MAX (or ntext using Sql 2000). If you want to be explicit about this, you can by using the "AddLongText()" method.

More Code
You can also add/remove columns, and alter their settings (without dropping and losing data). This next bit of code shows not only how to alter data, but also why you'd use migrations in the first place - it's a road map of how your DB has changed over time:

    public class Migration002:Migration {
        public override void Up() {
            //alter the description field - change to 1200
            TableSchema.TableColumn col = DataService.GetSchema("Test1", "Northwind").GetColumn("Description");
            col.MaxLength = 1200;

            //use AlterColumn to run an Alter statement - we don't want to save the schema
            //since that would overwrite what we're trying to do
            AlterColumn(col);

            //add a new column to Products
            DataService.GetSchema("Products", "Northwind").AddColumn("MyNewColumn",System.Data.DbType.DateTime);
            
            //add another one and set the default, as well as non-nullable
            DataService.GetSchema("Products", "Northwind").AddColumn("MaxInventory", System.Data.DbType.Int32,0,false,"100");

        }

        public override void Down() {
            
            //the Down() method reverses what Up() did
            //reset Description to 9000
            TableSchema.TableColumn col = DataService.GetSchema("Test1", "Northwind").GetColumn("Description");
            col.MaxLength = 9000;
            AlterColumn(col);

            //remove MyNewColumn and MaxInventory
            RemoveColumn("Northwind", "Products", "MyNewColumn");
            RemoveColumn("Northwind", "Products", "MaxInventory");


        }
    }

How It All Works
There's a lot going on under the hood here, as you can imagine. To keep track of it all, we create a new table in your DB (we have to) called "SubSonic_Schema" and it has one field - "version". We set this to know which version the DB is, and it allows SubSonic to know which direction to take the Migration (up or down).

The tool that actually runs the migration is our command line tool, SubCommander. This tool not only runs migrations, it will also create the migration code file (with naming and everything) that's needed to run the migration itself (you can do this by hand if you like):

sonic.exe migration /create

To run the migration:

sonic.exe migrate

To run to a specific version:

sonic.exe migration /version 1

When you run a migration (as opposed to creating one), SubCommander will look for a folder called "Migrations" in the same directory that it's executing in. If you're running it in VS (like you do with the generators), this would be the root of your project.

All the files get pulled in, and the name of the migration file is analyzed - this part is CRUCIAL. We need to use some method, non-code based, to determine which version the migration represents. For now I've come up with:

VERSION_Descriptor.cs (or vb)

so a sample migration code file would be "001_InitialSchema.cs". The numeric part is for us, the descriptor part is for you. I'm still very open on this, but this is how Rails does it and if someone else already thought this stuff out...

Each code file is pulled in and SubSonic takes a look at the version (the first 3 characters) and decides which to use. Once that's determined, then the source is pulled from the code file, handed off to our compiler (the same bits that our generator uses), and (using Reflection) the Up() or Down() method gets invoked in the appropriate migration file. If your current version is 2 (for instance) and you have migration files 001_x through 009_x, Up() will be called on files 003_x through 009_x.

Once your DB is version 9, you can version back down to 3 by using the command

sonic.exe migrate /version 3

This will run the Down() method in code files 009_x through 004_x.

The cool thing about this process is that you can call your class and namespace whatever you like; it's the naming of the file that's important. Would love some comments on this since naming is the biggest issue we have and people have some whacky naming conventions. I'd like to be open on this - but in general I think I'll default to some stronger adherence to convention.

Finally - each provider is responsible for implementing the methods needed to create/alter/remove columns as well as create/alter/drop tables. So far I've extended only the SQL Server provider (and by extension the EntLib ones) - I'm hoping to get to the MySQL and others before we release this.

Everyday Use
To rewind a bit - the goal here is to give you a way to version your DB as you work, and give you a way to go backwards if needed. Also, this is a great way for teams to review how a DB was schemed and built, with comments and so forth that can be reviewed during a code-review cycle.

When you need to change your DB, you just "migrate" it to the next version - this is the thinking here. Rather than just a quick tweak in the designer, you track what you do in code.

Many people have asked "why not just let Subversion track your changes as you make them to one code file" and the answer is that this method doesn't allow you to go backwards, unless you Revert, which isn't the best option.

Left To Do
There's a lot left to do - specifically I want to make sure this is all transactional. I have some ideas for this - but the thing I really need to find out (which I don't know) is how well the providers we use (MySQL, SqlLite, etc) support DB Management transactions - in other words data transactions are pretty straightforward; I'm not so sure about schema changes.

Also, I want to be able to "Reverse Engineer" an existing DB into a migration file. This isn't very hard to do, but doing it right might take some time to figure out and I'd love your comments on this.

When? Did You Say When? What Was That?
I have no timeline just yet, though if you want to play around with the bits they will be committed tonight/tomorrow (I hope). I want to include this feature as part of our next major release, which will also include our REST Handler and our new query tool - so it will be a few months until we can get this all tightened up.

I'm doing what I can to crack this stuff out and Eric's getting married in October (oops... not sure if I'm supposed to tell people that) so he's sort of out of the loop for a bit. I know he'd love any presents you want to send him :).

I'm going to lean on the commit team to see if they can free up some time (Phil? Jon? Scott?) - is it OK to call out your dev team like this? Who cares... it's Open Source and we make the rules as we go don't we?

And let me state this clearly:

I'm always working the bug list, so YES I WILL fix the outstanding issues :):):):)

Lemme hear it! What do y'all want to see aside from bug fixes and an earlier release?