In general, there are two kind of updates you’ll mainly perform on Windows Azure. One of them is changing your application’s logic (or so called business logic) e.g. the way you handle/read queues, or how you process data or even protocol updates etc and the other is schema updates/changes. I’m not referring to SQL Azure schema changes, which is a different scenario and approach but in Table storage schema changes and to be more precise only on specific entity types because, as you already now, Table storage is schema-less. As in In-Place upgrades, the same logic applies here too. Introduce a hybrid version, which handles both the new and the old version of your entity (newly introduced properties) and then proceed to your “final” version which handles the new version of your entities (and properties) only. It’s a very easy technique and I’m explaining how to add new properties and of course remove although it’s a less likely scenario.

During my presentation at Microsoft DevDays “Make Web not War”, I’ve created an example using a Weather service and an entity called WeatherEntry, so let’s use it. My class looks like this:

   1: [DataServiceKey("PartitionKey","RowKey")]
   2: public class WeatherEntry : TableServiceEntity
   3: {
   4:     public WeatherEntry()
   5:     {
   6:         PartitionKey = "athgr";
   7:         RowKey = string.Format("{0:10}_{1}", DateTime.MaxValue.Ticks - DateTime.Now.Ticks, Guid.NewGuid());
   8:     }
   9:     public DateTime TimeOfCapture{ get; set; }
  10:     public string Temperature{ get; set; }
  11: }

There is nothing special at this class. I use two custom properties, TimeOfCapture and Temperature and I’m going to make small change and I’ll add “SchemaVersion” which is needed to achieve the functionality I want. When I want to create a new entry, all I do now is instantiate a WeatherEntry, set the values and use a helper method called AddEntry to persist my changes.

   1: public void AddEntry(string temperature, DateTime timeofc)
   2: {
   3:    this.AddObject("WeatherData", new WeatherEntry { TimeOfCapture = timeofc, Temperature = temperature, SchemaVersion = "1.0" });
   4:    this.SaveChanges();
   5: }

I’m using TableServiceContext from the newly released StorageClient and methods like UpdateObject, DeleteObject, AddObject etc, exist in my data service context where AddEntry helper method relies. At the moment my Table schema looks like this:

schema-before-change  

It’s pretty obvious there is no special handling during saving of my entities but this is about to change in my hybrid version.

The hybrid

I did some changes at my base class and I’ve added a new property. It’s holding the temperature sample area, in my case Spata where Athens International Airport is.

My class looks like this now:

   1: [DataServiceKey("PartitionKey","RowKey")]
   2: public class WeatherEntry : TableServiceEntity
   3: {
   4:     public WeatherEntry()
   5:     {
   6:         PartitionKey = "athgr";
   7:         RowKey = string.Format("{0:10}_{1}", DateTime.MaxValue.Ticks - DateTime.Now.Ticks, Guid.NewGuid());
   8:     }
   9:     public DateTime TimeOfCapture{ get; set; }
  10:     public string Temperature{ get; set; }
  11:     public string SampleArea{ get; set; }
  12:     public string SchemaVersion{ get; set;}
  13: }

So, this hybrid client has somehow to handle entities from version 1 and entities from version 2 because my schema is already on version 2. How do you do that? The main idea is that you retrieve an entity from table storage and you check if SampleArea and SchemaVersion have a value. If they don’t, put a default value and save them. In my case my schema version number has to be 1.5 as this is the default schema number for this hybrid solution. One key point to this procedure is before you upgrade your client to this hybrid, you roll-out an update enabling “IgnoreMissingProperties” flag on your TableServiceContext. If IgnoreMissingProperties is true, when a version 1 client is trying to access your entities which are on version 2 and have those new properties, it WON’T raise an exception and it will just ignore them.

   1: var account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
   2: var context = new WeatherServiceContext(account.TableEndpoint.ToString(), account.Credentials);
   3:  
   4: /* Ignore missing properties on my entities */
   5: context.IgnoreMissingProperties = true;

Remember, you have to roll-out an update BEFORE you upgrade to this hybrid.

Whenever I’m updating an entity to Table Storage, I’m checking its version Schema and if it’s not “1.5” I update it and put a default value on SampleArea:

   1: public void UpdateEntry(WeatherEntry wEntry)
   2: {
   3:     if (wEntry.SchemaVersion.Equals("1.0"))
   4:     {
   5:         /* If schema version is 1.0, update it to 1.5 
   6:          * and set a default value on SampleArea */
   7:         wEntry.SchemaVersion = "1.5";
   8:         wEntry.SampleArea = "Spata";
   9:     }
  10:     /* Put some try catch here to 
  11:      * catch concurrency exceptions */
  12:     this.UpdateObject(wEntry);
  13:     this.SaveChanges();
  14: }

My schema now looks like this. Notice that both versions of my entities co-exist and are handled just fine by my application.

schema-after-change

Upgrading to version 2.0

Upgrading to version 2.0 is now easy. All you have to do is change the default schema number when you create a new entity to version 2.0 and of course update your “UpdateEntry” helper method to check if version is 1.5 and update the value to 2.0.

   1: this.AddObject("WeatherData", new WeatherEntry { TimeOfCapture = timeofc, Temperature = temperature, SchemaVersion = "2.0" });

and

   1: public void UpdateEntry(WeatherEntry wEntry)
   2: {
   3:    if (wEntry.SchemaVersion.Equals("1.5"))
   4:    {
   5:        /* If schema is version 1.5 it already has a default
   6:         value, all we have to do is update schema version so 
   7:         our system won't ignore the default value */
   8:        wEntry.SchemaVersion = "2.0";
   9:    }
  10:    /* Put some try catch here to 
  11:     * catch concurrency exceptions */
  12:    this.UpdateObject(wEntry);
  13:    this.SaveChanges();
  14: }

Whenever you retrieve a value from Table Storage, you have to check if it’s on version 2.0. If it is, you can safely use its SampleArea value which is not the default any more. That’s because schema version is changed when you actually call “UpdateEntry” which means you had the chance to change SampleArea to a non-default value. But if it’s on version 1.5 you have to ignore it or update it to a new, correct value.

If you do want to use the default value anyway, you can create a temporary worker role which will scan the whole table and update all of your schema version numbers to 2.0.

How about when you remove properties

That’s a really easy modification. If you remove a property, you can use a SaveChangesOption called ReplaceOnUpdate during SaveChanges() which will override your entity with the new schema. Don’t forget to update your schema version number to something unique and put some checks into your application to avoid failures when trying to read non-existent properties due to newer schema version.

   1: this.SaveChanges(SaveChangesOptions.ReplaceOnUpdate);


That’s all for today! Smile

P.K