Tuesday 22 December 2009

Undropping a SQL Server Table - Part 2

Introduction

In my last post, I explained how I ended up in a situation where I needed to recover data from a database where all the tables had been dropped, no backup existed, and third-party tools could not be used. I also explained how I created a stored procedure, pDBCC_PAGE_Lazarus, which could dump out the raw contents of a page on the now empty database and how I used a certificate to ensure that users could make use of the stored procedure even if they were not members of the sysadmin fixed server role.

This post will cover the next stage of the recovery process: obtaining the information to use when establishing which pages belonged to which dropped table.

Given the issues I'd had with third-party tools (namely not being allowed to install them) I decided early on that I wanted to make the recovery process as re-usable as possible. This meant that the recovery would need to take place within SQL Server, rather that within some C# application connected to the database. In fact my original goal was that I wanted to use Transact-SQL scripts only – no stored procedures or user-defined-functions. This was based upon the somewhat suspect view that I wanted anyone, even those without the rights to create objects within a database, to recover the data. That goal was abandoned early on – as soon as I'd created pDBCC_PAGE_Lazarus, in fact. But the goal to stay within SQL Server persisted.

Sunday 13 December 2009

Undropping a SQL Server Table - Part 1

Introduction

Have you ever accidentally dropped a SQL Server table? I have. In fact I didn't just drop a SQL Server table, I dropped all of them. In my defence, I was simply running a script written by another development team in order to deploy their objects into an existing database. What I hadn't realised was that their script expected to be deployed into a clean database – one without any existing objects – and so the first thing it did was cursor through sys.objects dropping everything it found. Somewhat embarrassing, but not the end of the word. I immediately contacted our DBA and requested a point-in-time restore. That's where things really started to go wrong. It turns out that when the creation of this particular database was requested, that request didn't make any mention of a backup strategy. So there wasn't one. Hmmm.

Now, whilst there was nothing business critical on this database, there was quite a lot of data which we very much wanted to recover. (By 'quite a lot' I mean 10GB or so.) If you've ever been faced with this situation, you'll know that there are several recovery tools on the market which can help. I looked at RedGate's SQL Log Rescue but that only supports SQL Server 2000 (I was on SQL Server 2005). I looked at Lumigent's Log Explorer (part of Audit DB) but was led to believe by several posts on-line that it only supported SQL Server 2000 – Lumigent themselves are somewhat vague on the subject. ApexSQL Recover, on the other hand, looked just the ticket. It supported SQL Server 2005, had good reviews, and provided a try-before-you-buy version which would recover 1 in 10 rows. The organisation that I was working within had a very strict policy about installing third-party applications which hadn't been through a centralised testing, approval, and packaging process. The policy was simple – you don't do it. Undeterred, I managed to obtain quasi-permission to install the trial version on my desktop with a view to going through the formal approval process if the results looked promising. They didn't. Once ApexSQL Recover was installed it contacted the database server and attempted to install some extended stored procedures. Not only had I not gained permission to install anything on the database server, I knew that extended stored procedures were specifically forbidden.

After taking my head out of my hands, believing the data to be lost forever, I starting thinking about how the recovery tools worked. I figured, if they can do it – why can't I? So, this post is the first of a series on how to undrop SQL Server tables the hard way.

Caveat: This post, and the series of which it forms a part, applies to SQL Server 2005. Your mileage with other versions may vary.

Monday 7 December 2009

Adding a Strong Name to a Third-Party Assembly

Introduction

Microsoft recommend that all assemblies be furnished with a strong name. The Code Analyser within Visual Studio will generate error CA2210, "Assemblies should have valid strong names", if the assembly does not have a strong name. So, like a good citizen, you strongly name all your assemblies and all is well.

One of the constraints placed upon assemblies which have a strong name is that all assemblies which they reference must also have a strong name. So what happens if you need to use a third-party assembly which doesn't have a strong name? You could contact the supplier and ask them to provide a version with a strong name. But what if they can't? Or won't? You'll need to add a strong name yourself.

When compiling an assembly, you'd normally add a strong name by referencing a file containing a key pair via the AssemblyKeyFile attribute, or via the Signing tab of Project Properties. But this is a third-party assembly – you don't have the source code. So how do you add the strong-name retrospectively?

Tuesday 1 December 2009

System.Data Serialisation Bug - Scenario 2

Introduction

In my last post I explained that System.Data contains a bug which can cause it to throw an exception during serialisation or deserialisation of a DataSet which contains a column that uses System.Object as its .NET type.

The bug manifests itself in two known scenarios:

  • Serialising a DataSet containing a column of type System.Object using a System.Xml.XmlTextWriter instance created via the static Create method; this throws a System.ArgumentException.
  • Deserialising a DataSet containing a column of type System.Object after having successfully passed it across the wire via Windows Communication Foundation (WCF) using netTcp binding; this throws a System.Data.DataException.

This post covers the second of the two scenarios – deserialising a DataSet which has been successfully transmitted via WCF's netTcp binding.

System.Data.DataException

A couple of years ago I was asked by a client to port an existing ASMX-based Web Service to one using Windows Communication Foundation. The brief was to change as little of the client and service as possible, and to just replace the communications interface. In addition to passing simple and complex types across the wire, the application would often pass both typed and untyped DataSets (clearly a bad idea, but that's another story). One of the typed DataSets was based upon a SQL Server 2005 table containing a column of type SQL_VARIANT. Attempting to pass an instance of this DataSet from server to client produced the following exception on the client (i.e. during deserialisation):

System.Data.DataException, System.Data, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
Undefined data type: 'xs:int'.

The XML fragment below represents an instance of the SQL_VARIANT column within the serialized typed DataSet. Note the use of the xsi:type="xs:int" attribute to define the data type of this column used for this particular row: although the column can contain data of any type, a given row must explicitly define which data type is being used.

<Data xsi:type="xs:int" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">1</Data>

Note: Much of the investigation detailed herein was carried out with the help of .NET Reflector to study what some of the classes in the .NET Framework were actually doing.

Monday 23 November 2009

System.Data Serialisation Bug - Scenario 1

Introduction

System.Data contains a bug which can cause it to throw an exception during serlisation or deserialisation of a DataSet which contains a column that uses System.Object as its .NET type.

The bug manifests itself in two known scenarios:

  • Serialising a DataSet containing a column of type System.Object using a System.Xml.XmlTextWriter instance created via the static Create method; this throws a System.ArgumentException.
  • Deserialising a DataSet containing a column of type System.Object after having successfully passed it across the wire via Windows Communication Foundation (WCF) using netTcp binding; this throws a System.Data.DataException.

I reported the bug to Microsoft back in 2007 whilst using .NET 2.0 and 3.0. At the time was told "it's definitely in our system for a Post Orcas release". The bug still exists in .NET Framework 3.5 SP1, although I haven't checked any of the .NET 4.0 betas yet.

This post covers the first scenario you might come across – simply serialising the DataSet with an XmlTextWriter. I'll do a follow-up next week covering the second scenario.

Reproducing the Bug

To demonstrate this behaviour we're going to need a DataSet with a System.Object column, and we're going to need to populate it with some data.

static System.Data.DataSet CreateAndPopulateDataSet()
{
  // create a DataSet containing two columns: a string (Key) and an object (Value)
  System.Data.DataSet dataSet = new System.Data.DataSet();
  System.Data.DataTable dataTable = dataSet.Tables.Add();
  dataTable.Columns.Add("Key", typeof(string));
  dataTable.Columns.Add("Value", typeof(object));

  // add two rows, one containing an integer Value, and one containing a string Value
  dataTable.LoadDataRow(new object[] { "foo", 42 }, true);
  dataTable.LoadDataRow(new object[] { "bar", "Hello World" }, true);

  // return the DataSet
  return dataSet;
}

Wednesday 18 November 2009

Cloning Messages in BizTalk Server when CanSeek is False

Introduction

When writing custom pipeline components for BizTalk Server 2006 there are times when you want to access a copy of a message, rather than the actual message being processed. Typically this is because the original data stream containing the message is not seekable, meaning that by reading the stream you prevent the next component within the pipeline from doing so.

The BizTalk Server 2006 documentation reports that the Data method of the Microsoft.BizTalk.Message.Interop.IBaseMessagePart interface provides such a facility. According to the documentation:

"In custom pipeline components, Data clones an inbound data stream, whereas GetOriginalDataStream returns the original inbound stream."

The testing I've done, however, shows that this is only the case if the original data stream is itself seekable. It seems likely that this is because a stream must be read in order to be cloned, and if it is not seekable it could never be returned to its original position following the clone. So BizTalk Server cannot honour the request to clone the stream if the stream is not seekable.

This article describes a utility method, GetSafeDataStream, which retrieves a stream which can be safely read without adversely affecting the original data stream.

Caveat: The information in this article applies to BizTalk Server 2006. Your mileage with other versions may vary.

Monday 16 November 2009

Welcome

Conventional wisdom suggests that the first post of any new blog (such as this one) should be in the form of an introduction. I'm sure you don't really care if I'm married (I am) or if I own a cat (I don't) so I'll dispense with an introduction of myself and plough on with an introduction of the blog itself. (If you really want to know something about me, you can view my profile on LinkedIn.)

There are several reasons why I've never blogged before. The primary reason is that I don't really have as much free time as many bloggers seem to have, so I can't really see myself keeping up the steady flow of posts which might be expected. Also, as my knowledge-base is quite varied I've always found it difficult to target the blog at any specific topic - I might have something worth saying on ASP.NET one minute and BizTalk Server the next. There are, however, several occasions where I've discovered some non-obvious piece of information and have felt the desire to share it with the world - but have not had a medium by which to do that. Another such occasion arose recently so I decided to take the plunge and set-up a blog.

By way of a theme, I realised that many of the topics which I'd like to discuss are about the internal mechanics of some piece of software or other: the sort of thing that isn't fully documented (if it's documented at all) and which can only be discovered by trial and error or (preferably) by firing up Lutz Roeder's .NET Reflector. I've therefore decided to name the blog Under the Covers.

I've got topics lined-up on the unusual behaviour of the Data property of BizTalk's IBaseMessagePart class, a bug in System.Data.DataSet.WriteXml and how to 'undrop' a table in SQL Server 2005.