Wednesday, 27 January 2010

Undropping a SQL Server Table - Part 6

Introduction

We've pretty much wrapped-up this series on how to recover data from tables which had been accidentally dropped, and for which no backup exists.

So far, we’ve created a stored procedure with which to extract a hexadecimal dump of a specified page (Part 1), identified which pages contained data from which tables (Part 2), parsed each page into individual records (Part 3), parsed those records (which were still in raw hexadecimal form) into individual columns to create a standard SQL Server rowset (Part 4), and generated the record-parsing code dynamically for a table with an arbitrary structure (Part 5).

This final part will discuss how to cope when you don't know the structure of the table you're trying to recover. As the meta-data (column names, types, etc.) will have been lost when the table was dropped, and as there isn't a way to recover that meta-data (which I know of), we'll have to guess it. This post is about that guess-work.

This Might Not Work

All of the previous posts in this series were describing the steps I'd taken to recover some dropped tables. They were describing a process I knew worked because I'd used it to recover 10GB or so of data. This post is different. I plan to guess the structure of a table and write about it as I go. This means that when I make mistakes (which I will) you'll get to read about them. In fact, it might be that I can't recover this data because I can't work out the table structure. We'll have to wait and see.

Test Scenario

During Part 2 of this series I created a table, TestTable, populated it with some data and then dropped it. In the same post I described how we could use DBCC IND to establish which pages belong to tables which we knew hadn't been dropped, to reduce the number of pages we'd have to analyse as having potentially belonged to TestTable.

When I did this, I expected every data page on that database to have been identified as belonging to a known table, or to have belonged to TestTable. After all, I'd only dropped one table for the test. But I noticed that five data pages didn't belong to an existing table and didn't belong to TestTable. It is the data on these pages which I'll be attempting to recover. As I write this, I have no idea what these data pages contain and I certainly don't know the associated table structures. Hopefully some of the techniques I use to identify the structure will be of use to you if you're attempting to establish the structure of some data you've mislaid.

Beginning the Investigation

My starting point in investigating these five rogue data pages is to establish whether they all belonged to the same table or not. So I issue the following query:

USE DataRecovery
GO

SELECT m_pageId, [m_objId (AllocUnitId.idObj)], [Metadata: AllocUnitId], m_prevPage, m_nextPage, ColumnCount, SumOfFixedWidths
FROM   LazarusPage
WHERE  TableName IS NULL
GO

This produces the following rowset:

m_pageId m_objId (AllocUnitId.idObj) Metadata: AllocUnitId m_prevPage m_nextPage ColumnCount SumOfFixedWidths
(1:59) 6 393216 (1:53) (1:61) 5 10
(1:60) 3 196608 (1:45) (1:74) 32 42
(1:61) 6 393216 (1:59) (1:62) 5 10
(1:62) 6 393216 (1:61) (0:0) 5 10
(347:22675801) 22413653 93451161169100800 (335:21889357) (339:22151505) 3797 2528

The first finding is obvious: the final page is clearly nonsense as it claims to have an m_pageId of (347:22675801). The database upon which this table resides doesn't have 347 files. We identified this page as being potentially of interest by virtue of it having an m_typeId of 1 (i.e. it's marked as a data page). But any random 8K block of data has a 1-in-256 chance of having a 0x01 in byte 1. So that's hardly proof that it's a real data page. To reduce the risk of false positives, we used some code earlier on in Part 3 to only set InUseDataPage if Slot0Offset contained a plausible value. We defined a plausible value as being between 96 and 8192 as the header occupies bytes 0 through 95 and as a page is only 8192 bytes long. Although this reduces the likelihood that we'll see a page and believe it to be a valid data page when it isn't, it doesn't eliminate the possibility - as this page proves.

The second finding is that pages (1:59), (1:61) and (1:62) belong to the same table. We can tell this because they share the same [m_objId (AllocUnitId.idObj)] and [Metadata: AllocUnitId], and because m_prevPage and m_nextPage link the pages together. Also, they share the same ColumnCount and SumOfFixedWidths which we used in an earlier post as a pseudo-signature.

You'll notice that (1:59) reports that its m_prevPage is (1:53). As (1:53) isn't in the list of data pages which are orphaned (i.e. for which we don't know the owning table) it's very tempting at this point to cheat and go see what table page (1:53) belongs so. But I won't. Incidentally, my guess is that whatever table these three pages belonged to has encountered a delete operation at some point meaning that these pages have been de-allocated from the table, although the table still exists and has at least one page, (1:53), allocated to it.

I'm going to ignore page (1:60) and come back to that later. It has more columns and so will almost certainly be more tricky to parse. Let's start with the low-hanging fruit.

So, let's focus in on the table which owns (well, owned) pages (1:59), (1:61) and (1:62).

SELECT ColumnCount, SumOfFixedWidths, VariableWidthColumnCount, SampleSlotText
FROM   LazarusPage
WHERE  m_pageId IN ('(1:59)', '(1:61)', '(1:62)')
ColumnCountSumOfFixedWidthsVariableWidthColumnCountSampleSlotText
51010......................c.o.n.v.e.r.t...s.m.a.l.l.i.n.t...0.....
51010..................O...c.o.n.v.e.r.t...b.i.t.......s.t.a.t.u.s.........1.......
5101.......................c.o.n.v.e.r.t...s.m.a.l.l.i.n.t...i.s.n.u.l.l...c.o.n.v.e.r.t...b.i.n.a.r.y...2.....r.e.v.e.r.s.e...s.u.b.s.t.r.i.n.g.....r.e.f.k.e.y.s.....3.1...2.........0.......

VariableWidthColumnCount is 1, which is nice, and we can see that variable width data in SampleSlotText. It contains words like CONVERT and ISNULL, so it looks like T-SQL so me. The way SampleSlotText was built was to output any alphanumeric character as-is, and output a '.' in place of any other character (including spaces and punctuation). As we have a non-printable character between every alphanumeric character, we can clearly see that this is Unicode data. So the single variable-width column which this table contains is an NVARCHAR column containing some T-SQL.

We know from ColumnCount that the table contains a total of 5 columns. It's likely that this means it contains 4 fixed-length columns (as VariableWidthColumnCount was 1) but we have to remember that trailing NULL variable-width columns aren't counted in VariableWidthColumnCount so this can be misleading. It's possible that this table contains a single fixed-width column, plus the NVARCHAR T-SQL column, plus 3 other variable-width columns which happen to always be NULL in the single record we've extracted per page. But that's not where the smart money is. The most likely set-up of this table is 4 fixed-width columns and a single variable-width column. So that's what I'm going to assume until I find some evidence to the contrary.

We can see from SumOfFixedWidths that we have 10 bytes spread amongst the (presumed) 4 fixed-width columns. So we know straight-away that their can't be any 8-byte columns as the remaining 2 bytes couldn't be split amongst the remaining three columns. So we know that table doesn't contain any MONEY, DATETIME or FLOAT columns. Let's take a peek at the hexadecimal dump of the record to see what we can learn next. I'll focus in on the sample record from page (1:59)

SELECT SampleSlotData
FROM   LazarusPage
WHERE  m_pageId = '(1:59)'
SampleSlotData
0x30000E000100000019000100000005000001003F00280063006F006E007600650072007400280073006D0061006C006C0069006E0074002C00300029002900

As the record contains offsets to other parts of the records, it's very useful to have a byte number along-side the actual data, so I'll re-present this in a fixed-width font and with a zero-based byte number along the top:

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
0x30000E000100000019000100000005000001003F00280063006F006E007600650072007400280073006D0061006C006C0069006E0074002C00300029002900

I realise that the data will be cropped on the right, but hopefully that represents the NVARCHAR column so we can ignore it. We'll find out for sure in a moment. Let's work through the data byte-by-byte.

Byte 0 is TagA. This contains various bit flags which we'll ignore for the moment (normally they're not of interest).

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
0x30000E000100000019000100000005000001003F00280063006F006E007600650072007400280073006D0061006C006C0069006E0074002C00300029002900

Byte 1 is TagB. This should be 0 on SQL Server 2005 (and 2000 and 2008, I believe).

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
0x30000E000100000019000100000005000001003F00280063006F006E007600650072007400280073006D0061006C006C0069006E0074002C00300029002900

Bytes 2 and 3 are Fsize, the offset within the record at which the column count can be found. This contains the value 0x000E (remember the bytes are flipped), which specifies that NCol, the column count, can be found at offset 14. The two-byte NCol, which contains the value 0x0005, is also highlighted below.

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
0x30000E000100000019000100000005000001003F00280063006F006E007600650072007400280073006D0061006C006C0069006E0074002C00300029002900

Every byte between Fsize and NCol belongs to one of the fixed-width columns, which are the bytes we're particularly interested in.

0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
0x30000E000100000019000100000005000001003F00280063006F006E007600650072007400280073006D0061006C006C0069006E0074002C00300029002900

Let's split this section into individual bytes to make it a little easier to read. We can also dispense with the byte offsets, as they don't really add any value here.

01 00 00 00 19 00 01 00 00 00

In my experience 0x01 followed by one or more 0x00s tends to suggest the first byte of a DECIMAL, representing a positive sign byte or the least significant byte of a integer column. So my guess (and that's all it is at this point) is that the first column in this table contains an 0x01 TINYINT, an 0x0001 SMALLINT, an 0x00000001 INT or an 0x00000019 DECIMAL (with a leading 0x01 representing a positive sign byte).

But all that's a little too vague for my liking. I think our chances will be improved by looking at some other records within the same table. We used some code in Part 3 to split pages into records, so I'll use that again here. That code required that each page being split had a known table name, so let's give them one:

UPDATE LazarusPage
SET    TableName = 'Unknown01'
WHERE  m_pageId IN ('(1:59)', '(1:61)', '(1:62)')

At this point I can run the code from Part 3 to populate LazarusRecord, and then extract just the 10-bytes belonging to the fixed-width columns as follows:

SELECT SUBSTRING(Data, 5, 10)
FROM   LazarusRecord
WHERE  TableName = 'Unknown01'

This produced 19 rows. This is fewer rows than I can expecting: the fixed-length portions of the records are very short, and we've got three 8K pages across which to store them. So I guess that NVARCHAR column containing T-SQL must get pretty wide sometimes. We'll need to make sure we define this as an NVARCHAR(MAX) to ensure we don't get any truncation. Anyway, back to the 19 rows themselves:

01000000190001000000
020000001C0001000000
020000001D0001000000
03000000160001000000
03000000170001000000
03000000180001000000
03000000190001000000
030000001A0001000000
030000001B0001000000
030000001C0001000000
030000001D0001000000
030000001E0001000000
030000001F0001000000
03000000200001000000
06000000060001000000
06000000070001000000
06000000080001000000
06000000090001000000
060000000A0001000000

Okay, doing that has shed a little light on the situation. Not as much as I'd have liked, but a little. I said earlier that the first byte may have been the sign byte of a DECIMAL column. It's clear now that it isn't: a DECIMAL's sign byte can only contain the values 0x00 and 0x01. So this must represent the least significant byte of a TINYINT, SMALLINT or INT.

If you'll glance down the middle of the table above you'll notice what seems to be a incremental number. There are a some duplicates (0x19, 0x1C and 0x1D) but where those duplicates exist, the first byte of the record is different. So the 'middle' column may well represent a sub-type where the column on the far left represents the type. We can also reasonably assume that the 0x19 byte is the least significant byte of some integer field (a REAL just isn't going to present 'clean' numbers like this).

It's possible that it represents the least significant byte of a DECIMAL, in which case the preceding byte is a negative sign byte. The 3 bytes to the left of that must therefore be split between a SMALLINT and a TINYINT. (The only 3-byte field is a CHAR(3) and these bytes certainly don't look like sensible ASCII values to me.) If all this were true then the DECIMAL must occupy a total of 5 bytes (DECIMALS occupy 5, 9, 13 or 17 bytes). Which leaves the last two bytes for a final column. I don't believe this is the actual structure, but I'm going to show that it's a possible structure.

Let's populate LazarusColumn along these lines and then parse the table. Note that it doesn't really matter how we decide the first three bytes are split, the second column will always contain 0x00 or 0x0000 and the first column will either contain no high-order byte or a 0x00 in its high-order byte.

INSERT LazarusColumn VALUES ('Unknown01', 'ColA', 48  /* TINYINT  */,  1, 0, 0, 1)
INSERT LazarusColumn VALUES ('Unknown01', 'ColB', 52  /* SMALLINT */,  2, 0, 0, 2)
INSERT LazarusColumn VALUES ('Unknown01', 'ColC', 106 /* DECIMAL  */,  5, 9, 0, 3)
INSERT LazarusColumn VALUES ('Unknown01', 'ColD', 52  /* SMALLINT */,  2, 0, 0, 4)
INSERT LazarusColumn VALUES ('Unknown01', 'ColE', 231 /* NVARCHAR */, -1, 0, 0, 5)

Then I run the code in Part 5, specifying a @TableName of 'Unknown01', to obtain the following rowset:

ColAColBColCColDColE
10-655610(convert(smallint,0))
20-655640(8000)
20-655650(case when ([rowcnt] > 2147483647) then 2147483647 else (convert(int,[rowcnt])) end)
30-655580(convert(tinyint,(([bitpos] & 7) + case when ([typestat] & 1 = 0) then 8 else 0 end + case when (([typestat] & 2 <> 0 or (type_name([xtype]) = 'image' or type_name([xtype]) = 'text') and [colstat] & 0x1000 <> 0)) then 16 else 0 end + case when (((type_nam
30-655590(convert(tinyint,xtypetotds([xtype],(1 - ([typestat] & 1)))))
30-655600(convert(smallint,columnproperty([id],[name],'oldusertype')))
30-655610(convert(varchar(255),[autoval]))
30-655620(convert(smallint,case when ((type_name([xtype]) = 'ntext' or (type_name([xtype]) = 'image' or type_name([xtype]) = 'text'))) then null when (type_name([xtype]) = 'uniqueidentifier') then [xprec] else (odbcprec([xtype],[length],[xprec])) end))
30-655630(odbcscale([xtype],[xscale]))
30-655640(convert(int,sign(([colstat] & 4))))
30-655650(convert(int,sign(([colstat] & 4))))
30-655660(convert(int,(1 - ([typestat] & 1))))
30-655670(convert(sysname,collationpropertyfromid([collationid],'name')))
30-655680(convert(binary(5),collationpropertyfromid([collationid],'tdscollation')))
60-655420(convert(smallint,(2 + 4 * ([status] & 1))))
60-655430(convert(smallint,0))
60-655440(convert(bit,([status] & 1)))
60-655450(convert(bit,([status] & 2)))
60-655460(convert(nvarchar(4000),case when ([status] & 2 = 2) then (uncompress([ctext])) else [ctext] end))

Now, as I said, I don't believe this represents the correct structure of this table. But it's important to note that it might.

One point worth noting is that we've have no NULLs returned. We didn't check the NULL bitmap before-hand, so we weren't really sure what to expect. Also note that I have no idea whether the NVARCHAR column is the first column, last column, or somewhere in the middle. The record layout can tell us the order of the fixed-length columns, and of the variable-length columns, but not one relative to the other. As our table clearly had no NULLs within it, the column sequence doesn't really matter. But if it did contain NULLs we'd have to get the column sequence right, or the bits in the NULL bitmap would be associated with the wrong columns. From what I've seen, when the bit in the NULL bitmap is set for a fixed-length field the column itself contains all 0x00 bytes; for a variable-length column it simply has zero length. If the NULL bitmap says your column is NULL but it actually contains data, you've got your column order wrong.

Given that I think the 'middle' 0x19 byte looks like a sub-type of some kind, it much more likely to be a positive number. The number could be 0x19 (TINYINT), 0x0019 (SMALLINT) or 0x00010019 (INT). Clearly it makes no difference to the data whether it's a TINYINT or a SMALLINT, as the high-order bytes would always be zero anyway. It being an INT looks unlikely to me. If it is a sub-type, relative to the type field on the far-left, then I'd expect that once the type changed we'd see relatively small values for the sub-type but if the sub-type is an INT then all its values are greater than 65536 (0x00010000). So, my money's on it being a SMALLINT. If the sub-type is a SMALLINT then the type probably is too (people tend to try and be consistent with their data types). So this mandates a fixed-length structure of SMALLINT/SMALLINT/SMALLINT/INT.

Let's define this as the structure and see what happens.

DELETE LazarusColumn WHERE TableName = 'Unknown01'
INSERT LazarusColumn VALUES ('Unknown01', 'ColA', 52  /* SMALLINT */,  2, 0, 0, 1)
INSERT LazarusColumn VALUES ('Unknown01', 'ColB', 52  /* SMALLINT */,  2, 0, 0, 2)
INSERT LazarusColumn VALUES ('Unknown01', 'ColC', 52  /* SMALLINT */,  2, 0, 0, 3)
INSERT LazarusColumn VALUES ('Unknown01', 'ColD', 56  /* INT      */,  4, 0, 0, 4)
INSERT LazarusColumn VALUES ('Unknown01', 'ColE', 231 /* NVARCHAR */, -1, 0, 0, 5)

A quick re-run of the code in Part 5 gives us the following:

ColAColBColCColDColE
10251(convert(smallint,0))
20281(8000)
20291(case when ([rowcnt] > 2147483647) then 2147483647 else (convert(int,[rowcnt])) end)
30221(convert(tinyint,(([bitpos] & 7) + case when ([typestat] & 1 = 0) then 8 else 0 end + case when (([typestat] & 2 <> 0 or (type_name([xtype]) = 'image' or type_name([xtype]) = 'text') and [colstat] & 0x1000 <> 0)) then 16 else 0 end + case when (((type_nam
30231(convert(tinyint,xtypetotds([xtype],(1 - ([typestat] & 1)))))
30241(convert(smallint,columnproperty([id],[name],'oldusertype')))
30251(convert(varchar(255),[autoval]))
30261(convert(smallint,case when ((type_name([xtype]) = 'ntext' or (type_name([xtype]) = 'image' or type_name([xtype]) = 'text'))) then null when (type_name([xtype]) = 'uniqueidentifier') then [xprec] else (odbcprec([xtype],[length],[xprec])) end))
30271(odbcscale([xtype],[xscale]))
30281(convert(int,sign(([colstat] & 4))))
30291(convert(int,sign(([colstat] & 4))))
30301(convert(int,(1 - ([typestat] & 1))))
30311(convert(sysname,collationpropertyfromid([collationid],'name')))
30321(convert(binary(5),collationpropertyfromid([collationid],'tdscollation')))
6061(convert(smallint,(2 + 4 * ([status] & 1))))
6071(convert(smallint,0))
6081(convert(bit,([status] & 1)))
6091(convert(bit,([status] & 2)))
60101(convert(nvarchar(4000),case when ([status] & 2 = 2) then (uncompress([ctext])) else [ctext] end))

This feels more plausible to me.

But Which is Right?

At this point my normal approach would be to show the data to users of the database to see whether they recognised the table based upon its content. Or I'd take a look at the applications which accessed the database to see if I could find one which used a T-SQL column. Of course for this table I'm pretty sure it belongs to SQL Server itself, although I can't quite work out what it might be used for. Perhaps we should try to find out.

Earlier on we noticed that pages (1:59), (1:61) and (1:62) belonged to the same table as (1:53). But we were wrong. We have to remember that pages (1:59), (1:61) and (1:62) are orphaned - they are not currently allocated to any table - that's why we're interested in them. If they really belonged to the same table as (1:53) then they'd have the same value for [m_objId (AllocUnitId.idObj)], and probably [Metadata: AllocUnitId], when we use a DBCC PAGE against them. But they don't. So how do we establish the actual table name and table structure for these pages?

Well, when we populated LazarusInUsePage using DBCC IND back in Part 2, perhaps we should have connected using a Dedicated Administrator Connection (DAC) to ensure all system tables were visible to us. [Later...] Nope, I've just tried that and it didn't make any difference.

The Lazarus database, like all other SQL Server databases, was built from the model database, so it'd be interesting to see whether or not this data exists there too.

DBCC PAGE ('Model', 1, 59, 1) WITH TABLERESULTS

The output from this statement (with a lot of details removed) is:

ParentObjectObjectFieldVALUE
...
PAGE HEADER:Page @0x06F58000m_pageId(1:59)
...
PAGE HEADER:Page @0x06F58000m_prevPage(1:53)
PAGE HEADER:Page @0x06F58000m_nextPage(1:61)
...
DATA:Slot 2, Offset 0xc4, Length 189, DumpStyle BYTERecord TypePRIMARY_RECORD
DATA:Slot 2, Offset 0xc4, Length 189, DumpStyle BYTERecord AttributesNULL_BITMAP VARIABLE_COLUMNS
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 00000000: 30000e00 02000000 1d000100 00000500 000100bd †0...................
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 00000014: 00280063 00610073 00650020 00770068 0065006e †.(.c.a.s.e. .w.h.e.n
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 00000028: 00200028 005b0072 006f0077 0063006e 0074005d †. .(.[.r.o.w.c.n.t.]
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 0000003C: 0020003e 00200032 00310034 00370034 00380033 †. .>. .2.1.4.7.4.8.3
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 00000050: 00360034 00370029 00200074 00680065 006e0020 †.6.4.7.). .t.h.e.n.
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 00000064: 00320031 00340037 00340038 00330036 00340037 †.2.1.4.7.4.8.3.6.4.7
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 00000078: 00200065 006c0073 00650020 00280063 006f006e †. .e.l.s.e. .(.c.o.n
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 0000008C: 00760065 00720074 00280069 006e0074 002c005b †.v.e.r.t.(.i.n.t.,.[
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 000000A0: 0072006f 00770063 006e0074 005d0029 00290020 †.r.o.w.c.n.t.].).).
Slot 2, Offset 0xc4, Length 189, DumpStyle BYTEMemory Dump @0x6468C0C4 000000B4: 0065006e 00640029 00††††††††††††††††††††††††††.e.n.d.).
...

That certainly looks like the same data to me. I'm pretty sure I haven't changed model on my SQL Server instance, so it's likely that these pages have become orphaned as the result of some service pack or other update having been applied. If that's the case then the original, shipped, version of model.mdf should contain these as unorphaned rows - i.e. with the meta-data of their owning table still intact.

The Microsoft Windows SDK contains a tool called Orca which we can use to examine the various MSIs which form the SQL Server 2005 installer to establish which contains model.mdf. It turns out that it's in SqlRun_SQL.msi. We can then use msidb (from the same SDK) to extract the .CAB file from the MSI:

> msidb -d SqlRun_SQL.msi -x Sql.cab

We can then extract model.mdf and modellog.log from the .CAB:

> expand sql.cab -F:model*.* .

Doing this leaves GUID on the end of the file names, so we need to quickly remove those:

> ren model.mdf.* model.mdf
> ren modellog.ldf.* modellog.ldf

We can then attach this original model database. (Yes I know sp_attach_db has been deprecated, but I prefer it and I'll continue to use it until it's gone.)

EXEC sp_attach_db 'OriginalModel', 'C:\Temp\Sql_RunSQL\model.mdf', 'C:\Temp\Sql_RunSQL\modellog.ldf'

Having taken a look at the original model database, I've confirmed that it too contains these orphaned pages. So perhaps we'll never know to which table they once belonged, and therefore what their actual structure is/was. My guess is that the original model database, created deep within Microsoft, once did have a 5-column table like the one we're parsing, but that the table was dropped before that database finally shipped as part of SQL Server 2005.

Although it's disappointing that I haven't been able to compare my guessed table structure with the actual one, I don't suppose it really matters. We've still worked through some techniques which helped us make educated guesses about its structure.

By the way, I'm not going to attempt to parse page (1:60) - this post is way too long already.

In Conclusion

I hope you've found this series useful and will be able to use parts of it if you ever find yourself needing to recover some dropped tables which can't be restored from a backup. The one thing you will need (in addition to this series) is patience - you will not recover your data quickly.

See Also

No comments:

Post a Comment