Tuesday, 22 December 2009

Undropping a SQL Server Table - Part 2


In my last post, I explained how I ended up in a situation where I needed to recover data from a database where all the tables had been dropped, no backup existed, and third-party tools could not be used. I also explained how I created a stored procedure, pDBCC_PAGE_Lazarus, which could dump out the raw contents of a page on the now empty database and how I used a certificate to ensure that users could make use of the stored procedure even if they were not members of the sysadmin fixed server role.

This post will cover the next stage of the recovery process: obtaining the information to use when establishing which pages belonged to which dropped table.

Given the issues I'd had with third-party tools (namely not being allowed to install them) I decided early on that I wanted to make the recovery process as re-usable as possible. This meant that the recovery would need to take place within SQL Server, rather that within some C# application connected to the database. In fact my original goal was that I wanted to use Transact-SQL scripts only – no stored procedures or user-defined-functions. This was based upon the somewhat suspect view that I wanted anyone, even those without the rights to create objects within a database, to recover the data. That goal was abandoned early on – as soon as I'd created pDBCC_PAGE_Lazarus, in fact. But the goal to stay within SQL Server persisted.


Throughout this and the remaining posts of this series I'll attempt to consistently refer to the subdivision of a page within which data is stored as a slot, to the raw hexadecimal data within a slot as a record (or slot data), and to the parsed data as a row. I'll probably manage to deviate from this somewhere along the way, but that's how I'm intending to use those terms. Feel free to point out any mistakes.

Testing the Recovery Process

To allow us to see what each step in the recovery process buys us, I'm going to create a table on my Lazarus database, add some rows to it, and then drop the table. During this post and subsequent ones I'll attempt to raise this data from the dead. The data in the table is pretty meaningless, but it uses a variety of data types, some fixed- and some variable-length, and it contains a few NULLs for good measure. So hopefully it's reasonably representative.

  MyInt       INT,
  MySmallInt  SMALLINT,
  MyTinyInt   TINYINT,
  MyChar      CHAR(6),
  MyMoney     MONEY,
  MyDateTime  DATETIME,
  MyFloat     FLOAT,
  MyDecimal   DECIMAL(9,2),
  MyVarChar   VARCHAR(100),
  MyNVarChar  NVARCHAR(4)

SET @Counter = 1

WHILE @Counter <= 255
  SELECT NULLIF(@Counter, 1) * 1000000                 AS MyInt,
         NULLIF(@Counter, 2) * 100                     AS MySmallInt,
         NULLIF(@Counter, 3)                           AS MyTinyInt,
         'XYZ' + CONVERT(CHAR(3), @Counter)            AS MyChar,
         @Counter + (@Counter / 10000.0)               AS MyMoney,
         DATEADD(hh, @Counter, '20090101')             AS MyDateTime,           
         @Counter + (@Counter / 1000000.0)             AS MyFloat,
         @Counter + (@Counter / 100.0)                 AS MyDecimal,
         REPLICATE('A', @Counter % 10)                 AS MyVarChar,
         N'X' + CONVERT(NVARCHAR, NULLIF(@Counter, 3)) AS MyNVarChar

  SET @Counter = @Counter + 1


Don't Waste Time on Allocated Pages

I suspect most people reading this post won't have dropped every table in their database – just one or two. DBCC PAGE isn't fast, so we want to avoid using it against pages which don't belong to dropped tables. To give you some guidance, executing a DBCC PAGE against every page in a tiny 280-page (2 MB) database on my laptop takes about 10 seconds; doing so against a 2,000,000-page (15 GB) slice of the 'real' Lazarus took at about 20 hours. So that's about 1675 pages (13 MB) per minute in both cases. Some of that time is probably the security context switch we introduced in the last post, but not much from the testing I've done. So, we don't want to call DBCC PAGE against pages whose ownership is not in question.

The following script will identify those pages which are allocated to existing tables – table we haven't dropped. It relies upon the undocumented DBCC IND command.

USE DataRecovery

  @TableName SYSNAME
  DBCC IND ('Lazarus', @TableName, -1)

  PageFID         INT,
  PagePID         INT,
  IAMFID          INT,
  IAMPID          INT,
  ObjectID        INT,
  IndexID         INT,
  PartitionNumber INT,
  PartitionID     BIGINT,
  iam_chain_type  VARCHAR(100),
  PageType        INT,
  IndexLevel      INT,
  NextPageFID     INT,
  NextPagePID     INT,
  PrevPageFID     INT,
  PrevPagePID     INT,


SELECT '[' + schemas.name + '].['+ objects.name +']'
FROM   Lazarus.sys.objects
JOIN   Lazarus.sys.schemas ON objects.schema_id = schemas.schema_id
WHERE  [type] IN ('S', 'U')

OPEN TableCursor

FETCH NEXT FROM TableCursor INTO @TableName
  INSERT LazarusInUsePage
  EXECUTE pDBCC_IND_Lazarus @TableName
  FETCH NEXT FROM TableCursor INTO @TableName

CLOSE TableCursor

We'll use the information in LazarusInUsePage later to reduce the number of pages we have to issue a DBCC PAGE command for.

Who's Page is it Anyway?

The first thing I needed to do in my own recovery process was establish which pages contained data I needed to recover. It was a 170GB database but I knew that only about 10GB needed to be recovered – many of the tables contained data I simply didn't need.

Given the output of a call to DBCC PAGE there wasn't anything obvious which I could use to identify which table it used to belong to. If the meta-data has still existed, I could have made use of [Metadata: ObjectId] or perhaps [m_objId (AllocUnitId.idObj)], but with no meta-data available for a dropped table (i.e. no matching entry in sys.tables or the other system tables or catalog views) these weren't really of much help. I needed to actually look at the data in the page and see what table it looked like it used to belong too. I know that sounds mad, and sounds difficult to automate, but the structure of a data record actually makes it viable.

You'll find the structure of a data record fully documented in several places (e.g. on-line at Inside the Storage Engine: Anatomy of a Record or in Karen Delaney's Inside Microsoft® SQL Server™ 2005: The Storage Engine). To save you flipping between different sources of information, I'll summarise the structure here.

4Fsize - 4Fdata
Fsize +2((NCol - 1) / 8) + 1Nullbits
Fsize +2 + size of Nullbits2VarCount
Fsize +2 + size of Nullbits + 2VarCount * 2VarOffset
Fsize +2 + size of Nullbits + 2 + (VarCount * 2)(varies)VarData

Note that all two-byte fields have their bytes reversed, so the hexadecimal value 0xABCD is actually stored with the byte 0xCD before 0xAB. Okay, lets quickly walk through that structure.

  • The first byte contains various bit flags which define certain characteristics about the record. Some of these are important to us. [TagA]
  • The next byte is unused in SQL Server 2005 (and, I believe, in SQL Server 2000 and SQL Server 2008 too). [TagB]
  • The next two bytes contain the offset within the record at which the column count can be found. Don't be confused by the mnemonic Fsize: it's not really the size of anything, it's an offset. [Fsize]
  • The data belonging to fixed width fields (CHAR, DECIMAL, INT, FLOAT, DATETIME, etc.) comes next. Obviously the size of this section varies from table to table. Space is reserved here for all fixed-width columns – even those which are NULL. [Fdata]
  • The next two bytes contains a count of the number of columns in the record. If it seems odd to you to have this after the data for those columns, then you're not alone. [Ncol]
  • We then have 1 bit per column to define whether that column contains a NULL or not. A bit exists for each column in the table, regardless of whether it has been declared as NULLable or not. This field is padded to a byte boundary and so will be one byte long for tables containing between 1 and 8 columns, two bytes long for tables containing between 9 and 16 columns, etc. [Nullbits]
  • Following the NULL bitmap is a two-byte count of the number of columns for which variable-width data is stored. There are a couple of quirks to this: (a) trailing NULL variable-width columns have no entry in the record and so are not counted here, and (b) SQL Server stores the back-pointer to a forwarding record stub here, as if it were a variable width column. I'll cover these quirks in more detail as we move through the recovery process. [VarCount]
  • Next we get an array of offsets at which each variable width column starts. The first entry in this array defines the offset at which the second variable-width column starts, the second entry defines the start of the third variable width column, etc. There's no point in it storing the start of the first variable-width column as that always starts immediately after the offset array. This behaviour also means that the last entry in this offset array actually points to the first byte of another record (or to free space within the page).

So, even without knowing what table a data page belongs to, given a sample record from that page we can establish:

  • how many columns it has [Ncol], and
  • what the total width of all the fixed-width columns is [Fsize – 4].

As all records on a page always belong to the same table, we only need to perform this check for the first record per page. For the tables I was attempting to undrop, the combination of the column count and the total width of fixed-width columns was almost unique. I therefore decided to use this to establish which table the data originally belonged to. Of course, this relies upon knowing the table structure of all the tables you've dropped, via scripted CREATE TABLE statements. If you don't even have those you're going to find the recovery process even harder than I did. Not impossible, but a lot harder.

Page Parsing

So, we need some code which will perform a DBCC PAGE on every page in the database, locate the first record on that page and parse some simple data from that record which we can use to identify the original table. Whilst we're at it, we may as well capture all the information returned by DBCC PAGE, in case it turns out useful. And we'll capture the first record in full, in both hexadecimal and plain text format too – as that may also help us identify the owning table.

Before I move on to the code, I should just mention the structure of a page (as opposed to a record). A page is an 8K structure which belongs to a given table – data from multiple tables is never mixed on the same page. The first 96 bytes of the page contain header information. The last 4 bytes of the page contain the offset at which the first record can be found, the preceding 4 bytes contain the offset at which the second record can be found etc. This is explained in much more detail at Inside the Storage Engine: Anatomy of a Page.

We're going to need somewhere to store the results of all this parsing. As I mentioned in Part 1 of this series, it's important that you don't store any information in the database containing dropped tables as you may well overwrite data you're attempting to recover. So we'll be doing all our data recovery in empty database called DataRecovery.

USE DataRecovery

  -- Primary key
  FileId                          INT        NOT NULL,
  PageId                          INT        NOT NULL,
  -- Page headers returned by DBCC PAGE
  [m_pageId]                      VARCHAR(255)   NULL,
  [m_headerVersion]               VARCHAR(255)   NULL,
  [m_type]                        VARCHAR(255)   NULL,
  [m_typeFlagBits]                VARCHAR(255)   NULL,
  [m_level]                       VARCHAR(255)   NULL,
  [m_flagBits]                    VARCHAR(255)   NULL,
  [m_objId (AllocUnitId.idObj)]   VARCHAR(255)   NULL,
  [m_indexId (AllocUnitId.idInd)] VARCHAR(255)   NULL,
  [Metadata: AllocUnitId]         VARCHAR(255)   NULL,
  [Metadata: PartitionId]         VARCHAR(255)   NULL,
  [Metadata: IndexId]             VARCHAR(255)   NULL,
  [Metadata: ObjectId]            VARCHAR(255)   NULL,
  [m_prevPage]                    VARCHAR(255)   NULL,
  [m_nextPage]                    VARCHAR(255)   NULL,
  [pminlen]                       VARCHAR(255)   NULL,
  [m_slotCnt]                     VARCHAR(255)   NULL,
  [m_freeCnt]                     VARCHAR(255)   NULL,
  [m_freeData]                    VARCHAR(255)   NULL,
  [m_reservedCnt]                 VARCHAR(255)   NULL,
  [m_lsn]                         VARCHAR(255)   NULL, 
  [m_xactReserved]                VARCHAR(255)   NULL, 
  [m_xdesId]                      VARCHAR(255)   NULL, 
  [m_ghostRecCnt]                 VARCHAR(255)   NULL,
  [m_tornBits]                    VARCHAR(255)   NULL, 
  GAM                             VARCHAR(255)   NULL,
  SGAM                            VARCHAR(255)   NULL,
  PFS                             VARCHAR(255)   NULL,
  DIFF                            VARCHAR(255)   NULL,
  ML                              VARCHAR(255)   NULL,
  -- Data parsed from Memory Dump returned by DBCC PAGE
  SlotCount                       INT            NULL,
  Slot0Offset                     INT            NULL,
  ColumnCount                     INT            NULL,
  SumOfFixedWidths                INT            NULL,
  VariableWidthColumnCount        INT            NULL,
  SampleSlotData                  VARBINARY(MAX) NULL,
  SampleSlotText                  VARCHAR(MAX)   NULL,
  -- Data infilled later
  InUseDataPage                   BIT            NULL,
  TableName                       SYSNAME        NULL,
  -- Constraints
  CONSTRAINT PK_LazarusPage PRIMARY KEY (FileId, PageId)

The idea is to populate this table for every page in every file in the database and to then use ColumnCount and SumOfFixedWidths to establish which table each page belongs to. Where these columns don't provide enough information, we can try using VariableWidthColunCount and/or SampleSlotText. Before we can start visiting every page, we need to know how many pages, across how many files, we're going to have to visit. Note that this needs to run on the database which we're attempting to recover data for.

USE Lazarus

SELECT [file_id]  AS FileId,
       0          AS PageIdFrom,
       [size] - 1 AS PageIdTo
FROM   sys.database_files
WHERE  type_desc = 'ROWS'

Given the values for FileId, PageIdFrom and PageIdTo, we can then run the following script to call pDBCC_PAGE_Lazarus for each page, parse the first record, and then store the PAGE HEADER results and parsed first record in our new LazarusPage table. I'll present the T-SQL first, and then walk through it step-by-step. Note that I wrote this as a T-SQL script which I always planned to wrap-up into a stored procedure. But I didn't then, and I haven't now.

USE DataRecovery

-- STEP 1

SET @FileId     = 1    -- <<< change this
SET @PageIdFrom = 0    -- <<< change this
SET @PageIdTo   = 279  -- <<< change this

SET @PageId = @PageIdFrom

-- STEP 2
  [ParentObject] VARCHAR(255),
  [Object]       VARCHAR(255),
  [Field]        VARCHAR(255),
  [VALUE]        VARCHAR(255)

WHILE @PageId <= @PageIdTo

  -- provide some feedback as to how far we've got
  --PRINT @PageId

  -- STEP 3
  -- skip this page if we already know it belongs to a non-dropped table
  IF EXISTS (SELECT 1 FROM LazarusInUsePage WHERE PageFID = @FileID AND PagePID = @PageID)
    SET @PageId = @PageId + 1

  -- STEP 4
  -- populate @tblDBCCPAGE with the results of a call to DBCC PAGE
  EXECUTE pDBCC_PAGE_Lazarus @FileId, @PageId

  -- bail out if we got a non-zero error code
  IF @@ERROR <> 0

  -- STEP 5
  -- establish whether DBCC PAGE is using 8-byte or 16-byte addresses
  -- in it's memory dump (I assume this is a 32-bit/64-bit thing)
  SELECT @AddressSize = MAX(LEN([Object]) - LEN('Memory Dump @0x'))
  WHERE  [Object] LIKE 'Memory Dump @0x%'

  -- STEP 6
  -- before we've even parsed the DBCC PAGE output properly, we want
  -- to confirm that this is a data page - there's no point wasting
  -- our time on it otherwise; note that m_type (i.e. @PageType)
  -- is at offset 1 of a data page (counting from 0)
  DECLARE @PageType CHAR(2)
  SELECT  @PageType = SUBSTRING([VALUE], @AddressSize + 4 + 3, 2)
  WHERE   [ParentObject] = 'DATA:'
  AND     [Object] = 'Memory Dump @0x' + SUBSTRING([VALUE], 1, @AddressSize)

  IF @PageType = '01'

    -- STEP 7
    -- extract the memory dump (as a VARCHAR string)
    SELECT   @PageVarChar = '0x'
    SELECT   @PageVarChar = @PageVarChar +
             SUBSTRING([VALUE], @AddressSize + 4 +  1, 8) +
             SUBSTRING([VALUE], @AddressSize + 4 + 10, 8) +
             SUBSTRING([VALUE], @AddressSize + 4 + 19, 8) +
             SUBSTRING([VALUE], @AddressSize + 4 + 28, 8) +
             SUBSTRING([VALUE], @AddressSize + 4 + 37, 8)
    FROM     @tblDBCCPAGE
    WHERE    [ParentObject] = 'DATA:'
    AND      [Object] LIKE 'Memory Dump%'
    ORDER BY [VALUE] -- logically SUBSTRING([VALUE], 1, @AddressSize)

    -- STEP 8
    -- convert the VARCHAR page data to VARBINARY
    SELECT  @DynamicSQL = N'SELECT @PageVarBinary = ' + SUBSTRING(@PageVarChar, 1, (8192 * 2) + 2) -- (inc. '0x')
    EXECUTE sp_executesql @DynamicSQL, N'@PageVarBinary VARBINARY(MAX) OUTPUT', @PageVarBinary OUTPUT

    -- STEP 9
    -- establish the slot count (from offsets 22 and 23; zero-based)
    DECLARE @SlotCount INT
    SELECT  @SlotCount = CONVERT(INT, SUBSTRING(@PageVarBinary, 24, 1) + SUBSTRING(@PageVarBinary, 23, 1))

    -- STEP 10
    -- declare those variables which we'll only populate if we have a non-zero @SlotCount
    DECLARE @Slot0Offset         INT
    DECLARE @ColumnCount         INT
    DECLARE @SumOfFixedWidths    INT
    DECLARE @VariableColumnCount INT
    DECLARE @SlotData            VARBINARY(MAX)
    DECLARE @SlotText            VARCHAR(MAX)

    SET @Slot0Offset = NULL
    SET @ColumnCount = NULL
    SET @SumOfFixedWidths = NULL
    SET @VariableColumnCount = NULL
    SET @SlotData = NULL
    SET @SlotText = NULL

    IF @SlotCount > 0

      SET @SlotId = 0

      -- STEP 11
      -- search for the first data record which isn't a forwarding stub
      WHILE @SlotId < @SlotCount

        -- STEP 12
        -- establish where the record in this slot begins within the page, and then
        -- extract the record which is within this slot (we don't know how wide it is
        -- yet, so assume 8060 - the maximum record size)
        SELECT  @Slot0Offset = CONVERT(INT, SUBSTRING(@PageVarBinary, 8192 - (@SlotId * 2), 1) + SUBSTRING (@PageVarBinary, 8191 - (@SlotId * 2), 1))
        SELECT  @SlotData = SUBSTRING(@PageVarBinary, @Slot0Offset + 1, 8060) -- +1 because 0-based offset

        -- STEP 13
        -- establish what type of record it is (from bits 1 to 3, discarding bit 0)
        DECLARE @RecordType TINYINT
        SELECT @RecordType = (CONVERT(TINYINT, SUBSTRING (@SlotData, 1, 1)) & 0x0E) / 2

        -- STEP 14
        -- if this is a primary or forwarded record, we'll take it
        IF @RecordType = 0 OR @RecordType = 1

        SET @SlotId = @SlotId + 1

      -- STEP 15
      -- establish how many columns there are (bytes 3 and 4 of the record)
      DECLARE @ColumnCountOffset INT
      SELECT  @ColumnCountOffset = CONVERT(INT, SUBSTRING (@SlotData, 4, 1) + SUBSTRING(@SlotData, 3, 1))
      SELECT  @ColumnCount = CONVERT(INT, SUBSTRING(@SlotData, @ColumnCountOffset + 2, 1) + SUBSTRING(@SlotData, @ColumnCountOffset + 1, 1))

      -- STEP 16
      -- establish the sum of all the fixed widths
      SELECT  @SumOfFixedWidths = @ColumnCountOffset - 4 -- StatusBitsA (1) + StatusBitsB (1) + ColumnCountOfset (2)

      -- STEP 17
      -- establish how many bytes are occupied by the NULL bitmap
      DECLARE @NullBitmapSize INT
      SELECT  @NullBitmapSize = ((@ColumnCount - 1) / 8) + 1

      -- STEP 18
      -- establish how many variable-width columns there are;
      -- not all rows have variable width columns, so check
      DECLARE @VariableColumnCountOffset INT
      IF CONVERT(TINYINT, SUBSTRING (@SlotData, 1, 1)) & 0x20 = 0x20
        SELECT @VariableColumnCountOffset = @ColumnCountOffset + 2 + @NullBitmapSize -- (+2 is for ColumnCount)
        SELECT @VariableColumnCount = CONVERT(INT, SUBSTRING(@SlotData, @VariableColumnCountOffset + 2, 1) + SUBSTRING(@SlotData, @VariableColumnCountOffset + 1, 1))
        SELECT @VariableColumnCountOffset = NULL
        SELECT @VariableColumnCount = 0

      -- STEP 19
      -- if this is a forwarded record, ignore the last variable-width column –
      -- it's actually a pointer back to the forwarding stub
      IF @RecordType = 1
        SET @VariableColumnCount = @VariableColumnCount - 1

      -- STEP 20
      -- establish how wide the record actually is
      DECLARE @SlotSize INT
      IF @VariableColumnCount = 0
        SELECT @SlotSize = 1 + 1 + 2 + @SumOfFixedWidths + @NullBitmapSize + 2
        SELECT @SlotSize = CONVERT(INT, SUBSTRING (@SlotData, @VariableColumnCountOffset + 2 + (2 * @VariableColumnCount), 1) + SUBSTRING(@SlotData, @VariableColumnCountOffset + 1 + (2 * @VariableColumnCount), 1))

      -- STEP 21
      -- now the we know the actual size of the record, reduce the size of @SlotData
      SELECT @SlotData = SUBSTRING(@SlotData, 1, @SlotSize)

      -- STEP 22
      -- extract any alpha-numeric characters in @SlotData, as they may help
      -- in identifying the owning table
      SELECT @SlotText = ''
      DECLARE @Position INT
      SELECT @Position = 1
      WHILE @Position <= @SlotSize
        DECLARE @CurrentChar CHAR
        SELECT @CurrentChar = SUBSTRING(@SlotData, @Position, 1)
        SELECT @SlotText = @SlotText + CASE WHEN @CurrentChar BETWEEN 'a' AND 'z' OR @CurrentChar BETWEEN 'A' AND 'Z' OR @CurrentChar BETWEEN '0' AND '9' THEN         @CurrentChar ELSE '.' END
        SET @Position = @Position + 1

    -- STEP 23
    -- write details of this page into the LazarusPage table
    INSERT LazarusPage
    SELECT @FileId,
           MAX(CASE WHEN [Field] = 'm_pageId'                      THEN [VALUE] ELSE '' END) AS m_pageId,
           MAX(CASE WHEN [Field] = 'm_headerVersion'               THEN [VALUE] ELSE '' END) AS m_headerVersion,
           MAX(CASE WHEN [Field] = 'm_type'                        THEN [VALUE] ELSE '' END) AS m_type,
           MAX(CASE WHEN [Field] = 'm_typeFlagBits'                THEN [VALUE] ELSE '' END) AS m_typeFlagBits,
           MAX(CASE WHEN [Field] = 'm_level'                       THEN [VALUE] ELSE '' END) AS m_level,
           MAX(CASE WHEN [Field] = 'm_flagBits'                    THEN [VALUE] ELSE '' END) AS m_flagBits,
           MAX(CASE WHEN [Field] = 'm_objId (AllocUnitId.idObj)'   THEN [VALUE] ELSE '' END) AS [m_objId (AllocUnitId.idObj)],
           MAX(CASE WHEN [Field] = 'm_indexId (AllocUnitId.idInd)' THEN [VALUE] ELSE '' END) AS [m_indexId (AllocUnitId.idInd)],
           MAX(CASE WHEN [Field] = 'Metadata: AllocUnitId'         THEN [VALUE] ELSE '' END) AS [Metadata: AllocUnitId],
           MAX(CASE WHEN [Field] = 'Metadata: PartitionId'         THEN [VALUE] ELSE '' END) AS [Metadata: PartitionId],
           MAX(CASE WHEN [Field] = 'Metadata: IndexId'             THEN [VALUE] ELSE '' END) AS [Metadata: IndexId],
           MAX(CASE WHEN [Field] = 'Metadata: ObjectId'            THEN [VALUE] ELSE '' END) AS [Metadata: ObjectId],
           MAX(CASE WHEN [Field] = 'm_prevPage'                    THEN [VALUE] ELSE '' END) AS m_prevPage,
           MAX(CASE WHEN [Field] = 'm_nextPage'                    THEN [VALUE] ELSE '' END) AS m_nextPage,
           MAX(CASE WHEN [Field] = 'pminlen'                       THEN [VALUE] ELSE '' END) AS pminlen, 
           MAX(CASE WHEN [Field] = 'm_slotCnt'                     THEN [VALUE] ELSE '' END) AS m_slotCnt,
           MAX(CASE WHEN [Field] = 'm_freeCnt'                     THEN [VALUE] ELSE '' END) AS m_freeCnt,
           MAX(CASE WHEN [Field] = 'm_freeData'                    THEN [VALUE] ELSE '' END) AS m_freeData,
           MAX(CASE WHEN [Field] = 'm_reservedCnt'                 THEN [VALUE] ELSE '' END) AS m_reservedCnt,
           MAX(CASE WHEN [Field] = 'm_lsn'                         THEN [VALUE] ELSE '' END) AS m_lsn,
           MAX(CASE WHEN [Field] = 'm_xactReserved'                THEN [VALUE] ELSE '' END) AS m_xactReserved,
           MAX(CASE WHEN [Field] = 'm_xdesId'                      THEN [VALUE] ELSE '' END) AS m_xdesId,
           MAX(CASE WHEN [Field] = 'm_ghostRecCnt'                 THEN [VALUE] ELSE '' END) AS m_ghostRecCnt,
           MAX(CASE WHEN [Field] = 'm_tornBits'                    THEN [VALUE] ELSE '' END) AS m_tornBits,
           MAX(CASE WHEN [Field] LIKE 'GAM %'                      THEN [VALUE] ELSE '' END) AS GAM,
           MAX(CASE WHEN [Field] LIKE 'SGAM %'                     THEN [VALUE] ELSE '' END) AS SGAM,
           MAX(CASE WHEN [Field] LIKE 'PFS %'                      THEN [VALUE] ELSE '' END) AS PFS,
           MAX(CASE WHEN [Field] LIKE 'DIFF %'                     THEN [VALUE] ELSE '' END) AS DIFF,
           MAX(CASE WHEN [Field] LIKE 'ML %'                       THEN [VALUE] ELSE '' END) AS ML,
           NULL, -- InUseDataPage
           NULL  -- TableName
    WHERE  [ParentObject]= 'PAGE HEADER:'

  -- STEP 24
  SET @PageId = @PageId + 1


There's quite a lot going on there, so let's run through it step by step:

  • Step 1: We need to tell the script which file to perform DBCC PAGE operations against, and for which range of pages. If you have a small enough database you can just specify the values you got from the sys.database_files query earlier, but if your database is large it might make more sense to run it a section of a file at a time. Remember, you should expect to get through roughly 1675 pages (13 MB) per minute. You'll get much better performance than this if you discard a lot of pages from the recovery process due to their presence in LazarusInUsePage.
  • Step 2: We need to manipulate the output from DBCC PAGE a little, and the easiest way to do this is to dump the results into a table variable, so we declare @tblDBCCPAGE here for just that purpose.
  • Step 3: If this file/page exists in our LazarusInUsePage table, then we already know to which table it belongs, so we don't need to bother with any of the logic within this script – we're not going to need to recover the data.
  • Step 4: This is where we call DBCC PAGE for the current @FileId and @PageId and replace the contents of @tblDBCCPAGE with the results. If we've specified an invalid @FileId or @PageId, the call to DBCC PAGE will fail and we'll bail out of the loop.
  • Step 5: When I first wrote this code I was using a 32-bit version of SQL Server. When I ran it on a 64-bit version I noticed that the output of DBCC PAGE varied slightly, in that it used 16-byte addresses rather than the 8-byte ones I'd seen before. This code attempts to establish what type of addresses are being used. It checks by looking in the [Object] column, but later we'll assume that the same length addresses are being used in the [VALUE] column too.
  • Step 6: This logic is really a performance optimisation. Before we start merrily parsing the page, we want to quickly ensure that it's a data page, rather than an index page or a page belonging to the GAM, SGAM, IAM, or one of those other structures. As we haven't parsed the DBCC PAGE output at this point we have to go about it in a slightly wacky way. The WHERE clause is looking for the first row in the Memory Dump section of the DBCC PAGE output. Once we have that row, we're looking at the two characters which appear at positions 3 and 4 of the data itself. So in the output which begins "640EC000: 01010400...", which represents the bytes 0x01, 0x01, 0x04 and 0x00, we're looking for the second "01". This is the page type (what the DBCC OUTPUT refers to as m_type) – data pages have a page type of 1.
  • Step 7: We now need to parse the Memory Dump section of the DBCC PAGE output into one long string. You'll notice that the output is prefixed with an address, then each row provides 40 characters (20 bytes) of data, and that after every 8 characters (4 bytes) there's a space. We obviously need to ensure we extract the rows in the correct order, hence the ORDER BY clause. Note that using a SUBSTRING in the ORDER BY actually prevents @PageVarChar from being concatenated to for each row, so we don't do that.
  • Step 8: This uses some dynamic SQL to convert the character string stored in @PageVarChar into a binary variable @PageVarBinary. Effectively, it's converting the six-character string "0x1234" into the bytes 0x12 and 0x34. If there's an easy way to do this without using dynamic SQL, please let me know.
  • Step 9: We're about to start looping through the records on the page looking for one which we can use to establish how many columns etc. are present. So we need to know how many records are on the page in total. The space on a page occupied by a record is referred to as a 'slot', hence why I've used that term here. The slot count (m_slotCnt) is a two-byte field at offsets 22 and 23. As with all multi-byte fields, the byte order is reversed.
  • Step 10: This just declares some variables we'll be populating in a moment. As we'll be re-using then as we loop through the pages, it just seems safer to force them to NULL here.
  • Step 11: When you update a record in SQL Server in such a way that it no longer fits on the same data page, then if the table is a heap (i.e. has no clustered index) SQL Server will place the data on a new page but leave a small 'forwarding' record in its place so any non-clustered indexes (which store the physical file/page/slot location of the row) can still find it. If the first record on a page happens to be a forwarding record we can either follow the pointer, or simply move onto the next record on the page. The simpler option seemed to be the latter so that's what I did. Hence this code starts a loop looking for a non-forwarding record.
  • Step 12: You'll recall that the offset into the page at which the first record starts is stored in the last two bytes of the page. So we read this offset and then the record itself (into @SlotData). As we don't know how large the record it at this time, we assume the worst and allocate 8060 bytes.
  • Step 13: Data pages can hold various types of records: primary records, forwarded records, forwarding records, blob fragment records etc. Bits 1 to 3 of offset 0 within the record form a 3-bit record type, which we extract here.
  • Step 14: As I mentioned earlier, we'll not interested in forwarding records here (they simply point to a 'real' record elsewhere) and blob fragments aren't really much use to us either (particularly as in my 'Lazarus' database I knew we didn't have any). The only records we want to process are primary record (type 0) and forwarded records (type 1). If the current record is neither of those, then we loop around and check the next slot (which will have an offset stored four-bytes prior to the previous record's offset).
  • Step 15: This is where we start to parse the record itself, based upon the details I provided at the start of this post. The first thing we do is establish where the column count is stored (from offsets 2 and 3) and then read that column count. As always, multi-byte fields have their bytes reversed.
  • Step 16: As we know that the bytes prior to the column count within the record are TagA (1 byte), TagB (1 byte), Fsize (2 bytes) and Fdata, we can establish the total width of all the fixed width fields (i.e. the size of Fdata) by subtracting 4 from the offset at which the column count is stored.
  • Step 17: The NULL bitmap varies in size according to the number of columns in the record (not the number of NULLable columns – the total number of columns). Each column occupies 1 bit within the NULL bitmap which is always a multiple of 8 bits wide. This step simply calculates how many bytes wide it is, given the number of columns.
  • Step 18: Bit 5 of TagA specifies whether or not an array of variable-width column offsets exists within the record. The most obviously scenario where no such array exists is if there are no variable-width columns in the table, but it will also happen if all variable-width columns are NULL. In this step we're just testing that bit, and then reading how many entries are in the array.
  • Step 19: As I mentioned earlier, when SQL Server has to move a record because it no longer fits on the original page following an update statement, it leaves a forwarding record to point to the new record. In the forwarded record it stores a back-pointer to the forwarding record so that can be updated if the record moves again. This back-pointer is stored in the variable-width column array, so we simply subtract one from the number of variable-width columns present if we're looking at a forwarded record.
  • Step 20: Now we can finally work out how wide the record is. If there weren't any variable width columns we simply add up the widths of those fixed-width parts of the record. If there were variable width columns, then we know that the last offset in the array of variable width offsets points one byte past the end of the last variable-width column; we can therefore use this to establish the record width.
  • Step 21: Although we originally created @SlotData (which holds the record) as a 8060-byte field, we can now shrink it as we know the actual size.
  • Step 22: I found that sometimes actually eye-balling the data helped confirm that I'd associated the page with the correct table. So we just rip through the bytes within @SlotData and create a text version of the same, designed to be human readable.
  • Step 23: Although this query looks a little odd, it's just a way of pivoting the data through 90 degrees and creating one column for each of the PAGE HEADER attributes returned from the DBCC PAGE query. We store those results in the LazarusPage table, along with those characteristics we've parsed from the first record in the page.

Wasn't that easy? The net result is that we now have a table, LazarusPage, populated with various items of meta-data about each data page which we can use to establish which table each page originally belonged to. What about the TestTable we create and dropped earlier? Was that picked-up by this script? Well, by rights we should be looking for this data based upon our knowledge of the table's meta-data. But I won't be covering that until the next post. So for the time being we can cheat as use our knowledge that the MyChar column always contain the letters "XYZ":

SELECT FileId, PageId, SlotCount, ColumnCount, SumOfFixedWidths, SampleSlotText
FROM   LazarusPage
WHERE  SampleSlotText LIKE '%XYZ%'

You'll note that the sum of SlotCount across the 3 pages is 255 which tallies with the number of rows we inserted. So far so good.

In my next post, we'll use this information to populate the InUseDataPage and TableName columns of LazarusPage and will parse each page into individual records ready to parse the records themselves.

See Also

No comments:

Post a Comment