Monday, 4 January 2010

Undropping a SQL Server Table - Part 3

Introduction

This series of posts describes the steps I went through to recover about 10GB worth of data in tables which had been accidentally dropped. In Part 1 I explained that no backup existed and that the use of third-party tools was not an option. I went on to provide a stored procedure which wrapper a call to DBCC PAGE. In Part 2 I used the stored procedure to extract a hexadecimal dump of relevant pages within the database, and to extract varies items of meta-data about both the page and the first record on the page. This post will use that information to identify to which table each page belonged and will then split the pages for a given table into individual records.

Terminology

Throughout this and the remaining posts of this series I'll attempt to consistently refer to the subdivision of a page within which data is stored as a slot, to the raw hexadecimal data within a slot as a record (or slot data), and to the parsed data as a row. I'll probably manage to deviate from this somewhere along the way, but that's how I'm intending to use those terms. Feel free to point out any mistakes.

Rogue Data Pages

I'm not sure in what way the pages which make up a SQL Server database are initialised. Logic would suggest that there's no point in initialising them at all as the GAM, SGAM, IAM etc. will simply not reference them. But we're not using those structures to identify whether a page is actually in use or not – we're parsing all pages. So is it possible for us to stumble across some random uninitialized page which looks like a data page simply because offset 1 of the page is 0x01? It seems that the answer is yes. I certainly came across some pages which had 0x01 in offset 1 but which were clearly not valid data pages by virtue of the fact that the slot 0 offset (i.e. the last 4 bytes of the page) pointed outside the page, or pointed to a location within the page header. Now, it could certainly be argued that I should really have discarded those in the code I presented in Part 2 of this series. But I didn't, I did it afterwards. So that's why I'm providing this snippet now:

USE DataRecovery
GO

-- first 96 bytes are the header, so < 96 can't be valid;
-- pages are 8K long, so offset must start before page end
UPDATE LazarusPage
SET    InUseDataPage = 1
WHERE  Slot0Offset  >= 96
AND    Slot0Offset  <= 8192

Table Meta-Data

In order to recognise which pages belong to which tables, we're going to need to know something about the tables we're attempting to recover. Specifically, for each table we're going to need to know how many columns it contained and what the total width of all the fixed-width columns was. I would have thought that most people will have this information – there'll be a CREATE TABLE script or a Visio diagram knocking about somewhere. If not, then you're going to have to deduce the table structure by manually inspecting the data we wrote into the Slot0Data and Slot0Text columns of our LazarusPage table – i.e. the hexadecimal dump of the first record. This won't be easy, but I'll give you some guidance on how you might go about it in a later post.

Strictly speaking, to identify each table's data pages we don't need to know anything about the individual columns within the table. If the total width of all the fixed-width columns is 32 we don't care if that's 32 1-byte fields, 1 32-byte field or anything in between. But in the next post we're going to need column-level meta-data, so it makes sense to derive the table-level meta data from column-level meta-data. Most of you will be familiar with the syscolumns system table or the sys.columns catalog view – this is where SQL Server stores it's column meta-data. The key pieces of information contained therein are the name, type and length of each column. If the column is a DECIMAL we'll also want it's precision and scale. We'll also need to know which table each column belongs to and what order the columns appear in. So let's create a cut-down version of syscolumns/sys.columns which we can populate with the meta-data about the columns in the tables we're attempting to recover.

USE DataRecovery
GO

CREATE TABLE dbo.LazarusColumn
(
  -- columns
  [TableName] SYSNAME  NOT NULL,
  [name]      SYSNAME  NOT NULL,
  xtype       TINYINT  NOT NULL,
  [length]    SMALLINT NOT NULL,
  xprec       TINYINT      NULL,
  xscale      TINYINT      NULL,
  colid       SMALLINT NOT NULL,
  -- constraints
  CONSTRAINT PK_LazarusColumn PRIMARY KEY (TableName, [name])
)
GO

The column names are modelled on those within the syscolumns system table. If you actually have a CREATE TABLE script, you can run that script and then use syscolumns to populate LazarusColumn. For example:

USE DataRecovery
GO

CREATE TABLE TestTable
(
  MyInt       INT,
  MySmallInt  SMALLINT,
  MyTinyInt   TINYINT,
  MyChar      CHAR(6),
  MyMoney     MONEY,
  MyDateTime  DATETIME,
  MyFloat     FLOAT,
  MyDecimal   DECIMAL (9,2),
  MyVarChar   VARCHAR(100) NULL,
  MyNVarChar  NVARCHAR(4)  NULL,
)
GO

INSERT LazarusColumn
SELECT 'TestTable', [name], xtype, [length], xprec, syscolumns.xscale, colid
FROM   syscolumns
WHERE  id = OBJECT_ID('[dbo].[TestTable]')

Or, if you prefer sys.columns over syscolumns:

INSERT LazarusColumn
SELECT 'TestTable', [name], system_type_id, max_length, [precision], scale, column_id
FROM   sys.columns
WHERE  object_id = OBJECT_ID('[dbo].[TestTable]')

Although we'll use the column-level meta-data in the next post, all we're really using it for now is as a means to get to the table-level meta-data:

USE DataRecovery
GO

CREATE TABLE dbo.LazarusTable
(
  -- columns
  TableName         SYSNAME,
  ColumnCount       INT,
  SumOfFixedWidths  INT,
  -- constraints
  CONSTRAINT PK_LazarusTable PRIMARY KEY (TableName)
)
GO

INSERT   LazarusTable
SELECT   TableName,
         COUNT(*),
         SUM(CASE WHEN xtype IN (167,231) THEN 0 ELSE [length] END)
FROM     LazarusColumn
GROUP BY TableName
GO

Columns with xtype 167 or 231 are VARCHAR and NVARCHAR columns respectively. If you have any other variable length columns (VARBINARY, perhaps) in your database, you'll want to add those to the list too.

Now that we know how many columns each table has, together with the sums of the fixed widths, we can attempt to identify each page based upon the sample record we stored in LazarusPage:

UPDATE LazarusPage
SET    TableName = LazarusTable.TableName
FROM   LazarusPage
JOIN   LazarusTable ON LazarusPage.ColumnCount      = LazarusTable.ColumnCount
                   AND LazarusPage.SumOfFixedWidths = LazarusTable.SumOfFixedWidths
WHERE  LazarusPage.InUseDataPage = 1
AND    LazarusPage.TableName IS NULL

Now, it's quite possible that two or more of your tables share the same ColumnCount and SumOfFixedWidths. If you're in this situation you'll have to wait until a later post in this series where I explain how to hand-pick your way through the Slot0Data to establish what the original table structure was.

Splitting Pages into Records

From now on we're going to be dealing with one table at a time: each table will have a different structure so life just becomes too complicated if we attempt to handle all those different structures at the same time.

For a given table, we want to parse the pages we've tagged as belonging to that table into individual records. We'll store the split records in a new table.

USE DataRecovery
GO

CREATE TABLE LazarusRecord
(
  -- columns
  TableName SYSNAME,
  FileId    INT,
  PageId    INT,
  SlotId    INT,
  Data      VARBINARY(MAX),  -- might be > 8000 bytes, hence the use of MAX
  -- constraints
  CONSTRAINT PK_LazarusRecord PRIMARY KEY (FileId, PageId, SlotId)
)
GO

The vast majority of logic in the code which splits the page into individual records will be familiar to you, as it's been lifted directly from the code we used to parse the page in my last post. In hindsight I could have combined the two but I ran them separately during the real recovery process, and so am presenting them separately here. Anyway, there wasn't enough space on the database I was recovering into to have LazarusPage and LazarusRecord fully populated at the same time.

There's little point in me stepping through the following code line-by-line as it's so similar to the previous code. The following script is set-up to parse the records for our TestTable table. Just change the value of the @TableName variable to use it for other tables.

USE DataRecovery
GO

DECLARE @TableName SYSNAME
SET @TableName = 'TestTable'
SET NOCOUNT ON

DECLARE @tblDBCCPAGE TABLE
(
  [ParentObject] VARCHAR(255),
  [Object]       VARCHAR(255),
  [Field]        VARCHAR(255),
  [VALUE]        VARCHAR(255)
)
  
DECLARE @FileId INT
DECLARE @PageId INT

DECLARE curPage CURSOR LOCAL FAST_FORWARD READ_ONLY FOR SELECT FileId, PageId FROM LazarusPage WHERE TableName = @TableName
OPEN curPage
FETCH NEXT FROM curPage INTO @FileId, @PageId
WHILE @@FETCH_STATUS = 0
BEGIN
  DELETE FROM @tblDBCCPAGE

  -- populate @tblDBCCPAGE with the results of a call to DBCC PAGE
  INSERT @tblDBCCPAGE
  EXECUTE pDBCC_PAGE_Lazarus @FileId, @PageId

  -- establish whether DBCC PAGE is using 8-byte or 16-byte addresses
  -- in it's memory dump (I assume this is a 32-bit/64-bit thing)
  DECLARE @AddressSize TINYINT
  SELECT @AddressSize = MAX(LEN([Object]) - LEN('Memory Dump @0x'))
  FROM   @tblDBCCPAGE
  WHERE  [Object] LIKE 'Memory Dump @0x%'

  -- extract the memory dump (as a VARCHAR string)
  DECLARE  @PageVarChar VARCHAR(MAX)
  SELECT   @PageVarChar = '0x'
  SELECT   @PageVarChar = @PageVarChar +
           SUBSTRING([VALUE], @AddressSize + 4 +  1, 8) +
           SUBSTRING([VALUE], @AddressSize + 4 + 10, 8) +
           SUBSTRING([VALUE], @AddressSize + 4 + 19, 8) +
           SUBSTRING([VALUE], @AddressSize + 4 + 28, 8) +
           SUBSTRING([VALUE], @AddressSize + 4 + 37, 8)
  FROM     @tblDBCCPAGE
  WHERE    [ParentObject] = 'DATA:'
  AND      [Object] LIKE 'Memory Dump%'
  ORDER BY [VALUE] -- logically SUBSTRING([VALUE], 1, @AddressSize)

  -- convert the VARCHAR page data to VARBINARY
  DECLARE @PageVarBinary VARBINARY(MAX)
  DECLARE @DynamicSQL NVARCHAR(MAX)
  SELECT  @DynamicSQL = N'SELECT @PageVarBinary = ' + SUBSTRING(@PageVarChar, 1, (8192 * 2) + 2) -- (inc. '0x')
  EXECUTE sp_executesql @DynamicSQL, N'@PageVarBinary VARBINARY(MAX) OUTPUT', @PageVarBinary OUTPUT 

  -- establish the slot count (from offsets 22 and 23; zero-based)
  DECLARE @SlotCount INT
  SELECT  @SlotCount = CONVERT(INT, SUBSTRING(@PageVarBinary, 24, 1) + SUBSTRING(@PageVarBinary, 23, 1))

  IF @SlotCount > 0
  BEGIN

    DECLARE @SlotId INT
    DECLARE @Slot0Offset INT

    SET @SlotId = 0
    WHILE @SlotId < @SlotCount
    BEGIN
      -- establish where this slot's row data begins within the page
      DECLARE @SlotOffset INT
      SELECT  @SlotOffset = CONVERT(INT, SUBSTRING(@PageVarBinary, 8192 - (@SlotId * 2), 1) + SUBSTRING(@PageVarBinary, 8191 - (@SlotId * 2), 1))

      -- extract the row data (we don't know how wide it is yet, so assume 8060 - the maximum row size)
      DECLARE @SlotData VARBINARY(MAX)
      SELECT  @SlotData = SUBSTRING(@PageVarBinary, @SlotOffset + 1, 8060) -- +1 because 0-based offset

      -- establish what type of record it is (from bits 1 to 3, ignoring bit 0)
      DECLARE @RecordType TINYINT
      SELECT @RecordType = (CONVERT(TINYINT, SUBSTRING(@SlotData, 1, 1)) & 0x0E) / 2

      -- if primary or forwarded
      IF @RecordType = 0 OR @RecordType = 1
      BEGIN

        -- establish how many columns there are (bytes 3 and 4 of the data in the slot)
        DECLARE @ColumnCountOffset INT
        SELECT  @ColumnCountOffset = CONVERT(INT, SUBSTRING(@SlotData, 4, 1) + SUBSTRING(@SlotData, 3, 1))
        DECLARE @ColumnCount INT
        SELECT  @ColumnCount = CONVERT(INT, SUBSTRING(@SlotData, @ColumnCountOffset + 2, 1) + SUBSTRING(@SlotData, @ColumnCountOffset + 1, 1))

        -- establish the sum of all the fixed widths
        DECLARE @SumOfFixedWidths INT
        SELECT  @SumOfFixedWidths = @ColumnCountOffset - 4 -- StatusBitsA (1) + StatusBitsB (1) + ColumnCountOfset (2)

        -- establish how many bytes are occupied by the NULL bitmap
        DECLARE @NullBitmapSize INT
        SELECT  @NullBitmapSize = ((@ColumnCount - 1) / 8) + 1

        -- establish how many variable-width columns there are
        DECLARE @VariableColumnCountOffset INT
        DECLARE @VariableColumnCount INT

        IF CONVERT(TINYINT, SUBSTRING(@SlotData, 1, 1)) & 0x20 = 0x20
        BEGIN
          SELECT @VariableColumnCountOffset = @ColumnCountOffset + 2 + @NullBitmapSize -- (+2 is for ColumnCount)
          SELECT @VariableColumnCount = CONVERT(INT, SUBSTRING(@SlotData, @VariableColumnCountOffset + 2, 1) + SUBSTRING(@SlotData, @VariableColumnCountOffset + 1, 1))
        END
        ELSE
        BEGIN
          SELECT @VariableColumnCountOffset = NULL
          SELECT @VariableColumnCount = 0
        END

        -- if this is a forwarded record, ignore the last variable-width columns - it's a pointer back to the forwarding stub
        IF @RecordType = 1
          SET @VariableColumnCount = @VariableColumnCount - 1

        -- establish how wide the row actually is
        DECLARE @SlotSize INT
        IF @VariableColumnCount = 0
        BEGIN
          -- StatusBitsA (1) + StatusBitsB (1) + ColumnCountOffset (2) + @SumOfFixedWidths + @NullBitmapSize + VariableColumnCount (2)
          SELECT @SlotSize = 1 + 1 + 2 + @SumOfFixedWidths + @NullBitmapSize + 2
        END
        ELSE
        BEGIN
          -- immediately after the VariableColumnCount is an array of 2-bytes offsets which point to the byte immediately
          -- following that variable-length column; so the 0th offset points to the first byte of the 1st column, the 1st
          -- offset points to the first byte of the 2nd column, and the last offset points to the first byte of the NEXT
          -- row; this is therefore the length if this row
          SELECT @SlotSize = CONVERT(INT, SUBSTRING(@SlotData, @VariableColumnCountOffset + 2 + (2 * @VariableColumnCount), 1) + SUBSTRING(@SlotData, @VariableColumnCountOffset + 1 + (2 * @VariableColumnCount), 1))
        END

        -- now the we know the actual side of the row, reduce the size of @SlotData
        SELECT @SlotData = SUBSTRING(@SlotData, 1, @SlotSize)

        -- add the row data to the LazarusRecord table
        INSERT LazarusRecord VALUES (@TableName, @FileId, @PageId, @SlotId, @SlotData)
      END

      -- move onto the next slot
      SELECT @SlotId = @SlotId + 1
    END

  END

  FETCH NEXT FROM curPage INTO @FileId, @PageId
END

CLOSE curPage
DEALLOCATE curPage

SET NOCOUNT OFF
GO

So, what effect has running that script had? If it's worked correctly, our LazarusRecord table should now be populated with one row for each of the records originally in our TestTable table. There were originally 255 rows so let's check.

SELECT Data FROM LazarusRecord
Data
0x30002E000000000064000158595A3120201127000000000000C07A1000849B00000B7A6F0C0100F03F01650000000A0001FC020039003D004158003100
0x30002E0080841E0064000258595A322020224E00000000000080F52000849B00000B7A6F0C0100004001CA0000000A0002FC02003A003E00414158003200
0x30002E00C0C62D002C010258595A332020337500000000000040703100849B00001137A79201000840012F0100000A0004FE01003900414141
... etc. ...
0x30002E00C0FD320F9C63FF58595A323535EFE92600000000004031F7008E9B00002315C61602E06F40019B6400000A0000FC02003D00450041414141415800320035003500

Yup – 255 rows. So, we've managed to extract the hexadecimal data for all of the original records from our dropped table. In the next post we'll parse that record data into actual CHARs, DECIMALs etc.

See Also

No comments:

Post a Comment