Thursday, November 5, 2009

SQL 2008 Transactional Replication and initializing from a backup file

Now there's a fun process to go through, especially if like me you don't know jack about T-SQL. Let's face it, I currently know more words in Mandarin than I know commands in T-SQL; which isn't a lot. Having already done several tests with the automatic method of setting up Transactional Replication (where it does all the initial synch work for you and you just sit back and watch) I had assumed that Initializing from a backup would be a breeze. Famous last words.

There are some articles out there on the web but I found that most of them either assumed you knew more or just left out minor details. If, like me you're trying to set up replication of a huge database over a bandwidth limited connection or if you have some other reason that the initial setup has to be done from a backup file, then here's the walkthrough. By the way, I've only done the one-way Transactional Replication as in my situation this is just a failover site and will not need to send changes back to the original server.

Steps (order is very important)

  1. Set up the Distributer – database and share – one time setup
  2. Set up Publisher on source server - Don’t use either checkbox for snapshot.
  3. Enable the flag under the Publication properties to allow “initialize from backup”. (Right-click on the publication, properties)
  4. Disable the distribution cleanup agents. (Under SQL agent jobs)
  5. Make a Full backup of the database. Keep a local copy as you'll need it later.
  6. Copy database to other site. (Over the network, courier pigeon, magic, whatever)
  7. Restore database with the same name
  8. Create pull subscription on the destination server. (see scripts provided below)
  9. Check status – Replication monitor -> drill down to publisher (add if needed), “View details on the subscription on the right to get a status report (3rd tab in window that pops up)
  10. After it’s done synching up, turn off the “initialize from backup” flag or else the cache it keeps will never shrink. And re-enable the distribution cleanup agents.
  11. TEST IT. Check the tables after synchronization and then check again after new transactions have been sent. (Wait a few minutes after each replication interval to give it time to catch up).
The reason that order is so important is that after you configure the publisher to enable the "Initialize from backup" and stop the cleanup jobs, it starts keeping a full record of all transactions that have occured since then. (Yes, the DB could grow a lot depending on how long this takes). The backup has a special value in it called an LSN number. This value tells the server to only send transactions that occurred after the backup was made.
Reference:
http://www.replicationanswers.com/NoSyncOn2005.asp

TIP 1: If you try to use a backup that was created before the publication was set up, it will fail.
TIP 2: If you get the Msg 21397 error mentioned in the link above, then you probably forgot to stop the Distribution cleanup agents and the server has thrown out some of the transactions that have occurred since the LSN (backup).

Step 10 is necessary because the distributor will keep waiting around for another subscriber and in the meanwhile your ms_replcommands table will continue to grow.
http://ilkirk.com/archive/2009/01/a-replication-fun-fact/

Now we'll move on to the actual subscription scripts that you will run around step 8. (I'll assume that you set up the Publication through the GUI keeping in mind not to create a snapshot). If you don't know how to create the publication, see the help file or http://msdn.microsoft.com/en-us/library/ms151160.aspx

On the Publishing server we're going to use sp_addsubscription to define the initial subscription, then we'll run sp_addpullsubscription and sp_addpullsubscription_agent on the Subscriber machine. I created the subscriber scripts by using the GUI and choosing the export to script option at the end instead of executing the change. Then I modified the subscriber scripts and that's how I recommend that you set them up. My generalized scripts below should just be used as a guide.

Script 1: (Yes, the exec line is really long. Run on publisher)

-----BEGIN: Script to be run at Publisher 'Publishing_SQLServerName'------------
------- backupdevicename has to be located on the Publisher machine -------

use [Your_DB_Name]
exec sp_addsubscription @publication = N'Your_DB_Name_PUB', @subscriber = N'Subscribing_SQLServerName', @destination_db = N'Your_DB_Name', @sync_type = N'initialize with backup', @backupdevicetype = 'disk', @backupdevicename = 'e:\BACKUP\Your_DB_Name090209.bak', @subscription_type = N'pull', @update_mode = N'read only'
GO
-------END: Script to be run at Publisher 'Publishing_SQLServerName'-------------

Replace Your_DB_Name, Your_DB_Name_PUB, Subscribing_SQLServerName, and the location of the backup with appropriate values.

Script 2: (run at the subscriber sql server)

-----BEGIN: Script to be run at Subscriber 'Subscribing_SQLServerName'-----------------
use [Your_DB_Name]
exec sp_addpullsubscription @publisher = N'Publishing_SQLServerName', @publication = N'Your_DB_Name_PUB', @publisher_db = N'Your_DB_Name', @independent_agent = N'True', @subscription_type = N'pull', @description = N'', @update_mode = N'read only', @immediate_sync = 0
-----END: Script to be run at Subscriber 'Subscribing_SQLServerName'-----------------

Same drill as before with changing out the placeholders with actual names.

Script 3: (still on the subscriber). Now we're going to set up the agents which will handle the data pulls for us. (once again, a really long exec command)

-----BEGIN: Script to be run at Subscriber 'Subscribing_SQLServerName'-----------------
exec sp_addpullsubscription_agent @publisher = N'Publishing_SQLServerName', @publisher_db = N'Your_DB_Name', @publication = N'Your_DB_Name_PUB', @distributor = N'Publishing_SQLServerName', @distributor_security_mode = 0, @distributor_login = N'dist_login_acct', @distributor_password = N'dist_password', @enabled_for_syncmgr = N'False', @frequency_type = 64, @frequency_interval = 0, @frequency_relative_interval = 0, @frequency_recurrence_factor = 0, @frequency_subday = 0, @frequency_subday_interval = 0, @active_start_time_of_day = 0, @active_end_time_of_day = 235959, @active_start_date = 20090902, @active_end_date = 99991231, @alt_snapshot_folder = N'', @working_directory = N'', @use_ftp = N'False', @job_login = N'domain\username', @job_password =N'user_password', @publication_type = 0
GO
-----END: Script to be run at Subscriber 'Subscribing_SQLServerName'-----------------

If you went through the GUI you will have seen where it prompted you for user accounts. The windows account is needed to access the Distribution share on the publisher that houses the snapshots(if we were using them) and updates. Fortunately both machines were on the same domain so that was easy for me. For the distributor I created a SQL login on both servers with the same username/password and granted that user rights on the publishing server. This account is used by the agents on the subscriber to check the distribution database on the publisher.

Provided you didn't get any errors when you ran those scripts, you'll want to start monitoring the replication now. Right click on the Subscription and View Synchronization Status.


Now right-click in SQL Management Studio on Replication and Launch the Replication Monitor. Drill down to the publisher (add if it need be) and then drill down to the publication. Right click on the right panel and View Details. The window that pops up is really useful to see how replication is going.

What's great about that window is that you can watch the # of pending transactions that are waiting to be synch'd. Initially this will be a very big number until it catches up. At this point the only thing left are steps 10 and 11. Turn off the flag on the publisher for Initializing from backup and actually go into the database and replica to see if data is being transferred properly. (taking into account replication intervals, etc)

TIP 3: If you get this error: The distribution agent failed to create temporary files in C:\Program Files\Microsoft SQL Server\100\COM directory. System returned errorcode 5.
Then you need to grant the user that the Distribution Agent is running as Write access to that directory. http://support.microsoft.com/default.aspx/kb/956032

TIP 4: You may start getting errors in your DB related to "Length of LOB data". This occurs because by default it only supports chunks up to 65535. Go into SQL management Studio, right click on the server and choose properties. Set the Max Text Replication Size to something higher. Or do what I did and use the max value of 2147483647.

TIP 5: I found a good reference book that just focuses on SQL replication called "Pro SQL Server 2008 Replication" (ISBN13: 9781430218074). It explains in detail the mechanisms behind replication and covers all types of replication for SQL and how to choose the one that's right for you.

10 comments:

kaochiuan said...

Thanks a lot.
It's very useful.

Ben said...

Great guide. Doing a backup/restore vs snapshots (as recommended by MS) made this a lot more friendly for our scenarios (reporting server away from production).

Anonymous said...

Couple of questions

01. Is it possible to initialize the replication on the subscriber without actually restoring the backup?

02. The initialization from backup assumes the entire database to be replicated. If we want to replicate only few individual tables, what additional steps we need to perform?

Thanks,
NSR

Gnawgnu said...

Sure, just follow the wizards on both ends and skip most of my steps above and it works out of the box. Your DB will slog down to near unresponsive for a while and it'll generate a stack of replication files on the snapshot folder structure which will transport over to the subscriber server. This method is only recommended for smaller databases or situations where you have a fast connection. In our case it would have been 14 hours of sync time instead of 1 hr for the backup copy method.
As for the second question, when you set up the publication you get to choose what tables and other types of data that you want replicated. You can tell it to just publish a single table if you want. Hope that helps.

Quickdraw6906 said...

In step 8, is a pull subscription required? I'm in an environment that requires a push subscription. Thanks.

Gnawgnu said...

Probably. Everything I've read on it implies that it'll work either way but I haven't tested my procedure with it.

David said...

How do I add a new article to an existing replication initializing from a backup file ?

Gnawgnu said...

I haven't done that yet but as I understand it you can do it with TSQL. This link is probably it: http://msdn.microsoft.com/en-us/library/ms146887.aspx

Unknown said...

I'm also looking to add an article after initiating replication from backup, and it's not snapshotting, any help would be appreciated.

siva said...

Its very useful.

For add an article after initiating the replication from backup use this link

Link : http://blogs.msdn.com/b/repltalk/archive/2011/03/15/how-to-add-an-article-to-an-existing-transactional-subscription-initialized-through-backup.aspx?Redirected=true