Thursday, December 2, 2010

Disaster Recovery Is More Than Just Technology Part 2: The Alphabet Soup

In my previous blog post, I talked about high availablity and disaster recovery (HADR) and how it is more than just the underlying technology that keeps the entire strategy intact. In this blog post, I’ll describe a few acronyms – sometimes called buzzwords – that are commonly referred to in HADR projects and implementations (I know I use them a lot when addressing questions regarding HADR.) These acronyms fall under the second P in my PPT for HADR – PROCESS. Every HADR project or implementation should first be able to define these acronyms well before they even purchase the hardware, software and technologies they intend to use. Let’s get going.



Recovery Point Objective (RPO). Simply put, RPO answers the question, “How much data can we afford to lose?” Every HADR project should be able to determine the acceptable amount of data loss and is usually measured in units of time. For example, if a highly critical application runs 24X7 and the stakeholders have defined the RPO to be one (1) hour, if the database that stores the data for the application crashes at 5:45AM and you are running regular log backups (for SQL Server) or redo log backups (for Oracle) every hour starting at 12:30AM (I’m pretty sure the Oracle guys would jump up on me by using this as an example), the 15-minute data loss would be acceptable as you have an RPO value of one (1) hour. The 15-minute data loss was derived from hourly backups running starting at 12:30AM and that the last backup ran at 5:30AM – 15 minutes before the database crashed. Now, defining the hourly backups was not decided upon based on guesswork, although, I see a lot of DBAs simply just use it as a standard. If it was, it’s time to define your RPO and determine the amount of acceptable data loss and review your backup strategies.



Recovery Time Objective (RTO). RTO answers the question, “When is my application coming back online after a disruption?” Together with RPO, RTO is also measured in units of time. Looking back at the earlier example, if the stakeholders have defined the RTO to be two (2) hours, the database, the application and whatever is necessary to use the application should be back online by 7:45AM.

 
Service Level Agreement (SLA). As Wikipedia defines it, SLA is a part of a service contract where the level of service is formally defined. In my experience, this is commonly agreed upon by a customer and a service provider. You might be thinking that if you have an internal IT management team, chances are that you won’t have to deal with SLAs. However, bear in mind that in order for a computer application to be online, it relies on hardware which needs to be covered by a vendor warranty with associated service levels should the hardware needs to be serviced or replaced during a disaster, a media on which it can be accessed – either via the Internet or your local network – which also needs to be covered by vendor or in-house service level agreements. This is a very important item in your HADR project as anything external to your team will definitely affect your RPO and RTO. For example, if your highly critical application has been restored within two hours and the data loss was less than an hour, you may have met your RPO and RTO but if the Internet connection that allows your users to access the application is still not restored after two hours, forget about achieving your RPO and RTO. Technically, from the point of view of users, your application is still not accessible. Which is why when you’re dealing with vendors or service providers, make sure that your agreed upon SLAs meets your RPO and RTO.



I’ve only scratched the tip of the iceberg on the different components that make up the PROCESS part of an HADR project. What I find surprising is that whenever I start asking about RPO/RTO/SLA values from customers asking for an HADR solution, they immediately respond with “I want zero downtime and zero data loss for my application.” They simply think that their application deserves an RPO and RTO value of zero (0). What they don’t realize quite yet is that as RPO and RTO approaches zero (borrowing jargons from integral calculus that, as your limits approach zero), the cost exponentially increases. And when we start talking about costs, customers start re-evaluating their HADR strategies the way they are supposed to be. This is where I really like the discussions to go because they will look at each application and the corresponding database differently and categorize them accordingly – from not-so-critical to highly critical. And they start crunching numbers to determine how highly critical an application can be and if it does merit a near-zero RPO and RTO. Take for instance an e-commerce site that generates an average of 50 transactions per minute (which is a relatively low volume these days) at US$10 per transaction. That is equivalent to US$500 per minute. Losing an hour’s worth of data due to downtime or data loss would mean US$30,000. Having an HADR solution in place should be justifiable enough to protect a US$30,000 worth of transactions in an hour. Your strategy would also consider if the transactions are only coming in between 5:00AM until 9:00PM as you wouldn’t want to be investing a lot for a solution that doesn’t protect anything after those hours.

 
So when you plan your next HADR project, think about these concepts and define your RPO/RTO/SLA. It will definitely keep your perspectives right and plan accordingly. In my next blog post, I will be talking about high availability implementations, more examples and how they should address your RPO/RTO/SLA. Keep posted. Plus, if you’re in the Washington DC area this weekend, catch me do a presentation on SQL Server Disaster Recovery Techniques for SQL Saturday #61

Friday, November 26, 2010

Disaster Recovery is More Than Just Technology Part 1

While I was at the PASS Summit 2010, I've spent a fair amount of time at the Ask-the-Experts table on high availability, disaster recovery and virtualization. Conference attendees with different requirements on high availability and disaster recovery come to these tables and ask questions.


I've spent a fair amount of time doing high availability and disaster recovery (HADR) in my previous life as a data center engineer focusing on the Microsoft platform. My previous organization sold high availability and disaster recovery solutions to customers like crazy, highlighting the fact that the solutions are more than just the technology aspect. Every time I talk about HADR in my presentations, I focus on the three main ingredients to have a successful implementation - people, process and technology (PPT). Note that technology is at the end of the list as the people and the process components should come first.


What I found out at the PASS Summit gave me insights as to how people approach HADR (and I thought I only saw these on the newsgroups and forums as I answer their questions.) Most SQL Server DBAs (and maybe even a lot of IT professionals) want a technical answer to their HADR problem. They want to know if failover clustering, database mirroring, replication or log shipping is the best solution to their requirement. What's funny is that when I ask them about what their RPO/RTO/SLAs are, they scratch their head and ask what those acronyms are. And when I start explaining these acronyms to them, they still want to hear what the best solution is for their requirement.


As I prepare for my presentation on Disaster Recovery Techniques for SQL Saturday #61 in Washington DC, I'll be writing a series of articles about disaster recovery and what RPO/RTO/SLAs are and how they fit into the whole disaster recovery strategies. Before I dive into the "technology" part of the PPT ingredient for a successful HADR implementation, I will talk about the people and the process part first. Why? Because these two will drive the technology part of the whole strategy. And if you're in the Washington DC area, feel free to drop by at the SQL Saturday event.

Monday, November 8, 2010

PASS Summit MVP Birds of a Feather Lunch 2010 - T- 2 days

This would be the last blog post before the PASS Summit MVP Birds of a Feather Lunch 2010. I know most, if not all, of the attendees are already in Seatle,WA enjoying the dinner meetups and the 9 C/48 F weather. As I've mentioned in my previous blog post, if you're attending the Birds of a Feather Lunch on Tuesday and decide to drop by my table, make sure you bring with you the print out of the graphic that you've managed to download. Within the graphic is an instruction that you need to follow. I have prepared five (5) special gifts that I will give away to the first five attendees that would give me their version of the graphic print out. So, make sure you be at the lunch table early to get that special gift. I'm pretty sure we'll have a great time.

Wednesday, November 3, 2010

PASS Summit MVP Birds of a Feather Lunch 2010 - Follow the White Rabbit

Remember Neo? The guy who disturbed the order of the matrix? He had dreams of being given instructions to follow the white rabbit to figure out the answer to the question, "What is the Matrix?" Later in the film, he finds a woman with a tattoo of a white rabbit on her shoulder. And, then, he does just as he was instructed.

In my previous blog post, I talked about doing a MVP Birds of a Feather Lunch session at the upcoming PASS Community Summit in Seattle, WA and how you, if you're attending the conference, should watch out for blog posts that contain secret messages so you can get a special gift when you drop by my session. In this particular blog post, you will find a link to a graphic that you can download and print out. I'll tell you what you need to do about the graphic on my next blog post. And while finding out where the secret message is ain't about disturbing the order of The Matrix, it will certainly help you distinguish yourself from the rest

Monday, November 1, 2010

PASS Summit MVP Birds of a Feather Lunch 2010

After a few months of inactivity, I'm back to blogging - this time with a purpose. By now, there have been a lot of blog posts about the upcoming PASS Community Summit 2010 in Seattle, WA on the 9th-11th November 2010. This, in particular, has been announced earlier this month but I need to confirm my attendance even before I can blog about it. So, now that my plane tickets and hotel reservations are set, I guess I can say that I am indeed going to PASS this year. And while I am not doing a regular session this time around, I will be at the MVP Birds of a Feather Lunch on the 10th. Keeping up with my commitment to do professional development sessions at PASS (although I will be at the Ask-the-Experts booth on High Availability and Disaster Recovery as well), my topic for the lunch session will be about why you need to distinguish yourself from the rest.

If you're attending the PASS Summit this year, drop by my area and I'll have something for you. For the next couple of days, I will be posting some secret messages on Twitter that give instructions on what you need to bring with you when you drop by. And I'm definitely sure that you'll enjoy having lunch with me because it's all about YOU.

Below is the list of topics for the Birds of a Feather Lunch for the PASS Summit 2010. I got it off Mike Walsh's post on SQLServerPedia so be sure to check out that site as well if there are any changes, although, I'm pretty sure there won't be any at this time.

Allan Hirt SQL Server Manageability Tips for the DBA
Ami Levin Performance Management Tools
Andy Leonard SSIS Frameworks (or design patterns or getting started)
Andy Warren Statistics
Arnie Rowland Community – TechNet Wiki
Brad Schulz Interpretting Query Plans
Bruce Loehle-Conger SSRS Lessons Learned
Chris Webb Performance Tuning SSAS
Christian Bolton Hyper-V with SQL Server
Christopher Shaw Professional Development for the Data Professional
Darren Gosbell SSAS Triple A – Administration, Automation and APIs
Davide Mauri SQL Server & NoSQL Alternatives
Dean Vitner Write-Ahead Logging & The Transaction Log
Denny Cherry SQL Service Broker
Eduardo Castro HA In Virtualized Environments
Edwin Sarmiento Why you need to distinguish yourself from the rest
Erland Sommarskog Practical T-SQL Programming
Geoff Hiten Advanced SQL Clustering
Gianluca Hotz Oracle to SQL Server DBA
Glenn Berry Database Mirroring
Grant Fritchey T-SQL Tuning & Optimization
Greg Galloway What’s the Coolest Business Problem You’ve Solved with a Cube?
Jason Strate Querying The Procedure Cache
Jeff Moden Black Arts T-SQL (Like the Tally Table)
Jessica Moss Fixing Bad Data Using SSIS
Jesus Gil Geo-Spatial Data on SSRS (en espanol)
Jonathan Kehayias VMWare With SQL Server
Justing Langford SQL Server Consolidation
Kevin Boles Performance Tuning
Kevin Kline Help for the involuntary DBA
Louis Davidson Database Design
Michael Coles SQL Encryption (pending ATE schedule question)
Michael Steineke High Performance DBs on Shared Storage
Patrick LeBlanc CDC
Paul Turley BI Tool & Visualization Choices
Peter Ward SQL Server Migration and Upgrade Planning
Plamen Ratchev Implementing Hierarchies in SQL
Rafael Salas PowerPivot: Where/How Are You Using It?
Rob Farley Tables don’t exist, so join me at my index
Scott Klein Data Development in SQL Azure
Sean McCown PowerShell For the SQL Server DBA
Ted Krueger Cheapest DR/HA Options on SQL Server
Tim Chapman SQL Security Best Practices
Tim Mitchell Care and Feeding of your SSIS Infrastructure
Tim Ford DMVs
Todd McDermid Data Warehouse ETL with SSIS
Tomislav Piasevoli The Past, Present and Future of Microsoft BI

Sunday, August 1, 2010

Communicating Shrinking Databases to the Accidental DBAs

I'm a self-proclaimed communicator (although most people say that I was born a teacher) which is why I always think of ways to simplify concepts, though they seem to be complex.

I was on IM with my former student and asking him how he was doing. During the course of the conversation, I asked if they have a full time DBA. Unfortunately, they don't and that they rely on their ERP system's tuning tool to do the performance tuning for their databases (of course, I recommended hiring my services to do it for them). One question that popped up during the conversation was shrinking the database. Most of the time, I get tempted to refer them to SQL Server MVP Tibor Karaszi's blog about why you shouldn't be shrinking your databases which I have done a lot of times when answering questions in the technical forums. But this time, I approached it differently because I was this guy's former trainer and he definitely knows that I will try to keep things as simple as possible. I started out with the concept of internal and external fragmentation, explaining what happens to the database files when you shrink them and, as auto growth kicks in due to the additional amount of data being loaded, grow back to a large file size. Imagine a room full of boxes where each box contains a bunch of items - shoes, books, other personal items, etc. Of course, a box has a fixed volume that can only contain a certain volume of items, much so occupying a certain amount of space in the room. If you need to add more items in a box full of books, what would you do? Well, you would have to get a bigger box to replace the smaller one before you can hold the additional books. So, the process of adding more books in a full box definitely requires more effort even before you can do the only task you thought about doing - storing books in a box. I bet it would take more time and effort storing books in a full box as compared to storing them in an empty one. Think of this box as your database. While we cannot auto grow boxes due to the underlying principles of nature, we can do so with databases and any files stored in the file system. However, the process of storing books in a full box is similar to adding more data in a full database. The database engine will have to grow the database even before it can add the data in it. But isn't the goal just to add records? This means that we are making the database engine do more unnecessary work than we need it to. And you wonder why database professionals complain about transactons getting blocked when the database is already full. Which is why the best practice is to allocate more than enough space on your databases so that we minimize those auto growth events during regular business hours. And this is where Tibor explained about internal fragmentation and how shrinking your databases will fragment your indexes, further hurting your database performance. Now, what about external fragmentation? Let's get back to that box analogy. If you were to replace the small box with a bigger one, wouldn't you move the other boxes around just to accomodate the new, larger box? Imagine that the smaller box is on the third column, fourth stack of the fifth row and that you have organized the boxes in the room in such a way that you have maximized the space (by the way, I'm a bit like that when arranging my personal stuff). You would then have to take the old, smaller box out of it's original location, find a place where you can place the new, larger box and, finally, store the additional books. If the new, larger box fits in the same place as your old, smaller box, then, you would have unused space in the room and, therefore, not maximizing the room space. This, from a file system and disk perspective, is called fragmentation and from the database's perspective, external fragmentation. And, because the boxes (or files for disk subsystems) are not arranged in contiguous, close-together kind of fashion, you (or the disk) will have to work harder just to get access to different boxes stored inside the room.

So, the next time somebody asks about why database shrinking is definitely not a good idea, you can tell him this story

Continuing the SharePoint Foundation 2010 Installation

I've been out of the blogosphere for quite a while due to travel and work but I realized that part 2 of the series on installing SharePoint Foundation 2010 is already published. Check out the last part of the article at MSSharePointTips.com

Wednesday, May 5, 2010

Setup a SharePoint Foundation 2010 test environment

SharePoint 2010 has already been released. So, whether you want to try out SharePoint Foundation or the full-blown SharePoint Server 2010, you can now download the RTM versions - be it via MSDN, TechNet or the trial version.

Check out this article on how to install SharePoint Foundation 2010. I did highlight the purpose of each component and why we need to install them. Installation is pretty straightforward for a singe-server deployment so understanding what's under the hood can prove to be insightful

Saturday, May 1, 2010

Your poor man's SQL Server log shipping-PowerShell version

In 2008, I was privileged to be a part of a project to write a chapter for a book that was for a good cause. SQL Server MVP Deep Dives became an avenue for a lot of the SQL Server MVPs to share their expertise for the benefit of not just the technical community but of the beneficiary.

I wrote a chapter for the book based on this blog post some three years ago and one of the recommendations I did was to convert the VBScript scripts to Windows PowerShell. So, here it is.

I've converted the VBScript that does the checking of the folders containing my latest LOG backups and restoring them in NORECOVERY mode to Windows PowerShell. What is fascinating is he fact that if the process I've outlined in the blog post is followed thru, adding another database to be configured for log shipping was as easy as creating regular LOG backups via TSQL scripts or database maintenance plans - no need to add an entry for every database that will be configured. How cool was that?

So, here's the Windows PowerShell version of the custom restore LOG backup command

$logBackupFolder="your LOG backup location here or a parameter variable"

##Variable for time duration - the amount of time you need to generate and restore LOG backups
$1HourOld = [DateTime]::Now.AddHours(-1)


##Retrieve folders and files in the specified directory
foreach ($databaseName in (Get-ChildItem $logBackupFolder Where { $_.PsIsContainer }) )
{
foreach ($logName in (Get-ChildItem $logBackupFolder\$databasename where {($_.CreationTime -ge $1HourOld) -and ($_.Extension -eq ".trn")} SELECT name))
{

$logBackupPath = [System.String]::Concat("'$logBackupFolder\",$databasename,"\",$logName.Name,"'")
$restoreSQL = "RESTORE LOG $databaseName FROM DISK=$logBackupPath WITH NORECOVERY" Out-File D:\$databaseName.sql

Invoke-Expression "osql.exe -SYourSQLServerInstanceHere -E -iD:\$databaseName.sql"

}
}


You might be wondering why I am using osql.exe instead of sqlcmd.exe. Well, it's still there even if you have SQL Server 2008. Which means this script will work even for SQL Server 2000 instances as long as you install Windows PowerShell on the machine running SQL Server.

And all of that for a mere 17-liner script. I could write it even less than that but I'm a big fan of script readability. Now, that's a good enough reason to dig deeper on what Windows PowerShell has to offer

Thursday, April 29, 2010

Retrieve Default SQL Server Backup Folder using PowerShell

As I've been translating a lot of my TSQL script to Windows PowerShell with SMO, here's another article I wrote on how to check for the default SQL Server backup folder

Thursday, March 18, 2010

Understanding ASP.NET session state management from a DBA's perspective

While I no longer do .NET development nowadays, I find it hard not to use my previous background as a reference when talking about databases. One specific discussion that came up today was how ASP.NET manages session state.

In order to configure an ASP.NET application to do session state management, you need to configure the section in the web.config file of the web application. There are four modes that are available for use
  • OFF- this means that session state is turned off for the web application
  • InProc - this means that sessions are kept in memory on the web server. This is the default behavior. You can configure the cookieless attribute if you want the sessionID value in the URL.
  • StateServer - this means that you are storing session state in a separate process called the ASP.NET state service
  • SQLServer - well, you guessed it, it stores session state in a SQL Server database
  • Custom - you can write your own custom mechanism or any other that is readily available as a session storage mechanism

What we as DBAs are concerned about is the use of the SQLServer option as a session state storage mechanism for ASP.NET. As a DBA or a developer, you can create the session state database by using the aspnet_regsql.exe utility that comes with the .NET Framework. Running this utility, by default will create the MembershipProvider database for ASP.NET security outside of Windows authentication. But that's not what we want. We want to create the database to store the ASP.NET session state. To do that, we need to run the utility from the command-line, passing the correct parameters.

aspnet_regsql.exe -S YourSQLServerInstance -E -ssadd

This will create the database named ASPState to store the stored procedures and functions needed for session state management. But there's a catch to this. the default will be to create the stored procedures and functions in the ASPState database, however, the tables that will store the data will be in the tempdb database. Now, as a SQL Server DBA, we all know that anything stored in the tempdb database will eventually be flushed out when the SQL Server service restarts. If you have active sessions on your ASP.NET web application, you will lose these anytime your SQL Server service is restarted. What you need to do is explicitly pass another parameter on the aspnet_regsql.exe utility to store everything in the ASPState database.

aspnet_regsql.exe -S YourSQLServerInstance -E -ssadd -sstype p

The -sstype parameter explicitly tells the utility to create the tables in the ASPState database instead of the tempdb database (the P value stands for persisted) ill survive service restarts so you need to make use of the ASPState_Job_DeleteExpiredSessions job specifically created to do this. The default behavior of this job is to run every minute which would probably be not a good idea especially as it logs all of the job execution information in the msdb database.

So, the next time you need to work with ASP.NET applications storing session state information in a SQL Server database, check this out with your web developers

Wednesday, March 17, 2010

Slipstream Office SharePoint Server 2007 Service Packs for Windows Server 2008 R2

This article took quite a while in the making. I've been seeing a lot of IT Pros deploy SharePoint in their infrastructure particularly on a Windows Server 2008 platform. However, SharePoint with Service Pack 1 is not supported on a Windows Server 2008 R2 platform.

This article outlines the steps in slipstreaming service packs for SharePoint for use with Windows Server 2008 R2. I'll work on a series of articles on how to deploy a NLB solution for SharePoint as well as implement Kerberos delegation for a SharePoint farm

Tuesday, February 23, 2010

Connecting to SQL Server via Windows PowerShell with SQL Server authentication

Most of the articles I've written about SQL Server with Windows PowerShell have been using Windows Authentication. And while it is highly recommended to use Windows authentication to connect to SQL Server, the reality is that the IT infrastructures we have don't run on Microsoft Windows.

Here's an article I wrote on how to use Windows PowerShell to connect to SQL Server via mixed mode authentication

Wednesday, February 17, 2010

Query Hyper-V Virtual machines using Windows PowerShell

Being a lazy administrator as I am, I try to minimize the amount of mouse-clicks I need to make to retrieve information about something on a Windows platform. As I have been using Microsoft Hyper-V on a bunch of my test machines, I always check if a VM is up and running before I power down my host machine (imagine the amount of electricity consumed just by keeping your machine up and running even without using it). This is specifically the case when dealing with my Windows XP VMs. I noticed that the profiles get corrupted if I shutdown the host machine without properly shutting down the VM. So, I always made sure that the VMs are not running before powering down the host machine.

I wrote a PowerShell command to query the current state of the VMs running on Hyper-V


Get-WMIObject -class "MSVM_ComputerSystem"-namespace "root\virtualization"-computername "."

This will actually display a bunch of information about the VMs running on Hyper-V but what we're really concerned about is the name of the VM and it's currently running state. These two properties are associated with the ElementName and EnabledState attributes of the MSVM_ComputerSystem class. All we need to do with the command above is to pipe the results to a Select-Object cmdlet, specifying only these two properties, as follows

Get-WMIObject -class "MSVM_ComputerSystem"-namespace "root\virtualization"-computername "." Select-Object ElementName, EnabledState

While the EnabledState property will give you a bunch of numbers, I'm only concerned with those values equal to 2, which means that the VM is running. But, then, you might not remember what the value 2 means. So might as well write an entire script that checks for the value of the EnabledState property. I've used the GWMI alias to call the Get-WMIObject cmdlet

$VMs = gwmi -class "MSVM_ComputerSystem"-namespace "root\virtualization"-computername "."
foreach
($VM IN $VMs
)
{
switch
($VM.EnabledState
)
{
2{$state
=
"Running" }
3{$state
=
"Stopped" }
32768{$state
=
"Paused" }
32769{$state
=
"Suspended" }
32770 {$state
=
"Starting" }
32771{$state
=
"Taking Snapshot" }
32773{$state
=
"Saving" }
32774{$state
=
"Stopping" }
}
write
-
host $VM.ElementName `,` $state

}

On a side note, make sure you are running as Administrator when working with this script as you will only see the VMs that your currently logged in profile has permission to access. Running as Administrator will show you all of the VMs configured on your Hyper-V server

Friday, January 15, 2010

"Cannot Generate SSPI Context" errors

I get to deal with this type of error on a regular basis and, most of the time, end up recommending updating the SQL Server server principal name (SPN) using the setspn utility or simply rebooting the server. I was reading thru the SQL Server CSS blog today and found out another reason for this error message - changed SQL Server service account passwords. While I do not recommend changing the service account passwords during production hours, there may be cases where the account's password need to be changed as well as the corresponding credentials on the services while waiting for approval for downtime to restart the service. Lesson learned from this blog post is that (1) never change service account passwords during production hours and (2) always restart the SQL Server service immediately after changing the service account password for Kerberos to function properly. No wonder a reboot usually fixes this issue as it also restartes the SQL Server service. I will have to reproduce this to validate
Google