2012年3月8日星期四

Disaster Recovery

Hi GUYS,
I am having a very mission critical sql server 2000 database with active
passive failover cluster an d logshipping with interval of 2 minutes.
what additional disaster recovery techniques i should implement to have
99.99999 availability and
i can not afford any data loss.
Pls advice.
Rgds
BijuHi
Most compnaies will not get better than 99.9, maybe, 99.99, but 99.999 is
possible with a big big budget.
For Production, you need 2 sets of everthing with one set at a distant
location. Everthing must be vendor certified to work together, so Unisys,
IBM, HP and Dell and really your only choices. Data Center editions are
usually only certifiable for something like this.
Then have exacly the same setup for your Test enviroment, and the same again
for a development environment. Power, aircon, access control, at least 3 of
each, at each site.
Document everthing, from setup to operations to the finest detail. Have run
books, and regularly test the validity of them.
Have your architecture reviewed and certified by Microsoft and the Hardware
Vendor.
Have your application reviewed and certified by Microsoft
Have the most restrictive change control in place that, so that you can
nearly do nothing.
http://www.ftponline.com/wss/2004_11/magazine/features/nruest/default.aspx
http://www.microsoft.com/sql/techinfo/administration/2000/availability.asp
Super high availability is not a piece of equipment that you buy, it a whole
process that everyone has to work towards.
--
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"bijupg@.hotmail.com" wrote:
> Hi GUYS,
> I am having a very mission critical sql server 2000 database with active
> passive failover cluster an d logshipping with interval of 2 minutes.
> what additional disaster recovery techniques i should implement to have
> 99.99999 availability and
> i can not afford any data loss.
> Pls advice.
> Rgds
> Biju|||This is a multi-part message in MIME format.
--040907010300030705010102
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
No such thing as guaranteed zero data loss. Even with tlog backups
every 2 minutes there is still a 2+ minute window (backup interval +
time to transmit backup file to secondary and restore) during which if
you lost your primary DB you would lose some data.
Basically the cost of the DR solution increases inversely exponentially
with the acceptable data loss window. With SAN to SAN mirroring
technology, which is very expensive in terms of disk space requirements,
bandwidth & $$$, you could probably get the data loss window down to a
second or two. When we evaluated EMC's SAN to SAN
snap/cloning/mirroring on their Clariion SAN range about 2 years ago the
technology wasn't quite ready. I expect it's probably improved a bit
since then but I'm still skeptical about SAN black magic (especially
since DB consistency can be a bit touchy when you don't go through the
approved RDBMS server processes).
The main problem we had with the technology a couple years ago was the
fact that it really needed synchronous comms between the 2 SANs. Since
we were looking at SANs in 2 geographically separated data centres the
latency between SANs over a typical IP network (on UTP for instance)
would mean every write on our primary SAN would be delayed for 10-30ms
while the data committed on the secondary SAN too (that's way too much
for every SAN write). The only solution was fibre between the 2 data
centres and over about 7km that was far too expensive.
DR is all a balance between acceptable data loss, time to recovery & the
amount of money you're willing to spend on the solution. It sounds like
you're already protecting your data pretty well. Five 9s is
theoretically about 5 minutes downtime per year. Half a dozen cluster
failovers over the course of the year in order to apply Windows hotfixes
to your cluster nodes will pretty much blow your five 9s margin. It's a
good target to aim for but one only the richest companies can attain.
--
*mike hodgson* |/ database administrator/ | mallesons stephen jaques
*T* +61 (2) 9296 3668 |* F* +61 (2) 9296 3885 |* M* +61 (408) 675 907
*E* mailto:mike.hodgson@.mallesons.nospam.com |* W* http://www.mallesons.com
bijupg@.hotmail.com wrote:
>Hi GUYS,
>I am having a very mission critical sql server 2000 database with active
>passive failover cluster an d logshipping with interval of 2 minutes.
>what additional disaster recovery techniques i should implement to have
>99.99999 availability and
>i can not afford any data loss.
>Pls advice.
>Rgds
>Biju
>
--040907010300030705010102
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<tt>No such thing as guaranteed zero data loss. Even with tlog backups
every 2 minutes there is still a 2+ minute window (backup interval +
time to transmit backup file to secondary and restore) during which if
you lost your primary DB you would lose some data.<br>
<br>
Basically the cost of the DR solution increases inversely exponentially
with the acceptable data loss window. With SAN to SAN mirroring
technology, which is very expensive in terms of disk space
requirements, bandwidth & $$$, you could probably get the data loss
window down to a second or two. When we evaluated EMC's SAN to SAN
snap/cloning/mirroring on their Clariion SAN range about 2 years ago
the technology wasn't quite ready. I expect it's probably improved a
bit since then but I'm still skeptical about SAN black magic
(especially since DB consistency can be a bit touchy when you don't go
through the approved RDBMS server processes).<br>
<br>
The main problem we had with the technology a couple years ago was the
fact that it really needed synchronous comms between the 2 SANs. Since
we were looking at SANs in 2 geographically separated data centres the
latency between SANs over a typical IP network (on UTP for instance)
would mean every write on our primary SAN would be delayed for 10-30ms
while the data committed on the secondary SAN too (that's way too much
for every SAN write). The only solution was fibre between the 2 data
centres and over about 7km that was far too expensive.<br>
<br>
DR is all a balance between acceptable data loss, time to recovery
& the amount of money you're willing to spend on the solution. It
sounds like you're already protecting your data pretty well. Five 9s
is theoretically about 5 minutes downtime per year. Half a dozen
cluster failovers over the course of the year in order to apply Windows
hotfixes to your cluster nodes will pretty much blow your five 9s
margin. It's a good target to aim for but one only the richest
companies can attain. <br>
</tt>
<div class="moz-signature">
<title></title>
<meta http-equiv="Content-Type" content="text/html; ">
<p><span lang="en-au"><font face="Tahoma" size="2">--<br>
</font> </span><b><span lang="en-au"><font face="Tahoma" size="2">mike
hodgson</font></span></b><span lang="en-au"> <font face="Tahoma"
size="2">|</font><i><font face="Tahoma"> </font><font face="Tahoma"
size="2"> database administrator</font></i><font face="Tahoma" size="2">
| mallesons</font><font face="Tahoma"> </font><font face="Tahoma"
size="2">stephen</font><font face="Tahoma"> </font><font face="Tahoma"
size="2"> jaques</font><font face="Tahoma"><br>
</font><b><font face="Tahoma" size="2">T</font></b><font face="Tahoma"
size="2"> +61 (2) 9296 3668 |</font><b><font face="Tahoma"> </font><font
face="Tahoma" size="2"> F</font></b><font face="Tahoma" size="2"> +61
(2) 9296 3885 |</font><b><font face="Tahoma"> </font><font
face="Tahoma" size="2">M</font></b><font face="Tahoma" size="2"> +61
(408) 675 907</font><br>
<b><font face="Tahoma" size="2">E</font></b><font face="Tahoma" size="2">
<a href="http://links.10026.com/?link=mailto:mike.hodgson@.mallesons.nospam.com">
mailto:mike.hodgson@.mallesons.nospam.com</a> |</font><b><font
face="Tahoma"> </font><font face="Tahoma" size="2">W</font></b><font
face="Tahoma" size="2"> <a href="http://links.10026.com/?link=/">http://www.mallesons.com">
http://www.mallesons.com</a></font></span> </p>
</div>
<br>
<br>
<a class="moz-txt-link-abbreviated" href="http://links.10026.com/?link=mailto:bijupg@.hotmail.com">bijupg@.hotmail.com</a> wrote:
<blockquote cite="midD390CB5E-2AF0-476F-9E11-718D159B74DC@.microsoft.com"
type="cite">
<pre wrap="">Hi GUYS,
I am having a very mission critical sql server 2000 database with active
passive failover cluster an d logshipping with interval of 2 minutes.
what additional disaster recovery techniques i should implement to have
99.99999 availability and
i can not afford any data loss.
Pls advice.
Rgds
Biju
</pre>
</blockquote>
</body>
</html>
--040907010300030705010102--|||is it agood idea to replace harddisks in raid 10 to avoid any possible data
corruption?
"Mike Epprecht (SQL MVP)" wrote:
> Hi
> Most compnaies will not get better than 99.9, maybe, 99.99, but 99.999 is
> possible with a big big budget.
> For Production, you need 2 sets of everthing with one set at a distant
> location. Everthing must be vendor certified to work together, so Unisys,
> IBM, HP and Dell and really your only choices. Data Center editions are
> usually only certifiable for something like this.
> Then have exacly the same setup for your Test enviroment, and the same again
> for a development environment. Power, aircon, access control, at least 3 of
> each, at each site.
> Document everthing, from setup to operations to the finest detail. Have run
> books, and regularly test the validity of them.
> Have your architecture reviewed and certified by Microsoft and the Hardware
> Vendor.
> Have your application reviewed and certified by Microsoft
> Have the most restrictive change control in place that, so that you can
> nearly do nothing.
> http://www.ftponline.com/wss/2004_11/magazine/features/nruest/default.aspx
> http://www.microsoft.com/sql/techinfo/administration/2000/availability.asp
> Super high availability is not a piece of equipment that you buy, it a whole
> process that everyone has to work towards.
> --
> Mike Epprecht, Microsoft SQL Server MVP
> Zurich, Switzerland
> MVP Program: http://www.microsoft.com/mvp
> Blog: http://www.msmvps.com/epprecht/
>
> "bijupg@.hotmail.com" wrote:
> > Hi GUYS,
> > I am having a very mission critical sql server 2000 database with active
> > passive failover cluster an d logshipping with interval of 2 minutes.
> > what additional disaster recovery techniques i should implement to have
> > 99.99999 availability and
> > i can not afford any data loss.
> > Pls advice.
> > Rgds
> > Biju|||RAID 1+0 is a good fit for SQL Server data disks but it will not protect
against all data corruption issues. It will minimize the chance for data
loss due to disk hardware issues but cannot protect against a controller
failure or an Operating System problem. As Mike said; availability is a
process, not an piece of equipment.
Geoff N. Hiten
Microsoft SQL Server MVP
"bijupg@.hotmail.com" <bijupghotmailcom@.discussions.microsoft.com> wrote in
message news:5F2A5F0C-D565-43D6-AB19-64467990E0D9@.microsoft.com...
> is it agood idea to replace harddisks in raid 10 to avoid any possible
> data
> corruption?
>
> "Mike Epprecht (SQL MVP)" wrote:
>> Hi
>> Most compnaies will not get better than 99.9, maybe, 99.99, but 99.999 is
>> possible with a big big budget.
>> For Production, you need 2 sets of everthing with one set at a distant
>> location. Everthing must be vendor certified to work together, so Unisys,
>> IBM, HP and Dell and really your only choices. Data Center editions are
>> usually only certifiable for something like this.
>> Then have exacly the same setup for your Test enviroment, and the same
>> again
>> for a development environment. Power, aircon, access control, at least 3
>> of
>> each, at each site.
>> Document everthing, from setup to operations to the finest detail. Have
>> run
>> books, and regularly test the validity of them.
>> Have your architecture reviewed and certified by Microsoft and the
>> Hardware
>> Vendor.
>> Have your application reviewed and certified by Microsoft
>> Have the most restrictive change control in place that, so that you can
>> nearly do nothing.
>> http://www.ftponline.com/wss/2004_11/magazine/features/nruest/default.aspx
>> http://www.microsoft.com/sql/techinfo/administration/2000/availability.asp
>> Super high availability is not a piece of equipment that you buy, it a
>> whole
>> process that everyone has to work towards.
>> --
>> Mike Epprecht, Microsoft SQL Server MVP
>> Zurich, Switzerland
>> MVP Program: http://www.microsoft.com/mvp
>> Blog: http://www.msmvps.com/epprecht/
>>
>> "bijupg@.hotmail.com" wrote:
>> > Hi GUYS,
>> > I am having a very mission critical sql server 2000 database with
>> > active
>> > passive failover cluster an d logshipping with interval of 2 minutes.
>> > what additional disaster recovery techniques i should implement to have
>> > 99.99999 availability and
>> > i can not afford any data loss.
>> > Pls advice.
>> > Rgds
>> > Biju

没有评论:

发表评论