Can a disk failure on Raid 5 cause torn pages? It actually did cause
torn pages on several databases for us, so I am wondering if we have
some kind of configuration problem.
Thanks
A torn page basically reads the last two bits of a page that is written to
disk. If you have a hardware issue as the page is written to disk you always
have the potential to have part of the data written and part not. I think
anytime you have a disk failure you run the risk of a torn page.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136847162.322252.52170@.o13g2000cwo.googlegro ups.com...
> Can a disk failure on Raid 5 cause torn pages? It actually did cause
> torn pages on several databases for us, so I am wondering if we have
> some kind of configuration problem.
> Thanks
>
|||Is there anything we could do to prevent this or is it just a fact of
life?
I guess I don't understand why the data page would not have been
written correctly to the redundant drive.
|||I'm no storage expert, but I would ask the storage vendor whether the RAID system fulfils Write
ordering and other aspects mentioned in
http://www.microsoft.com/technet/pro...basics.mspx.in case of a drive
failure.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegro ups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>
|||Well a RAID 5 does not actually have a redundant drive. All the drives are
written to with a small piece of the data. One of the drives holds the
parity while the others each get a piece of the actual data. So if any one
piece is missing it can rebuild the data with the parity. But that does not
mean you can not get corruption on the write especially during a hardware
failure. I won't claim to know how the drive controllers work internally and
how they each do their stuff. So I am not sure other than ensuring you have
good name equipment and all in proper working order. Especially the UPS.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegro ups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>
|||Guys, I guess I am just not getting it. I thought that RAID 5 was
redundant, meaning it should not be affected by a disk failure. The
data should be stored on 2 drives right? So how can you get torn page
errors when a disk fails?
Can someone please step me though a scenario of how data can be
corrupted with a RAID 5 disk failure?
|||No that is not how a RAID 5 works. If you have 4 drives in a RAID 5 you
will essentially split the data into 3 pieces. One of each of the pieces
will go onto 3 of the drives and a parity is calculated and placed on the
fourth drive. Each time you write to the drive array this is repeated but
the parity moves around so it is not always on the same drive. Under normal
conditions when you read the data the parity is not used and the whole data
block is created by piecing the three pieces back together. In the event of
a single disk failure the controller can read the two remaining good pieces
and using the parity recreate the third to get the data back. A Raid 5 does
not store the data twice. But even if it did that still does not prevent
torn pages. As I mentioned a torn page occurs when for some reason (usually
hardware related) the last two bits on a page did not get written properly
or at all. This can happen when the driver thinks it wrote the page
properly but the hardware didn't. A Raid 5 array does not claim to stop this
from occurring. That is why backups are so important. You can not protect
your data 100% with a Raid array.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137002846.251710.290370@.g44g2000cwa.googlegr oups.com...
> Guys, I guess I am just not getting it. I thought that RAID 5 was
> redundant, meaning it should not be affected by a disk failure. The
> data should be stored on 2 drives right? So how can you get torn page
> errors when a disk fails?
> Can someone please step me though a scenario of how data can be
> corrupted with a RAID 5 disk failure?
>
|||Thanks, that helps a little.
I am still having a hard time grasping parity.
I will use Raid 3 for simplicity.
Disk 1: 00000000
Disk 2: 11111111
Disk 3: ??
On Raid 3, if Disk 3 stores parity data, what would it store? I
don't understand how one drive could store enough data to rebuild
Disk 1 or Disk 2.
|||This should explain Parity:
http://www.pcguide.com/ref/hdd/perf/...nParity-c.html
This shows how raid 3 (and others) use parity.
http://www.storagereview.com/guide20...gleLevel3.html
This tells you why you may want to consider something other than RAID 5.
http://www.baarf.com/
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137006115.234760.166690@.g44g2000cwa.googlegr oups.com...
> Thanks, that helps a little.
> I am still having a hard time grasping parity.
> I will use Raid 3 for simplicity.
> Disk 1: 00000000
> Disk 2: 11111111
> Disk 3: ??
> On Raid 3, if Disk 3 stores parity data, what would it store? I
> don't understand how one drive could store enough data to rebuild
> Disk 1 or Disk 2.
>
|||How would RAID 10 affect this scenerio?
sql
2012年3月25日星期日
Disk Failure on Raid 5
Can a disk failure on Raid 5 cause torn pages? It actually did cause
torn pages on several databases for us, so I am wondering if we have
some kind of configuration problem.
ThanksA torn page basically reads the last two bits of a page that is written to
disk. If you have a hardware issue as the page is written to disk you always
have the potential to have part of the data written and part not. I think
anytime you have a disk failure you run the risk of a torn page.
--
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136847162.322252.52170@.o13g2000cwo.googlegroups.com...
> Can a disk failure on Raid 5 cause torn pages? It actually did cause
> torn pages on several databases for us, so I am wondering if we have
> some kind of configuration problem.
> Thanks
>|||Is there anything we could do to prevent this or is it just a fact of
life?
I guess I don't understand why the data page would not have been
written correctly to the redundant drive.|||I'm no storage expert, but I would ask the storage vendor whether the RAID system fulfils Write
ordering and other aspects mentioned in
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx.in case of a drive
failure.
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Well a RAID 5 does not actually have a redundant drive. All the drives are
written to with a small piece of the data. One of the drives holds the
parity while the others each get a piece of the actual data. So if any one
piece is missing it can rebuild the data with the parity. But that does not
mean you can not get corruption on the write especially during a hardware
failure. I won't claim to know how the drive controllers work internally and
how they each do their stuff. So I am not sure other than ensuring you have
good name equipment and all in proper working order. Especially the UPS.
--
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Guys, I guess I am just not getting it. I thought that RAID 5 was
redundant, meaning it should not be affected by a disk failure. The
data should be stored on 2 drives right? So how can you get torn page
errors when a disk fails?
Can someone please step me though a scenario of how data can be
corrupted with a RAID 5 disk failure?|||No that is not how a RAID 5 works. If you have 4 drives in a RAID 5 you
will essentially split the data into 3 pieces. One of each of the pieces
will go onto 3 of the drives and a parity is calculated and placed on the
fourth drive. Each time you write to the drive array this is repeated but
the parity moves around so it is not always on the same drive. Under normal
conditions when you read the data the parity is not used and the whole data
block is created by piecing the three pieces back together. In the event of
a single disk failure the controller can read the two remaining good pieces
and using the parity recreate the third to get the data back. A Raid 5 does
not store the data twice. But even if it did that still does not prevent
torn pages. As I mentioned a torn page occurs when for some reason (usually
hardware related) the last two bits on a page did not get written properly
or at all. This can happen when the driver thinks it wrote the page
properly but the hardware didn't. A Raid 5 array does not claim to stop this
from occurring. That is why backups are so important. You can not protect
your data 100% with a Raid array.
--
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137002846.251710.290370@.g44g2000cwa.googlegroups.com...
> Guys, I guess I am just not getting it. I thought that RAID 5 was
> redundant, meaning it should not be affected by a disk failure. The
> data should be stored on 2 drives right? So how can you get torn page
> errors when a disk fails?
> Can someone please step me though a scenario of how data can be
> corrupted with a RAID 5 disk failure?
>|||Thanks, that helps a little.
I am still having a hard time grasping parity.
I will use Raid 3 for simplicity.
Disk 1: 00000000
Disk 2: 11111111
Disk 3: ''
On Raid 3, if Disk 3 stores parity data, what would it store? I
don't understand how one drive could store enough data to rebuild
Disk 1 or Disk 2.|||This should explain Parity:
http://www.pcguide.com/ref/hdd/perf/raid/concepts/genParity-c.html
This shows how raid 3 (and others) use parity.
http://www.storagereview.com/guide2000/ref/hdd/perf/raid/levels/singleLevel3.html
This tells you why you may want to consider something other than RAID 5.
http://www.baarf.com/
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137006115.234760.166690@.g44g2000cwa.googlegroups.com...
> Thanks, that helps a little.
> I am still having a hard time grasping parity.
> I will use Raid 3 for simplicity.
> Disk 1: 00000000
> Disk 2: 11111111
> Disk 3: ''
> On Raid 3, if Disk 3 stores parity data, what would it store? I
> don't understand how one drive could store enough data to rebuild
> Disk 1 or Disk 2.
>|||How would RAID 10 affect this scenerio?|||I don't think it would matter what raid level it was. While Raid 10 does not
use parity it still writes to the disk and any write has the potential to
have a torn page.
--
Andrew J. Kelly SQL MVP
"JLA" <info@.jlaenterprises-dot-com.no-spam.invalid> wrote in message
news:43c5acee$0$17777$c3e8da3@.news.astraweb.com...
> How would RAID 10 affect this scenerio?
>|||Thanks! that was some good reading!
I think I am going to push for Raid 10. :-)
torn pages on several databases for us, so I am wondering if we have
some kind of configuration problem.
ThanksA torn page basically reads the last two bits of a page that is written to
disk. If you have a hardware issue as the page is written to disk you always
have the potential to have part of the data written and part not. I think
anytime you have a disk failure you run the risk of a torn page.
--
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136847162.322252.52170@.o13g2000cwo.googlegroups.com...
> Can a disk failure on Raid 5 cause torn pages? It actually did cause
> torn pages on several databases for us, so I am wondering if we have
> some kind of configuration problem.
> Thanks
>|||Is there anything we could do to prevent this or is it just a fact of
life?
I guess I don't understand why the data page would not have been
written correctly to the redundant drive.|||I'm no storage expert, but I would ask the storage vendor whether the RAID system fulfils Write
ordering and other aspects mentioned in
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx.in case of a drive
failure.
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Well a RAID 5 does not actually have a redundant drive. All the drives are
written to with a small piece of the data. One of the drives holds the
parity while the others each get a piece of the actual data. So if any one
piece is missing it can rebuild the data with the parity. But that does not
mean you can not get corruption on the write especially during a hardware
failure. I won't claim to know how the drive controllers work internally and
how they each do their stuff. So I am not sure other than ensuring you have
good name equipment and all in proper working order. Especially the UPS.
--
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Guys, I guess I am just not getting it. I thought that RAID 5 was
redundant, meaning it should not be affected by a disk failure. The
data should be stored on 2 drives right? So how can you get torn page
errors when a disk fails?
Can someone please step me though a scenario of how data can be
corrupted with a RAID 5 disk failure?|||No that is not how a RAID 5 works. If you have 4 drives in a RAID 5 you
will essentially split the data into 3 pieces. One of each of the pieces
will go onto 3 of the drives and a parity is calculated and placed on the
fourth drive. Each time you write to the drive array this is repeated but
the parity moves around so it is not always on the same drive. Under normal
conditions when you read the data the parity is not used and the whole data
block is created by piecing the three pieces back together. In the event of
a single disk failure the controller can read the two remaining good pieces
and using the parity recreate the third to get the data back. A Raid 5 does
not store the data twice. But even if it did that still does not prevent
torn pages. As I mentioned a torn page occurs when for some reason (usually
hardware related) the last two bits on a page did not get written properly
or at all. This can happen when the driver thinks it wrote the page
properly but the hardware didn't. A Raid 5 array does not claim to stop this
from occurring. That is why backups are so important. You can not protect
your data 100% with a Raid array.
--
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137002846.251710.290370@.g44g2000cwa.googlegroups.com...
> Guys, I guess I am just not getting it. I thought that RAID 5 was
> redundant, meaning it should not be affected by a disk failure. The
> data should be stored on 2 drives right? So how can you get torn page
> errors when a disk fails?
> Can someone please step me though a scenario of how data can be
> corrupted with a RAID 5 disk failure?
>|||Thanks, that helps a little.
I am still having a hard time grasping parity.
I will use Raid 3 for simplicity.
Disk 1: 00000000
Disk 2: 11111111
Disk 3: ''
On Raid 3, if Disk 3 stores parity data, what would it store? I
don't understand how one drive could store enough data to rebuild
Disk 1 or Disk 2.|||This should explain Parity:
http://www.pcguide.com/ref/hdd/perf/raid/concepts/genParity-c.html
This shows how raid 3 (and others) use parity.
http://www.storagereview.com/guide2000/ref/hdd/perf/raid/levels/singleLevel3.html
This tells you why you may want to consider something other than RAID 5.
http://www.baarf.com/
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137006115.234760.166690@.g44g2000cwa.googlegroups.com...
> Thanks, that helps a little.
> I am still having a hard time grasping parity.
> I will use Raid 3 for simplicity.
> Disk 1: 00000000
> Disk 2: 11111111
> Disk 3: ''
> On Raid 3, if Disk 3 stores parity data, what would it store? I
> don't understand how one drive could store enough data to rebuild
> Disk 1 or Disk 2.
>|||How would RAID 10 affect this scenerio?|||I don't think it would matter what raid level it was. While Raid 10 does not
use parity it still writes to the disk and any write has the potential to
have a torn page.
--
Andrew J. Kelly SQL MVP
"JLA" <info@.jlaenterprises-dot-com.no-spam.invalid> wrote in message
news:43c5acee$0$17777$c3e8da3@.news.astraweb.com...
> How would RAID 10 affect this scenerio?
>|||Thanks! that was some good reading!
I think I am going to push for Raid 10. :-)
Disk Failure on Raid 5
Can a disk failure on Raid 5 cause torn pages? It actually did cause
torn pages on several databases for us, so I am wondering if we have
some kind of configuration problem.
ThanksA torn page basically reads the last two bits of a page that is written to
disk. If you have a hardware issue as the page is written to disk you always
have the potential to have part of the data written and part not. I think
anytime you have a disk failure you run the risk of a torn page.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136847162.322252.52170@.o13g2000cwo.googlegroups.com...
> Can a disk failure on Raid 5 cause torn pages? It actually did cause
> torn pages on several databases for us, so I am wondering if we have
> some kind of configuration problem.
> Thanks
>|||Is there anything we could do to prevent this or is it just a fact of
life?
I guess I don't understand why the data page would not have been
written correctly to the redundant drive.|||I'm no storage expert, but I would ask the storage vendor whether the RAID s
ystem fulfils Write
ordering and other aspects mentioned in
http://www.microsoft.com/technet/pr...mspx.in
case of a drive
failure.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Well a RAID 5 does not actually have a redundant drive. All the drives are
written to with a small piece of the data. One of the drives holds the
parity while the others each get a piece of the actual data. So if any one
piece is missing it can rebuild the data with the parity. But that does not
mean you can not get corruption on the write especially during a hardware
failure. I won't claim to know how the drive controllers work internally and
how they each do their stuff. So I am not sure other than ensuring you have
good name equipment and all in proper working order. Especially the UPS.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Guys, I guess I am just not getting it. I thought that RAID 5 was
redundant, meaning it should not be affected by a disk failure. The
data should be stored on 2 drives right? So how can you get torn page
errors when a disk fails?
Can someone please step me though a scenario of how data can be
corrupted with a RAID 5 disk failure?|||No that is not how a RAID 5 works. If you have 4 drives in a RAID 5 you
will essentially split the data into 3 pieces. One of each of the pieces
will go onto 3 of the drives and a parity is calculated and placed on the
fourth drive. Each time you write to the drive array this is repeated but
the parity moves around so it is not always on the same drive. Under normal
conditions when you read the data the parity is not used and the whole data
block is created by piecing the three pieces back together. In the event of
a single disk failure the controller can read the two remaining good pieces
and using the parity recreate the third to get the data back. A Raid 5 does
not store the data twice. But even if it did that still does not prevent
torn pages. As I mentioned a torn page occurs when for some reason (usually
hardware related) the last two bits on a page did not get written properly
or at all. This can happen when the driver thinks it wrote the page
properly but the hardware didn't. A Raid 5 array does not claim to stop this
from occurring. That is why backups are so important. You can not protect
your data 100% with a Raid array.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137002846.251710.290370@.g44g2000cwa.googlegroups.com...
> Guys, I guess I am just not getting it. I thought that RAID 5 was
> redundant, meaning it should not be affected by a disk failure. The
> data should be stored on 2 drives right? So how can you get torn page
> errors when a disk fails?
> Can someone please step me though a scenario of how data can be
> corrupted with a RAID 5 disk failure?
>|||Thanks, that helps a little.
I am still having a hard time grasping parity.
I will use Raid 3 for simplicity.
Disk 1: 00000000
Disk 2: 11111111
Disk 3: ''
On Raid 3, if Disk 3 stores parity data, what would it store? I
don't understand how one drive could store enough data to rebuild
Disk 1 or Disk 2.|||This should explain Parity:
http://www.pcguide.com/ref/hdd/perf...enParity-c.html
This shows how raid 3 (and others) use parity.
http://www.storagereview.com/guide2.../www.baarf.com/
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137006115.234760.166690@.g44g2000cwa.googlegroups.com...
> Thanks, that helps a little.
> I am still having a hard time grasping parity.
> I will use Raid 3 for simplicity.
> Disk 1: 00000000
> Disk 2: 11111111
> Disk 3: ''
> On Raid 3, if Disk 3 stores parity data, what would it store? I
> don't understand how one drive could store enough data to rebuild
> Disk 1 or Disk 2.
>|||How would RAID 10 affect this scenerio?
torn pages on several databases for us, so I am wondering if we have
some kind of configuration problem.
ThanksA torn page basically reads the last two bits of a page that is written to
disk. If you have a hardware issue as the page is written to disk you always
have the potential to have part of the data written and part not. I think
anytime you have a disk failure you run the risk of a torn page.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136847162.322252.52170@.o13g2000cwo.googlegroups.com...
> Can a disk failure on Raid 5 cause torn pages? It actually did cause
> torn pages on several databases for us, so I am wondering if we have
> some kind of configuration problem.
> Thanks
>|||Is there anything we could do to prevent this or is it just a fact of
life?
I guess I don't understand why the data page would not have been
written correctly to the redundant drive.|||I'm no storage expert, but I would ask the storage vendor whether the RAID s
ystem fulfils Write
ordering and other aspects mentioned in
http://www.microsoft.com/technet/pr...mspx.in
case of a drive
failure.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Well a RAID 5 does not actually have a redundant drive. All the drives are
written to with a small piece of the data. One of the drives holds the
parity while the others each get a piece of the actual data. So if any one
piece is missing it can rebuild the data with the parity. But that does not
mean you can not get corruption on the write especially during a hardware
failure. I won't claim to know how the drive controllers work internally and
how they each do their stuff. So I am not sure other than ensuring you have
good name equipment and all in proper working order. Especially the UPS.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1136912372.389395.67820@.g47g2000cwa.googlegroups.com...
> Is there anything we could do to prevent this or is it just a fact of
> life?
> I guess I don't understand why the data page would not have been
> written correctly to the redundant drive.
>|||Guys, I guess I am just not getting it. I thought that RAID 5 was
redundant, meaning it should not be affected by a disk failure. The
data should be stored on 2 drives right? So how can you get torn page
errors when a disk fails?
Can someone please step me though a scenario of how data can be
corrupted with a RAID 5 disk failure?|||No that is not how a RAID 5 works. If you have 4 drives in a RAID 5 you
will essentially split the data into 3 pieces. One of each of the pieces
will go onto 3 of the drives and a parity is calculated and placed on the
fourth drive. Each time you write to the drive array this is repeated but
the parity moves around so it is not always on the same drive. Under normal
conditions when you read the data the parity is not used and the whole data
block is created by piecing the three pieces back together. In the event of
a single disk failure the controller can read the two remaining good pieces
and using the parity recreate the third to get the data back. A Raid 5 does
not store the data twice. But even if it did that still does not prevent
torn pages. As I mentioned a torn page occurs when for some reason (usually
hardware related) the last two bits on a page did not get written properly
or at all. This can happen when the driver thinks it wrote the page
properly but the hardware didn't. A Raid 5 array does not claim to stop this
from occurring. That is why backups are so important. You can not protect
your data 100% with a Raid array.
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137002846.251710.290370@.g44g2000cwa.googlegroups.com...
> Guys, I guess I am just not getting it. I thought that RAID 5 was
> redundant, meaning it should not be affected by a disk failure. The
> data should be stored on 2 drives right? So how can you get torn page
> errors when a disk fails?
> Can someone please step me though a scenario of how data can be
> corrupted with a RAID 5 disk failure?
>|||Thanks, that helps a little.
I am still having a hard time grasping parity.
I will use Raid 3 for simplicity.
Disk 1: 00000000
Disk 2: 11111111
Disk 3: ''
On Raid 3, if Disk 3 stores parity data, what would it store? I
don't understand how one drive could store enough data to rebuild
Disk 1 or Disk 2.|||This should explain Parity:
http://www.pcguide.com/ref/hdd/perf...enParity-c.html
This shows how raid 3 (and others) use parity.
http://www.storagereview.com/guide2.../www.baarf.com/
Andrew J. Kelly SQL MVP
"Dave" <daveg.01@.gmail.com> wrote in message
news:1137006115.234760.166690@.g44g2000cwa.googlegroups.com...
> Thanks, that helps a little.
> I am still having a hard time grasping parity.
> I will use Raid 3 for simplicity.
> Disk 1: 00000000
> Disk 2: 11111111
> Disk 3: ''
> On Raid 3, if Disk 3 stores parity data, what would it store? I
> don't understand how one drive could store enough data to rebuild
> Disk 1 or Disk 2.
>|||How would RAID 10 affect this scenerio?
disk failure on cluster array
We experienced a disk failure on the cluster array. It's a
active/passive SQL cluster. I was told by dell that we must powerdown
the passive before replacing the failed drive because otherwise both
servers will attempt to rebuild the array. Is this true? Can you
simply swap out the bad drive or will I need to power down the passave
node?
Dear Icemon,
It seems a bit much to me and I would guess that modern disk arrays (both
direct SCSI attach as well as any SAN implementation) would do a rebuild of
the RAID on the Array, with the host connected and online.
However my advice is to follow your vendors instructions.
I know this is probably not the answer you were hoping for, but it all
depends on your hardware vendor to provide support for replacing a hard
drive.
good luck and
rgds,
Edwin.
"icemon" <johnsitu@.gmail.com> wrote in message
news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
> We experienced a disk failure on the cluster array. It's a
> active/passive SQL cluster. I was told by dell that we must powerdown
> the passive before replacing the failed drive because otherwise both
> servers will attempt to rebuild the array. Is this true? Can you
> simply swap out the bad drive or will I need to power down the passave
> node?
>
|||"icemon" <johnsitu@.gmail.com> wrote in message
news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
> We experienced a disk failure on the cluster array. It's a
> active/passive SQL cluster. I was told by dell that we must powerdown
> the passive before replacing the failed drive because otherwise both
> servers will attempt to rebuild the array. Is this true? Can you
> simply swap out the bad drive or will I need to power down the passave
> node?
It really depends on the array and the controllers. Do you have PERCs in
each node and are the drives all SCSI attached? If so, then Dell seems to be
leading you the right direction.
Russ Kaufmann
MVP - Windows Server - Clustering
ClusterHelp.com, a Microsoft Certified Gold Partner
Web http://www.clusterhelp.com
Blog http://msmvps.com/clusterhelp
The next ClusterHelp classes are:
Denver starting Feb 12th
NYC starting Feb 19th
|||Let me guess? One of the Powervault 200/220 series arrays with PERC cards
in each host? Dell is correct. Unfortunately, SCSI arrays with no internal
controllers have a lot of limitations. You just found one the hard way.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"icemon" <johnsitu@.gmail.com> wrote in message
news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
> We experienced a disk failure on the cluster array. It's a
> active/passive SQL cluster. I was told by dell that we must powerdown
> the passive before replacing the failed drive because otherwise both
> servers will attempt to rebuild the array. Is this true? Can you
> simply swap out the bad drive or will I need to power down the passave
> node?
>
|||Since I didn't get any immiedate responses on this post, I decided to
call Dell once more. I spoke to another engineer who advised me
differently. Shouldn't they be more in sync with their knowledge base?
Anyway, this guy told me that it is not necessary since it's an
Active/Passive Node. He said that Since the Passive Node is not
running, it will not compete. I swap out the array and the drive is
rebuilt. It is running a PERC 3 on a powervault 220 array.
Geoff N. Hiten wrote:[vbcol=seagreen]
> Let me guess? One of the Powervault 200/220 series arrays with PERC cards
> in each host? Dell is correct. Unfortunately, SCSI arrays with no internal
> controllers have a lot of limitations. You just found one the hard way.
> --
> Geoff N. Hiten
> Senior Database Administrator
> Microsoft SQL Server MVP
>
>
> "icemon" <johnsitu@.gmail.com> wrote in message
> news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
|||I am not completely surprised either way. Level 1 tech support from Dell
isn't the most accurate in the universe, even on the Gold Support Queue. And
if you are running a cluster without at least Gold support, then you are
crazier than I am. The escalation techs are pretty good and the TAMs can
make things happen so I shouldn't be too harsh on Dell. I do wish they
would update their SCSI storage systems a bit, though.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"icemon" <johnsitu@.gmail.com> wrote in message
news:1165978456.551049.281650@.f1g2000cwa.googlegro ups.com...
> Since I didn't get any immiedate responses on this post, I decided to
> call Dell once more. I spoke to another engineer who advised me
> differently. Shouldn't they be more in sync with their knowledge base?
> Anyway, this guy told me that it is not necessary since it's an
> Active/Passive Node. He said that Since the Passive Node is not
> running, it will not compete. I swap out the array and the drive is
> rebuilt. It is running a PERC 3 on a powervault 220 array.
>
> Geoff N. Hiten wrote:
>
active/passive SQL cluster. I was told by dell that we must powerdown
the passive before replacing the failed drive because otherwise both
servers will attempt to rebuild the array. Is this true? Can you
simply swap out the bad drive or will I need to power down the passave
node?
Dear Icemon,
It seems a bit much to me and I would guess that modern disk arrays (both
direct SCSI attach as well as any SAN implementation) would do a rebuild of
the RAID on the Array, with the host connected and online.
However my advice is to follow your vendors instructions.
I know this is probably not the answer you were hoping for, but it all
depends on your hardware vendor to provide support for replacing a hard
drive.
good luck and
rgds,
Edwin.
"icemon" <johnsitu@.gmail.com> wrote in message
news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
> We experienced a disk failure on the cluster array. It's a
> active/passive SQL cluster. I was told by dell that we must powerdown
> the passive before replacing the failed drive because otherwise both
> servers will attempt to rebuild the array. Is this true? Can you
> simply swap out the bad drive or will I need to power down the passave
> node?
>
|||"icemon" <johnsitu@.gmail.com> wrote in message
news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
> We experienced a disk failure on the cluster array. It's a
> active/passive SQL cluster. I was told by dell that we must powerdown
> the passive before replacing the failed drive because otherwise both
> servers will attempt to rebuild the array. Is this true? Can you
> simply swap out the bad drive or will I need to power down the passave
> node?
It really depends on the array and the controllers. Do you have PERCs in
each node and are the drives all SCSI attached? If so, then Dell seems to be
leading you the right direction.
Russ Kaufmann
MVP - Windows Server - Clustering
ClusterHelp.com, a Microsoft Certified Gold Partner
Web http://www.clusterhelp.com
Blog http://msmvps.com/clusterhelp
The next ClusterHelp classes are:
Denver starting Feb 12th
NYC starting Feb 19th
|||Let me guess? One of the Powervault 200/220 series arrays with PERC cards
in each host? Dell is correct. Unfortunately, SCSI arrays with no internal
controllers have a lot of limitations. You just found one the hard way.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"icemon" <johnsitu@.gmail.com> wrote in message
news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
> We experienced a disk failure on the cluster array. It's a
> active/passive SQL cluster. I was told by dell that we must powerdown
> the passive before replacing the failed drive because otherwise both
> servers will attempt to rebuild the array. Is this true? Can you
> simply swap out the bad drive or will I need to power down the passave
> node?
>
|||Since I didn't get any immiedate responses on this post, I decided to
call Dell once more. I spoke to another engineer who advised me
differently. Shouldn't they be more in sync with their knowledge base?
Anyway, this guy told me that it is not necessary since it's an
Active/Passive Node. He said that Since the Passive Node is not
running, it will not compete. I swap out the array and the drive is
rebuilt. It is running a PERC 3 on a powervault 220 array.
Geoff N. Hiten wrote:[vbcol=seagreen]
> Let me guess? One of the Powervault 200/220 series arrays with PERC cards
> in each host? Dell is correct. Unfortunately, SCSI arrays with no internal
> controllers have a lot of limitations. You just found one the hard way.
> --
> Geoff N. Hiten
> Senior Database Administrator
> Microsoft SQL Server MVP
>
>
> "icemon" <johnsitu@.gmail.com> wrote in message
> news:1165899856.732197.284880@.79g2000cws.googlegro ups.com...
|||I am not completely surprised either way. Level 1 tech support from Dell
isn't the most accurate in the universe, even on the Gold Support Queue. And
if you are running a cluster without at least Gold support, then you are
crazier than I am. The escalation techs are pretty good and the TAMs can
make things happen so I shouldn't be too harsh on Dell. I do wish they
would update their SCSI storage systems a bit, though.
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"icemon" <johnsitu@.gmail.com> wrote in message
news:1165978456.551049.281650@.f1g2000cwa.googlegro ups.com...
> Since I didn't get any immiedate responses on this post, I decided to
> call Dell once more. I spoke to another engineer who advised me
> differently. Shouldn't they be more in sync with their knowledge base?
> Anyway, this guy told me that it is not necessary since it's an
> Active/Passive Node. He said that Since the Passive Node is not
> running, it will not compete. I swap out the array and the drive is
> rebuilt. It is running a PERC 3 on a powervault 220 array.
>
> Geoff N. Hiten wrote:
>
Disk failure and DBCC CHECKDB REPAIR_ALLOW_DATA_LOSS problem: "The repair level on th
Hello,
Hard disk started with bad sectors, it was noticed only when it was too
late, i.e. when the backup contained corrupted data (actually,
truncated, as BCP failed midway through) as well as the master
database. Database works, but when certain data is accessed, a SQL
error occurs. No summarising SQL queries work as they have to go
through corrupted data as well as through valid data.
I just ran:
DBCC CHECKDB('MYDB','REPAIR_ALLOW_DATA_LOSS')
It is NOT repairing anything. It's saying:
"The repair level on the DBCC statement caused this repair to be
bypassed."
That contadicts my command. Can anyone tell me what I'm doing wrong?
Below is portion of the output.
THANK YOU VERY MUCH FOR ANY HELP. BECAUSE EVEN THE BACKUP IS CORRUPTED,
I AM DESPERATE.
Ideally, if at all possible, I would like to fix
this ASAP, so waiting for a week for PSS is the last resort.
DBCC results for 'Transactions'.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259285) could not be
processed. See
other errors for details.
The repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8941, Level 16, State 102, Server SQLSVR, Line 1
Table error: Object ID 205243786, index ID 0, page (1:259285). Test
(sorted
[i].offset >= PAGEHEADSIZE) failed. Slot 9, offset 0x1 is invalid.
The repair level on the DBCC statement caused this repair to be
bypassed.
Table error: Object ID 205243786, index ID 0, page (1:259285). Test
(sorted[i].o
ffset >= max) failed. Slot 0, offset 0x9 overlaps with the prior row.
The
repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259286) could not be
processed. See
other errors for details.
The repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259287) could not be
processed. See
other errors for details.
The repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259288) could not be
processed. See
other errors for details.
...
Hi
see this link:
http://support.microsoft.com/default...b;en-us;826436
I hope this will be useful for yousql
Hard disk started with bad sectors, it was noticed only when it was too
late, i.e. when the backup contained corrupted data (actually,
truncated, as BCP failed midway through) as well as the master
database. Database works, but when certain data is accessed, a SQL
error occurs. No summarising SQL queries work as they have to go
through corrupted data as well as through valid data.
I just ran:
DBCC CHECKDB('MYDB','REPAIR_ALLOW_DATA_LOSS')
It is NOT repairing anything. It's saying:
"The repair level on the DBCC statement caused this repair to be
bypassed."
That contadicts my command. Can anyone tell me what I'm doing wrong?
Below is portion of the output.
THANK YOU VERY MUCH FOR ANY HELP. BECAUSE EVEN THE BACKUP IS CORRUPTED,
I AM DESPERATE.
this ASAP, so waiting for a week for PSS is the last resort.
DBCC results for 'Transactions'.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259285) could not be
processed. See
other errors for details.
The repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8941, Level 16, State 102, Server SQLSVR, Line 1
Table error: Object ID 205243786, index ID 0, page (1:259285). Test
(sorted
[i].offset >= PAGEHEADSIZE) failed. Slot 9, offset 0x1 is invalid.
The repair level on the DBCC statement caused this repair to be
bypassed.
Table error: Object ID 205243786, index ID 0, page (1:259285). Test
(sorted[i].o
ffset >= max) failed. Slot 0, offset 0x9 overlaps with the prior row.
The
repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259286) could not be
processed. See
other errors for details.
The repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259287) could not be
processed. See
other errors for details.
The repair level on the DBCC statement caused this repair to be
bypassed.
Msg 8928, Level 16, State 1, Server SQLSVR, Line 1
Object ID 205243786, index ID 0: Page (1:259288) could not be
processed. See
other errors for details.
...
Hi
see this link:
http://support.microsoft.com/default...b;en-us;826436
I hope this will be useful for yousql
2012年3月11日星期日
disaster recovery question
sql2k sp2
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, ChrisThis is a multi-part message in MIME format.
--=_NextPart_000_0169_01C3D9E2.A2C176D0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Restore msdb. It keeps track of your backup history. EM uses it to
populate the GUI, allowing you to click and go.
HTH
--
Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com/sql
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
sql2k sp2
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, Chris
--=_NextPart_000_0169_01C3D9E2.A2C176D0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
&
Restore msdb. It keeps track of =your backup history. EM uses it to populate the GUI, allowing you to click and =go.
HTH
-- Tom
---T=homas A. Moreau, BSc, PhD, MCSE, MCDBASQL Server MVPColumnist, SQL =Server ProfessionalToronto, ON Canadahttp://www.pinnaclepublishing.com/sql">www.pinnaclepublishing.com=/sql
"chris" wrote in message news:01f501c3da0c$22=667ab0$a401280a@.phx.gbl...sql2k sp2Im not in the heat of a disaster but was doing some =testing regarding this topic. Say I experience total hardward failure. I =require a new box or the need to reinstall the OS and SQL. So Im @. the point =that Ive reinstalled SQL and now Im going to restore my backups. What Im =doing currently is a full backup @. night and TLog backups every 5 minutes. =I restore the Master DB first. Then my first user db. Now its time to =restore the TLogs for the user db. Since SQL has been reinstalled, SQL no =longer knows where my backups exist. Because of this, I cant use the =defaults in EM for restoring. I need to switch to "From Device" instead of using ="Database" next to Restore: I then need to click Select device/add/ elipses =by "file name"/ select the file/ and hit OK a couple of times for the =first TLog restore. Now I do TLog backup for 14 hours a day. That totals =to 168 backups total. It would take me hours to specify all these one at =a time. Is there a way to tell a newly installed SQL box where all the =backups are? So that I dont have to follow this long path for each indiviual =TLog restore?TIA, Chris
--=_NextPart_000_0169_01C3D9E2.A2C176D0--|||(not quite on point to your question...)
If your database gets much update action, instead of just doing transaction
log backups every 5 minutes and letting them accumulate all day long,
consider doing periodic differential backups, maybe hourly:.
Full
log
log
log...
differential
log
log
log...
differential
log
log
log...
Restore sequence becomes:
most recent full
most recent differential
any t-log backups since most recent differential.
I'm pretty sure this would restore a lot faster than restoring a full and a
day's worth of logs.
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
> sql2k sp2
> Im not in the heat of a disaster but was doing some
> testing regarding this topic. Say I experience total
> hardward failure. I require a new box or the need to
> reinstall the OS and SQL. So Im @. the point that Ive
> reinstalled SQL and now Im going to restore my backups.
> What Im doing currently is a full backup @. night and TLog
> backups every 5 minutes. I restore the Master DB first.
> Then my first user db. Now its time to restore the TLogs
> for the user db. Since SQL has been reinstalled, SQL no
> longer knows where my backups exist. Because of this, I
> cant use the defaults in EM for restoring. I need to
> switch to "From Device" instead of using "Database" next
> to Restore: I then need to click Select device/add/
> elipses by "file name"/ select the file/ and hit OK a
> couple of times for the first TLog restore. Now I do TLog
> backup for 14 hours a day. That totals to 168 backups
> total. It would take me hours to specify all these one at
> a time. Is there a way to tell a newly installed SQL box
> where all the backups are? So that I dont have to follow
> this long path for each indiviual TLog restore?
> TIA, Chris|||Thanks Tom.
>--Original Message--
>Restore msdb. It keeps track of your backup history. EM
uses it to
>populate the GUI, allowing you to click and go.
>HTH
>--
>Tom
>----
--
>Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
>SQL Server MVP
>Columnist, SQL Server Professional
>Toronto, ON Canada
>www.pinnaclepublishing.com/sql
>
>"chris" <anonymous@.discussions.microsoft.com> wrote in
message
>news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
>sql2k sp2
>Im not in the heat of a disaster but was doing some
>testing regarding this topic. Say I experience total
>hardward failure. I require a new box or the need to
>reinstall the OS and SQL. So Im @. the point that Ive
>reinstalled SQL and now Im going to restore my backups.
>What Im doing currently is a full backup @. night and TLog
>backups every 5 minutes. I restore the Master DB first.
>Then my first user db. Now its time to restore the TLogs
>for the user db. Since SQL has been reinstalled, SQL no
>longer knows where my backups exist. Because of this, I
>cant use the defaults in EM for restoring. I need to
>switch to "From Device" instead of using "Database" next
>to Restore: I then need to click Select device/add/
>elipses by "file name"/ select the file/ and hit OK a
>couple of times for the first TLog restore. Now I do TLog
>backup for 14 hours a day. That totals to 168 backups
>total. It would take me hours to specify all these one at
>a time. Is there a way to tell a newly installed SQL box
>where all the backups are? So that I dont have to follow
>this long path for each indiviual TLog restore?
>TIA, Chris
>
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, ChrisThis is a multi-part message in MIME format.
--=_NextPart_000_0169_01C3D9E2.A2C176D0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Restore msdb. It keeps track of your backup history. EM uses it to
populate the GUI, allowing you to click and go.
HTH
--
Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com/sql
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
sql2k sp2
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, Chris
--=_NextPart_000_0169_01C3D9E2.A2C176D0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
&
Restore msdb. It keeps track of =your backup history. EM uses it to populate the GUI, allowing you to click and =go.
HTH
-- Tom
---T=homas A. Moreau, BSc, PhD, MCSE, MCDBASQL Server MVPColumnist, SQL =Server ProfessionalToronto, ON Canadahttp://www.pinnaclepublishing.com/sql">www.pinnaclepublishing.com=/sql
"chris" wrote in message news:01f501c3da0c$22=667ab0$a401280a@.phx.gbl...sql2k sp2Im not in the heat of a disaster but was doing some =testing regarding this topic. Say I experience total hardward failure. I =require a new box or the need to reinstall the OS and SQL. So Im @. the point =that Ive reinstalled SQL and now Im going to restore my backups. What Im =doing currently is a full backup @. night and TLog backups every 5 minutes. =I restore the Master DB first. Then my first user db. Now its time to =restore the TLogs for the user db. Since SQL has been reinstalled, SQL no =longer knows where my backups exist. Because of this, I cant use the =defaults in EM for restoring. I need to switch to "From Device" instead of using ="Database" next to Restore: I then need to click Select device/add/ elipses =by "file name"/ select the file/ and hit OK a couple of times for the =first TLog restore. Now I do TLog backup for 14 hours a day. That totals =to 168 backups total. It would take me hours to specify all these one at =a time. Is there a way to tell a newly installed SQL box where all the =backups are? So that I dont have to follow this long path for each indiviual =TLog restore?TIA, Chris
--=_NextPart_000_0169_01C3D9E2.A2C176D0--|||(not quite on point to your question...)
If your database gets much update action, instead of just doing transaction
log backups every 5 minutes and letting them accumulate all day long,
consider doing periodic differential backups, maybe hourly:.
Full
log
log
log...
differential
log
log
log...
differential
log
log
log...
Restore sequence becomes:
most recent full
most recent differential
any t-log backups since most recent differential.
I'm pretty sure this would restore a lot faster than restoring a full and a
day's worth of logs.
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
> sql2k sp2
> Im not in the heat of a disaster but was doing some
> testing regarding this topic. Say I experience total
> hardward failure. I require a new box or the need to
> reinstall the OS and SQL. So Im @. the point that Ive
> reinstalled SQL and now Im going to restore my backups.
> What Im doing currently is a full backup @. night and TLog
> backups every 5 minutes. I restore the Master DB first.
> Then my first user db. Now its time to restore the TLogs
> for the user db. Since SQL has been reinstalled, SQL no
> longer knows where my backups exist. Because of this, I
> cant use the defaults in EM for restoring. I need to
> switch to "From Device" instead of using "Database" next
> to Restore: I then need to click Select device/add/
> elipses by "file name"/ select the file/ and hit OK a
> couple of times for the first TLog restore. Now I do TLog
> backup for 14 hours a day. That totals to 168 backups
> total. It would take me hours to specify all these one at
> a time. Is there a way to tell a newly installed SQL box
> where all the backups are? So that I dont have to follow
> this long path for each indiviual TLog restore?
> TIA, Chris|||Thanks Tom.
>--Original Message--
>Restore msdb. It keeps track of your backup history. EM
uses it to
>populate the GUI, allowing you to click and go.
>HTH
>--
>Tom
>----
--
>Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
>SQL Server MVP
>Columnist, SQL Server Professional
>Toronto, ON Canada
>www.pinnaclepublishing.com/sql
>
>"chris" <anonymous@.discussions.microsoft.com> wrote in
message
>news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
>sql2k sp2
>Im not in the heat of a disaster but was doing some
>testing regarding this topic. Say I experience total
>hardward failure. I require a new box or the need to
>reinstall the OS and SQL. So Im @. the point that Ive
>reinstalled SQL and now Im going to restore my backups.
>What Im doing currently is a full backup @. night and TLog
>backups every 5 minutes. I restore the Master DB first.
>Then my first user db. Now its time to restore the TLogs
>for the user db. Since SQL has been reinstalled, SQL no
>longer knows where my backups exist. Because of this, I
>cant use the defaults in EM for restoring. I need to
>switch to "From Device" instead of using "Database" next
>to Restore: I then need to click Select device/add/
>elipses by "file name"/ select the file/ and hit OK a
>couple of times for the first TLog restore. Now I do TLog
>backup for 14 hours a day. That totals to 168 backups
>total. It would take me hours to specify all these one at
>a time. Is there a way to tell a newly installed SQL box
>where all the backups are? So that I dont have to follow
>this long path for each indiviual TLog restore?
>TIA, Chris
>
disaster recovery question
sql2k sp2
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, ChrisRestore msdb. It keeps track of your backup history. EM uses it to
populate the GUI, allowing you to click and go.
HTH
Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com/sql
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
sql2k sp2
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, Chris|||(not quite on point to your question...)
If your database gets much update action, instead of just doing transaction
log backups every 5 minutes and letting them accumulate all day long,
consider doing periodic differential backups, maybe hourly:.
Full
log
log
log...
differential
log
log
log...
differential
log
log
log...
Restore sequence becomes:
most recent full
most recent differential
any t-log backups since most recent differential.
I'm pretty sure this would restore a lot faster than restoring a full and a
day's worth of logs.
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
uses it to
--
message
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, ChrisRestore msdb. It keeps track of your backup history. EM uses it to
populate the GUI, allowing you to click and go.
HTH
Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com/sql
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
sql2k sp2
Im not in the heat of a disaster but was doing some
testing regarding this topic. Say I experience total
hardward failure. I require a new box or the need to
reinstall the OS and SQL. So Im @. the point that Ive
reinstalled SQL and now Im going to restore my backups.
What Im doing currently is a full backup @. night and TLog
backups every 5 minutes. I restore the Master DB first.
Then my first user db. Now its time to restore the TLogs
for the user db. Since SQL has been reinstalled, SQL no
longer knows where my backups exist. Because of this, I
cant use the defaults in EM for restoring. I need to
switch to "From Device" instead of using "Database" next
to Restore: I then need to click Select device/add/
elipses by "file name"/ select the file/ and hit OK a
couple of times for the first TLog restore. Now I do TLog
backup for 14 hours a day. That totals to 168 backups
total. It would take me hours to specify all these one at
a time. Is there a way to tell a newly installed SQL box
where all the backups are? So that I dont have to follow
this long path for each indiviual TLog restore?
TIA, Chris|||(not quite on point to your question...)
If your database gets much update action, instead of just doing transaction
log backups every 5 minutes and letting them accumulate all day long,
consider doing periodic differential backups, maybe hourly:.
Full
log
log
log...
differential
log
log
log...
differential
log
log
log...
Restore sequence becomes:
most recent full
most recent differential
any t-log backups since most recent differential.
I'm pretty sure this would restore a lot faster than restoring a full and a
day's worth of logs.
"chris" <anonymous@.discussions.microsoft.com> wrote in message
news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
quote:|||Thanks Tom.
> sql2k sp2
> Im not in the heat of a disaster but was doing some
> testing regarding this topic. Say I experience total
> hardward failure. I require a new box or the need to
> reinstall the OS and SQL. So Im @. the point that Ive
> reinstalled SQL and now Im going to restore my backups.
> What Im doing currently is a full backup @. night and TLog
> backups every 5 minutes. I restore the Master DB first.
> Then my first user db. Now its time to restore the TLogs
> for the user db. Since SQL has been reinstalled, SQL no
> longer knows where my backups exist. Because of this, I
> cant use the defaults in EM for restoring. I need to
> switch to "From Device" instead of using "Database" next
> to Restore: I then need to click Select device/add/
> elipses by "file name"/ select the file/ and hit OK a
> couple of times for the first TLog restore. Now I do TLog
> backup for 14 hours a day. That totals to 168 backups
> total. It would take me hours to specify all these one at
> a time. Is there a way to tell a newly installed SQL box
> where all the backups are? So that I dont have to follow
> this long path for each indiviual TLog restore?
> TIA, Chris
quote:
>--Original Message--
>Restore msdb. It keeps track of your backup history. EM
uses it to
quote:
>populate the GUI, allowing you to click and go.
>HTH
>--
>Tom
>----
--
quote:
>Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
>SQL Server MVP
>Columnist, SQL Server Professional
>Toronto, ON Canada
>www.pinnaclepublishing.com/sql
>
>"chris" <anonymous@.discussions.microsoft.com> wrote in
message
quote:
>news:01f501c3da0c$22667ab0$a401280a@.phx.gbl...
>sql2k sp2
>Im not in the heat of a disaster but was doing some
>testing regarding this topic. Say I experience total
>hardward failure. I require a new box or the need to
>reinstall the OS and SQL. So Im @. the point that Ive
>reinstalled SQL and now Im going to restore my backups.
>What Im doing currently is a full backup @. night and TLog
>backups every 5 minutes. I restore the Master DB first.
>Then my first user db. Now its time to restore the TLogs
>for the user db. Since SQL has been reinstalled, SQL no
>longer knows where my backups exist. Because of this, I
>cant use the defaults in EM for restoring. I need to
>switch to "From Device" instead of using "Database" next
>to Restore: I then need to click Select device/add/
>elipses by "file name"/ select the file/ and hit OK a
>couple of times for the first TLog restore. Now I do TLog
>backup for 14 hours a day. That totals to 168 backups
>total. It would take me hours to specify all these one at
>a time. Is there a way to tell a newly installed SQL box
>where all the backups are? So that I dont have to follow
>this long path for each indiviual TLog restore?
>TIA, Chris
>
2012年3月7日星期三
Disactory recovery from MS SQL clustering
Hi all,
I have an MS SQL active/passive cluster with 2 nodes of windows 2000 advance
server. One of the node has hardware failure. After I restored win 2k
server from ntbackup, i can't re-join the cluster. Though I can see both
server are up in the cluster administrator, but it fails when fail over
occurs.
Please advise me how to do.
Justin
Here is a webcast that might help:
http://support.microsoft.com/default...b;en-us;822250
Or may this Q:
http://support.microsoft.com/default...b;en-us;822400
Cheers,
Rod
MVP - Windows Server - Clustering
http://www.nw-america.com - Clustering
"Bill Gate" <iono@.umc.com.hk> wrote in message
news:eOGOkOxbEHA.1408@.TK2MSFTNGP12.phx.gbl...
> Hi all,
> I have an MS SQL active/passive cluster with 2 nodes of windows 2000
advance
> server. One of the node has hardware failure. After I restored win 2k
> server from ntbackup, i can't re-join the cluster. Though I can see both
> server are up in the cluster administrator, but it fails when fail over
> occurs.
> Please advise me how to do.
> Justin
>
|||The information for such scenarios is well documented in "Help and Support Center" for Windows
Here is cut an paste from Help
Scenario 6Single Cluster Node Corruption or Failure
Symptom: The node cannot join the cluster.
If the Event Log indicates that the cluster database on the local node is merely corrupted, you can perform a System State restore on that node to replace the local cluster database. For information, see To
restore the cluster database on a local node. Alternatively, you can copy the latest checkpoint file (CHKxxx.TMP) from the quorum disk to the %systemroot%\Cluster\ directory, rename it as file CLUSDB, and
restart the Cluster service on that node.
If a single node fails in the cluster due to system disk or other hardware failure, follow these steps to rebuild the node and rejoin the cluster:
After verifying that all cluster resource groups have been successfully moved to other nodes, repair or replace the failed hardware. For information, see To move a group to another node and To manage cluster
hardware.
Perform an Automated System Recovery restore on the failed node to rebuild the node. For information, see To restore a damaged cluster node using Automated System Recovery.
If you have other files or application data for that node backed up on on tape or other backup medium, you can restore that now. For information, see To restore files from a file or a tape and Scenario 8 below.
For each cluster group and resource, verify that the newly recovered node appears as a possible owner in Cluster Administrator, then move a resource group to the newly recovered node and verify that the move
is successful. For information, see To test whether group resources can fail over.
Note
If you do not have an Automated System Recovery backup of the node, you can evict that node and add a new node to the cluster. For more information, see To evict a node from the cluster and To add
additional nodes to the cluster.
For more scenarios and options, please review the topic "Backing up and restoring server clusters" in Windows Help.
Or go online to the following link
http://www.microsoft.com/resources/d...r/proddocs/en-
us/SAG_MSCSusing_9.asp
SQL Server specific information is there in SQL Server Books Online. I am cutting and pasting the same for you.
How to recover from failover cluster failure in Scenario 1
http://msdn.microsoft.com/library/de...ering_2uax.asp (online link)
In this scenario, failure is caused by hardware failure in Node 1 of a two-node cluster. This hardware failure could be caused, for example, by the failure of a small computer system interface (SCSI) card or the
operating system.
After Node 1 fails, the Microsoft SQL Server 2000 failover cluster fails over to Node 2.
Run SQL Server Setup and remove Node 1. For more information, see How to remove a failover clustered instance .
Evict Node 1 from Microsoft Cluster Service (MSCS). To evict a node from MSCS, from Node 2, right-click on the node to remove, and then click Evict Node.
Install new hardware to replace the failed hardware in Node 1.
Install the operating system. For more information about which operating system to install and specific instructions on how to do this, see Before Installing Failover Clustering.
Install MSCS and join the existing cluster. For more information, see Before Installing Failover Clustering.
Run the Setup program on Node 2 and add Node 1 back to the failover cluster. For more information, see How to add nodes to an existing virtual server (Setup).
Hope that helps.
Best Regards,
Uttam Parui
Microsoft Corporation
This posting is provided "AS IS" with no warranties, and confers no rights.
Are you secure? For information about the Strategic Technology Protection Program and to order your FREE Security Tool Kit, please visit http://www.microsoft.com/security.
Microsoft highly recommends that users with Internet access update their Microsoft software to better protect against viruses and security vulnerabilities. The easiest way to do this is to visit the following websites:
http://www.microsoft.com/protect
http://www.microsoft.com/security/guidance/default.mspx
I have an MS SQL active/passive cluster with 2 nodes of windows 2000 advance
server. One of the node has hardware failure. After I restored win 2k
server from ntbackup, i can't re-join the cluster. Though I can see both
server are up in the cluster administrator, but it fails when fail over
occurs.
Please advise me how to do.
Justin
Here is a webcast that might help:
http://support.microsoft.com/default...b;en-us;822250
Or may this Q:
http://support.microsoft.com/default...b;en-us;822400
Cheers,
Rod
MVP - Windows Server - Clustering
http://www.nw-america.com - Clustering
"Bill Gate" <iono@.umc.com.hk> wrote in message
news:eOGOkOxbEHA.1408@.TK2MSFTNGP12.phx.gbl...
> Hi all,
> I have an MS SQL active/passive cluster with 2 nodes of windows 2000
advance
> server. One of the node has hardware failure. After I restored win 2k
> server from ntbackup, i can't re-join the cluster. Though I can see both
> server are up in the cluster administrator, but it fails when fail over
> occurs.
> Please advise me how to do.
> Justin
>
|||The information for such scenarios is well documented in "Help and Support Center" for Windows
Here is cut an paste from Help
Scenario 6Single Cluster Node Corruption or Failure
Symptom: The node cannot join the cluster.
If the Event Log indicates that the cluster database on the local node is merely corrupted, you can perform a System State restore on that node to replace the local cluster database. For information, see To
restore the cluster database on a local node. Alternatively, you can copy the latest checkpoint file (CHKxxx.TMP) from the quorum disk to the %systemroot%\Cluster\ directory, rename it as file CLUSDB, and
restart the Cluster service on that node.
If a single node fails in the cluster due to system disk or other hardware failure, follow these steps to rebuild the node and rejoin the cluster:
After verifying that all cluster resource groups have been successfully moved to other nodes, repair or replace the failed hardware. For information, see To move a group to another node and To manage cluster
hardware.
Perform an Automated System Recovery restore on the failed node to rebuild the node. For information, see To restore a damaged cluster node using Automated System Recovery.
If you have other files or application data for that node backed up on on tape or other backup medium, you can restore that now. For information, see To restore files from a file or a tape and Scenario 8 below.
For each cluster group and resource, verify that the newly recovered node appears as a possible owner in Cluster Administrator, then move a resource group to the newly recovered node and verify that the move
is successful. For information, see To test whether group resources can fail over.
Note
If you do not have an Automated System Recovery backup of the node, you can evict that node and add a new node to the cluster. For more information, see To evict a node from the cluster and To add
additional nodes to the cluster.
For more scenarios and options, please review the topic "Backing up and restoring server clusters" in Windows Help.
Or go online to the following link
http://www.microsoft.com/resources/d...r/proddocs/en-
us/SAG_MSCSusing_9.asp
SQL Server specific information is there in SQL Server Books Online. I am cutting and pasting the same for you.
How to recover from failover cluster failure in Scenario 1
http://msdn.microsoft.com/library/de...ering_2uax.asp (online link)
In this scenario, failure is caused by hardware failure in Node 1 of a two-node cluster. This hardware failure could be caused, for example, by the failure of a small computer system interface (SCSI) card or the
operating system.
After Node 1 fails, the Microsoft SQL Server 2000 failover cluster fails over to Node 2.
Run SQL Server Setup and remove Node 1. For more information, see How to remove a failover clustered instance .
Evict Node 1 from Microsoft Cluster Service (MSCS). To evict a node from MSCS, from Node 2, right-click on the node to remove, and then click Evict Node.
Install new hardware to replace the failed hardware in Node 1.
Install the operating system. For more information about which operating system to install and specific instructions on how to do this, see Before Installing Failover Clustering.
Install MSCS and join the existing cluster. For more information, see Before Installing Failover Clustering.
Run the Setup program on Node 2 and add Node 1 back to the failover cluster. For more information, see How to add nodes to an existing virtual server (Setup).
Hope that helps.
Best Regards,
Uttam Parui
Microsoft Corporation
This posting is provided "AS IS" with no warranties, and confers no rights.
Are you secure? For information about the Strategic Technology Protection Program and to order your FREE Security Tool Kit, please visit http://www.microsoft.com/security.
Microsoft highly recommends that users with Internet access update their Microsoft software to better protect against viruses and security vulnerabilities. The easiest way to do this is to visit the following websites:
http://www.microsoft.com/protect
http://www.microsoft.com/security/guidance/default.mspx
订阅:
博文 (Atom)