Display date: apply

2012年3月29日星期四

Disk Subsystem Problem during snapshot

We have set up a new SQL environment that has to replace our current SQL
environment. On our current environment we apply a snapshot every night from
our production database (the production database is hosted on another
environment). This means every night approximately 15GB is being copied to
the subscriber. It takes a long time to do this (3+ hours), but is not a
problem in our environment.
Exactly the same snapshot has to be applied on our new SQL environment. In
this environment we are having serious problems with our Disk Subsystem when
applying the snapshot. In the new environment we use a shared HP MSA1000 Disk
System. When applying the snapshot, the load is so high on the MSA1000 that
it can’t handle all the requests no more. This results in a crash of SQL
server (SQL can’t find the tempdb anymore) and other systems that use the
MSA1000 (two more servers, so three in total) are having problems with the
disks also (event id 51).
We’ve contacted HP and Microsoft and the conclusion will probably be that
the load generated by applying the snapshot is to high for the MSA1000.
Microsoft’s explanation about how this can be, knowing that the exact same
process runs on an old environment, is very reasonable (although it’s still
hard to believe that old hardware operates better under high load than new
advanced hardware).
I’d like to know what the opinion from you all is about applying a snapshot
every night to a subscriber, which involves copying approximately 15GB of
data. Should it be no problem to this, or is it very uncommon to do this
because of hardware limitations? Is it normal that the MSA1000 has problems
of this kind with this, or should it be able to handle the load (should it
result in disks that cannot be contacted anymore or should it just perform
very slow)?
Your opinion is very much appreciated!
Kind regards,
Jan Martijn Schuur
"Martijn" schrieb:
...
> I’d like to know what the opinion from you all is about applying a snapshot
> every night to a subscriber, which involves copying approximately 15GB of
> data. Should it be no problem to this, or is it very uncommon to do this
> because of hardware limitations?
15 GB is a large amount of data for a nightly snapshot. Why don't you use
the incremental transactional replication?
Your other problems with this are not familiar to me. Sorry.

2012年3月19日星期一

discarding rows - best practice?

I have a need to filter out certain rows from my data stream. I cannot apply the filter against the source data using my DataReader component, due to some constraints in the source system. Therefore, I must filter the data out after it enters my datastream (trust me on this part).

I have created a data flow that uses the Conditional Split transformation to do this. I created one condition that matches the rows I want to discard. I then connected the Default output stream to my target table. I have simply left the "discard" output disconnected. This appear to do what I want.

My question is: is it OK to leave outputs disconnected in this fashion? It isn't really apparent when viewing the package that the conditional split is discarding rows. Is there a better way to handle this situation? For now I've just added an annotation to the package that describes what is happening.

Thanks for any help

You can use a rowcount transform to make it more 'evident'. That offers the extra benefit of getting the number of rows you discarded; which comes handy for auditing proposes.

There is a 3rd party adapter here that you may want to look as well(i have not used it):

http://www.sqlis.com/56.aspx

|||Yes, you have implemented that perfectly. The only thing I would add is to hook your "filtered" rows up to a Row Count transformation. This will do two things. One, it will let you see that records are going down that flow when debugging. Two it will let you capture the number of rows that went through there in a variable so that you can log it later if you wish.|||

As the other guys have said, you have done this in exactly the right way.

I disagree with Phil slightly though (sorry Phil ). I wouldn't bother connecting the dangling output to a rowcount component. This will simply make the data-flow do some unnecassary work. If you DO want to count the number of filtered rows then sure, use a rowcount component - although you can still determine the number of filtered out rows with introducing an additional component by substracting the number of output rows from the number of input rows. These two values are available by logging the OnPipelineRowsSent event.

-Jamie

|||

Jamie Thomson wrote:
If you DO want to count the number of filtered rows then sure, use a rowcount component - although you can still determine the number of filtered out rows with introducing an additional component by substracting the number of output rows from the number of input rows. These two values are available by logging the OnPipelineRowsSent event.
-Jamie

Good idea! How easy is it then to capture that and use it in auditing from a control flow task?|||

Phil Brammer wrote:

Jamie Thomson wrote:
If you DO want to count the number of filtered rows then sure, use a rowcount component - although you can still determine the number of filtered out rows with introducing an additional component by substracting the number of output rows from the number of input rows. These two values are available by logging the OnPipelineRowsSent event.
-Jamie
Good idea! How easy is it then to capture that and use it in auditing from a control flow task?

You can get hold of it in the eventhandler. You'd have to parse it out of the message but that's no biggie.

Not as easy as rowcount though. Options...always options!

-Jamie

|||It's amazing how deep you can get in SSIS... Many dark corners yet unexplored!

2012年3月29日星期四

Disk Subsystem Problem during snapshot

2012年3月19日星期一

discarding rows - best practice?

Display date

Blog Archive

About Me