I have a need to filter out certain rows from my data stream. I cannot apply the filter against the source data using my DataReader component, due to some constraints in the source system. Therefore, I must filter the data out after it enters my datastream (trust me on this part).
I have created a data flow that uses the Conditional Split transformation to do this. I created one condition that matches the rows I want to discard. I then connected the Default output stream to my target table. I have simply left the "discard" output disconnected. This appear to do what I want.
My question is: is it OK to leave outputs disconnected in this fashion? It isn't really apparent when viewing the package that the conditional split is discarding rows. Is there a better way to handle this situation? For now I've just added an annotation to the package that describes what is happening.
Thanks for any help
You can use a rowcount transform to make it more 'evident'. That offers the extra benefit of getting the number of rows you discarded; which comes handy for auditing proposes.
There is a 3rd party adapter here that you may want to look as well(i have not used it):
http://www.sqlis.com/56.aspx
|||Yes, you have implemented that perfectly. The only thing I would add is to hook your "filtered" rows up to a Row Count transformation. This will do two things. One, it will let you see that records are going down that flow when debugging. Two it will let you capture the number of rows that went through there in a variable so that you can log it later if you wish.|||As the other guys have said, you have done this in exactly the right way.
I disagree with Phil slightly though (sorry Phil ). I wouldn't bother connecting the dangling output to a rowcount component. This will simply make the data-flow do some unnecassary work. If you DO want to count the number of filtered rows then sure, use a rowcount component - although you can still determine the number of filtered out rows with introducing an additional component by substracting the number of output rows from the number of input rows. These two values are available by logging the OnPipelineRowsSent event.
-Jamie
|||
Jamie Thomson wrote:
If you DO want to count the number of filtered rows then sure, use a rowcount component - although you can still determine the number of filtered out rows with introducing an additional component by substracting the number of output rows from the number of input rows. These two values are available by logging the OnPipelineRowsSent event.
-Jamie
Good idea! How easy is it then to capture that and use it in auditing from a control flow task?|||
Phil Brammer wrote:
Jamie Thomson wrote: If you DO want to count the number of filtered rows then sure, use a rowcount component - although you can still determine the number of filtered out rows with introducing an additional component by substracting the number of output rows from the number of input rows. These two values are available by logging the OnPipelineRowsSent event.
-Jamie
Good idea! How easy is it then to capture that and use it in auditing from a control flow task?
You can get hold of it in the eventhandler. You'd have to parse it out of the message but that's no biggie.
Not as easy as rowcount though. Options...always options!
-Jamie
|||It's amazing how deep you can get in SSIS... Many dark corners yet unexplored!
没有评论:
发表评论