That’s not a recursion. This is a recursion.

kangourou boxeur
I may only be an expert of sorts in computer information systems as opposed to a computer scientist, and one specializing in algorithms at that, but I’d like to think I know the difference between iterating on row set that diminishes with each iteration, and set constrained self calling operation. I am also pretty sure that iterating on a row set that diminishes with each iteration is parent form of a for each type iteration. I do know that SQL Server 2012 introduces for us a deployment model that makes the business of building recursive SSIS packages wholly more tenable.

The project based deployment model allows us to reference our packages in development. This is a welcome change as it allows us to make better use of a more modular approach to package development. One of the ways that this is manifested is in the Execute package task. Instead of creating a connection manager of either a SQL Server or File System type, you can use an internal reference. In SQL Server 2012 an execute package task can be used to call a package that is either internally referenced in the same project as the parent package, or externally referenced. An externally referenced package is essentially the only reference type in SQL 2008 R2. This requires the use of a connection manager which will need to change if we are developing anywhere other than our production environment, which if you do, you really shouldn’t.

Beyond being able to call packages stored in the project we are working in the internally referenced package call allows us to do something that is simply amazing. We can call the package we are building. That means that we can have a child package that can modify the parent package’s variable values as they are the same variables. In the example I have created a package that reads a list of packages and executes them asynchronously. I included annotations to help keep this relatively simple package conceptually simple.

Package

The package begins with an Execute SQL Task. I use an expression(which we can see on the design space, new to SSDT) to set the disable parameter of this task. This is simply a test of the initialization parameter. If we are making the initial call to this package the parameter defaults to 1 which tells the populate Args SQL Task whether or not to run. For fun I pass the actual SQL to run to it through another parameter. Actually, it is quite a bit more than just fun but we will have to discuss environments, parameters and programmatic execution in package stores in later posts.

The next task is a scripting task in which I manipulate the array (I need to be mum on the specifics of this for various legal reasons I err on the side of caution for). I achieve two primary things by this task. The first and more obvious item accomplished is popping first argument off of the Args stack. The second is checking that there is at least one row left in the set (there are a lot of ways to skin this cat). If there isn’t we change our initialization value to -1. Otherwise we set it to continue.

Next we use a sequence container to assure that the previous steps have executed before we execute the remaining two in parallel. If we had an array length of zero for our Args set the -1 initialization parameter would disable this container and prevent the execution of its children members. Once it knows it’s disabled, the package completes its execution.

The next task is an execute package task. This simply executes the package we are in. We have access to and set the same variables with our actions. Since we have enforced precedence outside of the container and set our next Args and already disabled our Args population SQL Task, this task will inevitably lap our work task, allowing us to start all of our work without waiting for any of it to complete. If any of our work items are such that they require precedence constraints between them, we can engineer some sort of messaging service for them, or build parent packages that enforce the precedence between those packages and call them instead of their children.

Our final task is the one that performs the actual work. In this case I chose an execute process task that calls DTEXEC. Our Args set member retrieved in our script task is one of the parameters for the DTEXEC calls to packages that may have a different deployment model or project than our recursion. We could of just as easily used this approaches to spawn workers to operate on separate folders or servers.

Because some strings are code and some code contains strings that are code.

Not to dicker on about grammar and in particular grammatical flourishes in blog post ostensibly about syntactic inversions, but wouldn’t it be great if we allowed the title Because some strings are code and some codes contain strings that are code. I certainly do not intend to post about semantics proper, not to be confused with semantic web ontologies which is likewise a different topic and in many respects a different expert credential than I have been alluding to this past year.

Blogs I Follow, Found, Read, or Otherwise Link To

Blogs I Follow, Found, Read, or Otherwise Link To

Here I am making a blog post about adding blogs to the blogs I follow feature here on Word Press. With the pending New Year only a week or so away it dawns on me that I still don’t have the blog presence I resolved to in at least 3 of the last 5 new years. I have started and abandoned at least 3 blogs so far. I have tried to blog on databases in general, software development, systems administration, gardening, politics and various combinations thereof. I read as much as Blogging for Dummies as I could without buying a coffee refill at Barnes and Noble. I think I was reading at a venti level at that time. I am pretty sure it told me that one of the keys to getting good traffic to your site was by following blogs and linking to articles. It might also help to review a book.

EXEC sp_helptext ‘sp_addlinkedsrvlogin’

In the course of developing the expertise I have made it a point to assert as the point several past posts I have encountered a number of philosophies regarding the comparative value of various areas of expertise capable of being applied in a manner such that they overlap competitively. I in this time made it a point to refrain from passing judgement prematurely favoring one methodology to another. It has been my experience that often enough the positions of opposing schools of thought on most any subject are such that they focus primarily on the strengths of their position neglecting concern for their weaknesses which oft as not are the strengths of the other. In short thesis and antithesis give rise to synthesis.

Having said that brings us to the specific example to which this particular generalization most immediately applies. Recently I was asked by a novice SSIS developer if there a way using only SSIS package tasks to display a result set such as to provide inline with the aggregated values the final aggregates. This is something that over time I have developed a reflex to provide an SQL solution to. For brevity’s sake I am glad the question was not one of a running total.

First let me show you the solution I provided in SSIS:

SSIS Left Joining Aggregates

In the ETL Solution we join together the original source and the aggregate run against it. In SSIS this requires an additional sorting step. This sorting step is perfomed implicitly in SQL by the database engine. Generally speaking sorting is one of the most if not the most expensive operation that can be performed on a set of data, whether it is implied or explicitly commanded.

2012-7-12 Left join for an inline total

In either case we are performing the sorting of sets to be joined and in the SQL statement we are explicitly sorting the output. Of course results speak for themselves.

2012-7-12 Left join output2012-7-12 SSIS Output

In the left image we see the output of the SQL Statement and in the right the output of the SSIS task. There are differences in the names these are primarily owing to a lack of rigor on my behalf regarding the naming. We can see from the outputs that the data values are identical for both approaches.

So why should we choose one approach over the other? What you may not have noticed earlier in this post is that the SSIS solution output shown is the consequence of my having added a data viewer to the data flow task’s merge join output. That is to say is that what you are seeing in “output” of the SSIS solution is the output of the extract and transformation steps in the extract, transform, load data flow and not the output of the ETL flow which would be a confirmation of the data being loaded while in debug mode or an entry in the sysjobhistory table should it be run as an SQL Agent job with logging enabled.

I have to apologize for the trickery though I think it appropriate to demonstrating the differences between the two technologies. A proper comparison would have had a SQL solution that looked like this:

2012-7-12 Left join for an inline total ETL

You’ll note that if our only purpose was to select the data, using SSIS would be an inappropriate choice. However, since we are moving the data it becomes a question of establishing benchmarks for comparison of performance. Having said depending on the compatibility level of the database and surface area configurations we cannot connect between  instances using the OPENDATASOURCE clause. That of course is just a cursory manner of saying, there exists numerous administrative issues related to the connecting data sources and the securitization as such that often make the business of ETL more complicated than questions of SQL statements and sort algorithms.

 

RAISERROR 2147483647 ‘Hello World!’

In my last post I made it a point to demonstrate that a ratio would better show my expertise than an absolute number. In that post I also alluded to such a simple concept as “book” being possibly vague in a larger sense than what is realized by the concretization of the term “book” in the form of ISBN. Of course those in the know know that an ISBN only uniquely identifies an edition of a title, a title being that part of the cover we are judging our books by. The cover in turn being the thing between which the pages of writing are presumed bound (or at least sorted). While my logic may at this point seem a bit meandering, please, bear with me.

An ISBN uniquely identifies a title and not in the sense that it uniquely identifies an instance of a title that may be on a particular cover. So it might stand to reason that more than one book may have the same title, book again in this case being less a synonym for title and more an intance of it. Now from the point of view of expertise this does little for our sumarization but it does broach the topic of said expertise and begs the question as to why we might have more than one copy of Itzik Ben-Gan’s Microsoft SQL Server 2012 High Performance T-SQL Using Window Functions (ISBN 978-0-7356-5836-3),  which we don’t, but we might and we might want to know which if any of the other books contributing to this library of expertise we might have more than one copy of, as we might want to sell them or trade them or what have you.

The phrase (we’ll call these queries going forward) that might tell us how many titles we have more than one of might look something like:

2012-7-12 Count books

RAISERROR 13000 ‘Hello World!’

In my last post I claimed that my expertise could be summarized by a number. That number was the number of books in my library having a title containing either the word “SQL” or the word “Data” in whole or as a part of another word. This does not however address the question of to what extent the meandering mentioned has impacted the relative breadth of the library of expertise to which I am claiming an absolute number summarizes. So to that effect I propose another phrase be typed into a certain program. This phrase will include the former as its numerator but will have a denominator such as to give us a ratio of those books I claim contribute to my expertise to books in general (specifically books narrowly defined as ISBN numbers as opposed to leaves of paper with writing on them bound together and presumably following some grammatical style apropos to the topic to which the title should presumably allude). But, alas, I digress. The phrase I think might aid in the rationalization of the aforementioned expertise which I am ascribing to myself might look something like this:

2012-7-12 Count book titles relatively

PRINT ‘Hello World!’

 

In my last post I alluded to being an expert with an opinion. That allusion was suggestive that I what I might be an expert in was my opinion. Of course that is not the only thing I bill myself as an expert in, else who would really care about my opinion, expert or not. My expertise has come from years of diligent study interspersed with work and punctuated by an occasionally unhealthy obsession. Along with this study has come degrees and even some honors much the way this work has been accompanied by titles and even some pay. Over the years my expertise has meandered between a number of things related to that which has come to be my primary expertise and that which I am intending with this blog to not be bothered to opine on.

In that time I have amassed a small but respectable library of titles. Should we judge these books by their covers, or at least their titles, we might get some sort of indication as to what precisely it is that I am claiming expertise in. Of course this library being small does not make the reading and judging of their covers any less laborious. Fortunately, in the course of amassing such a library and developing this alleged expertise I have discovered that by typing into certain programs certain phrases I can get a number that summarizes what I have been alluding to all this while.

That phrase might go something like this:

2012-7-12 Count book titles

SELECT ‘Hello World!’ AS [C1]

Hubo throws out the first pitch
I have seen the WordPress logo around the web for years now and have never bothered to keep a blog here. I suppose this is largely due to the fact that I have not bothered to keep a blog anywhere. I have started blogs here and there but never bothered to keep any one of them, save perhaps in the sense that they are, at least in my non-expert opinion, mine in terms of being my intellectual property. This blog is mine in that sense, and with any luck not a bother. More to the point this blog is mine in the way that my expert opinion is going to be blogged on here. I say my expert opinion because I think I am entitled to call it that, being an expert in the area of my opinions.