Dynamic SQL: Yea or Nay?
I’ve always been on the side of stored procedures in the classic debate over the merits of dynamic SQL. In reality, I can only think of one good scenario where dynamic SQL at the application layer should be used: programmatic batch inserts.
I won’t go into the performance debate, since there are tons of articles that already cover this area, but rather, I’d like to discuss the usability and development and architectural aspect of it.
In almost all other cases, it seems like the best choice is to have the application not generate dynamic SQL and use a stored procedure…always. There are certainly times when dynamic SQL is necessary, for example, when generating selects against a dynamic table structure, but in those cases, the variable portions of the query can be parameterized into the stored procedure and the procedure should generate the dynamic SQL.
Some would argue that if the underlying data models change, the application layer will usually be forced to change are ignoring other aspects of model changes that don’t necessitate application model changes. These include performance tuning, filtering by table JOINs and reuse of the data logic in nested stored procedures or functions.
When working with compiled code like .NET, the core issue is that fixing query errors involves a recompile and redeploy, which in most cases, is much more difficult than just fixing a completely disconnected (but not completely decoupled since there is a quasi-interface (the return result type and structure)) stored procedure.
For example, if a dataset today contains data from table A and tomorrow it needs to include data from table A and B (let’s say they both contain the same elements, but one is used for archives), it would be easy to update the procedure to UNION the results from the two datasets without affecting the application layer.
This isn’t the only scenario, for example, let’s say the requirement changes and now the data needs to be filtered by another table. It would be easy to add a new INNER JOIN to the query without affecting the application layer. Not only that, it also allows for the recombination of fields (for example a user name field today only needs to show first and last name, but tomorrow, it may need to show the middle initial as well – this change can be done at the database level and not affect the application or UI layers). It can also make it easy to change the underlying table structure so long as the return data isn’t expected to change: it provides a layer of decoupling between the application layer and the raw data storage.
In addition, having a stored procedure allows for easier testing of the data layer without the added overhead of having to execute the application runtime and walk through the debugger line by line just to figure out if the return data is correct; it is much more efficient to simply execute the query and simulate the use case to find if the data that is returned is correct. It becomes much easier and much less painful to simulate data access tests since they can be run, observed, and analyzed nearly instantly.
In larger organizations with dedicated DBAs, stored procedures have the added benefit of allowing SQL experts to add performance tuning to eek out extra performance without requiring the application to be rewritten or recompiled. Again, we see this decoupling of the application layer from the data layer. Of course you could always have templated SQL stored in XML files or something that would get rid of that recompile, but it is still likely to necessitate more redeployment if the application in question is distributed. This key point is not to be taken lightly since — as an example — an error in string formatting may require the replacement of binaries and services across dozens of servers. Not only that, testing in such a scenario still requires interaction with the application layer, adding to the possible failure points, time required, and general development pain.
My own conclusion is that using dynamic SQL (including LINQ) creates too tight of a coupling between the application layer binaries and the underlying data store; it’s great for RAD and testing, but in any application of significance (especially in highly distributed environments), dynamic SQL at the application layer seems like it’s a maintenance and testing disaster waiting to happen.