<CharlieDigital/> Programming, Politics, and uhh…pineapples

4Apr/14Off

FluentNHibernate and SQL Date Generation

Posted by Charles Chen

So you'd like your SQL entry to have a system generated date/time, eh?

Here is a sample table:

CREATE TABLE [dbo].[AuditLog] (
	Id int IDENTITY(1,1) PRIMARY KEY,
	EventUtc datetime2(7) DEFAULT(SYSUTCDATETIME()) NOT NULL,
	EventOffsetUtc datetimeoffset(7) DEFAULT(SYSDATETIMEOFFSET()) NOT NULL,
	EntityContextUid uniqueidentifier,
	EntityContextName nvarchar(256),
	EntityContextType varchar(128),
	UserLogin nvarchar(128),
	EventName varchar(128),
	AppContext varchar(64),
	EventData nvarchar(max),
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

To spare you hours dealing with this error:

System.Data.SqlTypes.SqlTypeException:
   SqlDateTime overflow. Must be between
   1/1/1753 12:00:00 AM and 12/31/9999
   11:59:59 PM.

What you need to do is to use the following mapping for your date/time columns:

Map(a => a.EventUtc).Column("EventUtc")
	.CustomSqlType("datetime2(7)")
	.Not.Nullable()
	.Default("SYSUTCDATETIME()")
	.Generated.Insert();
Map(a => a.EventOffsetUtc).Column("EventOffsetUtc")
	.CustomSqlType("datetimeoffset(7)")
	.Not.Nullable()
	.Default("SYSDATETIMEOFFSET()")
	.Generated.Insert();

Special thanks to this Stackoverflow thread.

26Jul/13Off

SQL Query for Multi-Values In An Encoded String

Posted by Charles Chen

Consider a table with an textual column that encodes multi-values like so:

1|US;2|CA;3|MX

How can we query for all rows using an OR criteria?

For a single value, it's quite easy by searching for the string (in this case, an ISO 2 code).  But what if we need to search for the occurrence of one of n strings?

The following query achieves this using the Microsoft SQL XML data type, the nodes() function, and CROSS APPLY:

-- Create data source; this is just for demo purposes
DECLARE @contacts TABLE
(
    id int,
    contact nvarchar(100),
    countries nvarchar(100) 
)

-- Insert sample test data.
INSERT INTO @contacts VALUES (1, 'Charles', '1|US;2|CA;3|MX') -- US, CA, MX
INSERT INTO @contacts VALUES (2, 'Steven', '1|US;3|MX;2|CA') -- US, MX, CA
INSERT INTO @contacts VALUES (3, 'Arturo', '3|MX') -- MX
INSERT INTO @contacts VALUES (4, 'Silvia', '4|FR') -- FR
INSERT INTO @contacts VALUES (5, 'John', '2|CA;1|US') -- CA, US
INSERT INTO @contacts VALUES (5, 'Johan', '5|DE') -- DE

-- Query for all contacts in US OR MX OR CA (Charles, Steven, Arturo, John)
SELECT
    DISTINCT T1.id,
    T1.contact
FROM (
    SELECT 
        id,
        contact,        
        CAST('<a><c>' + REPLACE(countries, ';','</c><c>') + '</c></a>' AS XML) AS countriesXml
    FROM @contacts  
    ) AS T1
CROSS APPLY T1.countriesXml.nodes('/a/c') T2(c)
WHERE CAST(T2.c.query('string(.)') AS varchar(max)) IN ('1|US', '3|MX', '2|CA')

This should yield the values Charles, Steven, Arturo, and John by first converting the delimited values into XML by simply using string replacement.  Next, the XML is "shredded" using nodes().  For each base row, the shredding generates one row per node (for example, for Charles, we would have one row for US, one row for CA, and one row for MX).

Here is the result of the inner sub-select:

1	Charles	<a><c>1|US</c><c>2|CA</c><c>3|MX</c></a>
2	Steven	<a><c>1|US</c><c>3|MX</c><c>2|CA</c></a>
3	Arturo	<a><c>3|MX</c></a>
4	Silvia	<a><c>4|FR</c></a>
5	John	<a><c>2|CA</c><c>1|US</c></a>
5	Johan	<a><c>5|DE</c></a>

And here is the resultset after shredding:

1	Charles	1|US
1	Charles	2|CA
1	Charles	3|MX
2	Steven	1|US
2	Steven	3|MX
2	Steven	2|CA
3	Arturo	3|MX
4	Silvia	4|FR
5	John	2|CA
5	John	1|US
5	Johan	5|DE

You can see the intermediate resultset using this query:

SELECT
	T1.id,
	T1.contact,
	T2.c.query('string(.)')
FROM (
	SELECT 
		id,
		contact,		
		CAST('<a><c>' + REPLACE(countries, ';','</c><c>') + '</c></a>' AS XML) AS countriesXml
	FROM @contacts 	
	) AS T1
CROSS APPLY T1.countriesXml.nodes('/a/c') T2(c)

Finally, the DISTINCT clause collapses the resultset once again.

From my own testing, better performance can be achieved by creating a table variable with the target values and using a JOIN instead of an IN (about 1/5 of the time). For 100k records, using IN takes about 13.430s. Using a JOIN to a table variable takes about 2.293s.

22Jun/11Off

FluentNHibernate vs. Code First in EF 4.1

Posted by Charles Chen

First, having been in the SharePoint space exclusively for so long now, I have to admit: it's been a while since I've had to deal with non-SharePoint data access.  I don't know if I miss it or not :-D I've become really comfortable with my little SharePoint ORM approach thingy that generates models and such from content types (before you say SPMetal, this was developed for 2007 and I think still works better than SPMetal).

In the past, I've mostly avoided Entity Framework (EF), preferring LINQ-to-SQL due to EF1's shortcomings such as being less performant, creating more obtuse queries, my general disdain for the entity designer, the overly complex API, the lack of POCO support, etc.  I've also spent some time with NHibernate and FluentNHibernate and found it more pleasant to work with as an ORM solution.

Only recently have I discovered the "code first" approach released with EF4.1 which makes it much more appealing in the same way that FNH made NH more appealing by doing away with hbm.xml files for mapping your entities.  So I decided to take it for a spin and see how it measures up to NH+FNH.

If you're interested in much more in depth (and partisan) debate on the merits of one or the other, there's plenty to go around.  I won't get into that :-) I'm just concerned with the basics for now and I anticipate this being a series of blog posts as I test the merits -- and demerits -- of each.

For this first post, the basics are:

  • Create a simple console application that manages contacts
  • The application should auto-generate and auto-provision the schema
  • Basic queries should just work and not generate "dumb" SQL (i.e. counts should use COUNT in SQL, basic paging should be SQL based).

First up: Entity Framework.

So let's jump right into the code:

using System;
using System.Data.Entity;
using System.Linq;

namespace EFTest
{
    public class Program
    {
        private static void Main(string[] args)
        {
            Database.SetInitializer<ContactContext>(
                new DropCreateDatabaseAlways<ContactContext>());

            using (ContactContext context = new ContactContext())
            {
                // Basic add
                Contact chuck = new Contact {FirstName = "Charles", LastName = "Chen"};
                Contact sandra = new Contact {FirstName = "Sandra", LastName = "Chen"};
                Contact charlotte = new Contact {FirstName = "Charlotte", LastName = "Chen"};
                Contact judy = new Contact {FirstName = "Judy", LastName = "Chen"};

                context.Contacts.Add(chuck);
                context.Contacts.Add(sandra);
                context.Contacts.Add(charlotte);
                context.Contacts.Add(judy);

                context.SaveChanges();

                // Basic paged read
                var query = from c in context.Contacts
                               select c;

                var results = query.OrderBy(c => c.FirstName).Skip(2).Take(2);

                foreach(Contact c in results)
                {
                    Console.Out.WriteLine(c);
                }

                // Basic count
                int total = context.Contacts.Count();

                Console.Out.WriteLine(total);
            }
        }
    }

    public class Contact
    {
        public int ContactId { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public override string ToString()
        {
            return string.Format("{0} {1}", FirstName, LastName);
        }
    }

    public class ContactContext : DbContext
    {
        public DbSet<Contact> Contacts { get; set; }
    }
}

I like that it's fairly compact and straightforward.  The development experience was a bit challenging, though.  First, EF doesn't like it when you try to use the schema with an existing database.  It insists that you let it provision the database.  Okay, fine (though there are workarounds).  It's often thrown around in these debates that one of the benefits of EF is that it's "out-of-the-box" but in reality, at least with the code first bits, it's anything but.  You have to download EF4.1 first and install it just like you would with NH+FNH (though certainly, that may change in the future).

The walkthrough on the ADO.NET team blog is also broken.  To get DbContext, you need to add a reference to EntityFramework.dll, not System.Data.Entity as posted in the blog.  Overall, I found the code to be more compact and easier to work with.  The one downside is the that one has to consistently update the class inheriting from DbContext as new entities are added.

Second up: NH+FNH:

Now lets take a look at code that does the same thing in NH:

using System;
using System.Collections.Generic;
using FluentNHibernate.Automapping;
using FluentNHibernate.Cfg;
using FluentNHibernate.Cfg.Db;
using NHibernate;
using NHibernate.Cfg;
using NHibernate.Tool.hbm2ddl;

namespace FNHTest
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            ISessionFactory sessionFactory = CreateSessionFactory();

            using (ISession session = sessionFactory.OpenSession())
            using (ITransaction transaction = session.BeginTransaction())
            {
                // Basic add
                Contact chuck = new Contact {FirstName = "Charles", LastName = "Chen"};
                Contact sandra = new Contact {FirstName = "Sandra", LastName = "Chen"};
                Contact judy = new Contact {FirstName = "Judy", LastName = "Chen"};
                Contact charlotte = new Contact {FirstName = "Charlotte", LastName = "Chen"};

                session.SaveOrUpdate(chuck);
                session.SaveOrUpdate(sandra);
                session.SaveOrUpdate(judy);
                session.SaveOrUpdate(charlotte);

                transaction.Commit();

                // Basic paged read
                IList<Contact> results = session.QueryOver<Contact>()
                    .OrderBy(c => c.FirstName).Asc.Skip(2).Take(2).List();

                foreach (Contact c in results)
                {
                    Console.Out.WriteLine(c);
                }

                // Check the count
                int total = session.QueryOver<Contact>().RowCount();

                Console.Out.WriteLine(total);
            }
        }

        private static ISessionFactory CreateSessionFactory()
        {
            return Fluently.Configure()
                .Database(
                    MsSqlConfiguration.MsSql2008.ConnectionString("Data Source=CHARLES-E6410;Initial Catalog=FNHTest;Integrated Security=SSPI;Application Name='FNHTest'")
                )
                .Mappings(m => m.AutoMappings.Add(AutoMap.AssemblyOf<Program>()))
                .ExposeConfiguration(BuildSchema)
                .BuildSessionFactory();
        }

        private static void BuildSchema(Configuration config)
        {
            SchemaExport schema = new SchemaExport(config);

            schema.Drop(false, true); // Drops the tables only.
            schema.Create(false, true);
        }
    }

    public class Contact
    {
        public virtual int Id { get; set; }
        public virtual string FirstName { get; set; }
        public virtual string LastName { get; set; }

        public override string ToString()
        {
            return string.Format("{0} {1}", FirstName, LastName);
        }
    }
}

It's slightly more verbose, but not terribly so.  One note is that unlike the case with the ContactContext in EF, you won't need to continually update a "registry" with new entity types.

For this basic scenario, it's hard to say I prefer one over the other, but I'd have to give the edge to EF so far simply for the intangibles (read: Microsoft supported - in other words, it'll be easier from a convincing-your-team-to-not-roll-your-own standpoint).

Comparing the SQL

Of course, the next step is to compare the SQL generated from each of these.

Let's take a look at each query:

/*EF*/
SELECT TOP (2)
[Extent1].[ContactId] AS [ContactId],
[Extent1].[FirstName] AS [FirstName],
[Extent1].[LastName] AS [LastName]
FROM (
    SELECT
        [Extent1].[ContactId] AS [ContactId],
        [Extent1].[FirstName] AS [FirstName],
        [Extent1].[LastName] AS [LastName],
        row_number() OVER (ORDER BY [Extent1].[FirstName] ASC) AS [row_number]
    FROM [dbo].[Contacts] AS [Extent1]
)  AS [Extent1]
WHERE [Extent1].[row_number] > 2
ORDER BY [Extent1].[FirstName] ASC
/*NH*/
exec sp_executesql
N'SELECT TOP (@p0)
    Id0_0_,
    FirstName0_0_,
    LastName0_0_
FROM
(
    SELECT
        this_.Id as Id0_0_,
        this_.FirstName as FirstName0_0_,
        this_.LastName as LastName0_0_,
        ROW_NUMBER() OVER(ORDER BY this_.FirstName) as __hibernate_sort_row
    FROM
        [Contact] this_
) as query
WHERE
    query.__hibernate_sort_row > @p1 ORDER BY query.__hibernate_sort_row',N'@p0 int,@p1 int',@p0=2,@p1=2

You can see that for the first, paged read, EF uses a straight SQL statement whereas NH uses a parameterized dynamic SQL statement.  For this small dataset, there is no discernible difference in performance, but I would gather that for larger datasets, we'd see a performance boost with NH.

For the second query to count, we see that again, there is a small difference in how the two go about generating queries:

/*EF*/
SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
	COUNT(1) AS [A1]
	FROM [dbo].[Contacts] AS [Extent1]
)  AS [GroupBy1]
/*FNH*/
SELECT count(*) as y0_ FROM [Contact] this_

As far as I can tell, for this basic scenario where there is no additional filtering on columns, there should be no practical performance difference between the two (though obviously, EF generates an extra nested select statement, the real world performance impact is negligible).

Next up: I want to do some more tests with more complex queries and figure out how each of these frameworks handles dealing with schema changes.  Seems like a wash for these basic scenarios and one of personal preference.

Filed under: .Net, Dev, SQL Server 2 Comments
5Jun/09Off

T-SQL MERGE: My New Friend

Posted by Charles Chen

As I was catching up on SQL 2008 this morning, I spent some time working with the T-SQL MERGE command, new in SQL 2008. Most of the blogs and examples online are a bit terse, so I decided to write a simple sample.

Check it out:

--// Create our SOURCE table
DECLARE @t1 TABLE (
    id      int,
    name    nvarchar(16),
    email   nvarchar(32)
)
 

--// Create our TARGET table
DECLARE @t2 TABLE (
    id      int,
    name    nvarchar(16),
    email   nvarchar(32)
)

--// Insert values into SOURCE table
INSERT INTO @t1 VALUES (1, 'Charles Chen', 'charles.chen@domain.com')
INSERT INTO @t1 VALUES (2, 'Sandra Chen', 'sandra.chen@domain.com')
INSERT INTO @t1 VALUES (3, 'Brad Wright', 'brad.wright@domain.com')

--// Insert values into TARGET table
INSERT INTO @t2 VALUES (4, 'Brady Sines', 'brady.sines@domain.com')
INSERT INTO @t2 VALUES (1, '', '')

--// Select values from the TARGET table (just to verify)
SELECT * FROM @t2

--// Merge records from source to target
MERGE
    @t2 AS [target]
USING
    @t1 AS [source]
ON
    ([target].id = [source].id)
WHEN NOT MATCHED BY TARGET
    THEN
        INSERT(id, name, email)
        VALUES([source].id, [source].name, [source].email)
WHEN MATCHED
    THEN
        UPDATE SET
            [target].name = [source].name,
            [target].email = [source].email
--// Get the journal of actions
OUTPUT $action, inserted.*, deleted.*;

--// Grab the resultant table
SELECT * FROM @t2

And here are the results:

Awesome! Admittedly, I'm still trying to come up with some legitimate uses for this in my day-to-day application programming :-D

Filed under: SQL Server 3 Comments
24Apr/08Off

FOR XML Needs More Love

Posted by Charles Chen

I'm constantly amazed by the number of developers who have never worked with FOR XML EXPLICIT and the new FOR XML PATH.  If I were designing data access, it would be my go-to commands for building queries for complex data structures (nested DataReaders?  yuck!).


In the past, to support paging using FOR XML EXPLICIT queries took tons of lines to accomplish (although there is something about the whole explicitness that makes it surprisingly legible).  Now with the fancy pants ROW_NUMBER function in SQL along with CTEs, a hundred line query can be written with maybe 15-20 lines.


Here's a simple example that you can copy+paste and run:


/* 
Demonstrates usage of ROW_NUMBER and FOR XML PATH to create
pageable XML results queries.

In this case, the key is to page only on the Route objects.
*/

--// Mock Route table
DECLARE @Route TABLE (
Id int,
Title varchar(100)
)

--// Mock Step table
DECLARE @Step TABLE (
Id int,
RouteId int,
Title varchar(100),
Sequence int
)

--// Insert mock data
INSERT INTO @Route VALUES (1, 'Route 1')
INSERT INTO @Route VALUES (2, 'Route 2')
INSERT INTO @Route VALUES (3, 'Route 3')
INSERT INTO @Route VALUES (4, 'Route 4')
INSERT INTO @Route VALUES (5, 'Route 5')

--// Route 1 Steps
INSERT INTO @Step VALUES (1, 1, 'Step 1.1', 1)
INSERT INTO @Step VALUES (2, 1, 'Step 1.2', 2)
INSERT INTO @Step VALUES (3, 1, 'Step 1.3', 3)
INSERT INTO @Step VALUES (4, 1, 'Step 1.4', 4)
INSERT INTO @Step VALUES (5, 1, 'Step 1.5', 5)

--// Route 2 Steps
INSERT INTO @Step VALUES (6, 2, 'Step 2.1', 1)
INSERT INTO @Step VALUES (7, 2, 'Step 2.2', 2)
INSERT INTO @Step VALUES (8, 2, 'Step 2.3', 3)

--// Route 3 Steps
INSERT INTO @Step VALUES (9, 3, 'Step 3.1', 1)

--// Route 4 Steps
INSERT INTO @Step VALUES (10, 4, 'Step 4.1', 1)

--// Route 5 Steps
INSERT INTO @Step VALUES (11, 5, 'Step 5.1', 1)
INSERT INTO @Step VALUES (12, 5, 'Step 5.2', 2)
INSERT INTO @Step VALUES (13, 5, 'Step 5.3', 3)
INSERT INTO @Step VALUES (14, 5, 'Step 5.4', 4)

/*
Define the page size.
-- Add sorting and ordering later
*/

DECLARE @PageSize int
DECLARE @CurrentPage int

SET @CurrentPage = 0
SET @PageSize = 3

/*
Calculate starting and ending row.
*/
DECLARE @StartIndex int
DECLARE @EndIndex int

SET @StartIndex = @CurrentPage * @PageSize
SET @EndIndex = @StartIndex + @PageSize

; --// Need to terminate with a semicolon for CTE
/*
Perform core XML select
*/

WITH Routes AS (
SELECT
*,
ROW_NUMBER() OVER (ORDER BY Id) AS 'RowNumber'
FROM
@Route
)
SELECT
Routes.Id AS "@Id",
Routes.Title AS "@Title",
(
SELECT
Step.Id AS '@Id',
Step.Title AS '@Title'
FROM
@Step AS Step
WHERE
Step.RouteId = Routes.Id
ORDER BY
Step.Sequence ASC
FOR XML PATH('Step'), TYPE
) AS 'Steps'
FROM
Routes
WHERE
RowNumber > @StartIndex AND RowNumber <= @EndIndex --// BETWEEN Results in improper paging
FOR XML PATH('Route'), ROOT('Routes')


What's great about this is that if your object model is properly designed, it's just a matter of deserializing the XML (using precompiled serialization binaries, of course) to rehydrate your data model.


In this case, the output XML looks like this:


<Routes>
<Route Id="1" Title="Route 1">
<Steps>
<Step Id="1" Title="Step 1.1" />
<Step Id="2" Title="Step 1.2" />
<Step Id="3" Title="Step 1.3" />
<Step Id="4" Title="Step 1.4" />
<Step Id="5" Title="Step 1.5" />
</Steps>
</Route>
<Route Id="2" Title="Route 2">
<Steps>
<Step Id="6" Title="Step 2.1" />
<Step Id="7" Title="Step 2.2" />
<Step Id="8" Title="Step 2.3" />
</Steps>
</Route>
<Route Id="3" Title="Route 3">
<Steps>
<Step Id="9" Title="Step 3.1" />
</Steps>
</Route>
</Routes>

Next, you'll need some simple code to deserialize the XML to make it useful:


using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.IO;
using System.Xml.Serialization;

namespace XmlDeserialization {
internal class Program {
private static void Main(string[] args) {
string xml = ...; // XML string here

RouteList routes = new RouteList(xml);

Console.Out.WriteLine(routes.Count);

foreach(Route route in routes) {
Console.Out.WriteLine(" + Route: {0}", route.Id);

foreach(Step step in route.Steps) {
Console.Out.WriteLine(" + Step: {0}", step.Id);
}
}
}
}

public abstract class XmlDeserializingList<Titem> : List<Titem> {
protected XmlDeserializingList() { }

protected XmlDeserializingList(string xml) {
StringReader reader = new StringReader(xml);

XmlSerializer serializer = new XmlSerializer(GetType());
XmlDeserializingList<Titem> items = (XmlDeserializingList<Titem>)serializer.Deserialize(reader);

AddRange(items);
}
}

[XmlRoot("Routes")]
public class RouteList : XmlDeserializingList<Route> {
public RouteList() {}

public RouteList(string xml) : base(xml) {

}
}

[Serializable]
public class Route {
private int id;
private string title;
private Collection<Step> steps;

[XmlAttribute]
public int Id {
get { return id; }
set { id = value; }
}

[XmlAttribute]
public string Title {
get { return title; }
set { title = value; }
}

[XmlArray("Steps"), XmlArrayItem("Step", typeof(Step))]
public Collection<Step> Steps {
get { return steps; }
set { steps = value; }
}
}

[Serializable]
public class Step {
private int id;
private string title;

[XmlAttribute]
public int Id {
get { return id; }
set { id = value; }
}

[XmlAttribute]
public string Title {
get { return title; }
set { title = value; }
}
}
}

It might even be useful to add some abstract methods (and properties to support it) to GetNextPage()

Filed under: SQL Server No Comments
27Sep/07Off

Dynamic SQL: Yea or Nay?

Posted by Charles Chen

I've always been on the side of stored procedures in the classic debate over the merits of dynamic SQL.  In reality, I can only think of one good scenario where dynamic SQL at the application layer should be used: programmatic batch inserts.

I won't go into the performance debate, since there are tons of articles that already cover this area, but rather, I'd like to discuss the usability and development and architectural aspect of it.

In almost all other cases, it seems like the best choice is to have the application not generate dynamic SQL and use a stored procedure...always.  There are certainly times when dynamic SQL is necessary, for example, when generating selects against a dynamic table structure, but in those cases, the variable portions of the query can be parameterized into the stored procedure and the procedure should generate the dynamic SQL.

Some would argue that if the underlying data models change, the application layer will usually be forced to change are ignoring other aspects of model changes that don't necessitate application model changes.  These include performance tuning, filtering by table JOINs and reuse of the data logic in nested stored procedures or functions.

When working with compiled code like .NET, the core issue is that fixing query errors involves a recompile and redeploy, which in most cases, is much more difficult than just fixing a completely disconnected (but not completely decoupled since there is a quasi-interface (the return result type and structure)) stored procedure.

For example, if a dataset today contains data from table A and tomorrow it needs to include data from table A and B (let's say they both contain the same elements, but one is used for archives), it would be easy to update the procedure to UNION the results from the two datasets without affecting the application layer.

This isn't the only scenario, for example, let's say the requirement changes and now the data needs to be filtered by another table.  It would be easy to add a new INNER JOIN to the query without affecting the application layer.  Not only that, it also allows for the recombination of fields (for example a user name field today only needs to show first and last name, but tomorrow, it may need to show the middle initial as well - this change can be done at the database level and not affect the application or UI layers).  It can also make it easy to change the underlying table structure so long as the return data isn't expected to change: it provides a layer of decoupling between the application layer and the raw data storage.

In addition, having a stored procedure allows for easier testing of the data layer without the added overhead of having to execute the application runtime and walk through the debugger line by line just to figure out if the return data is correct; it is much more efficient to simply execute the query and simulate the use case to find if the data that is returned is correct.  It becomes much easier and much less painful to simulate data access tests since they can be run, observed, and analyzed nearly instantly.

In larger organizations with dedicated DBAs, stored procedures have the added benefit of allowing SQL experts to add performance tuning to eek out extra performance without requiring the application to be rewritten or recompiled.  Again, we see this decoupling of the application layer from the data layer.  Of course you could always have templated SQL stored in XML files or something that would get rid of that recompile, but it is still likely to necessitate more redeployment if the application in question is distributed.  This key point is not to be taken lightly since -- as an example -- an error in string formatting may require the replacement of binaries and services across dozens of servers.  Not only that, testing in such a scenario still requires interaction with the application layer, adding to the possible failure points, time required, and general development pain.

My own conclusion is that using dynamic SQL (including LINQ) creates too tight of a coupling between the application layer binaries and the underlying data store; it's great for RAD and testing, but in any application of significance (especially in highly distributed environments), dynamic SQL at the application layer seems like it's a maintenance and testing disaster waiting to happen.

Filed under: Dev, Rants, SQL Server 1 Comment
21Aug/07Off

Working With SQL Server Compact Edition 2005

Posted by Charles Chen

One interesting issue that I just solved involved how to specify the location of the database file for a SQL Server Compact Edition 2005 connection string in a .Net add-in for Microsoft Office.

You see, when the add-in starts, it sets the context directory as the user's documents directory, which of course, makes it impossible to enter a configuration string for the data source of the connection string.

It works fine if the directory is hard coded - which is what I did for testing purposes initially, but of course, when I switched over to XP64, this broke as on XP64, the program is installed to "Program Files (x86)".

The solution lies buried in Microsoft's SQL CE documentation: there's a note that you should use a special token with the connection string like so:

<connectionStrings>
	<add name="ClientDatabase" 
		connectionString="Data Source=|DataDirectory|\data-file.sdf"
		providerName="Microsoft.SqlServerCe.Client" />
</connectionStrings>

The token needs to be included exactly as entered "|DataDirectory|". So how is this token replaced? In the static constructor of my Connect class that was autogenerated by Visual Studio, I added the following code:

/// <summary>
/// Initializes the logging subsystem for the <see cref="Connect"/> class.
/// </summary>
static Connect() {
	string path = Assembly.GetExecutingAssembly().Location;
	path = path.Substring(0, path.LastIndexOf('\\'));
 

	// Set the DataDirectory for the SQL Server CE connection string.
	AppDomain domain = AppDomain.CurrentDomain;
	domain.SetData("DataDirectory", path);
}
Filed under: .Net, Dev, SQL Server No Comments
17Apr/07Off

Updating Reporting Services 2000 Connection Info

Posted by Charles Chen

If you're running Reporting Services 2000 and you accidentally change the password or remove the original account used by Reporting Services, you will have to update this information.


With Reporting Services 2005, there's a nice GUI utility that you can use to do this, but for Reporting Services 2000, you'll need the rsconfig.exe command line utility which can be found, in a typical installation, at: <drive>:\Program Files\Microsoft SQL Server\80\Tools\Binn


To update the account or password information, the following command will probably do:


rsconfig -c -s servername -d ReportServer -a sql -u username -p password -t


Useful to know...odd that with all of the RS development I've done in the past, I've never had to use this utility once until today.

Filed under: SQL Server No Comments
8Jan/07Off

ContentTypeIds In WSS3

Posted by Charles Chen

In WSS3, if you execute the following SQL:

    SELECT
        ContentTypeId
    FROM 
        ContentTypes

You will notice that the content types are represented in the output as hex. If you take a look at the table definition, you'll see that the actual data type of the column is VARBINARY(512).

Doing a lookup like so:

    SELECT
        ContentTypeId
    FROM 
        ContentTypes
    WHERE 
        ContentTypeId = '0x101'

Will not work since you cannot perform a comparison between VARBINARY and a character data type directly.

Doing the following will also not work:

    SELECT
        ContentTypeId
    FROM 
        ContentTypes
    WHERE 
        CAST(ContentTypeId AS VARCHAR(512)) = '0x101'

This doesn't work because the underlying type of the binary data isn't character data. It's integer data. You can confirm this by running the following query:

    SELECT
        ContentTypeId,
        CAST(ContentTypeId AS VARCHAR(512))
    FROM 
        ContentTypes

You'll see that it's just a bunch of gibberish. Try the same query with INT and you'll see that the data makes much more sense. What you'll notice is that content types that inherit from a base content type will have numerical values that increment by 1.

This information is useful, but not nearly as useful as the data that you can glean from the hex string representation of the ContentTypeId. You see, in the hex string representation, the base ID is a substring of the ID of any inheriting content type. For example, if I have a content type which has a ID (as a hex string) of 0x0101345346345312234346, then any child content types will have 0x0101345346345312234346 as a substring (e.g. 0x010134534634531223434601, 0x010134534634531223434602).

So how do we get this data in SQL Server for comparison purposes? We need to use an "undocumented" SQL function: fn_varbintohexstr().

This allows you to do nifty queries to find a given content type and all child content types (or any query where you have to retrieve information about a hierarchy of content types) like so:

    SELECT
        *
    FROM 
        ContentTypes
    WHERE 
        master.dbo.fn_varbintohexstr(ContentTypeId) LIKE 
            '0x0101345346345312234346%'

You can find out more information on this function here.

18Nov/05Off

Fun in Philly

Posted by Charles Chen

Wow, I had a lot more fun in Philly than I thought I would have.


Back up a bit.  I had originally registered for the Microsoft launch event in Philly as a participant a few weeks back.  Last week, Rich, my new manager, got 4 of us together, the only 4 that had really played around with SQL Server 2005, and asked if we could come up with a demo for the show as, apparently, we had a presence at the show. 


Long story short, I ended up taking up part of the responsibility of building the demo and also working the booth at the show.  Somehow, we ended up building a database mirroring demo which featured an ASP.Net 2.0 and Atlas frontend.


I also ended up with the responsibility of creating some posters and datasheets that we could hand out along with our demo.  Keep in mind that all of this came up last Thursday.  Having technical and graphical skills is both a curse and a blessing I guess.


Well, in any case, the show was a huge success.  We had people lining up to look at the demo and to talk to Igor and myself.  For an app that was put together in 10-12 hours, it was surprisingly stable.  We only had one major issue and that was a networking issue (dropped IP addresses).  It was so good that we had people coming back to the booth after seeing the datasheet from other people.  We had people coming back with their friends because they were so impressed.  It was quite amazing.  We ran out of the datasheets for the demo in the first 90 minutes of the show (maybe the first 60).  Fortunately, we also had it in poster form (which we didn't give away).  It drew quite a bit of attention and there were a lot of people who just stopped in their tracks to check it out.  The funny story regarding that is I almost didn't have the poster printed.  I was originally only going to print 8.5x11 copies, but Frank, another consultant, suggested that I go with the poster.




poster_web.gif


Believe it or not, I put it together in ~3-4 hours Tuesday night after I got back from a client site.  Everything on this sheet was created from scratch (Except for the computer graphic, which came from Visio).  I swear I was half asleep when I put it all together :-D


I actually heard a lot of comments on the poster, even one from a professor who said she really loved the layout and colors.  Even my wife was impressed that I put this together when I did/in the amount of time that I did it in.


So all in all, a fun but tiring day.  Man, I just love building cool stuff and I love connecting with other developers.  Hopefully, I'll have more opportunities like this in the future.