<CharlieDigital/> Programming, Politics, and uhh…pineapples

30Sep/140

Thoughts on Burnout

Posted by Charles Chen

I was reading an NPR piece on worker burnout and some different tactics taken by different companies to deal with it and came across a very nice, concise definition:

Christina Maslach is a professor at the University of California, Berkeley, whose four decades of research on the subject helped popularize the term "burnout." Maslach says it's a loose term that encompasses a combination of work overload, lack of autonomy and reward and social and moral discord at work.

This sentence very concisely summarizes the key drivers of burnout and the factors at play are not as simple as "too much work".

The article also brings up an interesting observation (well, it's just the next few paragraphs):

Most burnout stems from interpersonal strife, but most employers see the solution as time off, she says.

If companies really want to know what's causing burnout in their workplace, Maslach says, they shouldn't just mandate more time off. They should assess the core problem, then design solutions to mitigate those issues.

"When it's time off, I mean, that might be time away from work," Maslach says. "Maybe you're addressing issues of exhaustion, but it's not really addressing what may be the problems at work."

Ultimately, a company, a project, a product -- it is the effort of many individual humans who must come together to fulfill a common goal.  And when humans are involved, conflict is sure to arise.  Obviously, you can still get things done when not all of your parts are in harmony, but isn't it much more enjoyable when they are?

I hardly consider myself an expert, but in my own experience, I've found that it's a good idea to work to reinforce those relationships between the people that comprise the team through team activities.  A common one is eating together with one another or occasionally taking the whole office to lunch or dinner. It is especially important for management to be involved because it shows that the employees are valued as people and not just as fungible parts of a machine.

Andrew Fitzgerald comments in that NPR article:

One day I got called into the boss's office. I was thinking to myself "Shoot! What does this guy have on me now? They called me in just to tell me that they thought I was doing a good job and that they appreciate my work ethic. I didn't make a lot of money. The work was kind of tedious and repetitive but I could not tell you how good that made me feel. A little positive feedback from the higher ups goes a long way.

At IC, the development team is in a unique position because all of us work remotely and travel to Irvine.  So we end up spending quite a bit of time together eating meals, going to the shooting range, kayaking on the weekends, and I'm planning on taking the team to an indoor climbing facility as well (I try to keep things fresh).  I also try to make sure that everyone is taken care of; there is nothing I won't do from picking up lunch, driving a co-worker to a train station, picking up fruit for everyone to share, and so on.  Not just because I manage them, but because I like and respect these guys as people first and foremost.

Even at a basic level, we sit together in the office and chit-chat from time to time about random things and watch random videos after we've been hacking away for 8 or 9 hours. When we are on site, not one member of the development team leaves before the others.  Not because anyone is forced to stay, and not because we have some unspoken code about such actions or that we would shame anyone that did, but I think because we all feel that we are in this together and that truly, we have a common goal to achieve as a team.

And that is an important point, in my opinion, because too often, how leaders fail is by not aligning all of the cogs of the machinery towards a common goal.  Most of the time, that simply involves clear and open communication about expectations, company goals, and an understanding of the priorities of the company or the team.

Like a train with two engines heading in opposite directions, failure by team leads to align the members of a team to a goal or failure by management to communicate expectations and priorities seems to lead to inaction, indecision, and conflict when team members are trying to pull in opposite directions.  Ultimately, this just help feed into worker burnout.

4Apr/14Off

FluentNHibernate and SQL Date Generation

Posted by Charles Chen

So you'd like your SQL entry to have a system generated date/time, eh?

Here is a sample table:

CREATE TABLE [dbo].[AuditLog] (
	Id int IDENTITY(1,1) PRIMARY KEY,
	EventUtc datetime2(7) DEFAULT(SYSUTCDATETIME()) NOT NULL,
	EventOffsetUtc datetimeoffset(7) DEFAULT(SYSDATETIMEOFFSET()) NOT NULL,
	EntityContextUid uniqueidentifier,
	EntityContextName nvarchar(256),
	EntityContextType varchar(128),
	UserLogin nvarchar(128),
	EventName varchar(128),
	AppContext varchar(64),
	EventData nvarchar(max),
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

To spare you hours dealing with this error:

System.Data.SqlTypes.SqlTypeException:
   SqlDateTime overflow. Must be between
   1/1/1753 12:00:00 AM and 12/31/9999
   11:59:59 PM.

What you need to do is to use the following mapping for your date/time columns:

Map(a => a.EventUtc).Column("EventUtc")
	.CustomSqlType("datetime2(7)")
	.Not.Nullable()
	.Default("SYSUTCDATETIME()")
	.Generated.Insert();
Map(a => a.EventOffsetUtc).Column("EventOffsetUtc")
	.CustomSqlType("datetimeoffset(7)")
	.Not.Nullable()
	.Default("SYSDATETIMEOFFSET()")
	.Generated.Insert();

Special thanks to this Stackoverflow thread.

2Apr/14Off

What Alan Watts Can Teach Us About Leadership

Posted by Charles Chen

I was listening to a talk by Alan Watts and found one bit of advice that really connected to what I've learned about leading others.

The principle is that any time you -- as it were -- voluntarily let up control.

In other words, cease to cling to yourself; you have an excess of power because you are wasting energy all the time in self defense.

Trying to manage things, trying to force things to conform to your will.

The moment you stop doing that, that wasted energy is available.

Therefore you are in that sense -- having that energy available --  you are one with the divine principle; you have the energy.

When you are trying, however, to act as if you were god, that is to say you don't trust anybody and you are the dictator and you have to keep everybody in line you lose the divine energy because what you are simply doing is defending yourself.

One mistake that I've been guilty of is to try to force things to conform to my will on various projects (I still do it to varying degrees!).  It is usually with the best of intentions -- for a cleaner framework, a better product, a more efficient process -- but at the same time, it is true that a lot of energy is spent wasted in doing so.

What is the alternative, then?

I think Watts is right that a level of trust has to exist that the team around you can help you achieve your project goals.  Instead of expending the energy in controlling the members of the team, spend the energy in building that trust through training, mentorship, guidance, and giving up not just control, but responsibility.

Sometimes that trust will be unwarranted, but sometimes, that trust will pay itself back many-fold.

26Dec/13Off

Setting Batch Variables Dynamically

Posted by Charles Chen

One of the challenges of managing builds and deployments of source on multiple developer machines is that it can be complicated to contain and manage different variations in developer environments.

For example, often times, it is useful to know the server name or the connection string information so that local settings don't make it into source control.

How I've often tackled this is to add a batch file that every developer executes when getting the source for the first time.  This batch file asks for various settings and then saves the results to a text file which is then read back when executing other batch files.

Here is an example of such a batch file:

@ECHO OFF

ECHO ================================================================
ECHO Creates the developer specific local configuration for builds
ECHO that will allow developers to use local settings for specifc
ECHO aspects of the deployment.
ECHO ================================================================

:SPURL
SET SPURL=
SET /P SPURL=Enter SharePoint site URL (ex. http://mymachine:2345/): %=%
IF "%SPURL%"=="" GOTO SPURL

ECHO ----------------------------------------------------------------

:SQLCONN
SET SQLCONN=
ECHO Enter SQL connection string to the IC membership database 
ECHO (ex. Server=VM1;Database=IPDB;User ID=membershipUser;Password=P@ssw0rd;Application Name=HeCoreServices):
SET /P SQLCONN=
IF "%SQLCONN%"=="" GOTO SQLCONN

ECHO ----------------------------------------------------------------

ECHO SPURL=%SPURL% > build-configuration.txt
ECHO SQLCONN=%SQLCONN% >> build-configuration.txt

ECHO Completed; created file build-configuration.txt

PAUSE

This batch file will prompt the developer for two settings: the URL of a site and the connection string that the developer is using locally (which can vary by the database name, login, etc.).  The contents get written to a file called build-configuration.txt that looks like this:

SPURL=http://mymachine:2345
SQLCONN=Server=VM1;Database=IPDB;User ID=membershipUser;Password=P@ssw0rd;Application Name=HeCoreServices

This file is excluded from source control and developers can, of course, manually edit this file as well to create local settings.

Now when I'm ready to use these settings in another batch file, I can invoke it like so:

@ECHO OFF

SETLOCAL ENABLEDELAYEDEXPANSION

FOR /F "tokens=*" %%n IN (build-configuration.txt) DO (	
	ECHO %%n

	SET %%n
)

ECHO %SPURL%
ECHO %SQLCONN%

PAUSE

There are other ways to do this as well, but the downside to most approaches is that you have to know how many parameters you have or use less meaningful names.  This approach will let you set variables to your heart's content and read them in dynamically at execution.

Filed under: Awesome, Dev, Self Note No Comments
5Dec/13Off

A Simple Way to Improve CAML Query Performance

Posted by Charles Chen

There are many ways to improve the performance of your CAML queries, but I've recently found that in some cases, it's as easy as switching the order of your filter operations.

In this case, I was searching across a list of 1,000,000 items for a set of 41.

The list consists of tasks with, among other fields, a Status and Assigned To field.

Both of these fields were indexed, but the following query was still running in the 10 second range:

<Where>
    <And>
        <And>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Completed</Value>
            </Neq>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Terminated</Value>
            </Neq>
        </And>                                             
        <Eq>
            <FieldRef Name='AssignedTo' LookupId='TRUE' />
            <Value Type='Int'>159</Value>
        </Eq>
    </And>
</Where>

One small tweak and the same query ran in 1.5s:

<Where>
    <And>
        <Eq>
            <FieldRef Name='AssignedTo' LookupId='TRUE' />
            <Value Type='Int'>159</Value>
        </Eq>	
        <And>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Completed</Value>
            </Neq>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Terminated</Value>
            </Neq>
        </And>                                             
    </And>
</Where>

All that was done was to shift the order of the query conditions.

The first query reads as "All tasks that are not Completed and not Terminated and Assigned To user 159".

The second query reads as "All tasks that are Assigned To user 159 that are not Completed and not Terminated".

I didn't trace the generated SQL, but it's not hard to imagine that the SQL now performs an initial filter on the data set against the user ID and returns a much smaller data set for subsequent operations (filter by Status).

So the lesson learned is that for large lists, you need to follow Microsoft's guidance on large lists, but also ensure that your queries are written to take advantage of the indexes and reduce the data set as early as possible (preferably against an indexed field.).

26Nov/13Off

Preventing the Garbage Collector From Ruining Your Day

Posted by Charles Chen

If you're working with ZeroMQ, you may run into an exception with the message "Context was terminated".

It turns out that this is due to the garbage collector cleaning up (or attempting to clean up?) the ZmqContext.

Found this out via this handy thread on Stack, but what about cases where you can't use a using statement?

For example, in a Windows Service, I create the context on the OnStart method and destroy the context on the OnStop method.

In this case, an alternative is to use the GC.KeepAlive(Object obj) method to prevent the garbage collector from collecting the object until after the call to this method.  It seems counter intuitive, but it is actually a signal to tell the garbage collector that it can collect the object at any point after this call.

Filed under: .Net, Self Note, ZeroMQ No Comments
16Oct/13Off

Watch Out For SPListItemCollection.Count and Judicious Use of RowLimit

Posted by Charles Chen

This seemingly innocuous call can be quite dangerous when used incorrectly.

The reason is that this property invocation actually executes the query.

This is OK if you plan on iterating the results because the results are cached, but costly if you don't plan on iterating the results.  The following code sample can be used to test this effect for yourself:

static void Main(string[] args)
{
    using(SPSite site = new SPSite("http://internal.dev.com/sites/oncology"))
    using (SPWeb web = site.OpenWeb())
    {
        SPList list = web.Lists.TryGetList("General Tasks");

        SPQuery query = new SPQuery();
        query.RowLimit = 1;
        query.Query = @"
<Where>
<Contains>
<FieldRef Name='Title'/>
<Value Type='Text'>500KB_1x100_Type_I_R1</Value>
</Contains>
</Where>";
        query.QueryThrottleMode = SPQueryThrottleOption.Override;

        SPListItemCollection items = list.GetItems(query);

        Stopwatch timer = new Stopwatch();

        timer.Start();

        Console.Out.WriteLine("{0} items match the criteria.", items.Count);

        var timeForCount = timer.ElapsedMilliseconds;

        Console.Out.WriteLine("{0} milliseconds elapsed for count.", timer.ElapsedMilliseconds);

        foreach (var i in items)
        {
            Console.Out.WriteLine("{0} milliseconds elapsed for start of iteration.", timer.ElapsedMilliseconds - timeForCount);

            break;
        }
    }
}

(And of course, you can check the implementation of Count in Reflector or dotPeek)

You will see that the start of iteration will be very fast once you've invoked Count once.

Now here is where it gets interesting:

  1. The total time it takes to execute the query is longer for invoking Count versus just iterating (~3000ms vs ~3200ms, about 5-10% in my tests).
  2. When I set the RowLimit to 1, I can reduce the time by roughly 40-50% (~1600ms vs ~3200ms for a resultset of 230 out of a list of 150,000 items).

Try it yourself by commenting and uncommenting the RowLimit line and commenting and uncommenting the line that invokes Count.

What does this mean for you?  Well, if you don't need the count, then don't use it.  It's slower than just iterating the results. Where you plan on iterating the results anyways, don't invoke Count. If you need the count, you are better off doing a counter yourself in the iteration.

And in a use case where you don't plan on iterating the result set (for example, checking to see if there is at least one occurrence of a type of object), be sure to set the RowLimit in your query!

26Jul/13Off

SQL Query for Multi-Values In An Encoded String

Posted by Charles Chen

Consider a table with an textual column that encodes multi-values like so:

1|US;2|CA;3|MX

How can we query for all rows using an OR criteria?

For a single value, it's quite easy by searching for the string (in this case, an ISO 2 code).  But what if we need to search for the occurrence of one of n strings?

The following query achieves this using the Microsoft SQL XML data type, the nodes() function, and CROSS APPLY:

-- Create data source; this is just for demo purposes
DECLARE @contacts TABLE
(
    id int,
    contact nvarchar(100),
    countries nvarchar(100) 
)

-- Insert sample test data.
INSERT INTO @contacts VALUES (1, 'Charles', '1|US;2|CA;3|MX') -- US, CA, MX
INSERT INTO @contacts VALUES (2, 'Steven', '1|US;3|MX;2|CA') -- US, MX, CA
INSERT INTO @contacts VALUES (3, 'Arturo', '3|MX') -- MX
INSERT INTO @contacts VALUES (4, 'Silvia', '4|FR') -- FR
INSERT INTO @contacts VALUES (5, 'John', '2|CA;1|US') -- CA, US
INSERT INTO @contacts VALUES (5, 'Johan', '5|DE') -- DE

-- Query for all contacts in US OR MX OR CA (Charles, Steven, Arturo, John)
SELECT
    DISTINCT T1.id,
    T1.contact
FROM (
    SELECT 
        id,
        contact,        
        CAST('<a><c>' + REPLACE(countries, ';','</c><c>') + '</c></a>' AS XML) AS countriesXml
    FROM @contacts  
    ) AS T1
CROSS APPLY T1.countriesXml.nodes('/a/c') T2(c)
WHERE CAST(T2.c.query('string(.)') AS varchar(max)) IN ('1|US', '3|MX', '2|CA')

This should yield the values Charles, Steven, Arturo, and John by first converting the delimited values into XML by simply using string replacement.  Next, the XML is "shredded" using nodes().  For each base row, the shredding generates one row per node (for example, for Charles, we would have one row for US, one row for CA, and one row for MX).

Here is the result of the inner sub-select:

1	Charles	<a><c>1|US</c><c>2|CA</c><c>3|MX</c></a>
2	Steven	<a><c>1|US</c><c>3|MX</c><c>2|CA</c></a>
3	Arturo	<a><c>3|MX</c></a>
4	Silvia	<a><c>4|FR</c></a>
5	John	<a><c>2|CA</c><c>1|US</c></a>
5	Johan	<a><c>5|DE</c></a>

And here is the resultset after shredding:

1	Charles	1|US
1	Charles	2|CA
1	Charles	3|MX
2	Steven	1|US
2	Steven	3|MX
2	Steven	2|CA
3	Arturo	3|MX
4	Silvia	4|FR
5	John	2|CA
5	John	1|US
5	Johan	5|DE

You can see the intermediate resultset using this query:

SELECT
	T1.id,
	T1.contact,
	T2.c.query('string(.)')
FROM (
	SELECT 
		id,
		contact,		
		CAST('<a><c>' + REPLACE(countries, ';','</c><c>') + '</c></a>' AS XML) AS countriesXml
	FROM @contacts 	
	) AS T1
CROSS APPLY T1.countriesXml.nodes('/a/c') T2(c)

Finally, the DISTINCT clause collapses the resultset once again.

From my own testing, better performance can be achieved by creating a table variable with the target values and using a JOIN instead of an IN (about 1/5 of the time). For 100k records, using IN takes about 13.430s. Using a JOIN to a table variable takes about 2.293s.

11Jul/13Off

Use ContentIterator to Read Items from Large SharePoint Lists

Posted by Charles Chen

New to me: ContentIterator for reading items from large lists.

3Jun/13Off

Understanding Billing for Amazon EBS Provisioned IOPs

Posted by Charles Chen

I've been experimenting with Amazon EC2 and SharePoint to better understand how an enterprise architecture would be built on top of it as a platform and, as a part of this exercise, better understand the cost structure.

One key number that escaped me during my initial review of the pricing structure is the one highlighted below:

provisioned-iops-month

Now before we get into the details of what to watch for here, I should preface and say that for use cases that demand high performance disk I/O, provisioned IOPS is probably well worth it and, in fact, I definitely foresee us using this in select scenarios (i.e. high I/O database server instance).

In the screen cap below, you can see that I've got three provisioned IOPS volumes configured with 2000, 2000, and 1000 IOPS respectively (note that maximum IOPS on a volume is a multiple of the size of the volume):

disks

The question is what does "$0.10 per provisioned IOPS-month" actually mean?

Well, here's the skinny from the Amazon web site:

For example, if you provision a volume with 1000 IOPS, and keep this volume for 15 days in a 30 day month, then in the Virginia Region, you would be charged $50 for the IOPS that you provision ($0.10 per provisioned IOPS-Month * 1000 IOPS Provisioned * 15 days/30).

So in my case, for a 30 day month, it would work out to be (2000 + 2000 + 1000) * 0.10 * 1 = $500/mo. for the provisioned IOPS volumes.

This is something to keep your eye on as if you are using EC2 instances with provisioned IOPS volumes sporadically, you may want to take snapshots or create an AMI and discard the volumes when not in use.  Comparatively speaking, snapshot data is much friendlier on the wallet when your system isn't under active use.

Tagged as: No Comments