<CharlieDigital/> Programming, Politics, and uhh…pineapples

27Jan/14Off

More Thoughts on Planning Office Spaces for Development

Posted by Charles Chen

A proposed rule of thumb for planning office spaces for development:

  1. Buy cheap desks.
  2. Buy expensive chairs.

Our office has it all wrong.  We have desks that are solid as a rock but probably $600+ and seats that are $99 Office Max specials (just a guess).  I suspect that this is the case with most offices where the planners spend a fortune on desks, dividers, cabinets, and so on (just go look up the prices of cubicle partitions from Hon or Steelcase) and go cheap on the seating.

The problem is that the majority of an office worker's time is spent sitting.  Splurge on the chair instead.  Don't go too cheap on the desks; Ikea Galants are perfectly stable, durable, lightweight, easy to hack, easy to refactor, cheap, and assembling them is a great team building exercise :)

There is no exception to this rule.  If you want sit-to-stand type desks, just get something like this:

Kangaroo Elite Sit-to-Stand

Kangaroo Elite Sit-to-Stand (yeah, it's $600, but a sit-to-stand desk easily costs $1000+)

As an added bonus: if you have top candidates coming into the office for interviews, they are far more likely to be impressed by fancy chairs than fancy desks.

17Jan/14Off

Thoughts on Spatial Conditions for Optimal Collaborative Efficiency

Posted by Charles Chen

(Or in other words, how to arrange your office space)

As I have been working in the office more, I've started to think more about how to organize the office space to enhance the collaborative nature of our work.

A few weeks back, we were re-arranging our desks and I noticed how heavy they were and how un-conducive they were to refactoring.  They are L-shaped desks with the two sections held together by two meager screws and a metal plate.  Yet they were so heavy and unwieldy as to make them extremely difficult to move.

The thought occurred to me then why Ikea Galants are almost universally the office desk of choice for startups: they're cheap, they're sturdy, they're easy to move, they're open, and they're extremely hackable.

Google image search results for tech startup office.

Google image search results for tech startup office.

But beyond just the hardware, I've been thinking about how to improve our team's ability to collaborate as we are in a period that requires highly coordinated design and development to hit our very tight timeline.  One of the first things we did was cluster the core team members into one area of the office so that it's easy to turn around and talk to anyone and so that all members of the team can hear and jump in on any design discussion.

This has downsides, of course, as it can get noisy and it can be difficult for other members who need to concentrate.  However, my thought is that we should reuse the office rooms for private "thinking" or call areas with standing only desks (no seats) so that individuals who need temporary peace can move into an office but not get too comfortable.

My other thought has been that it is extremely important for the leaders, founders, CEO, president -- whomever, to sit with the team.  The reason is that these decision makers need to know what's going on -- especially so in a startup -- and it's difficult to do so if their time is spent behind closed doors away from the action.  Of course, there are times when important phone calls must be made or private discussions must be had, but again, the concept is to reserve the closed door rooms specifically for this purpose and only with standing desks.

An even bigger intangible is that having the leadership on the floor sets the tone for the team.  In fact, I sit with my back facing my developers so that each of them can see what I'm working on and what I'm not doing.  I'm not on Facebook, I'm not randomly browsing sites, I'm not sitting idle.  I think this has a motivational effect and helps the team realize that we're all in this together to make the magic happen and that I'm not asking any more of them than I am of myself.

One final important lesson I've learned is that walls are stupid:

IMG_20140117_085727882

15Jan/14Off

Thoughts on Goal Oriented Leadership

Posted by Charles Chen

I've been thinking heavily on the topic of leadership for several days now and trying to understand what works and what doesn't.  I am currently in the midst of a massive and daring undertaking and putting tremendous demands on my team to deliver.

My challenge has been to keep that pressure from them and help streamline this process so that they can focus on delivering and hitting our milestones.  To achieve this, I've tried to step up my leadership and intensity.

I started with a daily stand-up meeting where we got around and discussed our plan for the day, but after just a few days, I thought to myself that it was too limiting if I set the goals for the team and dictated what they did.  Who's to say that I have the best idea on how to hit our goals?  Instead, I only set our team goal and let each member of the team set their own goals for the day.

The effectiveness of this approach is multifaceted:

The first is that it allows communication of the overall team goal.  All members of the team need to know the goal and the mission to succeed; they have to see the bigger picture so that they know that there is a condition for victory and success.

The second is that it allows them to be responsible to their own expectations.  Because the goals are set by themselves, if the goals are not achieved, it cannot be blamed on someone else setting unrealistic goals; each member is thus motivated by an internal force and not an external force.

The third is that you will get better ideas simply by having more brains work on the problem of identifying the goals that need to be achieved for success.

A goal oriented leader doesn't tell people to do things, but rather lets them set their goals and move all obstacles out of the path of achieving those goals.  The idea is to transform yourself from a manager of tasks to a coordinator of goals: communicate a team goal, let the members set their own goals, and what a leader should do is align all goals to the mission.

(Also, it helps to keep the team well fed)

26Dec/13Off

Setting Batch Variables Dynamically

Posted by Charles Chen

One of the challenges of managing builds and deployments of source on multiple developer machines is that it can be complicated to contain and manage different variations in developer environments.

For example, often times, it is useful to know the server name or the connection string information so that local settings don't make it into source control.

How I've often tackled this is to add a batch file that every developer executes when getting the source for the first time.  This batch file asks for various settings and then saves the results to a text file which is then read back when executing other batch files.

Here is an example of such a batch file:

@ECHO OFF

ECHO ================================================================
ECHO Creates the developer specific local configuration for builds
ECHO that will allow developers to use local settings for specifc
ECHO aspects of the deployment.
ECHO ================================================================

:SPURL
SET SPURL=
SET /P SPURL=Enter SharePoint site URL (ex. http://mymachine:2345/): %=%
IF "%SPURL%"=="" GOTO SPURL

ECHO ----------------------------------------------------------------

:SQLCONN
SET SQLCONN=
ECHO Enter SQL connection string to the IC membership database 
ECHO (ex. Server=VM1;Database=IPDB;User ID=membershipUser;Password=P@ssw0rd;Application Name=HeCoreServices):
SET /P SQLCONN=
IF "%SQLCONN%"=="" GOTO SQLCONN

ECHO ----------------------------------------------------------------

ECHO SPURL=%SPURL% > build-configuration.txt
ECHO SQLCONN=%SQLCONN% >> build-configuration.txt

ECHO Completed; created file build-configuration.txt

PAUSE

This batch file will prompt the developer for two settings: the URL of a site and the connection string that the developer is using locally (which can vary by the database name, login, etc.).  The contents get written to a file called build-configuration.txt that looks like this:

SPURL=http://mymachine:2345
SQLCONN=Server=VM1;Database=IPDB;User ID=membershipUser;Password=P@ssw0rd;Application Name=HeCoreServices

This file is excluded from source control and developers can, of course, manually edit this file as well to create local settings.

Now when I'm ready to use these settings in another batch file, I can invoke it like so:

@ECHO OFF

SETLOCAL ENABLEDELAYEDEXPANSION

FOR /F "tokens=*" %%n IN (build-configuration.txt) DO (	
	ECHO %%n

	SET %%n
)

ECHO %SPURL%
ECHO %SQLCONN%

PAUSE

There are other ways to do this as well, but the downside to most approaches is that you have to know how many parameters you have or use less meaningful names.  This approach will let you set variables to your heart's content and read them in dynamically at execution.

Filed under: Awesome, Dev, Self Note No Comments
5Dec/13Off

A Simple Way to Improve CAML Query Performance

Posted by Charles Chen

There are many ways to improve the performance of your CAML queries, but I've recently found that in some cases, it's as easy as switching the order of your filter operations.

In this case, I was searching across a list of 1,000,000 items for a set of 41.

The list consists of tasks with, among other fields, a Status and Assigned To field.

Both of these fields were indexed, but the following query was still running in the 10 second range:

<Where>
    <And>
        <And>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Completed</Value>
            </Neq>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Terminated</Value>
            </Neq>
        </And>                                             
        <Eq>
            <FieldRef Name='AssignedTo' LookupId='TRUE' />
            <Value Type='Int'>159</Value>
        </Eq>
    </And>
</Where>

One small tweak and the same query ran in 1.5s:

<Where>
    <And>
        <Eq>
            <FieldRef Name='AssignedTo' LookupId='TRUE' />
            <Value Type='Int'>159</Value>
        </Eq>	
        <And>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Completed</Value>
            </Neq>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Terminated</Value>
            </Neq>
        </And>                                             
    </And>
</Where>

All that was done was to shift the order of the query conditions.

The first query reads as "All tasks that are not Completed and not Terminated and Assigned To user 159".

The second query reads as "All tasks that are Assigned To user 159 that are not Completed and not Terminated".

I didn't trace the generated SQL, but it's not hard to imagine that the SQL now performs an initial filter on the data set against the user ID and returns a much smaller data set for subsequent operations (filter by Status).

So the lesson learned is that for large lists, you need to follow Microsoft's guidance on large lists, but also ensure that your queries are written to take advantage of the indexes and reduce the data set as early as possible (preferably against an indexed field.).

26Nov/13Off

Preventing the Garbage Collector From Ruining Your Day

Posted by Charles Chen

If you're working with ZeroMQ, you may run into an exception with the message "Context was terminated".

It turns out that this is due to the garbage collector cleaning up (or attempting to clean up?) the ZmqContext.

Found this out via this handy thread on Stack, but what about cases where you can't use a using statement?

For example, in a Windows Service, I create the context on the OnStart method and destroy the context on the OnStop method.

In this case, an alternative is to use the GC.KeepAlive(Object obj) method to prevent the garbage collector from collecting the object until after the call to this method.  It seems counter intuitive, but it is actually a signal to tell the garbage collector that it can collect the object at any point after this call.

Filed under: .Net, Self Note, ZeroMQ No Comments
12Nov/13Off

An Architecture for High-Throughput Concurrent Web Request Processing

Posted by Charles Chen

I've been working with ZeroMQ lately and I think I've fallen in love.

It's rare that a technology or framework just jumps out at you, but here is one that will get your head spinning on the different ways that it can make your architecture more scalable, more powerful, and all the while offering a frictionless way of achieving this.

I've been building distributed, multi-threaded applications since college, and ZeroMQ has changed everything for me.

It initially started with a need to build a distributed event processing engine.  I had wanted to try implementing it in WCF using peer-to-peer and/or MSMQ endpoints, but the thought of the complexity of managing that stack along with the configuration and setup seemed like it would be at least fruitful to look into a few other alternatives.

RabbitMQ and ZeroMQ were the clear front-runners for me.  I really liked the richness of documentation and examples with RabbitMQ and if you look at some statistics, it has a much greater rate of mentions on Stack so we can assume that it has a higher rate of adoption.  But at the core of it, I think that there really is no comparison between these two except for the fact that they both have "MQ" in their names.

It's true that one could build RabbitMQ like functionality on top of ZeroMQ, but to a degree, I think that would be defeating the purpose.  The beauty of ZeroMQ is that it's so lightweight and so fast that it's really hard to believe; there's just one reference to add to your project.  No central server to configure.  No single point of failure.  No configuration files.   No need to think about failovers and clustering.  Nothing.  Just plug and go.  But there is a cost to this: a huge tradeoff in some of the higher level features that -- if you want -- you have to build yourself.

If you understand your use cases and you understand the limitations of ZeroMQ and where it's best used, you can find some amazing ways to leverage it to make your applications more scalable.

One such use case I've been thinking about is using it to build a highly scalable web-request processing engine which would allow scaling by adding lots of cheap, heterogeneous nodes.  You see, with ASP.NET, unless you explicitly build a concurrency-oriented application, your web server processing is single-threaded per request and you can only ever generate output HTML at the sum of the costs of generating each sub part of your view.  To get around this, we could consider a processing engine that would be able to parse controls and send the processing off -- in parallel -- to multiple processors and then reassemble the output HTML before feeding it back to the client.  In this scenario, the cost of rendering the page is the overhead of the request plus the cost of the most expensive part of the view generation.

The following diagram conceptualizes this in ZeroMQ:

zmq-processing

Still a work in progress...

Even if an ASP.NET application is architected and programmed for concurrency from the get-go, you are limited by the constraints of the hardware (# of concurrent threads).  Of course, you can add more servers and put a load balancer in front of them, but this can be an expensive proposition.  Perhaps a better architecture would be to design a system that allows adding cheap, heterogeneous server instances that do nothing but process parts of a view.

 In such an architecture, it would be possible to scale the system at any level by simply adding more nodes -- at any level.  They could be entirely heterogeneous; no need for IIS, in fact, the servers don't even have to be Windows servers.  The tradeoff is that you have to manage the session information yourself and push the relevant information down through the pipeline or at least make it accessible via a high speed interface (maybe like a Redis or Memcached?).

But the net gain is that it would allow for concurrent processing of a single web request and build an infrastructure for handling web requests that is easily scaled with cheap, simple nodes.

Filed under: .Net, Awesome, ZeroMQ No Comments
26Oct/13Off

More Thoughts on Object Oriented Code

Posted by Charles Chen

I've talked about writing object-oriented and domain-driven design before.

In talking with another dev this week, I think I have my simplest summary of object-oriented code yet: when  you are writing well written object oriented code, you'll know it by the questions being asked by your code.

So what does this mean?

A good example is the following:

FormData formData = GetFormData();

// Object-oriented? Not really:
bool isValid = FormDataUtil.IsValid(formData);

// Object-oriented:
bool isValid = formData.IsValid();

It's very subtle, but it's very easy to observe because you simply need to ask yourself if you are asking the questions to the right objects.  We don't ask FormDataUtil if the formData is valid, we ask the formData directly.

Primarily, what this will reflect is the principle of encapsulation.

When you are asking the right questions to the right objects, you'll find that the code is easier to read, easier to maintain, less fragile, and more natural to reuse.

If the code is well written, as first time users, we don't have to know that there is a class explicitly designed to validate the form data; we can find it easily on the class itself.

I don't think it gets any simpler than that and yet it is, to me, the very essence of what it means to write object-oriented code.

16Oct/13Off

SharePoint’s Image Problem

Posted by Charles Chen

SharePoint has an image problem with many folks.

I've discussed it before, but I think I have a new perspective on the issue.

You see, I recently interviewed a candidate that was Microsoft Certified IT Professional: SharePoint Administrator 2010 and Microsoft Certified Professional Developer: SharePoint Developer 2010.  His resume looked impressive and promising, with many SharePoint projects under his belt.

He failed miserably on my moderate-advanced web developer assessment, skipping on nearly every single item in the assessment -- things that I think any mid-level web/.NET developer should know.

You see, the problem is not that SharePoint is inherently a weak platform or inherently un-scalable or inherently poorly performing, but that they've made the platform so approachable, so well documented, and built such a strong community of developers and developer evangelists around it, that anyone can get into it.

You don't need to know how garbage collection works or understand the difference between a Func and an Action (or even know what they are!) or what IL is.

You don't need to know how to write a generic class or how to refactor code to reduce complexity.

You don't need to know many of the higher level programming concepts of .NET or sound object oriented programming practices to build solutions on SharePoint because most of the time, you can just google it and find an example (Of course, most examples -- even the Microsoft MSDN ones -- are not meant for production code; they are merely designed to convey the concept and usage of a particular member in the library but not necessarily how to use it in an architecturally sound manner).

So therein lies the root of SharePoint's image problem: it's so easy to build basic solutions, a developer who is fundamentally weak on the .NET platform and fundamental web technologies like Javascript and CSS can be productive in many companies and even get Microsoft certified!  And then these developers are let loose on projects that ultimately under-deliver in one way or another.  Be that performance or usability or maintainability.

SharePoint is like the mystery basket on Chopped.  There is nothing inherently good or bad about those mystery ingredients but for the skill of the individual chefs who combine them through the application of experience, creativity, and technique to create a dish.  With different chefs, you will get drastically different results each round and the winners are usually the folks that have a fundamental understanding of the art and science of flavor and cooking as well as a deep understanding of the ingredients.

It is the same with SharePoint; it is nothing but a basket of ingredients (capabilities or features) to start from (a very generic one at that).  At the end, whether your dish is successful or not is a function of whether you have good chefs that understand the ingredients and understand the art and science of melding the ingredients into one harmonious amalgamation.  Likewise, it is important that when staffing for SharePoint projects, you focus not only on SharePoint, but also on the fundamentals of .NET, computer science, and good programming in general.

You can also think of it like a Porsche.  Give it to a typical housewife to drive around a track and you'll get very different results versus having Danica Patrick drive it.  Should it reflect poorly on the Porsche that the housewife couldn't push it anywhere near the limits?  Or is it a function of the driver?  Danica Patrick in a Corolla could probably lap my wife in a Porsche around a track.

The bottom line is that SharePoint, in my view, is just a platform and a good starting point for many enterprise applications.  When some SharePoint projects fail or fall short of expectations, the blame is often assigned to the platform (ingredients) and not to the folks doing the cooking (or driving).  SharePoint is indeed a tricky basket of ingredients and it still takes a skilled team of architects, developers, testers, and business analysts to put it together in a way that is palatable.

16Oct/13Off

Watch Out For SPListItemCollection.Count and Judicious Use of RowLimit

Posted by Charles Chen

This seemingly innocuous call can be quite dangerous when used incorrectly.

The reason is that this property invocation actually executes the query.

This is OK if you plan on iterating the results because the results are cached, but costly if you don't plan on iterating the results.  The following code sample can be used to test this effect for yourself:

static void Main(string[] args)
{
    using(SPSite site = new SPSite("http://internal.dev.com/sites/oncology"))
    using (SPWeb web = site.OpenWeb())
    {
        SPList list = web.Lists.TryGetList("General Tasks");

        SPQuery query = new SPQuery();
        query.RowLimit = 1;
        query.Query = @"
<Where>
<Contains>
<FieldRef Name='Title'/>
<Value Type='Text'>500KB_1x100_Type_I_R1</Value>
</Contains>
</Where>";
        query.QueryThrottleMode = SPQueryThrottleOption.Override;

        SPListItemCollection items = list.GetItems(query);

        Stopwatch timer = new Stopwatch();

        timer.Start();

        Console.Out.WriteLine("{0} items match the criteria.", items.Count);

        var timeForCount = timer.ElapsedMilliseconds;

        Console.Out.WriteLine("{0} milliseconds elapsed for count.", timer.ElapsedMilliseconds);

        foreach (var i in items)
        {
            Console.Out.WriteLine("{0} milliseconds elapsed for start of iteration.", timer.ElapsedMilliseconds - timeForCount);

            break;
        }
    }
}

(And of course, you can check the implementation of Count in Reflector or dotPeek)

You will see that the start of iteration will be very fast once you've invoked Count once.

Now here is where it gets interesting:

  1. The total time it takes to execute the query is longer for invoking Count versus just iterating (~3000ms vs ~3200ms, about 5-10% in my tests).
  2. When I set the RowLimit to 1, I can reduce the time by roughly 40-50% (~1600ms vs ~3200ms for a resultset of 230 out of a list of 150,000 items).

Try it yourself by commenting and uncommenting the RowLimit line and commenting and uncommenting the line that invokes Count.

What does this mean for you?  Well, if you don't need the count, then don't use it.  It's slower than just iterating the results. Where you plan on iterating the results anyways, don't invoke Count. If you need the count, you are better off doing a counter yourself in the iteration.

And in a use case where you don't plan on iterating the result set (for example, checking to see if there is at least one occurrence of a type of object), be sure to set the RowLimit in your query!