Summer of Making – Water Balloon launcher

The 1st project of the Summer of Making was a Water balloon launcher.

Pretty simple and the kids loved being able to help out in the building of it.

We followed the instructions verbatim from the Make magazine link. If it’s not clear – it wasn’t to me – you want to purchase 2″ inner diameter PVC pipe.

Total cost was $24, with nearly everything sourced from Lowe’s home improvement store. I had some left over PVC glue from a previous project and the local tire place gave us the valve stem for free.

Here are the parts laid out.

Measuring out the main launch tube for cutting.

  

Measuring out the secondary tube and end handles that make up the air reservoir.

All the cut parts laid out for a sanity check before gluing.

unasembled

We placed the valve stem on the right side tube that makes up the handle.  Drill the hole small and work up to the size you need. Wow, it was a tight fit!

The finished product.

finished

It works very well. Our first test shot went nearly 60 ft.  It takes some playing around with the balloon size, amount of water “batting” used and the air pressure.

So far our best results have been with a water balloon that fits just barely in the barrel, about 2-3 inches of water batting and ~25 lbs. of air pressure.

UPDATE: (4/17/2015) 6 inches of water batting and 40 lbs of air pressure, sent a water balloon over 100 ft away.

The project was a lot of fun and I highly recommend it if you have some younger children looking to build something fun for the summertime .

The Maker movement – Inspiring

I’ve been reading a lot about the Maker movement going on and I have to admit, I’m feeling a bit inspired.

Makezine – Make magazine,etc

Maker Faire

Adafruit – tutorials and parts for electronics projects

Not sure if it’s the inner kid in me wanting to play with some of these amazing new gadgets like 3D printers and Robitics kits (which I’ve always be enthralled with), or the desire to control real world things with software. Either way, I’ve been busy researching all kinds of projects I might be able to tackle with my two boys.

I’ve got some parts and kits on order, from Arduinos to random sensor modules to robotics parts. I backed a kickstarter for a 4-in-1 drawing robot called the mDrawBot which should be arriving some time in May.

mDrawBot Kickstarter

I plan on making this the summer of “making” with my boys. 

It’s my answer to the “What’s there to do?” question I hear every summer when the kids are on summer break from school.

Stay tuned for our 1st project, a water balloon cannon!

Enhanced Monotouch.Dialog FloatElement

For those .NET developers that aren’t yet familiar with the Monotouch framework and are looking to write iOS apps using your existing C#/.NET skill set, you need to check it out. it’s a great platform, especially if you don’t want to deal with the hassle of leaning Objective-C.

Monotouch.Dialog is a specific framework of classes designed to ease the pain usually required in writing the rather repetitive bits of code needed to get tables and other UI paradigms working in iOS. It’s a relatively new addition to the Monotouch API, so there are still some rough edges.

The basic UI range slider (FloatElement) provided by the framework seems to be missing some obvious items, such as the ability to get an event or indication that the value has changed.

I created my own “extended” version with a few more features that I needed for a project I was working on.

Features
  • Ability to ’Lock/Unlock’ the range slider using provided UI images.
  • Inline caption that can be tied directly to the value of the slider
  • Callback functionality when the value of the element is changing
Looking ahead – Future Enhancements

The lock/unlock feature was the precursor for an additional feature I plan on building – a FloatElementGroup. The idea here is that multiple sliders are present and are used to split a given ‘whole’ value by various percentages (think audio equalizers). The sliders would auto-adjust as needed to compensate for any changes to a single slider and the user could optionally lock certain sliders on fixed values.

 

Here’s the code. Hopefully this will come in handy to someone else looking to use a FloatElement.

Source Code
public class FloatElementEx : Element
{
    static NSString skey = new NSString("FloatElementEx");
    const float LockImageWidth = 32.0f;
    const float LockImageHeight = 32.0f;

    /// <summary>
    /// Set a string to reserve a certain amount of space for the 
    /// caption used in the FloatElement. Useful when there is no
    /// initial caption to show - allows space to be reserved for 
    /// when it will be set.
    /// </summary>
    public string ReserveCaptionPlaceholderString { get; set; }
    /// <summary>
    /// Returns the locked status
    /// </summary>
    public bool IsLocked { get { return _valueLocked; } }
    public bool ShowCaption { get; set; }
    /// <summary>
    /// Ties the displayed caption to the value of the slider
    /// </summary>
    public bool UseCaptionForValueDisplay { get; set; }
    public bool Continuous { get; set; }
    public int MinValue { get; set; }
    public int MaxValue { get; set; }
    public int Value { get; private set; }
    public UIImage LockImage { get; set; }
    public UIImage UnlockImage { get; set; }

    private UIButton _lockImageView;
    private UISlider _slider;
    private Action<int> _valueChangedCallback;
    private bool _valueLocked;
    private bool _lockable = false;


    public FloatElementEx(int value, Action<int> valueChanged = null, bool continuous = true, bool lockable = false)
        : base(null)
    {
        MinValue = 0;
        MaxValue = 100;
        Value = value;
        Continuous = continuous;
        _lockable = lockable;
        _valueChangedCallback = valueChanged;
    }

    protected override NSString CellKey { get { return skey; } }

    public override UITableViewCell GetCell(UITableView tv)
    {
        var cell = tv.DequeueReusableCell(CellKey);
        if (cell == null) {
            cell = new UITableViewCell(UITableViewCellStyle.Default, CellKey);
            cell.SelectionStyle = UITableViewCellSelectionStyle.None;
        }
        else
            RemoveTag(cell, 1);

        SizeF captionSize = new SizeF(0, 0);
        if (ShowCaption && (Caption != null || ReserveCaptionPlaceholderString != null || UseCaptionForValueDisplay)) {
            if (Caption == null) {
                if (UseCaptionForValueDisplay)
                    captionSize = cell.TextLabel.StringSize(MaxValue.ToString(), 
                        UIFont.FromName(cell.TextLabel.Font.Name, UIFont.LabelFontSize));
                else if (!string.IsNullOrEmpty(ReserveCaptionPlaceholderString))
                    captionSize = cell.TextLabel.StringSize(ReserveCaptionPlaceholderString, 
                        UIFont.FromName(cell.TextLabel.Font.Name, UIFont.LabelFontSize));
            }
            else {
                captionSize = cell.TextLabel.StringSize(Caption, UIFont.FromName(cell.TextLabel.Font.Name, UIFont.LabelFontSize));
            }

            captionSize.Width += 10; // Spacing

            if (Caption != null)
                cell.TextLabel.Text = Caption;
        }

        var lockImageWidth = _lockable ? LockImageWidth : 0;

        if (_slider == null) {
            _slider = new UISlider(new RectangleF(10f + captionSize.Width, 12f, 280f - captionSize.Width - lockImageWidth, 7f)) {
                BackgroundColor = UIColor.Clear,
                MinValue = this.MinValue,
                MaxValue = this.MaxValue,
                Continuous = this.Continuous,
                Value = this.Value,
                Tag = 1
            };
            _slider.ValueChanged += delegate {
                Value = (int)_slider.Value;
                if (UseCaptionForValueDisplay) {
                    Caption = Value.ToString();
                    // force repaint/redraw
                    if (GetContainerTableView() != null) {
                        var root = GetImmediateRootElement();
                        root.Reload(this, UITableViewRowAnimation.None);
                    }
                }
                if (_valueChangedCallback != null)
                    _valueChangedCallback(Value);
            };
        }
        else {
            _slider.Value = Value;
        }

        if (_lockable){
            if (_lockImageView == null)
                _lockImageView = new UIButton(new RectangleF(_slider.Frame.X + _slider.Frame.Width, 2f, lockImageWidth, LockImageHeight));
            
            _lockImageView.SetBackgroundImage((_valueLocked) ? LockImage : UnlockImage, UIControlState.Normal);
            _lockImageView.TouchUpInside += (object sender, EventArgs e) => {
                _valueLocked = !_valueLocked;
                _lockImageView.SetBackgroundImage((_valueLocked) ? LockImage : UnlockImage, UIControlState.Normal);
                if (_valueLocked)
                    _slider.Enabled = (!_valueLocked);
            };
            cell.ContentView.AddSubview(_lockImageView);
        }
        cell.ContentView.AddSubview(_slider);
        return cell;
    }

    public override string Summary()
    {
        return Value.ToString();
    }

    protected override void Dispose(bool disposing)
    {
        if (disposing)
        {
            if (_slider != null)
            {
                _slider.Dispose();
                _slider = null;
            }
        }
    }

    public void SetValue(int f)
    {
        if (!IsLocked)
            _slider.SetValue(f, false);
    }

    public void SetCaption(string caption)
    {
        Caption = caption;
        // force repaint/redraw
        if (GetContainerTableView() != null) {
            var root = GetImmediateRootElement();
            root.Reload(this, UITableViewRowAnimation.None);
        }
    }
}

Usage
// Basic slider with callback [0..50], not lockable, Caption displayed as slider value
var elem = new FloatElemenEx(50, lockable: false, valueChanged: (val) => DoSomething())
{
    ShowCaption = true,
    UseCaptionForValueDisplay = true
}

// Basic slider with callback [0..50], lockable (can click image), Caption displayed as slider value
var elem = new FloatElemenEx(50, lockable: true, valueChanged: (val) => DoSomething())
{
    ShowCaption = true,
    UseCaptionForValueDisplay = true,
    LockImage = UIImage.FromBundle("images/lock.png"),
    UnlockImage = UIImage.FromBundle("images/unlock.png"),
}

// Basic slider with callback [0..50], not lockable, Caption set explicitly, space reserved
var elem = new FloatElemenEx(50, lockable: false, valueChanged: (val) => DoSomething( DoSomeCalc(elem.SetCaption(val); )))
{
    ShowCaption = true,
    UseCaptionForValueDisplay = false,
    ReserveCaptionPlaceholderString = "XXX", // calculates space based on this string            
}

// Sets the value of the slider and kicks off needed repaints internally (ie. if caption is tied to value, etc)
elem.SetValue(12);

// Sets the sliders caption. Need to use SetCaption() and not the Caption proprerty. Formers refreshes the element, latter doesn't.
elem.SetCaption("Hi!");

C# Version of DocRank

Recently I’ve been exploring Machine Learning concepts (clustering, link analysis, etc.) and discovered a great resource – Algorithms of the Intelligent Web.

The book gives a good introduction and treatment of applied Machine Leaning, especially for someone new to the topic. The book is particularly good in my opinion since it talks about machine learning concepts with an emphasis on how they can be applied to various areas – improving search, providing recommendations, fraud detection, etc. The only drawback to this book as a .NET developer is that the code samples in the book and the source code are all in Java.

Not a big deal, since Java and C# and very close as far as syntax and API structure. I performed a quick and dirty translation of the code from Java to C# if anyone is interested. I haven’t yet made a pass through the code to clean it up or make it more efficient – perhaps in the near future I’ll release an updated version of the source. The C# source for the converted Java code can be downloaded here.

Caching in a Service Oriented Architecture (SOA) – Part 2

This is the second part of a an article on designing a cache system for a service oriented architecture. The first part of the article dealt with the design considerations and potential approaches. This part will look at an implementation of a cache system.

 

System Overview

Before designing any system, its a good idea to fully understand what needs the system should satisfy. For the specific project I was working on some of the requirements that drove the design decisions were:

  • Low latency for returning data

Access to the data, no matter where it was located must be fast. The system would not allow for large amounts of latency in retrieving data. Every item type put in the cache must perform under representative usage patterns.

  • Remove load on the database

The environment we were working in was a typical SQL based RDBMS. We had an existing codebase and little ability to leverage some of the newer technologies such as NOSQL – of which many providers support features such as sharding – to help in scaling out the data layer.

  • Consuming layers should not need to know about the implementation

We decided that users of the system – in our case, business logic developers – should not know or care that something resides in the cache or not. In fact the developers of this layer shouldn’t even know if we have a cache. We were striving to put in place a set of design patterns that would completely abstract away where data was located.

  • Must work in a multi-tenant application layer (i.e.. shared web servers) and web-farm

Initially our approach was to leverage the .NET Runtime cache on the application server, but we quickly abandoned this once the requirement for a web-farms and multi-tenant application servers was added. A local cache on the server meant that a users request that added something to the cache on server ‘A’, and a subsequent request from that user, served by server ‘B’ would generate cache misses.

Two possible solutions remained – move to a distributed off-server cache, or synchronize changes to local caches across the set of servers serving requests. Of these two approaches we chose the lesser complex of the two, which was to move to a distributed cache server.

Cache Selection

With the choice of a web server resident cache out of the running due to the need to support web-farms, for our specific implementation we chose to go with the Windows Server AppFabric distributed cache.

We chose this since it had some nice features we would like to include in the future – such as locking and versioning of cache items. Also, given that it was a Microsoft product – this helped to simplify our setup, deployment and supportability concerns. There are a lot of good alternatives in this space, especially MemCache which is something we had considered and make look at leveraging in the future.

Another driver of our decision was the availability of a few features such as notification based expiration for local caches, and that the cache could be configured to use portions of the web servers available memory, so as not to need a dedicated cache server for smaller deployments.

 

Implementation Details

Factory pattern and Interfaces over the Cache Layer

One of the early choices made in the design of the system, was to be as defensive as possible. For that reason, we chose to implement a factory pattern around the cache creation. We felt that the factory pattern paired with some interface based programming would be able to provide a nice layer of abstraction of the cache details, from our development team.

This will allow us to swap out providers without needing to change all of the touch points in the business logic code, since we aren’t coupled directly to the cache implementation.

Some thought needs to be given to designing the proper cache interface for your system. You probably do not want to directly mirror the operations available on any given cache implementation. Remember the goal is to abstract away from the details, so think about how you will actually be using the cache and what the additional requirements you have around it are. In our case we wanted to store some ‘metadata’ with each cache item (explained below), so we made sure to build that into our abstraction layer interfaces.

Interacting with the cache

Figuring out how the business logic code you write will interact with the caching layer is likely the most important piece of the design. The decisions here will drive much of the overall design of the system and will either allow for some flexibility or conversely limit your ability to perform certain operations.

There are a few techniques you can choose, and I encourage you to explore them before settling on a choice.

We chose the Cache Aside pattern. This is where the code that is about to request data checks for the availability of the data in the cache. If the cache has the data, the data is returned. If the cache does not have the data, the data is loaded from the data store, added to the cache and then returned.

Our system had in place a Domain Model based design, which allowed us to fit the cache aside pattern nicely into the foundational mechanisms of which each domain object was built from. This let us keep the details of looking items up in the cache and the updating of items compartmentalized to a few key classes that most developers never directly interacted with.

For handling staleness of data, we were in luck due to the singular path we had to get out all data, which was though our domain objects. Due to this we were able to tap into the updates of objects and determine in the base implementation whether that object had data (direct or related) in the cache and invalidate/remove it if needed. This was a big win for us allowing to keep the complexity of dealing with the cache contained to a few classes and out of the minds of most of the business logic developers.

Store Cache Metadata

For each item in the cache we setup a custom system of storing a set of metadata along side the cached value. This metadata contains key system information such as related domain objects, creation times, and other keys to aid in cross-item or aggregate lookups in the cache.

Regions per customer

Our system can be multi-tenant, so we needed a way to distinguish the data of one customer from another. This can be accomplished a few ways in AppFabric:

  • Part of the cache key
  • Separate region per tenant
  • Separate cache per tenant

We decided against identifying the tenant in the cache key, as this would not allow us to administer a ‘tenant’ without affecting others. The reason we chose the separate region per tenant over the separate cache per tenant was because of the overhead of needing to maintain each cache, which we viewed as much heavier weight than a region. It is also possible to programmatically administer regions, while it is not possible at the cache level.

The only downside with storing a tenant per region is that currently, AppFabric will not distribute a region across multiple server, which it does with a separate cache. Right now this serves our needs, but we may re-evaluate this going forward.

Administrative cache

We envision a need for our operations staff to be able to administer the cache and troubleshoot issues. To that end we built a mechanism into the cache access layer that gave us visibility into what regions were active and created at any time. Ironically, Microsoft did not include a programmatic way to retrieve a list of named regions from the cache.

We came up with a simple approach of creating an administrative cache area, separate from the main cache. Each time a region is added or removed we maintain a key in the administrative region with that regions information.  This allows us to provide a real-time view of what regions are active and then query the regions for the contained cache items if needed.

Lessons Learned

Lack of programmatic administrative capabilities stinks

Not sure what the reason behind some of the gaps in the AppFabric API are, but I must say that the lack of a full-featured administrative API available from code stinks. Most of the administrative functions are only available via PowerShell. I can understand that this is likely targeted at the maintainers of the system, but quite honestly how is someone supposed to create an application that can be used to administer the cache without the need for PowerShell.

Local Cache with Notification based expiration – Not good enough

One of the features we were hoping to leverage with AppFabric, was the local cache option. This allows for a local cache to be constructed from where the client is accessing the cache (i.e. web server in our case). If the main cache gets updated the local cache would get invalidated. This seemed like a great way to boost performance while still maintaining our support for web-farm operation.

Unfortunately the design of this feature doesn’t work quite the way as we expected. The local cache option requires a polling of the main cache to find out if its items are invalid. This would allow for too much latency (or too high of network traffic is polling time decreased) in the system, so we had to nix this. Too bad – if only the implementers had used some type of event driven system so the latency was lower – that would have been great.

Your assumptions will be wrong – Test, Test, Test.

There is absolutely no way to know ahead of time whether a particular strategy will work without empirical testing.

Put some data in the database that represents your usage and test the scenarios – direct to database and data from cache. Make sure that your data access patterns return timings that are favorable to using the cache, otherwise don’t use it.

This was especially true in the system I worked on, where we were storing a complete set of an entity type in the cache, but when clients required the data they only ever needed a subset. We had assumed that this would likely be a poor candidate for caching in its entirety and were considering caching the subset variations. The tests we ran refuted this assumption and showed that the overhead of de-serializing the entire list of entity types and then pruning them in memory with LINQ, was fairly per formant and allowed for less burden of having to manage subset pieces of the entity type in cache.

 

References

Domain Models

http://msdn.microsoft.com/en-us/magazine/ee236415.aspx  (Employing the Domain Model Pattern)

http://martinfowler.com/eaaCatalog/domainModel.html  (P of EAA: Domain Model)

Windows Server AppFabric

http://msdn.microsoft.com/en-us/windowsserver/ee695849

Caching Patterns

http://www.alachisoft.com/resources/articles/domain-objects-caching-pattern.html  (Distributed Caching and Domain Objects Caching Pattern for .NET)

http://www.ibm.com/developerworks/webservices/library/ws-soa-cachemed/  (Cache mediation pattern specification: an overview)

http://ljs.academicdirect.org/A08/61_76.htm  (Caching Patterns and Implementation)

Caching in a Service Oriented Architecture (SOA) – Part 1

This will be a two part post on some of my thoughts on designing a caching system for a service oriented architecture, and some of the results from a series of prototypes done to flesh out the design.

Part 1 – Overview, use and potential approaches

Part 2 – Prototype designs, results and lessons learned

When designing a service oriented architecture (SOA) that is expected to see high volumes of traffic one of the potential architectural components you may be looking at is a caching system. In a high traffic system a cache can be essential in increasing performance and enabling the scalability of the overall system.

Why use a Caching system?

Caching systems inarguably add another layer of complexity as well as another potential point of failure to a systems architecture, so it’s use should be carefully weighed in relation to the expected benefits you anticipate from its use.

There are usually two main reasons for employing a caching system

  1. Offload Database work
  2. Drive down response times

Offloading database work is essentially a way of enabling a pseudo-scaling of the database tier, especially in cases where the database platform doesn’t inherently allow a scaling out. By performing more of the work of retrieving data without involving the database, we are effectively scaling out the capacity of the tier.

The requirement to Drive down response times or keep response times stable as the system grows is another common reason to employ a caching system. Retrieving data from a cache held in memory is magnitudes faster than retrieving it from the database tier in most circumstances and especially so if the database system itself does not employ some internal caching to keep the needed data in memory. If the database system determines it needs to read the data from disk, the cache fetch will seem like a Maserati compared to a ‘Model T’.

Where should you use caching?

Caching should likely be considered at all levels of the system – client tier, web tier and services tier. At each tier of your architecture the needs of the application and the type of data in use will dictate what gets cached and the strategy employed.

The goal of caching within a SOA system that is expected to scale means keeping the cached information at the layer that makes the most sense from a use and manageability standpoint.

Client Tier

At the client tier data should be cached to avoid round trips back to the server when possible. This is likely one of the most expensive calls in a system that can be made, as the network traversed in this call is likely a good distance from the web application or services tier. The best approach to performance here is to not incur the overhead of the call at all if possible.

Thick clients have long used local caching strategies to hold onto data as long as possible. After a database call a thick client would keep the set of data retrieved in memory between user actions and screen changes.

Browser based clients have had a more difficult time caching data due to the stateless nature of the web. Some approaches here have been to store data in the page itself. This can lead to page size bloat and slower response times in a typical scenario such as ASP.NET where the “stored data” is round-tripped with the page. With the rising popularity of AJAX style programming and partial page refreshing, the browser is becoming a more intelligent presentation layer compared to the typical post-back or complete page refresh model.

Web Application Tier

The web application tier has a role to participate in the caching strategy as well. Since in a proper N-tier system,  the web application tier is responsible for serving of resources (pages, images, etc), it should employ its caching strategy around these object types primarily. Employing a cache strategy around how long a page can be served from cache versus being regenerated should be one of the primary focuses of caching at this tier.

While tempting to cache data at the web application tier, this should be avoided as there are several problems that could arise from this in a scaled and load balanced environment, such as data only being available to certain web servers, or the distributed maintenance of a cache from multiple web servers.

Services Tier

Caching at the services tier should target “data” since this is the single point of access to data within a SOA based system. As such it makes sense to control the population, refreshing and invalidation of a data cache from this tier. The services tier lends itself particularly well to the caching of data as its primary purpose it to act as the facade that serves all requests to retrieve or update data.

Where it retrieves this data from is of no concern to the caller other than from the standpoint that the data is correct and accurate. Since employing a cache at this tier is transparent to the caller, offloads work from the database and is more manageable from the standpoint of trapping changes that require updating the cache, it makes the most sense to cache data at this tier.

What types of data should you cache?

The type of data that you determine should be cached should ultimately provide an increase in performance to the system without dramatically increasing the complexity of the system. There are certain types or classes of data that make more sense to cache than others, in order of priority.

  • Data that changes infrequently
  • Expensive queries
  • Data that is accessed frequently

Some thought should also be given to the dependencies between cached object types. I cover this more in the considerations section, but a high number of dependencies between objects may be a factor in determining whether you cache these object types.

Data that changes infrequently

Data items that are fairly static in nature make ideal candidates for caching. The benefit here is that there is a low overhead to managing this type of data in a cache as updates to the data are infrequent requiring less of a need to clear items from the cache and/or refresh them.

An example of data that changes infrequently could be policy data that drives certain actions within the application.

Expensive Queries

Queries that are expensive in either time or resource usage to run are another ideal candidate for caching. Caching this type of data will provide the aforementioned “scaling” increase at the data tier since the underlying database is freed from running the majority of these queries, allowing it to run other queries which in effect provides the same benefit as scaling the database system.

Examples of an expensive query might be a query that aggregates several pieces of data together or performs some level of trending, along the lines of something you may see in a dashboard style view.

Data that is accessed frequently

Data that is accessed frequently also makes a nice candidate for caching since this type of data – even if cheap to execute and return – provides a constant load on the underlying system. Being able to effectively take this constant load off of the database and move it to the cache can yield significant performance improvements.

So in general there are a couple of factors that drive the cost/benefit analysis as to what should be cached: cost of data and frequency of change:

Cost of Data Frequency of Change Benefit of Caching
High High Low*
High Low High
Low High Best not to cache
Low Low Moderate

* While the cost of the data is very high to execute and retrieve, the benefits of caching are reduced by the frequency of change since the frequent changes will lead to high cache turnover, frequent refreshing/re-querying of data and an overall higher level of data management for this item in the cache.

Strategies

Two approaches we designed and tested in a prototype were the “Write Through” and “Data Event Driven” approaches. Both approaches have advantages and disadvantages associated with them and should be carefully considered in the context of how the system is used and the way in which data is interacted with.

Write Through

The write through cache is a a cache implementation where the cache is updated during the operation that is updating a piece of data, followed by a subsequent updating of the data in the underlying data store. In essence this is a cache first and data store second type of model.

This type of model is most effective in a system where there is a well defined set of interfaces for interacting with the data and all areas of the system use (ie. single source for all data interactions). A single source of updates allows for a more manageable point to maintain the data in the cache from, whereby a single component or code path is responsible for the update or refresh of the cache for a given operation.

Another considerations in utilizing this type of design is to think about how concurrency of updates occur in the system. There exists the possibility that there can be two distinct operations that are updating a piece of data, both of which are attempting to first change the data in the cache and then in the data store. Most typical data stores provide a mechanism for handling concurrent updates, usually through a locking mechanism. This may or may not be the case in your cache provider.

Since the first update occurring in the system will be to the cache (not the data store) it is essential that there be some mechanism to effectively handle the ability for concurrent attempts to update data. As mentioned some cache implementations provide the ability to lock a cache item while it is being updated effectively reproducing the same behavior as the data store.

Advantages
  •  
    • Easy to implement given a single point of interaction to data within the system
Disadvantages
  •  
    • May need to implement concurrency handling for updates to the cache
    • Calls that directly modify the database or do not use the “single point of interaction for data” can lead to stale cache data items

Data Event Driven

The data driven model is a cache implementation where the signal for a change to the cache comes from a change to the underlying data in the data store. Basically the data “signals” that it has changes and the cache is updated based on this.

There are two ways to handle a data driven design – the push or pull method.

In the pull method, there would exist a way to actively monitor the underlying data to detect changes and then react to those changes by updating the cache. An example of this would be using something like the SqlDependency feature in Microsoft ADO.NET. Typically you setup a query to watch some set of data and when the results of that query changes you are signaled and can react to the change. In my opinion and through testing this does scale well to larger systems since the number of items that need to be setup and then polled – utilizing system resources – can grow to a large number. For simpler systems or those that do not have a requirement to cache a large number of different data types, this may be appropriate.

In the push method, the data store itself would have a mechanism to watch a set of data for changes and signal that the data has changed to an interested party. Our prototype used a combination of triggers and SQL CLR code to detect when changes to a set of data we were interested in occurred and then raise a signal to the cache implementation to refresh the associated data. The benefit of this method was that there was a low overhead associated with tracking the changes to the data versus polling. One of the disadvantages to this approach is the maintenance of the triggers in the system. As the need to track more and more data items grows the number and complexity of the triggers to detect and deal with changes also increased.

A common disadvantage to both flavors of the Data Event Driven design is that the rollup from data in a normalized database to something that is typically stored in a cache – such as an aggregated type of data object – was exceptionally difficult. Translating what data should be watched for a given object such as a Customer, which might span three normalized tables in the data store was a chore. In our prototype it meant a minimum of three triggers – one for each table – to watch a portion of the data for a Customer, and then the ability to translate a change detected by any of those three triggers into a ‘Customer’ object that was held by the cache.

The obvious advantage and appeal of this type of system is that any and all changes to the data can be caught and propagated to the cache layer for updates as needed. There would be no stale data in the cache if someone wrote directly to the database or skirted a centralized update control point.

Advantages
  • All changes are accounted for the in the data
  • Reactive – low to no resource usage for monitoring changes
Disadvantages
  • Hard to implement.
  • Translation from physical to logical can be tricky

Considerations

Dependencies between cache objects

One of the more challenging aspects of a cache system implementation is how your define and manage any dependencies between data items within the cache. This should be considered in the overall design of the system as to how data is stored with especial attention given to how granular or coarse the items you are putting into the cache are. You should shy away from an implementation where you are storing very discrete data items that cannot stand on their own as having business value. Data items that are only useful when aggregated together may not make the best candidates for being cache. After all joining items together to return data with business value is the purview of the database and not necessarily that of the cache.

Cache Refresh Strategy

There are two approaches to consider when removing a piece of data held in cache is invalid or stale – actively refresh the data by fetching and replacing it, or remove the data from cache and let the next request fetch and cache the data.

In considering these approaches one factor that might drive the decision of one over another is the overall expected volume of traffic. If the expectation of the system volume is very high, it could be that “removing the cache item and letting the next requestor fetch the data” strategy could result in severe spikes in utilization at the data tier as ‘N’ number of requestors all looking for the same piece of data removed from the cache, go and request it from the database. This could possibly be mitigated if the cache supports the concept of locking on a cache item key that isn’t in cache while the data is being fetched.

Single Server Cache vs. Distributed Cache

Some thought should be given to whether a single cache server and the resources available to it would be sufficient for your implementation. If high availability of the cache is a requirement, you may need to consider a distributed cache implementation, which provides for storing multiple copies of your data within the cache for failover and high availability purposes.

I’d be interested in hearing any feedback on the ideas in this article or learning’s you many have from implementing a similar large scale caching implementations in support of a SOA based system.

Error returning a DataTable from a WCF service call

Not debating the merits of whether its appropriate or sensible to return a DataTable as a response from a web service, but if you receive an error like I did make sure to check the following:

1. The DataTable needs to be named.

    Before (causes an error):

public DataTable ExecuteDataTable()
{
    return new DataTable();
}

    After (no error):

public DataTable ExecuteDataTable()
{
    return new DataTable(“Test”);
}

 

2. Ensure that the packet sizes configured for WCF are large enough to accommodate a serialized DataTable with the data content you have in it. Serializing a DataTable to XML results in some fairly large XML documents and can easily surpass the packet limits in the default WCF configuration.

Nuggets from Interviewing – Tips for Developers

Having participated in numerous interviews for various companies filling positions, I’ve seen some really great candidates come through, and some not so great. There have also been those candidates that appeared to be great, but made some classic mistakes.

With a good sampling of interviews under my belt I thought it would be helpful to list a few observations in the form of tips – mostly not to do – for interviewing. I hesitated to write this as a lot of what I’m listing feels like it should be obvious to most people, but from what I’ve seen, that’s not always the case.

Observations

  1. Know what’s on your resume

A big red flag goes up when a candidate cannot speak about technologies or projects they’ve listed on their resumes. If you list it, make sure you can talk about it. Better yet, list your role on the project. What? You don’t want to list that you had the merest of any participation or a limited role on the project. Probably a good indication it doesn’t belong on your resume.

Interviewers aren’t impressed or looking for buzzwords for every technology out there. We want to see projects you had a significant role or involvement in and we want to know in-depth what you did. If you didn’t have a hand in something other than having been on a team that performed some work in that area, or you didn’t actually do any of the work – don’t list it!

I’ve had interviews where I’ve paraphrased a question based on a statement on a candidates resume, asking them if they’ve done this before and the answer was “no”. Seriously? If you list it, make sure you have done it.

2. Read the job description and ensure you meet the basic qualifications

Not knowing exactly what technologies are being asked for in a position – both required and preferred/desired – leads us to believe you are mass emailing out resumes and applying to anything you can find. Normally a Human Resources or Recruiting department will screen candidates out if they don’t meet the job qualifications, but they shouldn’t have to – don’t apply for positions you don’t meet required qualifications for.

It’s disheartening when we hear someone say, “I didn’t know you were looking for that skill”, or “I’m not as strong in X” when the job positing clearly lists it in the context of “Proficient in X”. If you don’t know what “proficient” or “demonstrated ability” means, look it up.

Make sure you meet all of the basic requirements of the position before applying.

3. Don’t pretend to have experience in an area

This ties in with #1 above – if you list it or say you know something, make sure you know it.

We’ve had candidates list newer technologies like WPF or WCF on their resume which were requirements for some of the positions we’ve had and when asked to tell us what they’ve done with either technology the response is along the lines of “I’ve downloaded some of the samples and played around with them”. This is fine if the position only asked for familiarty, but isn’t normally going to suffice if it’s listed as “Demonstrable experience” or “Proficient in”.

It’s perfectly acceptable to talk about the fact that in your current position you haven’t yet used a technology and that you are immersing yourself in it on your own. This is great and shows that you can learn on your own and attempt to keep abreast of new technologies, but if the technology is a requirement of the position sample work will more than likely not be enough.

More egregious is the candidate who says they’ve spent some number of years working with a technology or had a substantial project they’ve worked on using a technology and then can’t answer basic questions about it.

Think about it this way – my job is to determine what your skill level is in any of the technologies that we might be using for our position. Given that objective, any attempt to pretend to know a technology will be exposed during the interview. It’s designed that way.

Just don’t do it. It’s embarrassing for you and us.

4. Know your skill level in different technologies

This is one of those self-reflection type of items. In preparation for the interviewing process you should have taken some time to assess your skill level in different areas. We all have stronger and weaker areas. Know what yours are – it helps.

In the interview process I will usually ask a candidate to rate themselves in a few of the skills we are focusing on in the interview. Scale of 1 to 10, 10 being an expert.

The answer to this question gives me some insight into (a) is how you rated yourself consistent with what I see on your resume, (b) is it aligned with the requirements of the position and (c) gives me a sense of what level of questions I should ask you.

(A) Rating yourself inconsistently with what’s on your resume leads me to ask you more
about your past experiences.

If you rated yourself lower than your presented experience we may need to explore past project work. What was the role on the project? Did you use the skill on that project? To what level of depth?

If the rating is much higher than on the resume I want to know why. Was the role you had more substantial than you presented? Some resumes have so many projects listed that they all explanations are brief. This may indicate that your resume isn’t presenting yourself as well as it could for the position.

(B) If the position requirement lists “Proficient in writing SQL queries” and you rate yourself a 2 in SQL, we have a problem. In either case – rating yourself higher or lower than required for the position – means we need to explore further.

(C) Knowing where you think you’re at experience-wise with a technology gives me a good sense of where to start asking questions. Someone rating themselves a 5 or below will get more basic questions to start with, whereas someone rating themselves higher will start with more difficult questions. We will still circle back to some basics to ensure a good foundation but expect to be asked more detailed questions as to why/how things work.

We aren’t necessarily looking for syntax in these questions as opposed to knowledge of concepts and how things work, tradeoffs, etc. As a developer myself there are a ton of times I can’t remember the syntax for something and need to look it up or use Intellisense to help.

Reflect on your skills and “where you’re at”.

5. Don’t be afraid to say you don’t know

One common mistake during interviews is the failure to say “I don’t know”. It is perfectly valid to tell an interviewer that you aren’t familiar with a concept or technology. Constant change is a fact in our industry. There’s the possibility that given the breadth of technologies we employ and the depth of each, that there will likely be corners of a technology that you’ve used that you are unfamiliar with or even a full technology stack for that matter.

If the case occurs that the interview leads down one of these less traveled paths, it’s always preferable to an interviewer to hear “I don’t know”, rather than have a candidate try to guess, make things up or tell the interviewer what you think they want to hear. The former leaves the impression of honesty, while the latter leads to a perception of lack of knowledge or worse – of deception.

Most interviewers that attempt to be fair and find good candidates won’t immediately count an “I don’t know” against a candidate unless its in a core area that is going to be required of the position. Even then I try to weigh whether the item not understood is a conceptual issue or something that as a practitioner that you might use reference material (ie. Google, Books, etc) to normally accomplish.

When you don’t know, be honest. Don’t make things up, guess or try to give the interviewer what they want to hear.

6. Don’t talk badly about past people or positions

An obviously bad idea is to trash talk about people you’ve worked with or companies that have employed you. This immediately leaves an interviewer to believe that you are the type of person to tear people down rather than help lift the entire group.

I honestly believe most people know they shouldn’t talk badly about past employers and coworkers in an interview, but for some reason there is a subset of people that once they get in front of you can’t help but tear people down. I’m not sure if its due to nervousness, the question asked (e.g. tell me about a challenge…), or if they just really had a terrible experience with an employer.

But think about it from this perspective. You are interviewing for a position with a new company.  If for some reason that company makes you an offer and further down the road things don’t work out, the interviewer is likely getting a good sense of how you are going to describe them or their company on your next interview. I can guarantee that the thought going through the interviewer’s mind is – PASS.

Keep it positive. Talk about challenges, not how much you can’t stand a company or coworker.

7. Don’t make assumptions

This is a very broad statement, but its applicable from the standpoint of, as soon as you start making assumptions there exists the possibility of error. Making assumptions can take many forms. Perhaps you think that for a certain line of questioning by the interviewer that they are looking for a certain answer. This might apply more often for a behavioral style of interview question, where you assume that the interviewer is trying to probe for whether a certain set of traits or behaviors from you. These type of assumptions can quickly lead you astray if you make the wrong assumption.

If in doubt it’s always better to ask for clarification. The interviewer won’t mind paraphrasing or explaining something a bit more clearly, and as a bonus you will appear more engaged than someone who makes an assumption.

Another assumption I’ve seen made is one of a candidate believing they are a perfect fit for the position and therefore not doing any “selling” of themselves. A candidate that feels that based on the experiences listed on their resume or previous positions held that they assume you that the interviewer will “obviously see” that they are a perfect fit, is assuming a few things.

The first thing that they assume is that the interviewer is valuing something on their resume or in their background the same way that they do. Maybe – maybe not.

Another assumption being made is that because of the “obviousness” of the assumption being made, that there is no need to sell themselves or their skills to the interviewer. Remember that the candidates main objective is to show the interviewer that they are qualified for the position and how they can add value to the organization. I can’t imagine someone failing worse than someone who is qualified and assumes that the interviewer see this and therefore doesn’t close the deal by explaining how they see themselves fitting into the position. Not to mention it gives a perception of overconfidence on the candidates part.

Assume nothing.

8. Be prepared to demonstrate your knowledge

There are several interviewing techniques that are employed, but when interviewing I usually insist on having the candidate demonstrate their ability in one way or another.

This can mean reading and discussing some code. Or it can be a whiteboard design problem that you are asked to work through. It can even be writing some code to demonstrate your ability to design something or solve a specific problem.

Now, unless you’ve really done your homework on a company and their hiring practices it may be difficult to determine what to be prepared for. The point is that if you are interviewing for a development position you should be able to demonstrate that you can develop something.

Many times the reason behind the exercise is not to determine if you read or write some code correctly or designed a system to a given requirement. Rather it’s usually tied to how you get to the result, and not the correctness of the result. It’s more helpful for me to see how you think – do you ask questions, do you make assumptions, what tradeoffs are being considered, etc – than whether you know how to solve a single specific problem.

Rather than tell me, show me.

9. Do share your non-work activities that are directly related to the position

Many candidates tend to try and avoid talking about non-work activities. In most cases I would agree. As an interviewer I don’t particularly care that you enjoy playing sports or sailing.

BUT, if the activity is related to the position you are applying for or technologies being used by all means let the interviewer know. In fact, its one of those “key indicators” that can be looked for to show some additional level of motivation or passion from the candidate. For a developer this might range from the Exchange server you run at home to learn with, or the game you wrote, or the books/clubs/organizations/usergroups you read and/or are a part of.

So if it relates to the job or technology bring it up. If it doesn’t don’t.

Good luck and I hope these tips help.

Related: Nuggets from Interviewing – Interviewer Techniques

Binding DataGrid columns to DataContext items

The WPF DataGrid is a fantastic component. It’s flexible and can serve in a lot of different scenarios. Recently I was using the DataGrid in a project and had the need to dynamically hide certain columns via a users selection in a configuration section.

The idea is the user would check a box to hide a corresponding column represented in the DataGrid and the column would not appear.

 image image

I implemented the XAML as below, binding the columns visibility flag to the checkboxes IsChecked value with the appropriate converter. I was surprised to find that this didn’t work at all. Toggling the checkbox did nothing.

image

Looking in the IDE’s Output window I noticed the following binding error message:

System.Windows.Data Error: 2 : Cannot find governing FrameworkElement or FrameworkContentElement for target element. BindingExpression:Path=IsChecked; DataItem=null; target element is ‘DataGridTextColumn’ (HashCode=19699911); target property is ‘Visibility’ (type ‘Visibility’)

After some digging around, I found that the DataGrid columns aren’t actually in the Visual Tree of the window as you can see from the screenshot below.

image

This is the source of our error, as the DataGridTextColumn we are trying to bind to things isn’t participating in the Visual Tree.  To work around this we can implement the DataGrid’s DataContextChanged event and “forward’ the DataContext to the individual columns.

This would allow the individual columns to be able to be bound to items within the DataContext. Below is a quick and dirty implementation of how to perform this forwarding.

A more elegant way to do this would be to override the DataContextChanged property on all Datarid’s using  a metadata override.

Here’s the code behind to handle the DataContextChanged event and forward the context to the columns. Notice that I’ve also implemented a property with change notification to be used by the checkbox to store it’s “IsChecked” value so that it can be retrieved via the DataContext.

image

And here’s the modified XAML showing the Checkbox now bound to the new property and the DataGrid’s first column bound to the same value using the DataContext.

image

Because the DataGridTextColumn is not part of the visual tree I don’t believe there is any way to perform direct element bindings.

If someone figures this out, drop me a note.

Nuggets from Interviewing – Interviewer Techniques

Interviewing candidates for an open job position is hard. Period.

No matter what process or methodology you use, in the end it comes down to trying to summarize a persons skills, personality, work ethic and team fit within a relatively small allotted time. Not an easy task.

As an interviewee I’ve participated in many different interview styles that companies had ranging from a brief single discussion with the hiring manager to the grueling eight hour interview with standardized testing, five one on one  interviews and a case study.

What I’ve found is that, along with most things in life, a style somewhere in between both extremes works best. Keeping in mind the damage hiring the “wrong” person can cause – lower productivity, morale problems, inter-team social issues, etc – we can immediately see that any extra effort needed upfront to setup a process that works will yield greater results in finding the “right people” and pay us back in the long run.

The Interview Process

There are many ingredients you can choose from when thinking about what to include in your interview process. The list below contains some of the techniques I use in interviews I participate in as well as some that others might be familiar with.

  • One on one interview

A individual discussion between the candidate and a key team member. To be most effective this should be someone at least technical enough to distinguish between someone who knows the technology or area being discussed and someone who is good at talking around an area.

My opinion is that the interview must be with at least one individual that would be considered a “peer” or have a close working relationship with whoever is hired into the position.

Pros:

  – Lower stress than a group interview

  – Can get multiple reads on the same areas from a candidate

Cons:

  – Time can be a factor if there are several individuals on the team you want the candidate 
    to speak with

  – Need to reconcile each interviewers perception of the candidate. More difficult when
    there are discrepancies in perceptions of the candidate.

  • Group interview

The group interview is one of those techniques that if done correctly works well and if not done correctly is a complete disaster. The idea is to have several key team members in the interview at once all taking turns interacting with and asking the candidate questions.

Pros:

  – Allows multiple team members to evaluate the same answers given by the candidate

  – Can leverage the experience and specialties of many individuals

  – Allows those not currently interacting with the candidate to focus more on the responses
    from the candidate

  – Less chance of difficult reconciliations since all interviewers hear the same response to
    questions

Cons:

  – Higher stress environment for the candidate

  – Can feel like a “firing squad” or inquisition if not done right

  • Standardized testing

Standardized testing usually takes the form of a non-technical assessment of a candidate usually focusing in on behavioral, reasoning and logic questions.

Pros:

  – Easy to compare results across candidates

  – Useful if your team believes there is a direct correlation between the success in the job
    position and the generalized knowledge being tested

Cons:

  – Some people aren’t good at taking tests, especially under pressure. May result in loss of
    qualified candidates

  • Written or Verbal technical assessments

Written or verbal quizzes on the applicable technology areas your team uses. Can cover a range of topics and skills levels from basic knowledge to advanced concepts.

This should not devolve into a syntax assessment, as most programmers rely heavily on documentation, Intellisense and reference material for syntax, although the more familiar with the syntax a candidate is, the likelier they are to have been using the technology more recently and frequently.

Pros:

  – Easy to determine whether the candidate has used the technology and has a firm
    understanding of the underpinnings

Cons:

  – None

  • Reviewing of code

The intent of having the candidate read code is to determine a few things:

      – Do they understand the programming language being used.

      – Can they understand someone else’s code. A good portion of your product is
        likely written already. If the candidate can’t read and understand this existing code
        that may be a problem. Most companies can ill-afford the employee who needs to
        rewrite / refactor every area they work in so that they can properly understand it.

      – Can they intelligently talk about a piece of code and what some of the characteristics
        of it are – Usage patterns, good things, things to be improved, etc.

Pros:

  – Good read on the candidates ability to understand code. Something not obtained
    through verbal/written technical assessments.

Cons:

  – Stressful for the candidate as they will feel pressured and “on-stage”

  • Spot the bug / Troubleshooting Exercises

This technique usually involves having the candidate review a piece of code or application that has a bug in it. This exercise can be tailored to be very light or more complex.

A light version of this would be to have the candidate review a set of short code blocks with issues or bugs in them. These could be highlight core mistakes in the understanding of a concept to a more esoteric ones. These also should not include syntax problems that would be easily picked up by a compiler as most developers tend to rely on this.

For a more complex exercise, the candidate could review the companies actual product where bugs have been introduced into the code. Since the code would need to run to be worked on by the candidate these bugs would not be syntax related.

Pros:

  – You get the candidate doing exactly what you are hiring them for- working with code.

  – Provides a good insight into a candidates ability to troubleshoot, identify and correct
    code. What a concept!

Cons:

  – Candidates might feel stressed having to work with code in the interview. This can
    be alleviated by leaving the room, or by interacting with the candidate as they work –.
    answering questions and guiding them (although not directly to the answer!).

  – Depending on the bug introduced and the subtleties / complexities of the scenario the
    candidate might not be able to identify this. Make sure the bug is relatively apparent, or
    better yet have multiple bugs ranging from easy to more complex.

  • Design Problems

The technique referred to here are programming design problems, not esoteric ones. Usually involves describing a hypothetical system or problem that needs to be designed by the candidate giving them just enough details to start – barely.

The exercise usually involves the candidate coding the problem on a provided computer.

The premise behind this technique is that you want to be able to gauge how well a candidate can take a set of instructions and break them down into a well designed program. The main takeaway from this technique is the thought process and decisions that the candidate makes along the way and not the final outcome.

In fact we usually don’t have the candidate finish whatever problem was chosen. Observe things like:

     – Did the candidate ask clarifying questions on the requirements/premise of the problem

     – How did the candidate approach the problem – dive right in coding, draw it out, ask
       questions

     – Did the candidate consider the appropriate tradeoffs in design – speed, performance,
       usability

Remember that the focus is not on the end result but on how the candidate approaches the problem and works through it.

Pros:

  – Provides insight into how a candidate thinks about problems and design considerations

  – Can be used as a gauge of experience, familiarity with tools, language/project choices, etc

Cons:

  – Pick an area that is unfamiliar to the candidate. Make sure the problem is a fairly well know
    concept so it doesn’t require and explanation aside from the intended requirements.

  – More stressful than direct interview discussions since the candidate will feel like they are
    “on-stage”. Be sure to explain the intent of the exercise is to see their approach to the
    problem.

  • “Try before you buy”

Pairing the candidate up with a team member for a time period on a current project and let them code together. This appears to be more applicable when the team is following an Agile methodology.

Pros:

  – Useful to see the candidate in action. Coding style, thought processes, etc.

  – Able to get a rough gauge on team fit, personality.

Cons:

  – Time. This requires a long time commitment by the candidate

  – May not work well with a candidate that is not familiar with the Agile methodology. Most
    non Agile shops don’t have programmers pairing together.

  – May be difficult for candidate to participate and demonstrate anything useful due to lack
    of knowledge of the feature, code, etc.

 

In the end what you choose to be a part of your process must be based on what works for your team and business culture and the amount of time you can invest in each interview.

Some of the techniques I find the most useful in interviews I participate in are the “hands-on” techniques. If I am hiring for a developer, then I want someone who can read and write code. If I’m in the interview, you will need to convince me you can code.

Let me hear from you if there are techniques you find work better than others for your team, or if you have any ones I haven’t listed here.

Related: Nuggets from Interviewing – Tips for Developers and just about everyone