V3 Documentation
Search

Distributed Processing

Introduction

Distributed processing is a feature introduced in NCrunch Version 2. It allows you to spread the load of executing tests across one or more computers (referred to as a 'Grid'), giving the NCrunch engine additional capacity, improving response times, and opening the door to new cross platform/configuration testing opportunities.

It's possible to configure NCrunch to completely offload all build and test work from the computer running Visual Studio (the 'Client'), freeing up valuable resources for other development tasks and reducing UI interference.

Computers that make up the grid (referred to as 'Nodes') can be in any location or domain, provided they are accessible from the client via TCP connection. After a connection has been established, NCrunch will automatically replicate and synchronise your source code across the nodes. The nodes are then integrated into the processing queue where they are cleanly utilised by the client to execute work on its behalf.

It's possible to share grid nodes between multiple clients, allowing a development team to pool grid resources for greater flexibility.

Source code is stored locally on each node between connections (up to a configurable size limit), making it very easy to jump off and back on the grid without needing to re-transfer large amounts of data. Data exchanged across the grid is also compressed in an effort to reduce bandwidth consumption as much as possible. This makes it possible to scale the grid over the internet using cloud or virtualisation services, where nodes can be brought online for extra capacity when needed.

Security Considerations

It should be clear that security is a central concern when setting up distributed processing with NCrunch. As source code is often an extremely valuable and carefully protected asset, distributed processing should only be performed across machines that are completely trusted.

Never open a connection to a grid node server that cannot be trusted with your entire source code tree.

Never allow a connection from a grid client that cannot be trusted with the safe operation of the grid node or any of the source code that may be stored on this node on behalf of other clients. This is an important consideration for grid nodes that are shared between development teams, as organisations sometimes have policies restricting source code access between teams.

Never expose a grid node server to open connections from the internet without strict firewall/network controls. Although data exchanged across the grid is encrypted and all servers are password protected, grid node servers are not designed to stand against the wide range of attack vectors possible over the public internet. Ensure connections can only be made from clients that are trusted.

When making use of distributed processing in an environment with strict security policies, it's recommended that you keep the grid within a secure private network or secure VPN.

Setting Up The Grid

Once you have NCrunch up and running in Visual Studio and you've identified one or more servers you'd like to use, it's time to set up your nodes.

Each grid node server needs to have installed any SDKs or frameworks required to build your source tree and execute the code within it. This means you must have the appropriate version of the .NET framework installed, along with any frameworks used by your source code that involve reference files being stored underneath the 'Program Files' or 'Windows' directories. Depending upon your solution, you may or may not need to have Visual Studio installed on each grid node. Typically, a grid node setup should be similar to any team build server. Most teams will already have a server they use for continuous integration (i.e. Cruise Control, Team City, etc). Assuming this server has enough capacity, it could easily be used as a grid node.

There is no requirement to copy your solution to the grid node - NCrunch will take responsibility for this. However, if you are working with a large and/or complex solution, you may wish to copy this to the grid node and try building/testing with it to ensure it can run correctly on the node. This is a useful troubleshooting method if you find the node doesn't return expected results when NCrunch is up and running.

When you have a grid node all set and ready with required SDKs installed, you'll need to install the NCrunch Grid Node Server. You can find the MSI for this on the download page. The installer will load the service application onto the node and start up a wizard allowing you to configure it. It's important that you complete the entire wizard. If you don't, you'll need to configure the grid node manually using the node configuration tool, then start the service manually. In case you don't want to run the grid node as a windows service, you can also launch the NCrunch.GridNode.Console.exe application to run it directly on the desktop.

As the wizard finishes, it should automatically start the node service and configure it to run automatically on system boot. You're now ready to try connecting to the node so you can put it to work.

Open up Visual Studio on your development machine, then go to the 'NCrunch' tool menu and choose 'Distributed Processing'. You should see the distributed processing tool window appear. This is a useful window for both configuring and monitoring the status of your grid.

Click the 'Add Server' button to register your grid node with the client. You'll need to enter either the IP address or host name of the server, along with the password you entered while configuring the node. You can leave the port as its default unless you've specifically changed this on the grid node.

Once the node has been added to the client's configuration, you should see it shown in the tool window. The status will probably show in red as 'Not connected'. This is fine because NCrunch won't try to connect to the node until you've enabled the engine. When you do this, you should see the node's status change to reflect an active connection. If for some reason the client isn't able to connect to the node, make sure that it's possible for Visual Studio to open a TCP connection to the grid node on the specified port and that there are no firewalls preventing this from happening. You may wish to try opening a telnet connection to the grid node on the same port to make sure your network is allowing you to connect.

When a node is exposed to a solution for the first time, the client will need to upload all files in the solution that would normally be needed to build and run tests. Assuming that the solution currently works with NCrunch on the client machine, NCrunch should already know about all the files that are needed and it will upload them automatically. Once the synchronisation between client and node is complete, the node should automatically start building projects and running tests.

Day-To-Day Operation

When the grid is up and running, NCrunch will begin farming out test execution tasks to any connected grid nodes. You should be able to see this happening in the processing queue window.

It's also possible to open up the 'Server Tasks' tab for individual nodes in the Distributed Processing Window if you want more information about what these nodes are doing. This tab will also show you tasks being executed by the node on behalf of other clients, so it should be easy to see if someone else is heavily utilising the grid. You can also refer to this tab if you want information about the progress of solution synchronisation with the grid node.

Build Tasks

You'll notice that build tasks are duplicated between the grid nodes and the local machine. This is because NCrunch does not copy build output files over the grid - it will only pre-build source code. If you make a change to your solution, the change must be built on each individual node before the nodes can execute any tests. There are several reasons for this:

  • It is possible to offload all build/test work from the client onto the grid.
  • Identification and copying of all artifacts output from the build process is difficult to do reliably, and when artifacts are missed, crazy things can happen.
  • Because of possible differences in platform and configuration, there are no guarantees that artifacts built on one computer will be the same as another (although such situations could be considered very unusual, it is physically possible for differences to exist).
  • Somewhere, a computer connected to the grid still needs to build the projects. Any improvements in response time offered by sharing build artifacts from this computer would be much negated by the fact that other machines would need to wait for this build to complete. While they wait for this build, they are still unable to run tests from the latest version of the code (and therefore they might as well be building!).

"Grid Only Mode"

"Grid Only Mode" is where all build and test tasks are executed on grid nodes, and the client does nothing but manage the work. To set NCrunch in Grid Only Mode, just set the max number of processing threads client configuration setting to '0' (in the Performance section of NCrunch's global settings), or turn off the (local) node in the Distributed Processing Window.

In Grid Only Mode, NCrunch will only build and run tests on the client machine when you are trying to debug your tests. Remotely debugging tests on NCrunch grid nodes is something that unfortunately isn't possible yet, so this must still be done locally.

Controlling Work Distribution

It's possible to strictly control which tests a grid node (or client) is allowed to execute. This can be done in two ways:

  • By marking tests with the RequiresCapability attribute and defining capabilities with the capabilities of this computer configuration setting. Machines without a configured capability will not execute tests that require it.
  • By adjusting the tests to execute on this machine configuration setting to specifically exclude certain tests from execution using a custom expression. For example, you could create a special category that describes tests that only run on a particular node.

Controlling work distribution across an NCrunch grid can open the door to some fantastic new opportunities - such as concurrent testing of code that cannot run on a normal development machine and requires special hardware or configuration. Spreading tests across nodes in the grid for multi-platform testing can also be done using DistributeByCapabilitiesAttribute.

It is also possible to configure individual projects with required capabilities so that NCrunch does not attempt to build them on certain machines.

Grid Performance

A large grid is great for increasing the overall throughput of the NCrunch engine, but it's still important to make sure your tests execute quickly to deliver feedback as fast as possible. Even if infinite capacity is available, NCrunch can't execute a test pipeline any faster than the slowest test contained in it.

When sharing grid nodes between multiple clients, there is overhead added in the form of additional communication needed between the nodes/clients to coordinate the work. The size of this overhead is in proportion to the number of clients making use of each grid node, and the speed of your network. You may want to experiment with different grid configurations (i.e. two nodes max for each client vs all clients connected to all nodes) to find the configuration that gives the best performance.

If the grid gives you a good amount of task capacity (i.e. 15 or higher), you may wish to consider adjusting your fast lane threads configuration option to reserve more processing power for fast executing tests. This should enable the extra capacity to be used for improving response time, rather than getting tied up running slow (and perhaps less relevant) tests.