Parallel Foreach async in C#

5 min read Foreach itself is very useful and efficient for most operations. Sometimes special situations arise where high latency in getting data to iterate over, or processing data inside the foreach depends on an operation with very high latency or long processing. This is the case for example with getting paged data from a database to iterate over. The goal is to start getting data from the database, but a chunk of data at a time, since getting one record at a time introduces its own overhead. As the data becomes available, we’d start processing it, while in the background we get more data and feed it into the processor. The processing part would itself be parallel as well, and start processing the next iterator.

ForEachAsync

My favorite way to do this is with an extension method Stephen Toub wrote many years ago, that accepts a data generator and breaks the data source into partitions allowing for specifying the degree of parallelism and accepts a lambda to execute for each item

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace Extensions
{
    public static class Extensions
    {
        public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
        {
            return Task.WhenAll(
                from partition in Partitioner.Create(source).GetPartitions(dop)
                select Task.Run(async delegate
                {
                    using (partition)
                        while (partition.MoveNext())
                            await body(partition.Current);
                }));
        }
    }
}

But let’s see what we can do to optimize more…

Representing State Machine Capabilities using neo4j / graph databases

10 min read

Building the graph

The first challenge we have is representing the capabilities of the system in a logical way. We can get a list of all the possible general classes of states, and represent inheritance for them, which will come in handy later.

var derivedTypes = ReflectionHelpers.FindAllDerivedTypes<ProcessableState>();
foreach (var type in derivedTypes)
    graphClient.Cypher
        .Create("(statetype:StateType {newState})")
        .WithParam("newState", new { type.Name })
        .ExecuteWithoutResults();

this is done with a simple extension class

public class ReflectionHelpers
{
    public static List<Type> FindAllDerivedTypes<T>()
    {
        return FindAllDerivedTypes<T>(Assembly.GetAssembly(typeof(T)));

Debugging on localhost with HSTS

2 min read

What is the function of HSTS

HSTS stands for HTTP Strict Transport Security and it tells your browser that your web content should always be served over HTTPS. See Security Headers for more info

Adding a signed localhost certificate to the Trusted Root Certification Authorities store

Newer versions of chrome require the server’s cert must contain a “subjectAltName” otherwise known as a SAN certificate. If you are using an older signed certificate which only references a commonName, then you might still get rejected by Chrome even if you’re certificate is valid.

Security Headers

3 min read

X-XSS-Protection

This header is used to configure the built in reflective XSS protection found in Internet Explorer, Chrome and Safari (Webkit). Valid settings for the header are 0, which disables the protection, 1 which enables the protection and 1; mode=block which tells the browser to block the response if it detects an attack rather than sanitising the script.

app.UseXXssProtection(options => options.EnabledWithBlockMode());

RavenDB Load Balancing

< 1 min read Somewhat counter-intuitive, this behavior is set at the client level, not the server. When servers are setup for replication, they create system documents of the other servers involved in replication. When the client accesses the primary server, it downloads and caches the replication information, so the client can “fail over” properly.
To set this up, you need to assign the FailoverBehavior convention in your DocumentStore.

RavenDB Lucene Query

< 1 min read How to build a Lucene query using extension methods, and not have the request go out until ToList() is called.

private List GetMembers(string nickname)
{
  var query = DocumentSession.Advanced.LuceneQuery(AllMembersIndex.Name);
  // Search for nickname
  if (!nickname.IsNullOrWhiteSpace())
    query = query.Search("Nickname", nickname);
  // Execute query
  return query.ToList();
}

RavenDB Create static index

2 min read

/// 
/// Gets the document store.
/// 
/// The document store.
///
/// Do this only once per AppDomain load. It's very expensive.
private static IDocumentStore GetDocumentStore()
{
    // Create the DocumentStore (expensive operation).
    IDocumentStore documentStore = new DocumentStore
    {
        ConnectionStringName = "RavenDB",
        Credentials = System.Net.CredentialCache.DefaultNetworkCredentials // For "trusted connections": see comments at http://ravendb.net/docs/client-api/connecting-to-a-ravendb-datastore
    };
    // Read from and write to all servers.
    documentStore.Conventions.FailoverBehavior = FailoverBehavior.ReadFromAllServers
        | FailoverBehavior.AllowReadsFromSecondariesAndWritesToSecondaries;
    // Initialize the store (must be done before creating indexes)
    documentStore = documentStore.Initialize();
    // Create static indexes
    CreateRavenStaticIndexes(documentStore);
    // Return document store
    return documentStore;
}

Background tasks in MVC and IIS

2 min read As you might’ve noticed, keeping threads running after a request returns, for processing post operational tasks (such as performing analytics on a file that was uploaded, etc) don’t always complete in a web project.
There are several issues with spawning threads in the context of an ASP.NET project. Phil Haack’s post explains the issues in more detail. The following classes solve the problem of IIS killing threads before they complete.

Value types, reference types and practical uses

2 min read C# has two different types of variables: value types and reference types. While in C and C++ primitive types can contain values or references and certain complex types (arrays, objects) can only be used via reference, in C# the line between the two types is very clear. Numeric types(int, decimal, double, etc.), bool and structs access the values directly. Class, object, interface, delegate string and dynamic are only accessed and used via reference. Because of all the awkwardness with referencing and dereferencing in C and C++, C# uses the ‘ref’ keyword only for those cases where you want to modify a value type outside of your current scope.

If you’ve worked with C# a bit you’re probably used to always working with copies in any situation outside of simple assignment (=). If you loop over a collection, inside the loop you’re working with copies. If you pass something to a function, that something is a copy. However the difference between value types and reference types often means a copy isn’t always a copy. Knowing when this is not the case can help you spot troublesome bugs, and easily solve problems that might have taken you much longer.