Observability using Elasticsearch and Neo4j

10 min read Elasticsearch continues to add features at an astonishing rate, and people find really creative ways to use them and enhance it even more. What Neo4j can do is just way too cool to pass on. So we’ll look at how to ingest data with elasticsearch and analyze the data with neo4j. Combining the two helps us achieve some really powerful solutions.

I originally was intrigued by elasticsearch for log aggregation and its capability to instantly aggregate and search over millions of records. We could ship logs from all sorts of data sources like application logs, web server logs (Nginx, IIS). Then we can filter through those logs in Kibana’s Discover, choose the columns we wanted to see for particular use-cases and create saved searches. This immediately made it useful to us, the engineering team. We then use query-based filtering to add restrictions on documents people should access, and with field-level security, we can control which fields they even see inside each document. All of a sudden we have the ability to give our level 1 support real-time visibility into customer issues, without overloading them. On top of this, we add Windows event logs and Syslogs and create some alerts.

How to Upload Large Files

4 min read Uploading files these days is pretty straightforward in just about any web framework and programming language, however, when files get big or many files are uploaded at the same time, memory usage starts being a concern and starts to run into bottlenecks. Aside from this, frameworks put in constraints to protect the application from things like denial of service through resource exhaustion. I ran into several of these limitations over the years and came up with a few solutions. The examples below are using the MVC framework in .net core…

Concepts for a Modern Planet-Scale Application

12 min read Gossip Protocols Gossip protocols, also known as infection or epidemic protocols, are a category of protocols used for peer-to-peer communication that provide additional information about the network which they know about and accept but aren’t the authoritative source of. Some distributed systems use peer-to-peer gossip to ensure that information is efficiently disseminated to all members of a group. A really good use-case for this is network discovery and state maintenance in a very large network, where direct chatter between all nodes would waste a lot of bandwidth, and especially useful…

Work with a REST API using PowerShell

2 min read A well-designed REST API can be consumed and interacted with in many ways. Powershell is one of those really useful ones because it’s very dynamic. We’ll also consider that the API is protected using JWT Bearer tokens by an OpenID Connect server. Our example API, in this case, is a simple REST API to query and manage users.

Set required headers:

Adding the bearer token manually to the script in this case, but this step could be automated as well although it’s a lot more involved to initially set up. See the following guide for one way of doing this https://docs.microsoft.com/en-us/information-protection/develop/concept-authentication-acquire-token-ps

$headers = @{}
$headers["Accept"] = "application/json"
$headers["Authorization"] = "Bearer 3a5e90b25ac028ec968def29d0055d418265e9810968eb4a0c531a45fee3b00f"

Signing Git Commits Using YubiKey on Windows

8 min read There are several things we need to do in order to achieve end-to-end security in our release pipeline. In this post, I’ll explain how to set up signing git commits and store the private key on a YubiKey using it as a smart card. Signing our commits is especially important in public projects like those on GitHub, to avoid people impersonating us. For private projects and later on in the build pipeline, we can validate that all our commits are signed by trusted parties, and add gates to protect against unauthorized code making it into our products.

Cloudflare Worker Conditional Reverse Proxy

< 1 min read Cloudflare worker to load content from subdomain/alternate location and replace references to subdomain/alternate location.

addEventListener('fetch', event => {
  var url = new URL(event.request.url);
  if (url.pathname.startsWith('/blog') || url.pathname === '/blog') {
    event.respondWith(handleBlog(event, url));
  } else {
    event.respondWith(fetch(event.request));
  }
})

async function handleBlog(event, url) {
  // Load subdomain content / reverse proxy mysite.com/blog to blog.mysite.com subdomain
  var originUrl = url.toString().replace('https://mysite.com/blog', 'https://blog.mysite.com');
  
  // Load content
  let response = await fetch(originUrl);

  // Make sure we only modify text, not images
  let type = response.headers.get("Content-Type") || "";
  if (!type.startsWith("text/")) {
    return response;
  }

  // Read response body
  let text = await response.text();

  // Modify it
  let modified = text.replace(/blog.mysite.com/g, "mysite.com/blog")

  // Return modified response
  return new Response(modified, {
    status: response.status,
    statusText: response.statusText,
    headers: response.headers
  });
}

Authentication Ideas

6 min read Security, and particularly around authentication, authorization, and auditing, is my favorite part of software development. It’s the stuff that not just lets us be safe, but rather, the reason I like it so much is that it’s by far the broadest part of software development. It requires us to understand the full breadth of the field, from hardware security components like TPM (Trusted Platform Module) chips to IETF standards-based protocols that not only make things safer but open the door to creating simpler, better, and more integrated systems. Historically it may not have always been the case, and security was at odds with other fields like performance and usability. Those problems have long been addressed now, once we realized that thinking of systems as having behavior emergent from the interaction of many systems and focusing on the end problem we’re trying to solve, instead of trying to fit the problem into an isolated individual system.

This new way of thinking gave way to new fields such as Systems Engineering, where the focus moves to focus on discovering the real problems that need to be resolved and identifying the most probable and highest impact failures that can occur. The domain of security, and organizations like (ISC)², OWASP and NIST have recognized and pushed the application of this understanding very well over the years, and standards have changed and become better.

One concrete example of this I think is NIST’s update to NIST 800-171 to remove periodic password change requirements, and drop the password complexity requirements in favor of screening new passwords against a list of commonly used or compromised passwords.

Parallel Foreach async in C#

5 min read Foreach itself is very useful and efficient for most operations. Sometimes special situations arise where high latency in getting data to iterate over, or processing data inside the foreach depends on an operation with very high latency or long processing. This is the case for example with getting paged data from a database to iterate over. The goal is to start getting data from the database, but a chunk of data at a time, since getting one record at a time introduces its own overhead. As the data becomes available, we’d start processing it, while in the background we get more data and feed it into the processor. The processing part would itself be parallel as well, and start processing the next iterator.

ForEachAsync

My favorite way to do this is with an extension method Stephen Toub wrote many years ago, that accepts a data generator and breaks the data source into partitions allowing for specifying the degree of parallelism and accepts a lambda to execute for each item

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace Extensions
{
    public static class Extensions
    {
        public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
        {
            return Task.WhenAll(
                from partition in Partitioner.Create(source).GetPartitions(dop)
                select Task.Run(async delegate
                {
                    using (partition)
                        while (partition.MoveNext())
                            await body(partition.Current);
                }));
        }
    }
}

But let’s see what we can do to optimize more…

Change Desktop Wallpaper with a USB Rubber Ducky

5 min read Rubber Ducky is the “technical” name of a USB device which looks like a USB thumb drive, but presents itself to the computer as a USB keyboard, which, once plugged in, starts typing away pre-recorded keystrokes at super-human speeds. It can bypass many IT safeguards since the computer detects it as a keyboard, and it can easily fool people. Someone might think they’re just plugging in a harmless USB thumb drive, or even just a harmless iPhone charging cable. These can be disguised as just about anything these days, and they can be microscopic.

Monitoring DNS Lookups against your DNS Server using Elasticsearch

7 min read DNS is one of the most fundamental components of today’s internet, one of the simplest technologies at the same time, but it’s also one of the biggest risks, and can be one of the best indicators when something is at risk. There are many cyber attacks that either have their root in DNS, or just use DNS just as a part of the structure of the attack without even thinking about it. Understanding your organization’s DNS traffic is a very solid step in protecting the overall organization. Monitoring DNS lookups with Elasticsearch and analyzing with machine learning can significantly reduce risk around several types of attacks. Here are some of the threats based in DNS and how to know about them.