Upload Large Files in C#

C# / .NET DevOps Misc

C# / .NET

Alexandru Puiu

September 21, 2020

4 min

Configure framework limits

Uploading a single large file

Uploading Multiple Files at the same time using multi-part content

Upload Files to Azure Storage

Uploading large files using C#

Report progress

Next Steps

Uploading files these days is pretty straightforward in just about any web framework and programming language, however, when files get big or many files are uploaded at the same time, memory usage starts being a concern and starts to run into bottlenecks. Aside from this, frameworks put in constraints to protect the application from things like denial of service through resource exhaustion. I ran into several of these limitations over the years and came up with a few solutions.

The examples below are using the MVC framework in .net core / C# but the concepts are valid for other languages.

Configure framework limits

By default, .NET Core allows you to upload files up to 28 MB. Depending on how you’re hosting the application, the limits are configurable in a few places.

IIS Express configuration

IIS Express uses the values in web.config for just about everything, so we can set the requestLimits maxAllowedContentLength to 200MB. This can be set up to 4GB, but I found conflicting info about if it could be disabled. If you need to upload files larger than 4GB, chunking is recommended. This limit is for the entire app.

<system.webServer>
  <security>
    <requestFiltering>
      <requestLimits maxAllowedContentLength="209715200" />
    </requestFiltering>
  </security>
</system.webServer>

Next, we have limits from the MVC framework, which can be configured on a per-endpoint basis. This can be done using either the RequestFormLimits or DisableRequestSizeLimit attributes

[HttpPut("upload")]
[DisableRequestSizeLimit] //or [RequestFormLimits(MultipartBodyLengthLimit = 629145600)]
public async Task Upload([FromQuery] Guid? messageId)
{
    ...

Kestrel Configuration

If you’re using Kestrel, you can use the RequestSizeLimit attribute on each endpoint

[HttpPut("upload")]
[DisableRequestSizeLimit] //or [RequestFormLimits(MultipartBodyLengthLimit = 629145600)]
[RequestSizeLimit(209715200)]
public async Task Upload([FromQuery] Guid? messageId)
{
    ...

You can also configure these settings application-wide, although caution is advised, since they’re there for a reason (to protect your application).

...
.UseKestrel(options =>
{
    options.Limits.MaxRequestBodySize = 209715200;
});
...

public void ConfigureServices(IServiceCollection services)
{
    ...
    services.AddMvc();
    services.Configure<FormOptions>(x =>
    {
        x.MultipartBodyLengthLimit = 209715200;
    });
}

More info on limits:

http://www.binaryintellect.net/articles/612cf2d1-5b3d-40eb-a5ff-924005955a62.aspx

https://stackoverflow.com/questions/38698350/increase-upload-file-size-in-asp-net-core

https://khalidabuhakmeh.com/increase-file-upload-limit-for-aspdotnet

Disable automatic binding

MVC by default will try to accept the full response and map contents to variables. We’ll need to disable this behavior for our endpoint so we can interact with the stream as it comes in. We can easily achieve this with an attribute and just remove all value providers from the request:

using System;
using Microsoft.AspNetCore.Mvc.Filters;

namespace Web.Infrastructure
{
 [AttributeUsage(AttributeTargets.Class | AttributeTargets.Method)]
 public sealed class DisableFormValueModelBindingAttribute : Attribute, IResourceFilter
 {
 public void OnResourceExecuting(ResourceExecutingContext context)
 {
 context.ValueProviderFactories.Clear();
 }

 public void OnResourceExecuted(ResourceExecutedContext context)
 {

 }
 }
}

Then we can just decorate our upload method with [DisableFormValueModelBinding]

Uploading a single large file

Uploading one file is now pretty straightforward, we just need to treat the body as the file stream, but this is rarely the case since we usually need additional parameters passed in. For that reason, we usually prefer to encode the data as form-data using multi-part encoding.

Uploading Multiple Files at the same time using multi-part content

To be able to understand our cutoffs for each parameter/file, we first need to extract the boundary. This is found in the Content-Type header. Here’s an example of a raw multipart http post

POST /upload HTTP/1.1
Host: localhost:8000
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:29.0) Gecko/20100101 Firefox/29.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cookie: __atuvc=34%7C7; permanent=0; _gitlab_session=226ad8a0be43681acf38c2fab9497240; __profilin=p%3Dt; request_method=GET
Connection: keep-alive
Content-Type: multipart/form-data; boundary=—————————9051914041544843365972754266
Content-Length: 554

—————————–9051914041544843365972754266
Content-Disposition: form-data; name="text"

text default
—————————–9051914041544843365972754266
Content-Disposition: form-data; name="file1"; filename="a.txt"
Content-Type: text/plain

Content of a.txt.

—————————–9051914041544843365972754266
Content-Disposition: form-data; name="file2"; filename="a.html"
Content-Type: text/html

<!DOCTYPE html><title>Content of a.html.</title>

—————————–9051914041544843365972754266–

We can use the following helper class to extract the boundary from the Content-Type header:

using System;
using System.Linq;

namespace Web.Infrastructure
{
 public static class RequestHelpers
 {
 public static string GetBoundary(string contentType)
 {
 var elements = contentType.Split(' ');
 var element = elements.First(e => e.StartsWith("boundary="));
 var boundary = element.Substring("boundary=".Length);

 if (boundary.Length >= 2 && boundary[0] == '"' && boundary[^1] == '"')
 boundary = boundary[1..^1];

 return boundary;
 }

 public static bool IsMultipartContentType(string contentType) => !string.IsNullOrEmpty(contentType) && contentType.IndexOf("multipart/", StringComparison.OrdinalIgnoreCase) >= 0;
 }
}

Now we can use the RequestHelpers.GetBoundary helper method to find our boundary and create our MultipartReader and now we can iterate or loop through each section. In the following example, we optionally cast the multipart as a file for simplicity in accessing things like the filename.

var boundary = RequestHelpers.GetBoundary(_httpContextAccessor.HttpContext.Request.ContentType);
var reader = new MultipartReader(boundary, _httpContextAccessor.HttpContext.Request.Body);
var section = await reader.ReadNextSectionAsync();

var fileSection = section.AsFileSection();
var originalFilename = fileSection.FileName;

Next, we can create a temp file to upload our file. If you’re planning to persist the file on disk, then this would be your final file. In this case, I use this temp file as a buffer, then upload the file to Azure Storage and finally delete the temp file.

To save on memory, we read the stream in chunks and write to the file as it comes in. This is all done using async to make sure we’re not blocking a thread while we wait for either stream.

var tempFilename = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.tmp");
using (var stream = new FileStream(tempFilename, FileMode.CreateNew))
{
 const int chunkSize = 1024;
 var buffer = new byte[chunkSize];
 var bytesRead = 0;
 do
 {
 bytesRead = await fileSection.FileStream.ReadAsync(buffer, 0, buffer.Length);
 await stream.WriteAsync(buffer, 0, bytesRead);
 } while (bytesRead > 0);
}

Upload Files to Azure Storage

Next, we can take the file we just received and saved to disk and upload it to Azure Storage. We have several things we can customize, and as with anything real-world, we need to account for failures and resources. In this case, we configure a maximum parallel operation count to 1 with a 1MB chunk and tell it to retry up to 5 times with a delay of 2 seconds between each retry. This ensures that our file still makes it to azure storage even if a network glitch happens. This retry logic is also one of the main reasons we don’t want to read from the input stream and write directly to the azure stream. That file on disk allows us to retry without having to re-receive the file from the user.

var storageAccount = CloudStorageAccount.Parse(_configuration["MicrosoftAzureStorage:AzureStorageConnectionString"]);
var backOffPeriod = TimeSpan.FromSeconds(2);

var blobClient = storageAccount.CreateCloudBlobClient();
blobClient.DefaultRequestOptions = new BlobRequestOptions()
{
 SingleBlobUploadThresholdInBytes = 1024 * 1024, //1MB, the minimum
 ParallelOperationThreadCount = 1,
 RetryPolicy = new ExponentialRetry(backOffPeriod, maxAttempts: 5),
};
var videoContainer = blobClient.GetContainerReference("videos");

videoContainer.CreateIfNotExists();

var blockBlob = _videoContainer.GetBlockBlobReference(filename);
blockBlob.Properties.ContentType = section.ContentType ?? "application/octet-stream";
storedUrl = blockBlob.Uri.AbsoluteUri;

await blockBlob.UploadFromFileAsync(tempFilename);

Uploading large files using C#

Luckily, System.Net.Http comes with a handy solution for this, we can just use StreamContent to buffer files and wrap them in a MultipartFormDataContent. We can optionally specify the boundary as well, and we no longer have to parse for it if we want to preset it.

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.IdentityModel.Tokens.Jwt;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading;
using System.Threading.Tasks;

namespace Misc
{
 class Program
 {
 static string baseUrl = "http://localhost:39720";

 static async Task Main(string[] args)
 {
 var _multipartHttpClient = new HttpClient { BaseAddress = new Uri(baseUrl), Timeout = new TimeSpan(0, 20, 0) };

 var multipartContent = new MultipartFormDataContent("NKdKd9Yk");
 multipartContent.Headers.ContentType.MediaType = "multipart/form-data";

 var videoContent = new StreamContent(new FileStream(@"C:\repo\test-files\test.mp4", FileMode.Open), 100000);
 multipartContent.Add(videoContent, "video", "file.mp4");

 var thumbnailContent = new StreamContent(new FileStream(@"C:\repo\test-files\test.png", FileMode.Open), 100000);
 multipartContent.Add(thumbnailContent, "thumbnail", "file.png");

 _multipartHttpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("multipart/form-data"));
 _multipartHttpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", await GetToken());
 var response = await _multipartHttpClient.PutAsync($"{baseUrl}/api/files/upload", multipartContent);
 response.EnsureSuccessStatusCode();
 }
 
 …
 }
}

Report progress

In order to get some idea of progress, we can just fire an event between chunks being uploaded. The buffer should be set depending on the expected upload speed and app responsiveness.

// Source: https://forums.xamarin.com/discussion/180009/how-to-work-with-custom-progressable-stream-content

using System;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;

namespace Misc
{
 internal class ProgressableStreamContent : HttpContent
 {
 /// <summary>
 /// Lets keep buffer of 20kb
 /// </summary>
 private const int defaultBufferSize = 5 * 4096;

 private HttpContent content;
 private int bufferSize;
 private Action<long, long> progress;

 public ProgressableStreamContent(HttpContent content, Action<long, long> progress) : this(content, defaultBufferSize, progress) { }

 public ProgressableStreamContent(HttpContent content, int bufferSize, Action<long, long> progress)
 {
 if (bufferSize <= 0)
 throw new ArgumentOutOfRangeException(nameof(bufferSize));

 this.content = content ?? throw new ArgumentNullException(nameof(content));
 this.bufferSize = bufferSize;
 this.progress = progress;

 foreach (var h in content.Headers)
 Headers.Add(h.Key, h.Value);
 }

 protected override Task SerializeToStreamAsync(Stream stream, TransportContext context)
 {

 return Task.Run(async () =>
 {
 var buffer = new byte[bufferSize];
 TryComputeLength(out var size);
 var uploaded = 0;

 using (var sinput = await content.ReadAsStreamAsync())
 {
 while (true)
 {
 var length = sinput.Read(buffer, 0, buffer.Length);
 if (length <= 0) break;

 uploaded += length;
 progress?.Invoke(uploaded, size);

 stream.Write(buffer, 0, length);
 stream.Flush();
 }
 }
 stream.Flush();
 });
 }

 protected override bool TryComputeLength(out long length)
 {
 length = content.Headers.ContentLength.GetValueOrDefault();
 return true;
 }

 protected override void Dispose(bool disposing)
 {
 if (disposing)
 content.Dispose();

 base.Dispose(disposing);
 }
 }
}

Next, all we need to do is wrap our MultipartFormDataContent in the ProgressableStreamContent and we can pass an event we want to be fired every time the stream progresses

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.IdentityModel.Tokens.Jwt;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Threading;
using System.Threading.Tasks;

namespace Misc
{
 class Program
 {
 static string baseUrl = "http://localhost:39720";

 static async Task Main(string[] args)
 {
 var _multipartHttpClient = new HttpClient { BaseAddress = new Uri(baseUrl), Timeout = new TimeSpan(0, 20, 0) };

 var multipartContent = new MultipartFormDataContent("NKdKd9Yk");
 multipartContent.Headers.ContentType.MediaType = "multipart/form-data";

 var videoContent = new StreamContent(new FileStream(@"C:\repo\test-files\test.mp4", FileMode.Open), 100000);
 multipartContent.Add(videoContent, "video", "file.mp4");

 var thumbnailContent = new StreamContent(new FileStream(@"C:\repo\test-files\test.png", FileMode.Open), 100000);
 multipartContent.Add(thumbnailContent, "thumbnail", "file.png");

 var progressContent = new ProgressableStreamContent(multipartContent, 100000, (done, total) =>
 {
 Console.WriteLine($"Done {done}/{total}");
 });

 _multipartHttpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("multipart/form-data"));
 _multipartHttpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", await GetToken());
 var response = await _multipartHttpClient.PutAsync($"{baseUrl}/api/files/upload", progressContent);
 response.EnsureSuccessStatusCode();
 }
 
 …
 }
}

More info:

https://forums.xamarin.com/discussion/180009/how-to-work-with-custom-progressable-stream-content

Next Steps

We still have room for optimization and depending on what we’re trying to optimize for, several things can be added:

Configure a factory to start uploading chunks of the temp file as they’re written to disk (as they come in). This saves time in having to wait to receive the full file before we start uploading it, while still allowing us to retry the file if anything happens.
Split the file into multiple chunks and upload using several parallel sockets. This gets complicated but works great in a load-balanced scenario, and theoretically, other network optimization techniques kick in, such as jumbo frames and make the upload happen faster.
G-Zip compress each chunk before sending it to the server. This adds CPU load to decompress the chunks/files, but depending on what kind of content you’re working with, it could very significantly cut down on upload time.

Alexandru Puiu

Engineer / Security Architect

Systems Engineering advocate, Software Engineer, Security Architect / Researcher, SQL/NoSQL DBA, and Certified Scrum Master with a passion for Distributed Systems, AI and IoT..