Uploading files these days is pretty straightforward in just about any web framework and programming language, however, when files get big or many files are uploaded at the same time, memory usage starts being a concern and starts to run into bottlenecks. Aside from this, frameworks put in constraints to protect the application from things like denial of service through resource exhaustion. I ran into several of these limitations over the years and came up with a few solutions.
The examples below are using the MVC framework in .net core / C# but the concepts are valid for other languages.
Configure framework limits
By default, .NET Core allows you to upload files up to 28 MB. Depending on how you’re hosting the application, the limits are configurable in a few places.
IIS Express configuration
IIS Express uses the values in web.config for just about everything, so we can set the
requestLimits maxAllowedContentLength to 200MB. This can be set up to 4GB, but I found conflicting info about if it could be disabled. If you need to upload files larger than 4GB, chunking is recommended. This limit is for the entire app.
Next, we have limits from the MVC framework, which can be configured on a per-endpoint basis. This can be done using either the
If you’re using Kestrel, you can use the
RequestSizeLimit attribute on each endpoint
You can also configure these settings application-wide, although caution is advised, since they’re there for a reason (to protect your application).
More info on limits:
Disable automatic binding
MVC by default will try to accept the full response and map contents to variables. We’ll need to disable this behavior for our endpoint so we can interact with the stream as it comes in. We can easily achieve this with an attribute and just remove all value providers from the request:
Then we can just decorate our upload method with
Uploading a single large file
Uploading one file is now pretty straightforward, we just need to treat the body as the file stream, but this is rarely the case since we usually need additional parameters passed in. For that reason, we usually prefer to encode the data as form-data using multi-part encoding.
Uploading Multiple Files at the same time using multi-part content
To be able to understand our cutoffs for each parameter/file, we first need to extract the boundary. This is found in the Content-Type header. Here’s an example of a raw multipart http post
We can use the following helper class to extract the boundary from the Content-Type header:
Now we can use the
RequestHelpers.GetBoundary helper method to find our boundary and create our
MultipartReader and now we can iterate or loop through each section. In the following example, we optionally cast the multipart as a file for simplicity in accessing things like the filename.
Next, we can create a temp file to upload our file. If you’re planning to persist the file on disk, then this would be your final file. In this case, I use this temp file as a buffer, then upload the file to Azure Storage and finally delete the temp file.
To save on memory, we read the stream in chunks and write to the file as it comes in. This is all done using async to make sure we’re not blocking a thread while we wait for either stream.
Upload Files to Azure Storage
Next, we can take the file we just received and saved to disk and upload it to Azure Storage. We have several things we can customize, and as with anything real-world, we need to account for failures and resources. In this case, we configure a maximum parallel operation count to 1 with a 1MB chunk and tell it to retry up to 5 times with a delay of 2 seconds between each retry. This ensures that our file still makes it to azure storage even if a network glitch happens. This retry logic is also one of the main reasons we don’t want to read from the input stream and write directly to the azure stream. That file on disk allows us to retry without having to re-receive the file from the user.
Uploading large files using C#
System.Net.Http comes with a handy solution for this, we can just use
StreamContent to buffer files and wrap them in a
MultipartFormDataContent. We can optionally specify the boundary as well, and we no longer have to parse for it if we want to preset it.
In order to get some idea of progress, we can just fire an event between chunks being uploaded. The buffer should be set depending on the expected upload speed and app responsiveness.
Next, all we need to do is wrap our
MultipartFormDataContent in the
ProgressableStreamContent and we can pass an event we want to be fired every time the stream progresses
We still have room for optimization and depending on what we’re trying to optimize for, several things can be added:
- Configure a factory to start uploading chunks of the temp file as they’re written to disk (as they come in). This saves time in having to wait to receive the full file before we start uploading it, while still allowing us to retry the file if anything happens.
- Split the file into multiple chunks and upload using several parallel sockets. This gets complicated but works great in a load-balanced scenario, and theoretically, other network optimization techniques kick in, such as jumbo frames and make the upload happen faster.
- G-Zip compress each chunk before sending it to the server. This adds CPU load to decompress the chunks/files, but depending on what kind of content you’re working with, it could very significantly cut down on upload time.