Jordan Hook

Software Developer

malware  dotnet core  C#  docker  containers  isolation  

If you haven't already done so, please read Part 1 to learn about how we developed the scraper used in this tutorial.

When working with dangerous files such as malware it's important to take proper safety precautions. Accidentally opening or executing a malicious file on your system could have dire consequences even on a virtual machine. In order to protect yourself whilst using the scraper we are going to make a few changes to the code of our scraper and how we use it. However, with all of this said please note it is still strongly recommended you use a full virtual machine at the very least when working with malware as there are chances you could still get infected!

Part 1: Code Changes

The first code change we are going to make is how the files are named after they are downloaded. Our original code was preserving the original file names and types based on the URL where is was stored. This is dangerous depending how you plan on accessing the files later as you could accidentally execute one the files by clicking on it.

string targetPath = this.outputDirectory + "/" + sampleParts[sampleParts.Length - 1];

To fix this issue at the very least on Windows based operating systems we are going to be renaming the files but, there is also another reason for renaming the files. Our current setup could be downloading duplicate files or even overwriting different samples based on the file name. To ensure we are only receiving unique file names per download we are going to be hashing each file and using the hash as the name of the file. 

If want to understand more about hashing and cryptography I recommend visiting this as they have some good articles about it. For the purpose of this tutorial we will be implementing SHA-256. In order to do this we will need to complete the following code changes:

  • Implement function to calculate SHA-256 hash of an entire file
  • Update existing code to download files using unique names (we are going to take a random number approach)
  • Calculate the hash of each file
  • Rename each file with its own unique hash

To get started lets first write the code to generate a unique hash for each file. Luckily for us, and in most cases there are built in implementations of common hashing algorithms in most languages so our code should look something like this:

using System.Text;
using System.Security.Cryptography;

private string getFileHash(string filePath) {
    using (SHA256 sha = SHA256.Create()) {

        // Read file and compute hash as bytes
        byte[] shaHash = sha.ComputeHash(File.ReadAllBytes(filePath));

        // Convert hash to string
        StringBuilder sb = new StringBuilder();
        for(int i = 0; i < shaHash.Length; i++) {

        return sb.ToString();

Next we need to ensure our samples are being downloaded with unique names so that we don't accidentally overwrite them. To accomplish this, we will be using random numbers to help generate the temporary file names. Now before you ask or mention it below, I do know that there is built-in functionality for generating temporary file names however, we are trying to develop an application that will isolate itself to only specified sections of the file system so we know where are samples are at all times even if something goes wrong and the application crashes.

The code to implement this will be added to a few different places in the scapeSamples() function. Because of this I am only going to show some of the new code added below. For the full implantation please visit the GitHub project linked below.

Random r = new Random();
string targetPath = this.outputDirectory + "/sample-" + randomNumber;

// Keep adding to filename until a unique filename is found
while(File.Exists(targetPath)) {
    targetPath += r.Next(1, 9);
string shaHash = getFileHash(targetPath);

// Check if we have a duplicate sample
if(File.Exists(this.outputDirectory + "/" + shaHash)) {
} else {
    // Rename file
    File.Move(targetPath, outputDirectory + "/" + shaHash);

Now all we need to do is test out our code. I am going to run the application in my samples directory using 4 threads.

dotnet run /samples/ 4

And to see if the new files have been renamed to hashes we can list the files in our output directory:

Part 2: Isolated Environments

The whole reason of rewriting this application using dotnet core was for the ability to run it in a linux based environment. For this part of the tutorial we are going to be working on running the application within an docker container to provide further isolation from the host operating system. With this being said please note Docker containers do not provide full virtualization such as what a virtual machine via VMware or VirtualBox could provide you however, it is a step that ensures we can run the scraper from any operating system that is capable of running docker containers. In other words you could still run your virtual machine when analyzing malware and execute the scraper or even samples within containers to help protect your analysis machine. 

In order to run our project in a docker container we are going to pre-build and create a docker file.

# dotnet build
Microsoft (R) Build Engine version 15.9.20+g88f5fadfbe for .NET Core
Copyright (C) Microsoft Corporation. All rights reserved.

  Restore completed in 53.8 ms for /Users/jhook/Desktop/dotnet/scraper/scraper.csproj.
  scraper -> /Users/jhook/Desktop/dotnet/scraper/bin/Debug/netcoreapp2.2/scraper.dll

Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:00:01.34
# touch Dockerfile

Please note the path of the scraper.dll file as it is important later. 

Now that we have our application built we need to create to configure our docker image to create a container utilizing dotnet core and our application. To do this we will use the official Microsoft image for docker and the path noted above to our applications binaries.


COPY bin/Debug/netcoreapp2.2/publish/ app/

ENTRYPOINT ["dotnet", "app/scraper.dll"]

Our Dockerfile does the following:

  • Creates a container using the Microsoft dotnet core runtime
  • Create a directory called app in the root of the file system
  • Copies our application binaries to the app directory 
  • Executes our application by calling the dotnet runtime

Now we need to build our docker image

docker build -t scraper .
Sending build context to Docker daemon  296.4kB
Step 1/4 : FROM
 ---> 136d49fe5bd7
Step 2/4 : WORKDIR /app
 ---> Running in d7b5487adc8a
Removing intermediate container d7b5487adc8a
 ---> 787fd38d8b4d
Step 3/4 : COPY bin/Debug/netcoreapp2.2/publish/ app/
 ---> 96619b278ea7
Step 4/4 : ENTRYPOINT ["dotnet", "app/scraper.dll"]
 ---> Running in 567430bd3a45
Removing intermediate container 567430bd3a45
 ---> d667d4c9f89e
Successfully built d667d4c9f89e
Successfully tagged scraper:latest

Running the command above in the same directory of our Dockerfile will create a new image for us called scraper. 

Now we can run our image within a container by executing the command below

docker run --name=scraper -v /Users/jhook/Desktop/dotnet/scraper/samples:/samples scraper /samples 4

After a few moments you should begin to see new samples appearing in your samples folder with SHA hashes as names. If you want to run this command in the background you can run use the -d parameter to the run command to run the process in detached mode.

Part 3: Conclusion

Running our sample scraper within a docker container is a huge goal and will allow us to scrape for samples on any docker ready device. Our next steps could be automating this process to run on a schedule or even adding additional layers of processing. For example, in another tutorial I may add another step in our pipeline to automatically perform basic analysis on our samples. If you have any suggestions or ideas on additions to this project please comment below!

Full Source (GitHub)