A Behind-the-Scenes Look at Creating LOLDrivers

Get an insider’s view of how LOLDrivers came together: the planning, community contributions, and technical work that turned it into a vital resource for tracking vulnerable Windows drivers.

May 15, 20255 min read
Retro green-screen computer displaying driver listings, symbolizing the analysis process behind creating LOLDrivers

On October 31, 2023, our esteemed colleague, Takahiro Haruyama of the Carbon Black Threat Analysis Unit (TAU), unveiled groundbreaking research that shed light on a critical vulnerability landscape. Haruyama’s team identified 34 distinct drivers, comprising 237 unique file hashes, that were susceptible to unauthorized firmware access. What’s more alarming is that six of these drivers granted kernel memory access, offering unmitigated control over devices to users without administrative privileges. This vulnerability presents a stark reality: attackers with basic access could potentially modify or delete firmware, paving the way for privilege escalation and a multitude of security breaches.

First, we highly recommend checking out the blog for all the details https://blogs.vmware.com/security/2023/10/hunting-vulnerable-kernel-drivers.html

This blog aims to peel back the curtain and reveal the process the maintainer team employs to import and enrich a vast collection of drivers for LOLDrivers. It’s an inside look into the nuts and bolts of operational cybersecurity — how what seems like a daunting task can be streamlined through the power of automation and strategic tooling.

Gather The Hash

Streamlining Driver Contributions with Automation

For modest contributions involving fewer than five new drivers, our LOLDrivers Streamlit App should suffice. However, when faced with the task of adding over 200+ hashes, automation becomes a necessity.

Previously, we’ve relied on converting CSVs into YAML to handle large batches — a method that served us well during the initial stages of LOLDrivers. This time, we are scaling up the process. Our starting point is extracting hashes from the blog, where each link points to a VirusTotal (VT) page or a VT Collection. From there, we can download the samples and collect the MD5 hashes.

Blog image

We structure our data into a CSV with two columns: md5 and tag. The tag corresponds either to the driver's name (like stdcdrv64.sys) or to the Tags value in the YAML.

Before generating the YAML, we set some prerequisites. To establish a baseline, we manually created the initial YAML files using the LOLDrivers platform, which then served as templates for automation.

Our automation effort led to the development of a basic Python script that performs the following functions:

  1. Loads the CSV and generates a new YAML file for each unique tag.
  2. Aggregates multiple hashes under a single YAML if they share the same tag.
  3. Populates the md5 field under knownvulnerablesamples in each YAML file.

Here is the script that automated the YAML generation process:

import csv
import yaml
import uuid
from collections import defaultdict
from pathlib import Path

csv_file_path = 'list.csv'
tag_to_md5s = defaultdict(list)

with open(csv_file_path, mode='r', newline='', encoding='utf-8') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        tag_to_md5s[row['tag']].append(row['md5'])

for tag, md5s in tag_to_md5s.items():
    id_guid = str(uuid.uuid4())

    service_name = ''.join(char for char in tag if char.isalnum() or char in (' ', '-', '_')).rstrip()
    service_filename = f"{service_name}.sys"

    yaml_template = {
        'Id': id_guid,
        'Author': 'Takahiro Haruyama',
        'Created': '2023-11-02',
        'MitreID': 'T1068',
        'Category': 'vulnerable driver',
        'Verified': 'TRUE',
        'Commands': {
            'Command': f'sc.exe create {service_name} binPath= C:\\windows\\temp\\{service_filename} type=kernel && sc.exe start {service_name}',
            'Description': 'The Carbon Black Threat Analysis Unit (TAU) discovered 34 unique vulnerable drivers (237 file hashes) accepting firmware access. Six allow kernel memory access. All give full control of the devices to non-admin users. By exploiting the vulnerable drivers, an attacker without the system privilege may erase/alter firmware, and/or elevate privileges. As of the time of writing in October 2023, the filenames of the vulnerable drivers have not been made public until now.',
            'Usecase': 'Elevate privileges',
            'Privileges': 'kernel',
            'OperatingSystem': 'Windows 10',
        },
        'Resources': [
            'https://blogs.vmware.com/security/2023/10/hunting-vulnerable-kernel-drivers.html'
        ],
        'Acknowledgement': {
            'Person': '',
            'Handle': '',
        },
        'Detection': [],
        'KnownVulnerableSamples': [
            {
                'Filename': '',
                'MD5': md5,
                'SHA1': '',
                'SHA256': '',
                'Signature': '',
                'Date': '',
                'Publisher': '',
                'Company': '',
                'Description': '',
                'Product': '',
                'ProductVersion': '',
                'FileVersion': '',
                'MachineType': '',
                'OriginalFilename': ''
            }
            for md5 in md5s
        ],
        'Tags': [tag]
    }

    output_filename = f"{id_guid}.yaml"
    with open(output_filename, 'w', encoding='utf-8') as yamlfile:
        yaml.dump(yaml_template, yamlfile, default_flow_style=False)

print("YAML files have been generated based on tags.")

After running the script, it successfully generated all the YAML files, including those with multiple MD5 hashes as required into a single YAML file.

Sidebar: Standardizing File Names

During the process, we had to rename all samples in the project drivers directory to a standardized naming convention — md5.bin. This was accomplished using another Python script that recalculated MD5 hashes for each file and renamed them accordingly.

import os
import hashlib
from pathlib import Path

directory_path = '../drivers'

def calculate_md5(filename, block_size=4096):
    """ Calculate the MD5 hash of a file. """
    md5 = hashlib.md5()
    with open(filename, 'rb') as f:
        for block in iter(lambda: f.read(block_size), b''):
            md5.update(block)
    return md5.hexdigest()

for filename in os.listdir(directory_path):
    file_path = os.path.join(directory_path, filename)
    if os.path.isdir(file_path):
        continue
    file_md5 = calculate_md5(file_path)
    new_file_name = file_md5 + '.bin'
    new_file_path = os.path.join(directory_path, new_file_name)
    # Rename the file
    os.rename(file_path, new_file_path)
    print(f"Renamed {filename} to {new_file_name}")

print("All files have been renamed to their MD5 hash with a .bin extension.")

From Hash to Metadata Enrichment

With the YAML files ready, we leveraged an exceptional script, originally contributed by

Nasreddine Bencherchali, named metadata-extractor.py. This script performs the remarkable task of enriching the metadata visible within the YAML files or the user interface. The script works by getting the MD5 from the yaml and finding the .bin matching. It then loads it up using lief and we then extract everything.. well mostly (i’m sure we’re missing some things).

Execution is as simple as running the following command:

python3 bin/metadata-extractor.py -y yaml -d drivers

stdout:

 [*] NOTICE - The File yaml/dbb58de1-a1e5-4c7f-8fe0-4033502b1c63.yaml Was Enriched
    [*] NOTICE - The File yaml/3aa6e630-59be-4a15-a30c-aaed4c1edaf0.yaml Was Enriched
    [*] NOTICE - The File yaml/ebdde780-e142-44e7-a998-504c516f4695.yaml Was Enriched
    [*] NOTICE - The File yaml/b759adfa-b353-4ca3-9dfb-8fadf7a437eb.yaml Was Enriched
    [*] NOTICE - The File yaml/0c0198a3-5c63-4a9b-abe9-88a810602329.yaml Was Enriched
    [*] NOTICE - The File yaml/1aeb1205-8b02-42b6-a563-b953ea337c19.yaml Was Enriched
    [*] NOTICE - The File yaml/babe348d-f160-41ec-9db9-2413b989c1f0.yaml Was Enriched
    [*] NOTICE - The File yaml/5f70bde4-9f81-44a8-9d3e-c6c7cf65bfae.yaml Was Enriched
    [*] NOTICE - The File yaml/a08ee79f-801d-4b98-996f-55f6a72ac5f7.yaml Was Enriched
    [*] NOTICE - No enrichment was performed on yaml/d05a0a6c-c037-4647-99ac-c41593190223.yaml
    [*] NOTICE - The File yaml/e368efc7-cf69-47ae-8204-f69dac000b22.yaml Was Enriched
    [*] NOTICE - No enrichment was performed on yaml/7196366e-04f0-4aaf-9184-ed0a0d21a75f.yaml
    [*] NOTICE - The File yaml/6356d7d9-3b82-4731-9d5f-cc9bc37558fc.yaml Was Enriched
......

Now the MD5 in the yaml is completely enriched —

Blog image
Blog image

Now with enriched metadata, our next step is to incorporate YARA rules and update all detections accordingly.

YARA Generator and Enrichment

Having extracted the metadata and enriched the YAML files, our subsequent action is to augment this data with YARA rules. The yara-generator.py script creates YARA rules based on the drivers and YAML files and outputs them to the detection directory. Our good friend

Florian Roth contributed this and it works awesome!

Make your way to bin/yara-generator and run a simple command to generate the Yara rules:

python3 yara-generator.py -d ../../drivers -y ../../yaml -o ../../detections/yara

This will read the drivers+yaml and output to the yara detection directory. When this runs, it looks like this:

[INFO ] [+] Processing 1 driver folders
['../../drivers']
[INFO ] [+] Processing 1 YAML folders
[INFO ] [+] Processing 1821 sample files
....

and then it goes brrrrrrrrrr ripping through the binaries and outputs to the detections/yara directory.

Following YARA generation, we employ enrich_with_yara to embed the YARA rule URLs within the YAML files.

Generate Site

To ensure everything is in order, we execute a site generation script that builds the site data, typically run as a validation step before merging a PR. The script compiles CSV and JSON files for the API, markdown files for the website, and all related content, successfully populating the LOLDrivers.io site.

python3 bin/site.py

site_gen.py wrote 430 drivers markdown to: loldrivers.io/content/drivers/
site_gen.py wrote drivers CSV to: loldrivers.io/content/api/drivers.csv
site_gen.py wrote drivers JSON to: loldrivers.io/content/api/drivers.json
site_gen.py wrote drivers table to: loldrivers.io/content/drivers_table.csv
site_gen.py wrote drivers publishers to: loldrivers.io/content/drivers_top_n_publishers.csv
site_gen.py wrote drivers products to: loldrivers.io/content/drivers_top_n_products.csv
finished successfully!

This generates the csv and json for the API, all the markdown files for the site and everything related to the LOLDrivers.io site.

Final Result

In the end, we have a PR ready to be merged.

Now merged, we have the new pages.

Thank You

We trust this blog has shed light on the diverse automation techniques the LOLDrivers.io maintainers employ to handle a voluminous set of data efficiently. Bringing our narrative to a culmination, it’s pivotal to recognize the evolution of our processes. The journey we’ve undertaken has seen the transformation of meticulous manual tasks into a seamless operation, a confluence of scripts working in unison to compile hashes, generate YAMLs, rename binaries, and enrich metadata.

We want to also thank all of the Carbon Black Threat Analysis Unit (TAU) team and Takahiro Haruyam.

This transition has not only accelerated our workflow but has also underscored our dedication to enhancing LOLDrivers.io. As we look towards the horizon, our resolve to refine and advance our capabilities is unyielding. We warmly extend an invitation to the community to partake in this endeavor — to contribute, to collaborate, and to collectively elevate the safety and integrity of drivers everywhere.

Michael Haag

Written by

Michael Haag

Threat Researcher

In the intricate chessboard of cybersecurity, my role oscillates between a master tactician and a relentless hunter. As an expert in detection engineering and threat hunting, I don't just respond to the digital threats, I anticipate them, ensuring that the digital realm remains sovereign.

© 2025 MagicSword. All rights reserved.