Windows

Windows Server 2025 as an AI Host: Docker, GPU Passthrough, and Operational Baselines

10 min read

As CTO of my own company, I recently attended a conference that was hyping GPU-powered AI and was annoyed by how far behind Windows is in most of these conversations. Our stack is all Windows Server, but most AI tools still assume Linux. When I looked at the price for a full move to Linux, with new skills, new policies, new everything, it honestly didn’t make any sense. I wanted AI now, but I didn’t want to torch a Windows stack that already runs our business.

What finally gave me hope was a combo that actually works: Windows Server 2025 + Docker + the NVIDIA Container Toolkit. Put together the right way, it lets me run GPU-accelerated containers on our Windows infrastructure in a setup that feels realistic for production. It’s not a neat and clean three-step checklist because, frankly, this stuff is messy. But I’ll walk through what mattered for me: the drivers that count, the Docker settings that tend to cause issues, and the runtime choices that decide whether the GPU shows up at all.

By the end, you’ll have Windows Server 2025 running Docker containers with full NVIDIA GPU support, ready for inference, model serving, or whatever comes next.

What is Windows Server 2025 + Docker + NVIDIA Toolkit?

This infrastructure stack combines Microsoft’s latest Windows Server 2025 with Docker containerization and NVIDIA’s GPU acceleration toolkit to create a platform for running AI workloads on Windows infrastructure. It’s essentially the enterprise Windows answer to the predominantly Linux-based AI deployment ecosystem.

Key Components:

  • Windows Server 2025: Latest Windows Server with enhanced container support and security features
  • Docker Engine: Container runtime that handles application packaging and deployment
  • NVIDIA Container Toolkit: GPU passthrough layer that exposes NVIDIA hardware to containers
  • GPU Passthrough: Direct GPU access from containers for compute-intensive AI tasks
  • Enterprise Integration: Full Active Directory, Group Policy, and Windows management tooling compatibility
  • Hybrid Architecture: Bridge between traditional Windows infrastructure and modern containerized AI workloads

Prerequisites

Before diving into this setup, make sure you have:

  • [ ] Windows Server 2025 (Standard or Datacenter edition)
  • [ ] NVIDIA GPU with compute capability 3.5 or higher (RTX 20/30/40 series, Tesla, Quadro)
  • [ ] Latest NVIDIA GPU drivers (version 470.57.02 or newer)
  • [ ] Minimum 16GB RAM (32GB recommended for AI workloads)
  • [ ] Administrative privileges on the Windows Server
  • [ ] Internet connectivity for downloading components
  • [ ] At least 100GB free disk space for containers and models

Step-by-Step Installation Guide

Step 1: Verify System Requirements and GPU Detection

First, confirm your system meets the requirements and Windows can see your NVIDIA GPU.

Open Server Manager and navigate to Local Server to verify your Windows Server 2025 installation.

platform=windows; view=os; description=Server Manager showing Windows Server 2025 system information

Open Device Manager by right-clicking the Start button and selecting Device Manager. Expand Display adapters to confirm your NVIDIA GPU is detected and has no warning indicators.

platform=windows; view=os; description=Device Manager showing NVIDIA GPU under Display adapters without warning icons

Expected result: You should see your NVIDIA GPU listed without any yellow warning triangles or error indicators.

Step 2: Install Latest NVIDIA GPU Drivers

Download the latest NVIDIA drivers from the NVIDIA Driver Download page. Select Windows Server 2022 as the operating system (Server 2025 drivers use the 2022 branch).

Run the downloaded installer and select Custom (Advanced) installation type. Check Perform a clean installation to ensure no conflicting drivers remain.

platform=windows; view=app; description=NVIDIA driver installer showing Custom installation options with clean installation checked

After installation, open Command Prompt as Administrator and verify the driver installation:

nvidia-smi

Expected result: You should see output displaying your GPU information, driver version, and CUDA version.

platform=windows; view=terminal; description=Command prompt showing nvidia-smi output with GPU details and driver version

Step 3: Enable Windows Container Features

This is where the foundation work starts getting tricky. Windows doesn’t enable container support by default, and you need both Containers and Hyper-V working properly for Docker to behave.

Open PowerShell as Administrator and enable the Containers feature:

Enable-WindowsOptionalFeature -Online -FeatureName Containers -All -NoRestart

Enable Hyper-V (required for Docker Desktop):

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V -All -NoRestart

Restart the server to apply these changes:

Restart-Computer -Force

After reboot, verify the features are enabled:

Get-WindowsOptionalFeature -Online -FeatureName Containers
Get-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V

Expected result: Both features should show State : Enabled.

Step 4: Install Docker Desktop for Windows

Download Docker Desktop for Windows from the official Docker website. Choose the version that supports Windows containers.

Run the installer and ensure Use Windows containers instead of Linux containers is checked during installation.

platform=windows; view=app; description=Docker Desktop installer configuration screen with Windows containers option selected

After installation, Docker Desktop should start automatically. If not, launch it from the Start menu.

Open Settings in Docker Desktop (gear icon in the top-right) and navigate to General. Verify that Use Windows containers is selected.

platform=windows; view=app; description=Docker Desktop Settings showing General tab with Windows containers selected

Test Docker installation by opening Command Prompt and running:

docker version

Expected result: You should see both client and server version information, confirming Docker is running.

Step 5: Install NVIDIA Container Toolkit

Alright, this is where things start feeling like you’re assembling something with missing instructions. The NVIDIA Container Toolkit needs to be configured specifically for Windows containers, and it’s not as straightforward as the Linux equivalent.

First, download the NVIDIA Container Toolkit for Windows from the NVIDIA Container Toolkit releases page. Look for the Windows installer (.msi file).

Run the installer with administrative privileges. The installer will configure the necessary runtime components automatically.

platform=windows; view=app; description=NVIDIA Container Toolkit installer welcome screen

After installation, restart the Docker service to recognize the new runtime:

Restart-Service docker

Verify the NVIDIA runtime is available:

docker info

Look for nvidia in the Runtimes section of the output.

Expected result: The Docker info output should include nvidia as an available runtime.

Step 6: Configure Docker Daemon for GPU Support

Create or edit the Docker daemon configuration file. Navigate to C:\ProgramData\Docker\config\ and create or edit daemon.json:

{
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  },
  "experimental": false
}
platform=windows; view=app; description=Notepad showing daemon.json configuration file with NVIDIA runtime settings

Restart Docker Desktop or the Docker service:

Restart-Service docker

Wait for Docker to fully restart (check Docker Desktop shows “Engine running”).

Step 7: Test GPU Access in Containers

This is the make-or-break moment. Testing GPU access reveals just how different Windows containers are from their Linux counterparts, and the syntax here is going to feel foreign if you’re used to Linux Docker commands.

Test GPU access with a simple NVIDIA container:

docker run --rm --gpus all mcr.microsoft.com/windows/servercore:ltsc2022 cmd /c "nvidia-smi"

This command might fail initially because Windows containers handle GPU access differently than Linux containers. Instead, try:

docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 mcr.microsoft.com/windows/servercore:ltsc2022 cmd /c "nvidia-smi"

The device class GUID 5B45201D-F2F2-4F3B-85BB-30FF1F953599 represents GPU devices in Windows containers.

Expected result: You should see nvidia-smi output from within the container, showing your GPU is accessible.

platform=windows; view=terminal; description=Command prompt showing successful nvidia-smi output from within a Windows container

Yeah, that GUID syntax is as awkward as it looks. But when it works, you’ll know you’ve cleared the biggest hurdle.

Step 8: Set Up AI Workload Test Environment

Time to see if all this complexity actually pays off with a real AI workload test.

Create a test directory for your AI workloads:

mkdir C:\AIWorkloads
cd C:\AIWorkloads

Create a simple Dockerfile to test GPU-accelerated Python workloads:

# Use Windows Server Core as base
FROM mcr.microsoft.com/windows/servercore:ltsc2022

# Install Python (this is simplified - you'd typically use a Python base image)
# For testing, we'll use a pre-built Python Windows container
FROM python:3.11-windowsservercore-ltsc2022

# Install basic GPU testing packages
RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Copy test script
COPY test_gpu.py .

# Run test
CMD ["python", "test_gpu.py"]

Create a simple GPU test script (test_gpu.py):

import torch

def test_gpu():
    if torch.cuda.is_available():
        device = torch.cuda.get_device_name(0)
        print(f"GPU detected: {device}")
        print(f"CUDA version: {torch.version.cuda}")
        
        # Simple tensor operation on GPU
        x = torch.rand(1000, 1000).cuda()
        y = torch.rand(1000, 1000).cuda()
        z = torch.matmul(x, y)
        
        print("GPU computation successful!")
        return True
    else:
        print("No GPU detected")
        return False

if __name__ == "__main__":
    test_gpu()

Build and run the test container:

docker build -t ai-gpu-test .
docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 ai-gpu-test

Expected result: The container should detect your GPU and successfully perform a GPU computation.

Configuration Options

Docker Daemon Configuration

The daemon.json file supports additional GPU-related settings:

{
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  },
  "experimental": false,
  "log-level": "info",
  "storage-driver": "windowsfilter",
  "registry-mirrors": [],
  "insecure-registries": []
}

Environment Variables for GPU Control

Control GPU visibility and capabilities in containers:

# Make specific GPUs visible
docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 -e NVIDIA_VISIBLE_DEVICES=0 your-image

# Specify required capabilities
docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 -e NVIDIA_DRIVER_CAPABILITIES=compute,utility your-image

Resource Limits and Allocation

Set memory and CPU limits for AI workloads:

docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 --memory=8g --cpus="4.0" your-ai-image

Common Configuration Patterns

Production AI Inference Server

FROM python:3.11-windowsservercore-ltsc2022

# Install production dependencies
RUN pip install fastapi uvicorn torch torchvision transformers

# Copy application
COPY app/ /app/
WORKDIR /app

# Expose port
EXPOSE 8000

# Run server
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Batch Processing Container

FROM python:3.11-windowsservercore-ltsc2022

# Install batch processing tools
RUN pip install pandas numpy torch scikit-learn

# Copy scripts
COPY scripts/ /scripts/
WORKDIR /scripts

# Default to batch processing
CMD ["python", "batch_processor.py"]

Tips and Troubleshooting

Common Issues

Problem: “docker: Error response from daemon: container created but not started”

This usually happens when GPU device access is incorrectly configured. Windows containers require the specific device class GUID rather than the --gpus flag used in Linux.

Solution:

# Instead of --gpus all, use:
docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 your-image

Problem: “nvidia-smi: command not found” inside container

The NVIDIA drivers aren’t properly passed through to the container. This often indicates the Container Toolkit isn’t correctly configured.

Solution:

  1. Verify the Container Toolkit installation
  2. Check that daemon.json has the correct runtime configuration
  3. Restart Docker service after any configuration changes

Problem: “CUDA out of memory” errors in containers

Container resource limits may be conflicting with GPU memory allocation.

Solution:

# Increase container memory limits
docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 --memory=16g your-image

Problem: Container isolation issues with GPU access

Windows container isolation modes can interfere with GPU passthrough.

Solution: Always use --isolation=process for GPU-enabled containers:

docker run --rm --isolation=process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 your-image

Pro Tips

  • Monitor GPU utilization: Use nvidia-smi -l 1 to continuously monitor GPU usage during container operations
  • Container image optimization: Use multi-stage builds to reduce final image size for AI workloads
  • Resource monitoring: Set up Windows Performance Toolkit to monitor container resource usage
  • Backup configurations: Always backup your daemon.json before making changes
  • Version compatibility: Keep NVIDIA drivers, Container Toolkit, and Docker versions aligned for best compatibility
  • Development workflow: Use Docker Compose for multi-container AI applications with shared GPU access

The reality is that troubleshooting this setup requires patience. When something breaks, it’s usually at the intersection of three different technology stacks, and error messages can be cryptic. But once you get past the initial configuration hurdles, the platform is surprisingly stable.

Conclusion

When the first GPU computation finally ran inside the container, I was quite happy with myself! It’s not the clean, elegant Linux experience with all those device class GUIDs and isolation setting, but the payoff is absolutely worth it for a Windows-first shop like mine.

This approach lets me keep what already works (Active Directory, Group Policy, our monitoring stack, and the team’s Windows instincts) while still running serious AI workloads with full GPU acceleration. The passthrough quirks are annoying, sure, but they’re not deal-breakers. They’re just the cost of bridging two worlds and once you get it to click, the setup becomes a solid foundation for inference, model serving, and batch processing.

What comes next:

  • Build your first production inference pipeline on this foundation
  • Set up proper monitoring for GPU utilization and container health
  • Explore orchestration options with Docker Swarm or Kubernetes for Windows

The hard part is behind you. Now you can focus on the AI workloads themselves instead of fighting the infrastructure.