Skip to content
Docs Try Aspire
Docs Try

Set up Ollama in the AppHost

⭐ Community Toolkit Ollama logo

This article is the reference for the Aspire Ollama hosting integration from the Aspire Community Toolkit. It enumerates the AppHost APIs — with examples for both AppHost.cs and apphost.mts — that you use to model an Ollama server and its model resources in your AppHost project.

If you’re new to the Ollama integration, start with the Get started with Ollama integrations guide. For how consuming apps read the connection information this page exposes, see Connect to Ollama.

To start building an Aspire app that uses Ollama, install the 📦 CommunityToolkit.Aspire.Hosting.Ollama NuGet package:

Terminal
aspire add ollama --source CommunityToolkit

Learn more about aspire add in the command reference.

Or, choose a manual installation approach:

C# — AppHost.cs
#:package CommunityToolkit.Aspire.Hosting.Ollama@*
XML — AppHost.csproj
<PackageReference Include="CommunityToolkit.Aspire.Hosting.Ollama" Version="*" />

Once you’ve installed the hosting integration in your AppHost project, you can add an Ollama server resource and then add model resources as shown in the following examples:

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama");
var llama3 = ollama.AddModel("llama3");
var exampleProject = builder.AddProject<Projects.ExampleProject>("apiservice")
.WithReference(llama3);
// After adding all resources, run the app...
  1. When Aspire adds a container image to the AppHost, as shown in the preceding example with the docker.io/ollama/ollama image, it creates a new Ollama instance on your local machine.

  2. The resource name passed to AddOllama is used as the connection string name when referenced in a dependency. Model sub-resources are named by sanitizing the model name (for example, "llama3" produces the resource name "ollama-llama3").

  3. The AppHost reference call configures a connection in the consuming project named after the referenced model resource.

Models are sub-resources of an Ollama server resource. You add them with the AddModel (or addModel) method. Each model is downloaded when the Ollama container first starts.

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama");
// Add by model name — resource name is generated automatically
var phi35 = ollama.AddModel("phi3.5");
// Add with an explicit resource name
var llama3 = ollama.AddModel("ollama-llama3", "llama3");
var exampleProject = builder.AddProject<Projects.ExampleProject>("apiservice")
.WithReference(phi35)
.WithReference(llama3);
// After adding all resources, run the app...

When the Ollama container for this integration first spins up, it downloads the configured LLMs. The progress of this download displays in the State column for this integration on the Aspire dashboard.

Add a data volume to the Ollama resource to persist downloaded models across container restarts:

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama")
.WithDataVolume();
var llama3 = ollama.AddModel("llama3");
var exampleProject = builder.AddProject<Projects.ExampleProject>()
.WithReference(llama3);
// After adding all resources, run the app...

The data volume is mounted at /root/.ollama in the Ollama container. When a name parameter isn’t provided, the volume name is generated automatically. For more information on data volumes and details on why they’re preferred over bind mounts, see Docker docs: Volumes.

By default, the Ollama container runs on CPU. To enable GPU acceleration, use the WithGPUSupport (or withGPUSupport) extension method:

Nvidia:

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama")
.WithGPUSupport();
ollama.AddModel("llama3");
// After adding all resources, run the app...

AMD:

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama")
.WithGPUSupport(OllamaGpuVendor.AMD);
ollama.AddModel("llama3");
// After adding all resources, run the app...

For more information, see GPU support in Docker Desktop and GPU support in Podman.

To use a model in GGUF format from Hugging Face, use the AddHuggingFaceModel (or addHuggingFaceModel) extension method:

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama");
var llama = ollama.AddHuggingFaceModel(
"llama",
"bartowski/Llama-3.2-1B-Instruct-GGUF:IQ4_XS");
var exampleProject = builder.AddProject<Projects.ExampleProject>()
.WithReference(llama);
// After adding all resources, run the app...

Only models in GGUF format are supported. Ollama automatically prefixes hf.co/ if the model name doesn’t already start with hf.co/ or huggingface.co/.

To use a fixed port for the Ollama container, pass it as a parameter:

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama", port: 11434);
ollama.AddModel("llama3");
// After adding all resources, run the app...

The Ollama integration also provides support for running Open WebUI alongside the Ollama container:

C# — AppHost.cs
var builder = DistributedApplication.CreateBuilder(args);
var ollama = builder.AddOllama("ollama")
.WithOpenWebUI();
ollama.AddModel("llama3");
// After adding all resources, run the app...

For the full reference of Ollama connection properties — and how consuming apps in C#, TypeScript, Python, and Go read them — see Connect to Ollama.

The Ollama hosting integration automatically adds health checks for both the Ollama server and each model resource:

  • Server health check. Verifies that the Ollama server is running and that a connection can be established to it.
  • Model health check. Verifies that a model has been downloaded and is available. The model resource is marked as unhealthy until the download completes.