Azure Data Lake Storage
Questi contenuti non sono ancora disponibili nella tua lingua.
Azure Data Lake Storage Gen2 is a set of capabilities built on Azure Blob Storage for big data analytics. The Aspire Azure Data Lake Storage hosting integration models Data Lake resources as children of an Azure Storage resource, and the client integration registers DataLakeServiceClient and DataLakeFileSystemClient instances for dependency injection.
Hosting integration
Section titled “Hosting integration”The Azure Data Lake Storage hosting integration models a Data Lake resource as a child of an Azure Storage resource. To add a Data Lake resource, install the 📦 Aspire.Hosting.Azure.Storage NuGet package in your AppHost project:
aspire add azure-storageLearn more about aspire add in the command reference.
Or, choose a manual installation approach:
#:package Aspire.Hosting.Azure.Storage@*<PackageReference Include="Aspire.Hosting.Azure.Storage" Version="*" />aspire add azure-storageLearn more about aspire add in the command reference.
This updates your aspire.config.json with the Azure Storage hosting integration package:
{ "packages": { "Aspire.Hosting.Azure.Storage": "13.3.0" }}Add Azure Data Lake resource
Section titled “Add Azure Data Lake resource”In your AppHost project, call AddDataLake (or addDataLake) on an Azure Storage resource builder to add a Data Lake resource:
var builder = DistributedApplication.CreateBuilder(args);
var storage = builder.AddAzureStorage("storage");var dataLake = storage.AddDataLake("datalake");
builder.AddProject<Projects.ExampleProject>() .WithReference(dataLake) .WaitFor(dataLake);
// After adding all resources, run the app...builder.Build().Run();import { createBuilder } from './.aspire/modules/aspire.mjs';
const builder = await createBuilder();
const storage = await builder.addAzureStorage("storage");const dataLake = await storage.addDataLake("datalake");
await builder.addProject("api", "../ExampleProject/ExampleProject.csproj") .withReference(dataLake) .waitFor(dataLake);
// After adding all resources, run the app...await builder.build().run();The preceding code:
- Adds an Azure Storage resource named
storage. - Adds a Data Lake resource named
datalakeas a child of the storage resource. - Passes a reference to the Data Lake resource to the consuming project and waits for it to be ready.
Add Azure Data Lake file system resource
Section titled “Add Azure Data Lake file system resource”You can also add a Data Lake file system resource directly from the storage resource using AddDataLakeFileSystem (or addDataLakeFileSystem):
var builder = DistributedApplication.CreateBuilder(args);
var storage = builder.AddAzureStorage("storage");var dataLake = storage.AddDataLake("datalake");var fileSystem = storage.AddDataLakeFileSystem("analytics", "analytics-data");
builder.AddProject<Projects.ExampleProject>() .WithReference(dataLake) .WithReference(fileSystem) .WaitFor(dataLake);
// After adding all resources, run the app...builder.Build().Run();import { createBuilder } from './.aspire/modules/aspire.mjs';
const builder = await createBuilder();
const storage = await builder.addAzureStorage("storage");const dataLake = await storage.addDataLake("datalake");const fileSystem = await storage.addDataLakeFileSystem("analytics", { dataLakeFileSystemName: "analytics-data" });
await builder.addProject("api", "../ExampleProject/ExampleProject.csproj") .withReference(dataLake) .withReference(fileSystem) .waitFor(dataLake);
// After adding all resources, run the app...await builder.build().run();The AddDataLakeFileSystem (or addDataLakeFileSystem) method takes:
name: The resource name used in Aspire.dataLakeFileSystemName(optional): The actual file system name in Azure. Defaults to the resource name if not specified.
Customize provisioning infrastructure
Section titled “Customize provisioning infrastructure”The Data Lake resource is part of the Azure Storage resource, which is a subclass of AzureProvisioningResource. You can customize the generated Bicep using the ConfigureInfrastructure (or configureInfrastructure) API on the storage resource. For example, you can configure the storage SKU, access tier, and other properties:
var builder = DistributedApplication.CreateBuilder(args);
var storage = builder.AddAzureStorage("storage") .ConfigureInfrastructure(infra => { var storageAccount = infra.GetProvisionableResources() .OfType<StorageAccount>() .Single();
storageAccount.Sku = new StorageSku { Name = StorageSkuName.PremiumLrs }; storageAccount.Tags.Add("workload", "analytics"); });
var dataLake = storage.AddDataLake("datalake");
builder.AddProject<Projects.ExampleProject>() .WithReference(dataLake);
// After adding all resources, run the app...builder.Build().Run();import { createBuilder } from './.aspire/modules/aspire.mjs';
const builder = await createBuilder();
const storage = await builder.addAzureStorage("storage") .configureInfrastructure(infra => { const storageAccount = infra.getProvisionableResources() .filter(r => r.type === "StorageAccount")[0];
storageAccount.sku = { name: "Premium_LRS" }; storageAccount.tags["workload"] = "analytics"; });
const dataLake = await storage.addDataLake("datalake");
await builder.addProject("api", "../ExampleProject/ExampleProject.csproj") .withReference(dataLake);
// After adding all resources, run the app...await builder.build().run();For more information on customizing Azure Storage provisioning, see Azure Blob Storage: Customize provisioning infrastructure.
Client integration
Section titled “Client integration”To get started with the Aspire Azure Data Lake Storage client integration, install the 📦 Aspire.Azure.Storage.Files.DataLake NuGet package in your client-consuming project:
dotnet add package Aspire.Azure.Storage.Files.DataLake#:package Aspire.Azure.Storage.Files.DataLake@*<PackageReference Include="Aspire.Azure.Storage.Files.DataLake" Version="*" />Add Azure Data Lake service client
Section titled “Add Azure Data Lake service client”In the Program.cs file of your client-consuming project, call AddAzureDataLakeServiceClient to register a DataLakeServiceClient for dependency injection:
builder.AddAzureDataLakeServiceClient("datalake");You can then retrieve the DataLakeServiceClient instance using dependency injection:
public class ExampleService(DataLakeServiceClient client){ // Use client...}Add Azure Data Lake file system client
Section titled “Add Azure Data Lake file system client”You can also register a DataLakeFileSystemClient for accessing a specific file system:
builder.AddAzureDataLakeFileSystemClient("analytics");You can then retrieve the DataLakeFileSystemClient instance using dependency injection:
public class ExampleService(DataLakeFileSystemClient client){ // Use client...}Keyed services
Section titled “Keyed services”Both client methods have keyed variants for registering multiple clients:
builder.AddKeyedAzureDataLakeServiceClient("datalake1");builder.AddKeyedAzureDataLakeServiceClient("datalake2");
builder.AddKeyedAzureDataLakeFileSystemClient("analytics");builder.AddKeyedAzureDataLakeFileSystemClient("archive");Configuration
Section titled “Configuration”The Azure Data Lake Storage client integration supports multiple configuration approaches.
Use a connection string
Section titled “Use a connection string”Provide the connection name when calling AddAzureDataLakeServiceClient:
builder.AddAzureDataLakeServiceClient("datalake");The connection string is retrieved from the ConnectionStrings section. Two formats are supported:
Service URI (recommended):
{ "ConnectionStrings": { "datalake": "https://{account_name}.dfs.core.windows.net/" }}When using a service URI, a default credential is used for authentication.
For file system clients, include the file system name:
{ "ConnectionStrings": { "analytics": "https://{account_name}.dfs.core.windows.net/;FileSystemName=analytics-data" }}Azure Storage connection string:
{ "ConnectionStrings": { "datalake": "DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=mykey;EndpointSuffix=core.windows.net" }}Use configuration providers
Section titled “Use configuration providers”The integration loads settings from the Aspire:Azure:Storage:Files:DataLake configuration section:
{ "Aspire": { "Azure": { "Storage": { "Files": { "DataLake": { "ServiceUri": "https://{account_name}.dfs.core.windows.net/", "DisableHealthChecks": false, "DisableTracing": false } } } } }}Use inline delegates
Section titled “Use inline delegates”Configure settings programmatically:
builder.AddAzureDataLakeServiceClient( "datalake", settings => settings.DisableHealthChecks = true);Configure client options:
builder.AddAzureDataLakeServiceClient( "datalake", configureClientBuilder: clientBuilder => clientBuilder.ConfigureOptions( options => options.Diagnostics.ApplicationId = "myapp"));Client integration health checks
Section titled “Client integration health checks”By default, the integration adds a health check that verifies connectivity to Azure Data Lake Storage. The health check:
- Is enabled when
DisableHealthChecksisfalse(the default) - Integrates with the
/healthHTTP endpoint
Observability and telemetry
Section titled “Observability and telemetry”Logging
Section titled “Logging”The integration uses these log categories:
Azure.CoreAzure.Identity
Tracing
Section titled “Tracing”The integration emits OpenTelemetry tracing activities:
Azure.Storage.Files.DataLake.DataLakeServiceClientAzure.Storage.Files.DataLake.DataLakeFileSystemClient
Metrics
Section titled “Metrics”The Azure SDK for Data Lake Storage doesn’t currently emit metrics.