Implementing Zero Trust Network Security in Azure at Scale

Arun Malik · June 2026 · 15 min read
Zero Trust Azure Networking Network Security Perimeter Private Link AVNM Service Tags Defense in Depth

Traditional perimeter-based security models assume that everything inside the corporate network boundary is trustworthy. That assumption has failed repeatedly in practice. Zero Trust inverts the model: trust nothing, verify everything, assume breach. This post describes how I implement Zero Trust network security across Azure infrastructure at enterprise scale, covering the full stack from Network Security Perimeter (NSP) for PaaS isolation through Private Link for data-plane protection, AVNM for centralized policy enforcement, and IP Address Management with Service Tags for precise traffic control. Each layer addresses a specific class of threats while maintaining operational simplicity at scale.

The Problem with Perimeter Security

For decades, network security followed a castle-and-moat architecture: a hardened perimeter (firewalls, DMZs, VPN gateways) protecting a trusted interior. If you were inside the perimeter, you had broad access. If you were outside, you did not. This model breaks down for three reasons that are now obvious but were once controversial:

First, cloud workloads have no meaningful perimeter. An Azure Storage account is accessible from any network by default. A SQL Database exposes a public endpoint. The interior/exterior distinction vanishes when your infrastructure spans dozens of regions and hundreds of subscriptions.

Second, lateral movement is the primary attack vector in modern breaches. Once an attacker compromises a single workload (phishing, supply chain, misconfiguration), they move laterally through the flat internal network. A perimeter-only model offers no resistance to this movement.

Third, the blast radius of a compromised identity is unbounded in a perimeter model. A single set of leaked credentials can access every resource the network can reach.

Zero Trust replaces implicit trust with explicit verification at every layer: identity, device, network, application, and data. In network security specifically, this means every flow must be explicitly authorized, every resource must be isolated by default, and every access decision must be logged and auditable.

NETWORK SECURITY PERIMETER (NSP) AZURE VIRTUAL NETWORK MANAGER (AVNM) NSG + SERVICE TAGS VM / Container Private Subnet Storage / KV Private Endpoint SQL / Cosmos Private Endpoint Azure Policy: Deny public access | Enforce encryption | Require NSG
Figure 1: Defense-in-depth layers for Zero Trust Azure networking. Each layer independently enforces access control, creating multiple barriers an attacker must overcome.

Layer 1: Network Security Perimeter (NSP)

Network Security Perimeter is Azure's newest network security primitive. It creates a logical boundary around PaaS resources that are deployed outside your virtual networks. The core insight behind NSP: Private Link secures the data plane (how you connect to resources), but NSP secures the control plane (who can access resources and from where).

How NSP Works

When you create a Network Security Perimeter and associate PaaS resources with it in enforced mode, all public network access is denied by default. Resources within the same perimeter can communicate freely with each other (intra-perimeter traffic), but any traffic crossing the perimeter boundary requires an explicit access rule.

This is the critical difference from traditional NSG-based security: NSP operates at the PaaS resource level, not the virtual network level. You can secure an Azure Storage account, Key Vault, or Azure AI Search instance without deploying it into a VNet or configuring Private Endpoints for every consumer.

Key Capability: Preventing Data Exfiltration

NSP prevents a common attack pattern: a compromised workload writing sensitive data to an attacker-controlled storage account. Because the attacker's storage account is outside the perimeter, the write fails. Traditional NSGs cannot prevent this because they operate at the IP/port level, not the resource identity level.

NSP Components

Component Purpose Scope
Perimeter Top-level logical boundary defining the trusted resource group Subscription / Resource Group
Profile Collection of access rules applied to associated resources Per-perimeter
Access Rule Inbound/outbound rules allowing traffic across the boundary Per-profile
Resource Association Binding a PaaS resource to a perimeter (with access mode) Per-resource
Diagnostic Settings Access logs and metrics for audit and compliance Per-perimeter

Access Modes: Transition vs. Enforced

NSP supports two access modes. In Transition mode (formerly called Learning mode), the perimeter logs all traffic that would be denied in enforced mode without actually blocking it. This gives you visibility into existing access patterns before cutting off traffic. In Enforced mode, all traffic except intra-perimeter and explicitly allowed flows is denied.

The recommended deployment pattern: associate resources in Transition mode, analyze access logs for 2-4 weeks, create access rules for legitimate traffic, then switch to Enforced mode. This avoids the outage risk of blocking traffic you did not know existed.

Operational Note

NSP is now generally available in all Azure public cloud regions and Azure Government regions. Private Endpoint traffic is allowed without explicit access rules when both the endpoint and the resource are within the same perimeter.

Layer 2: Private Endpoints and Private Link

If NSP secures the control plane, Private Link secures the data plane. A Private Endpoint creates a network interface inside your VNet with a private IP address that maps to a specific PaaS resource. Traffic between your workload and the resource traverses the Microsoft backbone network; it never touches the public internet.

Why Private Endpoints Matter for Zero Trust

Consider a standard Azure Storage account. Without Private Endpoints, your application connects to mystorageaccount.blob.core.windows.net which resolves to a public IP. Even with firewall rules restricting access to your VNet, the traffic still transits a public endpoint. DNS poisoning, BGP hijacking, or a misconfigured firewall rule could expose this path.

With a Private Endpoint, the same FQDN resolves to 10.0.1.5 (a private IP in your VNet). The connection is entirely private. There is no public attack surface. If you then disable public access on the storage account, the resource becomes invisible to the internet entirely.

Private Endpoint vs. Service Endpoint

Capability Private Endpoint Service Endpoint
Traffic path Private IP in your VNet, over backbone Still uses public IP of the service
DNS resolution Private IP (requires DNS zone) Public IP (unchanged)
On-premises access Works over ExpressRoute / VPN VNet traffic only
Data exfiltration protection Maps to a specific resource instance Allows access to any instance of the service type
Cost Per-endpoint hourly + data processing Free
Recommendation Use for production workloads Legacy; use for cost-sensitive dev/test
Best Practice

Deploy Private Endpoints for all PaaS resources in production. Combine with Azure Policy to deny creation of resources without Private Endpoints (Microsoft.Network/privateEndpoints audit/deny policy). Disable public access on the resource after Private Endpoint connectivity is confirmed.

Layer 3: Network Security Groups and Service Tags

Network Security Groups (NSGs) are the foundational traffic filter in Azure. They operate at Layer 4 (TCP/UDP), allowing you to define allow/deny rules based on source, destination, port, and protocol. In a Zero Trust model, NSGs enforce the principle of least privilege at the network layer: deny all traffic by default, then explicitly allow only the flows your application requires.

Service Tags: Managed IP Prefix Groups

Service Tags replace hardcoded IP addresses in NSG rules with dynamically managed IP prefix groups. Microsoft maintains these groups and updates them automatically as Azure service IP ranges change. This eliminates the operational burden of tracking and rotating IP addresses manually.

Consider a common scenario: your application needs to call Azure Key Vault. Without Service Tags, you would need to look up the current IP ranges for Key Vault in your region, create NSG rules with those IPs, and update them whenever Microsoft changes the ranges. With Service Tags, you write one rule: allow outbound to AzureKeyVault.WestUS2. Done.

Critical Service Tags for Zero Trust

Service Tag Scope Use Case
VirtualNetwork VNet address space + peered VNets Intra-VNet communication
AzureLoadBalancer Azure health probes Required for LB health checks
Internet All public IPs outside Azure Deny inbound from Internet (default deny)
Storage.<Region> Azure Storage IPs in a region Allow access to regional storage
Sql.<Region> Azure SQL IPs in a region Database connectivity
AzureMonitor Log Analytics, App Insights endpoints Telemetry egress
Service Tags Are Not Sufficient Alone

Service Tags simplify IP-based ACLs but are not a complete security solution. A Service Tag for Storage includes all Azure Storage accounts, including attacker-controlled ones. Combine Service Tags with Private Endpoints (which pin to a specific resource instance) and NSP (which prevents cross-boundary access) for proper data exfiltration protection.

IPAM and Custom Service Tags

Azure IP Address Manager (IPAM) provides centralized IP address planning, allocation, and tracking across your Azure environment. For organizations operating at scale, IPAM solves the problem of IP address sprawl: overlapping address spaces, exhausted subnets, and inconsistent allocation across teams.

In the context of Zero Trust, IPAM enables you to create custom service tags based on your own IP address pools. This allows NSG rules that reference your internal service boundaries rather than relying solely on Microsoft-managed tags. For example, you can create a service tag for your "payment processing" subnet range and reference it across all NSGs in your environment, ensuring that only explicitly designated networks can reach payment infrastructure.

# Register a custom IP prefix for use as a service tag
az network custom-ip prefix create \
  --name PaymentServices \
  --resource-group rg-networking \
  --cidr 10.50.0.0/16 \
  --zone 1

# Use in NSG rules across your environment
az network nsg rule create \
  --nsg-name nsg-web-tier \
  --name AllowToPayment \
  --priority 200 \
  --direction Outbound \
  --access Allow \
  --source-address-prefixes VirtualNetwork \
  --destination-address-prefixes 10.50.0.0/16 \
  --destination-port-ranges 443

Layer 4: Azure Virtual Network Manager (AVNM)

Individual NSG rules work at the subnet or NIC level. They are managed by the team that owns the resource. This creates a fundamental governance problem: a team can modify or delete their NSG rules, bypassing security policies set by the platform team. Azure Virtual Network Manager solves this by introducing Security Admin Rules that operate at a higher precedence than NSG rules and cannot be overridden by resource owners.

Security Admin Rules vs. NSG Rules

Property Security Admin Rules (AVNM) NSG Rules
Evaluation order Evaluated first (higher priority) Evaluated after admin rules
Override capability Cannot be overridden by resource owners Can be modified by anyone with NSG write permissions
Scope Management group, subscription, or network group Subnet or NIC
Actions Allow, Deny, Always Allow Allow, Deny
Use case Platform-level guardrails Application-level access control
Inbound Traffic Any source AVNM Admin Rules (1st) Cannot override NSG Rules (2nd) Team-managed Application Workload Traffic evaluation order: AVNM Security Admin Rules → NSG Rules → Application Platform teams set non-overridable guardrails; application teams add granular rules
Figure 2: Traffic evaluation order. AVNM Security Admin Rules evaluate before NSGs, providing non-overridable platform guardrails.

Common AVNM Patterns

Block high-risk ports globally: Create an admin rule denying inbound SSH (22) and RDP (3389) from the Internet across all network groups. Individual teams cannot create NSG rules to allow these ports, even if they have full NSG write permissions.

Enforce network segmentation: Block traffic between production and development network groups. This prevents accidental cross-environment communication even if VNet peering is misconfigured.

Always Allow for exceptions: The "Always Allow" action permits traffic regardless of subsequent deny rules. Use this sparingly for infrastructure services that must remain reachable (Azure Monitor, Key Vault) even when other deny rules are in place.

Layer 5: Azure Policy for Continuous Enforcement

The previous layers define the security controls. Azure Policy ensures those controls remain in place. Without policy enforcement, security configurations degrade over time: someone disables a firewall rule for debugging, a new resource deploys without a Private Endpoint, an NSG gets deleted during a migration.

Essential Zero Trust Policies

Policy Effect What It Prevents
Storage accounts should disable public access Deny Creating storage with public endpoints
SQL servers should use Private Link Audit / Deny Databases accessible from the internet
Subnets should have an NSG DeployIfNotExists Subnets without traffic filtering
Network interfaces should not have public IPs Deny VMs with direct internet exposure
VNet peering should only connect approved VNets Deny Unauthorized cross-boundary connectivity
Key Vault should disable public access Deny Secrets accessible from untrusted networks

The DeployIfNotExists effect is particularly powerful for Zero Trust: rather than just blocking non-compliant resources, it automatically remediates them. If a subnet is created without an NSG, the policy creates and attaches a default-deny NSG automatically.

Layer 6: External Scanning and Continuous Validation

A Zero Trust posture is only as strong as your ability to verify it. External scanning validates that your controls work as intended by testing your infrastructure from an attacker's perspective. Internal configuration audits can miss gaps that external probing reveals.

What to Scan

Integrate external scanning into your CI/CD pipeline. Before a deployment promotes to production, validate that no new public endpoints were introduced. After deployment, run a post-deployment scan to confirm the deployed state matches the expected state.

Layer 7: Shift Left and Infrastructure as Code

Security controls deployed reactively are always behind the threat. Shift Left means security validation happens at the earliest possible stage: during code review, in CI/CD pipelines, and before infrastructure changes reach production.

IaC Security Validation Pipeline

# Example: GitHub Actions step for Terraform security scanning
- name: Run tfsec
  uses: aquasecurity/tfsec-action@v1.0.0
  with:
    soft_fail: false

# Azure Policy compliance check before deployment
- name: Check Policy Compliance
  run: |
    az policy state trigger-scan --resource-group $RG
    az policy state list --resource-group $RG \
      --filter "complianceState eq 'NonCompliant'" \
      --query "[].{policy:policyDefinitionName, resource:resourceId}"

Security gates in the pipeline should check for:

Layer 8: Attestation and Ownership Reviews

Technical controls drift over time. Teams change, projects end, resources become orphaned, and permissions accumulate beyond what is needed. Regular attestation reviews verify that the security posture you built in layers 1-7 still reflects current reality.

What to Attest

Automate as much of this as possible. Azure Resource Graph queries can identify orphaned resources, unused permissions, and stale configurations. Build dashboards that surface compliance drift before it becomes a security incident.

Putting It All Together

Zero Trust network security in Azure is not a single product or a one-time configuration. It is a layered architecture where each layer addresses a specific threat vector, and the layers reinforce each other:

Layer 8: Attestation & Ownership Reviews (continuous governance) Layer 7: Shift Left / IaC Security (prevent before deploy) Layer 6: External Scanning (validate from attacker's view) Layer 5: Azure Policy (continuous enforcement) Layer 4: AVNM Security Admin Rules (non-overridable guardrails) Layer 3: NSGs + Service Tags (network-level ACLs) Layer 2: Private Endpoints (data-plane isolation) Layer 1: Network Security Perimeter (PaaS boundary control)
Figure 3: The eight layers of Zero Trust network security. Each layer operates independently and reinforces the others.
Defense-in-Depth: Progressive Security Layers Attestation Shift-Left IaC AVNM NSP Private Endpoints Vuln Scanning Azure Policy NSGs Phase 1: Foundation Phase 2: Enhancement Phase 3: Full ZT ← Start here (center out)
Figure 4: Defense-in-depth as concentric rings. Implementation begins at the core (NSGs) and progressively expands outward through each security layer.
The Critical Insight

No single layer is sufficient. An attacker who bypasses your NSG (layer 3) still faces Private Endpoint isolation (layer 2) and NSP boundaries (layer 1). A misconfigured AVNM rule (layer 4) is caught by Azure Policy (layer 5) and validated by external scanning (layer 6). The layers create redundancy through diversity.

Implementation Roadmap

The following staircase shows how to sequence your Zero Trust adoption. Each phase builds on the previous one, and the layers within each phase can be deployed in parallel.

Phase 1: Foundation ✓ NSGs (Network Security Groups) ✓ Azure Policy (basic) ✓ Vulnerability Scanning Phase 2: Enhancement ✓ Private Endpoints ✓ NSP (Network Security Perimeter) ✓ AVNM (Virtual Network Manager) Phase 3: Full Zero Trust ✓ Shift-Left IaC Security ✓ Runtime Attestation ✓ All layers at maximum Maturity Progression
Figure 5: Zero Trust implementation roadmap. Start with foundational controls (NSGs, Policy, Scanning), then layer on network isolation (PE, NSP, AVNM), and finally achieve full maturity with shift-left and attestation.

Operational Recommendations

  1. Start with visibility. Deploy NSP in Transition mode and Azure Policy in Audit mode before enforcing. Understand your current traffic patterns before restricting them.
  2. Automate everything. Manual security processes do not scale and they drift. Every control should be expressed as code, deployed through pipelines, and validated continuously.
  3. Measure compliance, not just configuration. A deployed NSG is not security. A validated, tested NSG that blocks the traffic it should block is security. Test your controls from the attacker's perspective.
  4. Plan for failure. Every security control will eventually be misconfigured, bypassed, or degraded. The layered approach ensures no single point of failure compromises your entire posture.
  5. Iterate continuously. Zero Trust is not a destination. New services, new attack vectors, and new Azure features require ongoing adaptation. Build review cycles into your operational rhythm.

References

  1. Microsoft. "What is a network security perimeter?" Azure Documentation, 2026.
  2. Microsoft. "What is Azure Private Link?" Azure Documentation, 2025.
  3. Microsoft. "Azure service tags overview." Azure Documentation, 2025.
  4. Microsoft. "Security admin rules in Azure Virtual Network Manager." Azure Documentation, 2026.
  5. Microsoft. "Azure Policy overview." Azure Documentation, 2025.
  6. Microsoft. "Zero Trust deployment guide for Azure networking." Microsoft Security, 2025.
  7. Rose, S., Borchert, O., Mitchell, S., Connelly, S. "Zero Trust Architecture." NIST Special Publication 800-207, 2020.
  8. Kindervag, J. "Build Security Into Your Network's DNA: The Zero Trust Network Architecture." Forrester Research, 2010.
← Back to home