14 min to read
GCP Shared VPC and GKE Cluster Setup Guide
Configure IAM permissions for GKE clusters using Shared VPC architecture

Overview
Google Cloud Platform’s Shared VPC is a powerful feature that enables centralized network management while maintaining project-level resource isolation.
This comprehensive guide walks through the process of setting up GKE clusters across different service projects using Shared VPC architecture, with detailed IAM configuration steps.
Shared VPC allows organizations to centrally manage network resources while enabling independent resource operations across service projects.
This approach ensures consistent security policies, systematic network separation across different environments (dev, staging, production), and efficient collaboration across teams.
Why Shared VPC Matters<
Shared VPC is essential for enterprise-grade GCP deployments where network security, cost optimization, and operational efficiency are critical.
It provides a foundation for multi-environment architectures while maintaining security boundaries and enabling centralized network governance.
For organizations running multiple projects and environments, Shared VPC eliminates the complexity of VPC peering while providing superior network control and security posture management.
What is Shared VPC?
Shared VPC allows multiple projects to connect their resources to a common VPC network, enabling secure and efficient communication using internal IP addresses.
This architecture separates network administration from project administration, providing several key benefits:
Key Benefits of Shared VPC
Benefit | Description | Business Impact |
---|---|---|
Centralized Network Management | Single point of control for network policies, firewall rules, and routing | Reduced operational overhead and consistent security policies |
Project Isolation | Resources separated by project boundaries while sharing network | Clear cost allocation and security boundaries |
Simplified Communication | Internal IP communication without VPC peering complexity | Lower latency and reduced network management complexity |
Environment Separation | Logical separation of dev, staging, and production environments | Improved security and compliance posture |
Cost Optimization | Shared network resources and reduced data transfer costs | Lower overall infrastructure costs |
Shared VPC Architecture Patterns
Single Host Project Architecture
The most common Shared VPC pattern uses one host project providing network services to multiple service projects:
Multiple Host Projects Architecture
For organizations requiring environment isolation at the network level:
Prerequisites and Project Setup
Example Project Structure
For this guide, we’ll use the following project structure:
Project Type | Project ID | Purpose | Resources |
---|---|---|---|
Host Project | somaz-hp | Network management and shared resources | VPC, Subnets, Firewall Rules, Artifact Registry |
Service Project | somaz-sp-dev | Development environment | GKE Cluster, Development workloads |
Service Project | somaz-sp-prod | Production environment | GKE Cluster, Production workloads |
Required APIs and Service Accounts
Before starting the IAM configuration, ensure the following APIs are enabled and understand the service accounts involved:
APIs to Enable:
# Enable required APIs in all projects
gcloud services enable container.googleapis.com
gcloud services enable compute.googleapis.com
gcloud services enable artifactregistry.googleapis.com
gcloud services enable cloudresourcemanager.googleapis.com
# Verify enabled APIs
gcloud services list --enabled --filter="name:container.googleapis.com OR name:compute.googleapis.com"
Key Service Accounts:
Service Account | Format | Purpose |
---|---|---|
Google APIs Service Account | <project-number>@cloudservices.gserviceaccount.com | Default service account for Google Cloud services |
GKE Service Agent | service-<project-number>@container-engine-robot.iam.gserviceaccount.com | GKE cluster operations and management |
Terraform GKE Service Account | tf-gke-<cluster-name>-<random>@<project>.iam.gserviceaccount.com | Terraform-managed GKE cluster service account |
IAM Configuration Step by Step
This section provides a comprehensive walkthrough of IAM permissions required for GKE clusters in Shared VPC environments.
Step 1: Set Environment Variables
First, establish environment variables for the service accounts and project information:
Step 2: Enable Required APIs
Ensure all necessary APIs are enabled in both host and service projects:
# Enable APIs in service project
gcloud services enable container.googleapis.com --project=$SERVICE_PROJECT
gcloud services enable compute.googleapis.com --project=$SERVICE_PROJECT
# Enable APIs in host project
gcloud services enable container.googleapis.com --project=$HOST_PROJECT
gcloud services enable compute.googleapis.com --project=$HOST_PROJECT
# Verify API enablement
gcloud services list --enabled --project=$SERVICE_PROJECT --filter="name:container.googleapis.com"
gcloud services list --enabled --project=$HOST_PROJECT --filter="name:compute.googleapis.com"
Step 3: Grant Kubernetes Engine Service Agent Role
Grant the roles/container.serviceAgent
role to both host and service project service accounts:
Step 4: Grant Editor Role to Service Project
Grant editor permissions within the service project:
Step 5: Grant Network User Role
Grant network access permissions on the host project:
Step 6: Grant Additional Required Roles
Grant additional roles needed for full GKE functionality:
Step 7: Organization-Level Permissions for Terraform
For Terraform automation, grant organization-level permissions:
Step 8: Artifact Registry Permissions
After creating the GKE cluster with Terraform, grant Artifact Registry access:
Terraform Integration
Terraform Configuration Example
Here’s a comprehensive Terraform configuration for creating GKE clusters in Shared VPC:
# terraform/main.tf
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 4.0"
}
}
}
provider "google" {
project = var.service_project_id
region = var.region
}
# Data sources for existing network resources
data "google_compute_network" "shared_vpc" {
name = var.network_name
project = var.host_project_id
}
data "google_compute_subnetwork" "gke_subnet" {
name = var.subnet_name
project = var.host_project_id
region = var.region
}
# GKE Cluster in Shared VPC
resource "google_container_cluster" "shared_vpc_cluster" {
name = var.cluster_name
location = var.region
project = var.service_project_id
# Network configuration for Shared VPC
network = data.google_compute_network.shared_vpc.self_link
subnetwork = data.google_compute_subnetwork.gke_subnet.self_link
# IP allocation policy for VPC-native cluster
ip_allocation_policy {
cluster_secondary_range_name = var.cluster_secondary_range_name
services_secondary_range_name = var.services_secondary_range_name
}
# Remove default node pool
remove_default_node_pool = true
initial_node_count = 1
# Network policy configuration
network_policy {
enabled = true
}
# Workload Identity
workload_identity_config {
workload_pool = "${var.service_project_id}.svc.id.goog"
}
# Private cluster configuration
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = false
master_ipv4_cidr_block = var.master_ipv4_cidr_block
}
# Master authorized networks
master_authorized_networks_config {
dynamic "cidr_blocks" {
for_each = var.authorized_networks
content {
cidr_block = cidr_blocks.value.cidr_block
display_name = cidr_blocks.value.display_name
}
}
}
}
# Node pool
resource "google_container_node_pool" "primary_nodes" {
name = "${var.cluster_name}-node-pool"
location = var.region
cluster = google_container_cluster.shared_vpc_cluster.name
project = var.service_project_id
node_count = var.node_count
node_config {
preemptible = var.preemptible
machine_type = var.machine_type
# Service account for nodes
service_account = google_service_account.gke_nodes.email
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]
# Workload Identity
workload_metadata_config {
mode = "GKE_METADATA"
}
}
# Auto-scaling
autoscaling {
min_node_count = var.min_node_count
max_node_count = var.max_node_count
}
# Node management
management {
auto_repair = true
auto_upgrade = true
}
}
# Service account for GKE nodes
resource "google_service_account" "gke_nodes" {
account_id = "tf-gke-${var.cluster_name}-nodes"
display_name = "GKE Node Service Account for ${var.cluster_name}"
project = var.service_project_id
}
# IAM binding for node service account
resource "google_project_iam_binding" "gke_nodes_gcr" {
project = var.host_project_id
role = "roles/artifactregistry.reader"
members = [
"serviceAccount:${google_service_account.gke_nodes.email}",
]
}
Variables Configuration
# terraform/variables.tf
variable "service_project_id" {
description = "The service project ID where GKE cluster will be created"
type = string
}
variable "host_project_id" {
description = "The host project ID containing the Shared VPC"
type = string
}
variable "region" {
description = "The region for the GKE cluster"
type = string
default = "us-central1"
}
variable "network_name" {
description = "The name of the Shared VPC network"
type = string
}
variable "subnet_name" {
description = "The name of the subnet for GKE cluster"
type = string
}
variable "cluster_name" {
description = "The name of the GKE cluster"
type = string
}
variable "cluster_secondary_range_name" {
description = "The name of the secondary range for cluster IPs"
type = string
}
variable "services_secondary_range_name" {
description = "The name of the secondary range for services"
type = string
}
variable "master_ipv4_cidr_block" {
description = "The IP range for the GKE master"
type = string
default = "172.16.0.0/28"
}
variable "authorized_networks" {
description = "List of authorized networks for GKE master"
type = list(object({
cidr_block = string
display_name = string
}))
default = []
}
variable "node_count" {
description = "Number of nodes in the node pool"
type = number
default = 3
}
variable "min_node_count" {
description = "Minimum number of nodes in the node pool"
type = number
default = 1
}
variable "max_node_count" {
description = "Maximum number of nodes in the node pool"
type = number
default = 10
}
variable "machine_type" {
description = "Machine type for GKE nodes"
type = string
default = "e2-medium"
}
variable "preemptible" {
description = "Whether to use preemptible nodes"
type = bool
default = false
}
Terraform Execution
# Initialize Terraform
terraform init
# Plan the deployment
terraform plan -var-file="environments/dev.tfvars"
# Apply the configuration
terraform apply -var-file="environments/dev.tfvars"
# Verify cluster creation
gcloud container clusters get-credentials cluster-name --region=region --project=service-project-id
kubectl get nodes
Best Practices and Security Guidelines
IAM Security Best Practices
Practice | Description | Implementation |
---|---|---|
Principle of Least Privilege | Grant only the minimum required permissions | Use specific predefined roles instead of primitive roles |
Service Account Segregation | Create dedicated service accounts for different purposes | Separate accounts for GKE nodes, applications, and CI/CD |
Regular Access Review | Periodically review and audit IAM permissions | Use Cloud Asset Inventory and IAM Recommender |
Conditional Access | Use IAM conditions for time-based or resource-specific access | Implement conditions for temporary access or specific resources |
Network Security Configuration
Monitoring and Logging
Troubleshooting Common Issues
Permission Denied Errors
- Cluster creation fails: Verify all service accounts have required roles on host project
- Node pool creation fails: Check compute.networkUser role for GKE service account
- Pod networking issues: Verify secondary IP ranges are properly configured
- Image pull errors: Ensure Artifact Registry reader permissions are granted
Diagnostic Commands
Network Configuration Issues
What’s Next?
After successfully setting up GKE clusters with Shared VPC, consider these advanced topics:
- Multi-cluster Service Mesh: Implement Istio across multiple GKE clusters
- GitOps with ArgoCD: Set up continuous deployment pipelines
- Workload Identity: Secure pod-to-GCP service authentication
- Network Policies: Implement fine-grained network security
- Cross-project Monitoring: Set up centralized observability
Advanced Configurations
Recommended Learning Path
- Master the basics: Ensure solid understanding of Shared VPC and GKE fundamentals
- Implement automation: Use Terraform for all infrastructure provisioning
- Security hardening: Apply network policies and Workload Identity
- Observability: Set up comprehensive monitoring and logging
- Advanced networking: Explore service mesh and multi-cluster architectures
Comments