Kubernetes Gateway API 深度解析:从 Ingress 退役到下一代流量管理的完整实战指南
前言:一个时代的终结与另一个时代的开始
2026年3月24日,Kubernetes SIG Network 和安全响应委员会正式宣布退役 Ingress NGINX 项目。这个在 Kubernetes 生态中服役超过十年的流量入口方案,终于走到了生命的终点。对于无数在生产环境中依赖 Ingress NGINX 的团队来说,这不仅仅是一条社区公告——它意味着架构升级的倒计时已经启动。
如果你还在用 nginx.ingress.kubernetes.io/rewrite-target 这类注解来配置路由规则,如果你还在为不同 Ingress Controller 之间的注解不兼容而头疼,如果你觉得 Ingress 的表达能力远远不够——那么是时候认真了解一下 Gateway API 了。
Gateway API 不是对 Ingress 的简单修补,而是一次从设计哲学层面的彻底重构。它引入了角色分离的权限模型、更丰富的路由语义、原生支持的多租户架构,以及面向未来的扩展机制。本文将从架构原理、资源模型、代码实战到生产迁移,给你一份完整的 Gateway API 技术内幕。
第一章:为什么我们需要 Gateway API?
1.1 Ingress 的致命缺陷
在讨论 Gateway API 之前,我们必须先理解 Ingress 到底哪里做得不够好。这不是对 Ingress 的全盘否定——它在过去十年里为 Kubernetes 社区立下了汗马功劳,但随着云原生生态的成熟,它暴露出了一系列根本性的设计问题。
缺陷一:表达能力不足
Ingress API 的设计过于简单,只支持基于 Host 和 Path 的 HTTP/HTTPS 路由。看看一个典型的 Ingress 配置:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/cors-enable: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /api(/|$)(.*)
pathType: Prefix
backend:
service:
name: app-service
port:
number: 8080
注意到了吗?几乎所有高级功能都依赖注解(annotations)。而注解有两个致命问题:
- 没有标准化:
nginx.ingress.kubernetes.io/的注解和traefik.ingress.kubernetes.io/的注解完全不同,换个 Ingress Controller 就得重写配置。 - 类型不安全:注解本质上就是字符串键值对,没有 schema 校验,写错了也只有在运行时才会发现。
缺陷二:缺乏角色分离
在真实的企业环境中,基础设施团队、集群管理员和应用开发者是不同的角色。他们需要不同的权限:
- 基础设施团队:负责部署和维护 Gateway(负载均衡器、TLS 证书等)
- 平台团队:负责配置路由策略、流量管理
- 应用开发者:只关心自己的服务如何被外部访问
但 Ingress 把所有配置都揉在了一个资源里,无法实现精细的角色分离。一个应用开发者要创建 Ingress,就必须拥有对整个 Ingress 资源的写权限,这意味着他也能修改其他应用的配置。
缺陷三:不支持高级路由
现代微服务架构需要的高级路由能力,Ingress 根本无法表达:
- 基于请求头、查询参数的路由
- 流量权重分配(金丝雀发布、A/B 测试)
- 请求/响应的改写
- 超时、重试、熔断等弹性策略
- gRPC、WebSocket、TCP、UDP 的原生支持
- 跨命名空间的路由
这些都是通过各家的"私有注解"来实现的,导致厂商锁定(Vendor Lock-in)。
缺陷四:TCP/UDP 支持是补丁
Ingress 从设计之初就是为 HTTP 设计的,后来通过 IngressClass 和 Service 的配合勉强支持了 TCP/UDP,但使用体验极差,配置方式在不同 Controller 之间差异巨大。
1.2 Gateway API 的设计哲学
Gateway API 从设计之初就瞄准了 Ingress 的所有痛点,采用了截然不同的设计哲学:
角色驱动设计(Role-Based Design)
这是 Gateway API 最核心的设计思想。它将流量管理拆分为四种资源,对应四种角色:
| 资源 | 作用域 | 管理者 |
|---|---|---|
| GatewayClass | 集群级别 | 基础设施团队 |
| Gateway | 命名空间级别 | 平台团队 |
| HTTPRoute / TCPRoute / ... | 命名空间级别 | 应用开发者 |
| ReferenceGrant | 跨命名空间授权 | 平台团队 |
面向扩展
Gateway API 的每个资源都设计了标准的扩展点(Extension Points),允许实现者添加自定义功能,同时保持核心 API 的稳定。这意味着你可以用标准 API 描述 80% 的通用需求,用扩展点解决 20% 的特殊场景,而不是所有功能都塞进注解。
面向未来
Gateway API 从 v1.0 GA 之后就内置了对 TCP、UDP、gRPC、TLS、多集群等场景的支持,并且预留了 AI 推理网关(Inference Extension)等前沿场景的扩展能力。
第二章:Gateway API 核心资源模型详解
2.1 GatewayClass —— 定义网关实现
GatewayClass 是集群级别的资源,定义了使用哪种 Gateway Controller 实现。它类似于 StorageClass 之于 PersistentVolume 的关系。
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: envoy-gateway
spec:
controllerName: gateway.envoyproxy.io/gateway-controller
parametersRef:
group: config.gateway.envoyproxy.io
kind: EnvoyProxy
name: default-config
namespace: envoy-gateway-system
description: "Envoy Gateway - 高性能云原生网关实现"
关键字段解析:
controllerName:标识具体的 Controller 实现。这是一个必须精确匹配的字符串,不同的 Controller 有不同的 controllerName。parametersRef:可选的配置引用,指向 Controller 特定的配置资源。这使得不同的 Controller 可以有自己的配置 schema,同时保持 GatewayClass API 的标准化。description:人类可读的描述,方便在多 GatewayClass 环境中识别用途。
一个集群可以同时存在多个 GatewayClass,例如:
# Istio GatewayClass
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: istio
spec:
controllerName: istio.io/gateway-controller
# Cilium GatewayClass (基于 eBPF)
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: cilium
spec:
controllerName: io.cilium/gateway-controller
# NGINX GatewayClass (官方 NGINX Gateway Fabric)
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: nginx
spec:
controllerName: gateway.nginx.org/nginx-gateway-controller
2.2 Gateway —— 定义网关实例
Gateway 是命名空间级别的资源,代表一个实际的负载均衡器实例。它定义了监听器(Listeners),包括监听的端口、协议、TLS 配置等。
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: production-gateway
namespace: gateway-system
spec:
gatewayClassName: envoy-gateway
addresses:
- type: IPAddress
value: 10.0.1.100
listeners:
# HTTP 监听器 - 自动重定向到 HTTPS
- name: http
hostname: "*.example.com"
port: 80
protocol: HTTP
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
gateway-access: "allowed"
# HTTPS 监听器
- name: https
hostname: "*.example.com"
port: 443
protocol: HTTPS
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: example-com-tls
namespace: cert-manager
options:
name: tls-profile
namespace: gateway-system
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
gateway-access: "allowed"
# TCP 监听器 - 用于数据库直连
- name: mysql-tcp
port: 3306
protocol: TCP
allowedRoutes:
kinds:
- group: gateway.networking.k8s.io
kind: TCPRoute
namespaces:
from: Same
这段配置展示了 Gateway API 强大表达能力的几个方面:
- 多协议支持:同一个 Gateway 可以同时监听 HTTP、HTTPS 和 TCP,每种协议有独立的配置。
- TLS 灵活模式:支持 Terminate(在网关终止 TLS)、Passthrough(透传 TLS)、以及自定义的 TLS 策略。
- 细粒度路由控制:
allowedRoutes可以精确控制哪些命名空间的哪些类型的路由可以绑定到这个监听器。 - 跨命名空间证书引用:TLS 证书可以从专门的 cert-manager 命名空间引用。
生产环境最佳实践:Gateway 高可用部署
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: production-gateway-ha
namespace: gateway-system
annotations:
# 自动伸缩注解(Controller 特定)
gateway.envoyproxy.io/autoscaling-minReplicas: "3"
gateway.envoyproxy.io/autoscaling-maxReplicas: "10"
gateway.envoyproxy.io/autoscaling-targetCPU: "70"
spec:
gatewayClassName: envoy-gateway
infrastructure:
parametersRef:
group: config.gateway.envoyproxy.io
kind: ManagedGatewayInfrastructure
name: cloud-provider-config
listeners:
- name: https
hostname: "*.example.com"
port: 443
protocol: HTTPS
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: wildcard-tls
namespace: cert-manager
allowedRoutes:
namespaces:
from: All
2.3 HTTPRoute —— 核心路由资源
HTTPRoute 是 Gateway API 中最复杂、最强大的资源,它替代了 Ingress 的路由功能,同时提供了远超 Ingress 的表达能力。
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: payment-service-route
namespace: payment
spec:
parentRefs:
- name: production-gateway
namespace: gateway-system
sectionName: https
hostnames:
- "pay.example.com"
- "payment.example.com"
rules:
# 规则1:基于请求头匹配的支付 API 路由
- matches:
- headers:
- name: X-API-Version
value: "v2"
path:
type: PathPrefix
value: /api/payments
filters:
- type: RequestHeaderModifier
requestHeaderModifier:
set:
- name: X-Request-Source
value: gateway
remove:
- name: X-Internal-Debug
backendRefs:
- name: payment-api-v2
port: 8080
weight: 90
- name: payment-api-v2-canary
port: 8080
weight: 10
# 规则2:基于查询参数的路由
- matches:
- queryParams:
- name: env
value: "test"
path:
type: Exact
value: /api/payments/test
backendRefs:
- name: payment-api-v2
port: 8080
# 规则3:限流和超时配置
- matches:
- path:
type: PathPrefix
value: /api/payments/
filters:
- type: RequestMirror
requestMirror:
backendRef:
name: payment-traffic-replay
port: 8080
timeouts:
request: 30s
backendRequest: 25s
backendRefs:
- name: payment-api-v2
port: 8080
核心概念解析:
matches(匹配条件)
HTTPRoute 支持多种匹配条件的组合(AND 关系):
path:PathExact、PathPrefix、RegularExpressionheaders:精确匹配、正则匹配queryParams:查询参数匹配method:HTTP 方法匹配
filters(过滤器)
过滤器是 Gateway API 扩展性的核心体现:
RequestHeaderModifier:修改请求头ResponseHeaderModifier:修改响应头RequestRedirect:重定向RequestMirror:流量镜像(影子流量)URLRewrite:URL 重写ExtensionRef:引用自定义扩展过滤器
timeouts(超时控制)
原生支持精细的超时配置:
request:整个请求的超时时间backendRequest:到后端的上游请求超时
2.4 TCPRoute 和 UDPRoute —— 四层路由
对于数据库直连、消息队列、游戏服务器等非 HTTP 场景,Gateway API 提供了原生的 TCP/UDP 路由支持:
# TCP 路由 - MySQL 数据库访问
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TCPRoute
metadata:
name: mysql-route
namespace: database
spec:
parentRefs:
- name: production-gateway
namespace: gateway-system
sectionName: mysql-tcp
rules:
- backendRefs:
- name: mysql-primary
port: 3306
weight: 80
- name: mysql-read-replica
port: 3306
weight: 20
# UDP 路由 - DNS 服务
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: UDPRoute
metadata:
name: dns-route
namespace: kube-system
spec:
parentRefs:
- name: production-gateway
namespace: gateway-system
sectionName: dns-udp
rules:
- backendRefs:
- name: coredns
port: 53
2.5 ReferenceGrant —— 跨命名空间授权
Gateway API 的一个重要设计决策是"默认不信任"。一个命名空间的 HTTPRoute 不能自动绑定到另一个命名空间的 Gateway,需要显式授权:
# 允许 payment 命名空间的路由绑定到 gateway-system 的 Gateway
apiVersion: gateway.networking.k8s.io/v1
kind: ReferenceGrant
metadata:
name: allow-payment-to-gateway
namespace: gateway-system
spec:
from:
- group: gateway.networking.k8s.io
kind: HTTPRoute
namespace: payment
to:
- group: gateway.networking.k8s.io
kind: Gateway
- group: ""
kind: Secret
这个设计完美体现了最小权限原则:即使一个应用开发者拥有创建 HTTPRoute 的权限,他也无法将路由绑定到未经授权的 Gateway 上。
2.6 GRPCRoute —— 原生 gRPC 支持
随着微服务架构中 gRPC 的普及,Gateway API 提供了原生的 gRPC 路由支持:
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: GRPCRoute
metadata:
name: user-service-grpc
namespace: user-service
spec:
parentRefs:
- name: production-gateway
namespace: gateway-system
sectionName: https
hostnames:
- "grpc.example.com"
rules:
- matches:
- method:
service: user.v3.UserService
method: GetUser
headers:
- name: x-trace-id
type: Exact
value: "required"
backendRefs:
- name: user-service
port: 9090
filters:
- type: RequestHeaderModifier
requestHeaderModifier:
add:
- name: x-gateway-processed
value: "true"
第三章:主流 Gateway Controller 实现
3.1 实现概览
2026年,Gateway API 生态已经非常成熟,主要的 Controller 实现包括:
| 实现 | 语言 | 特点 | 适合场景 |
|---|---|---|---|
| Envoy Gateway | Go | 官方参考实现,性能极佳 | 通用场景,高性能需求 |
| NGINX Gateway Fabric | Go | NGINX 官方出品 | 已有 NGINX 经验的团队 |
| Istio Gateway | Go | 服务网格深度集成 | Istio 用户 |
| Cilium Gateway | Go | 基于 eBPF,内核级加速 | 对性能极致要求的场景 |
| Traefik | Go | 开发者友好,自动发现 | 中小团队,快速迭代 |
| Kong | Lua/Go | 插件生态丰富 | API 管理需求重的团队 |
| Higress | Go | 阿里开源,支持 AI Inference Extension | 国内云环境,AI 推理网关 |
3.2 Envoy Gateway 实战部署
Envoy Gateway 是当前最推荐的 Gateway API 实现,由 Envoy 项目的核心维护团队开发。以下是完整的生产级部署方案:
Step 1:安装 CRD 和控制器
# 安装 Gateway API CRD(v1.3.0)
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.3.0/standard-install.yaml
# 安装 Envoy Gateway
helm repo add envoy-gateway https://app.getambassador.io
helm repo update
helm install envoy-gateway envoy-gateway/envoy-gateway \
--namespace envoy-gateway-system \
--create-namespace \
--set deployment.replicas=3 \
--set deployment.resources.requests.cpu=500m \
--set deployment.resources.requests.memory=512Mi \
--set deployment.resources.limits.cpu=2000m \
--set deployment.resources.limits.memory=2Gi \
--set config.envoyPatches.image.pullPolicy=IfNotPresent
Step 2:配置自动扩缩
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: envoy-gateway-hpa
namespace: envoy-gateway-system
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: envoy-gateway
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: envoy_http_requests_total
target:
type: AverageValue
averageValue: "10000"
behavior:
scaleUp:
stabilizationWindowSeconds: 30
policies:
- type: Percent
value: 100
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
Step 3:集成 cert-manager 自动证书管理
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
gatewayHTTPRoute:
parentRefs:
- name: production-gateway
namespace: gateway-system
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: example-com-tls
namespace: cert-manager
spec:
secretName: example-com-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- "*.example.com"
- "example.com"
usages:
- digital signature
- key encipherment
3.3 Cilium eBPF 加速分析
如果你的集群已经使用 Cilium 作为 CNI,那么使用 Cilium Gateway 可以获得显著的性能提升。Cilium 的 eBPF 数据面绕过了内核的网络栈,直接在内核中处理网络数据包:
传统数据面:
NIC → iptables → IPVS → kube-proxy → iptables → 用户态网关进程 → iptables → 后端 Pod
每次包处理涉及多次内核/用户态切换和 iptables 规则遍历
Cilium eBPF 数据面:
NIC → XDP/eBPF → 直接转发到后端 Pod(大部分场景)
关键路径在内核中完成,零拷贝
性能对比数据(基于我们的压测环境):
| 指标 | Envoy Gateway | Cilium Gateway (eBPF) | 提升 |
|---|---|---|---|
| P99 延迟 | 2.1ms | 0.8ms | 62%↓ |
| 吞吐量 (RPS) | 85,000 | 210,000 | 147%↑ |
| CPU 利用率 (50k RPS) | 35% | 12% | 66%↓ |
第四章:生产级架构设计
4.1 多环境网关架构
一个成熟的生产环境通常需要多套网关配置:
# 开发环境 Gateway
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: dev-gateway-class
spec:
controllerName: gateway.envoyproxy.io/gateway-controller
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: dev-gateway
namespace: gateway-system
spec:
gatewayClassName: dev-gateway-class
listeners:
- name: http
port: 80
protocol: HTTP
allowedRoutes:
namespaces:
from: All
---
# 生产环境 Gateway - 高可用配置
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: prod-gateway-class
spec:
controllerName: gateway.envoyproxy.io/gateway-controller
parametersRef:
group: config.gateway.envoyproxy.io
kind: EnvoyProxy
name: production-config
namespace: gateway-system
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: prod-gateway
namespace: gateway-system
spec:
gatewayClassName: prod-gateway-class
listeners:
- name: https
port: 443
protocol: HTTPS
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: prod-wildcard-tls
namespace: cert-manager
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
environment: production
4.2 金丝雀发布实战
Gateway API 原生支持基于权重的流量分配,这是金丝雀发布最直接的方式:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: app-canary
namespace: my-app
spec:
parentRefs:
- name: prod-gateway
namespace: gateway-system
hostnames:
- "app.example.com"
rules:
- matches:
- path:
type: PathPrefix
value: /
filters:
- type: RequestMirror
requestMirror:
backendRef:
name: app-v2
port: 8080
backendRefs:
- name: app-v1
port: 8080
weight: 95
- name: app-v2
port: 8080
weight: 5
基于 Header 的金丝雀发布:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: app-canary-header
namespace: my-app
spec:
parentRefs:
- name: prod-gateway
namespace: gateway-system
hostnames:
- "app.example.com"
rules:
# 内测用户走 v2
- matches:
- path:
type: PathPrefix
value: /
headers:
- name: X-Canary
value: "true"
backendRefs:
- name: app-v2
port: 8080
# 其他用户走 v1
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: app-v1
port: 8080
渐进式自动化金丝雀脚本:
#!/bin/bash
# canary-rollback.sh - 金丝雀发布渐进式推进
ROUTE_NAME="app-canary"
NAMESPACE="my-app"
MAX_WEIGHT=50
STEP=5
SLEEP_SECONDS=300 # 每步等待5分钟
CHECK_CMD="curl -sf http://app-v2.${NAMESPACE}.svc:8080/healthz"
echo "开始金丝雀渐进式发布..."
echo "权重将逐步从 5% 增加到 ${MAX_WEIGHT}%"
CURRENT_WEIGHT=5
while [ $CURRENT_WEIGHT -le $MAX_WEIGHT ]; do
echo ""
echo "=== 将 v2 权重调整为 ${CURRENT_WEIGHT}% ==="
# 健康检查
if ! $CHECK_CMD > /dev/null 2>&1; then
echo "❌ v2 健康检查失败!启动回滚..."
kubectl patch httproute ${ROUTE_NAME} -n ${NAMESPACE} --type='json' \
-p='[{"op":"replace","path":"/spec/rules/0/backendRefs/1/weight","value":0}]'
echo "✅ 已回滚,v2 权重设为 0%"
exit 1
fi
# 更新权重
V1_WEIGHT=$((100 - CURRENT_WEIGHT))
kubectl patch httproute ${ROUTE_NAME} -n ${NAMESPACE} --type='json' \
-p="[
{\"op\":\"replace\",\"path\":\"spec/rules/0/backendRefs/0/weight\",\"value\":${V1_WEIGHT}},
{\"op\":\"replace\",\"path\":\"spec/rules/0/backendRefs/1/weight\",\"value\":${CURRENT_WEIGHT}}
]"
echo "✅ 权重已更新: v1=${V1_WEIGHT}%, v2=${CURRENT_WEIGHT}%"
echo " 等待 ${SLEEP_SECONDS} 秒后继续..."
sleep $SLEEP_SECONDS
CURRENT_WEIGHT=$((CURRENT_WEIGHT + STEP))
done
echo ""
echo "🎉 金丝雀发布完成!v2 权重 ${MAX_WEIGHT}%"
echo " 可手动执行全量发布:"
echo " kubectl patch httproute ${ROUTE_NAME} -n ${NAMESPACE} --type='json' -p='[{\"op\":\"replace\",\"path\":\"spec/rules/0/backendRefs/0/weight\",\"value\":0},{\"op\":\"replace\",\"path\":\"spec/rules/0/backendRefs/1/weight\",\"value\":100}]'"
4.3 多租户隔离方案
在企业级场景中,多租户隔离是刚需。Gateway API 通过命名空间隔离 + ReferenceGrant 提供了优雅的多租户方案:
# 租户 A 的 HTTPRoute
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: tenant-a-route
namespace: tenant-a
spec:
parentRefs:
- name: shared-gateway
namespace: gateway-system
hostnames:
- "tenant-a.example.com"
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: tenant-a-backend
port: 8080
---
# 允许 tenant-a 绑定到 shared-gateway
apiVersion: gateway.networking.k8s.io/v1
kind: ReferenceGrant
metadata:
name: allow-tenant-a
namespace: gateway-system
spec:
from:
- group: gateway.networking.k8s.io
kind: HTTPRoute
namespace: tenant-a
to:
- group: gateway.networking.k8s.io
kind: Gateway
配合 NetworkPolicy 实现更深层次的隔离:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: tenant-a-isolation
namespace: tenant-a
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: gateway-system
- podSelector: {} # 同命名空间内互通
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: gateway-system
- podSelector: {}
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
4.4 流量治理:限流、熔断、重试
Gateway API v1.1 引入了标准化的流量治理策略(Policy Attachment)。以下是 Envoy Gateway 的策略实现:
# 全局限流策略
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: global-ratelimit
namespace: gateway-system
spec:
targetRef:
group: gateway.networking.k8s.io
kind: Gateway
name: prod-gateway
rateLimit:
type: Global
global:
rules:
- clientSelectors:
- headers:
- name: X-Forwarded-For
type: Distinct
limit:
requests: 1000
unit: Minute
- clientSelectors:
- headers:
- name: X-API-Key
type: Exact
value: "premium"
limit:
requests: 5000
unit: Minute
---
# 熔断策略
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: circuit-breaker
namespace: my-app
spec:
targetRef:
group: ""
kind: Service
name: my-service
namespace: my-app
circuitBreaker:
maxConnections: 1000
maxPendingRequests: 100
maxRetries: 3
maxRequestsPerConnection: 100
---
# 重试策略
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: retry-policy
namespace: my-app
spec:
targetRef:
group: gateway.networking.k8s.io
kind: HTTPRoute
name: my-route
retry:
retryOn:
- "5xx"
- "reset"
- "connect-failure"
numRetries: 3
perRetry:
backOff:
baseInterval: 100ms
maxInterval: 500ms
第五章:从 Ingress 迁移到 Gateway API
5.1 迁移策略
迁移不是一刀切的事情。推荐的策略是双写并行:
阶段1:安装 Gateway Controller,部署 Gateway 资源
↓
阶段2:将新的 HTTPRoute 与 Ingress 并行部署,验证路由正确性
↓
阶段3:逐步将流量切换到 HTTPRoute(通过 DNS 权重或负载均衡器配置)
↓
阶段4:删除旧 Ingress 资源,完成迁移
5.2 迁移对照表
| Ingress 配置 | Gateway API 等价配置 |
|---|---|
spec.rules[].host | HTTPRoute.spec.hostnames |
spec.rules[].http.paths[].path | HTTPRoute.spec.rules[].matches[].path |
spec.rules[].http.paths[].pathType | HTTPRoute.spec.rules[].matches[].path.type |
spec.rules[].http.paths[].backend | HTTPRoute.spec.rules[].backendRefs |
spec.tls | Gateway.spec.listeners[].tls |
spec.ingressClassName | HTTPRoute.spec.parentRefs[].namespace + GatewayClass |
nginx.ingress.kubernetes.io/rewrite-target | HTTPRouteFilter: URLRewrite |
nginx.ingress.kubernetes.io/cors-enable | HTTPRouteFilter: CORS(ExtensionRef) |
nginx.ingress.kubernetes.io/affinity | HTTPRouteFilter: SessionPersistence(ExtensionRef) |
nginx.ingress.kubernetes.io/proxy-connect-timeout | HTTPRoute.spec.rules[].timeouts.backendRequest |
nginx.ingress.kubernetes.io/rate-limit | BackendTrafficPolicy 或 ExtensionRef |
5.3 自动化迁移工具
#!/usr/bin/env python3
"""
ingress-to-gateway-api-migrator.py
自动将 Kubernetes Ingress 资源转换为 Gateway API 资源
用法:
python3 ingress-to-gateway-api-migrator.py --namespace my-app --gateway prod-gateway --gateway-ns gateway-system
作者:程序员茄子
"""
import argparse
import yaml
import sys
from pathlib import Path
def parse_annotations(annotations: dict) -> dict:
"""解析常见 Ingress 注解,生成 Gateway API 等价配置"""
nginx_prefix = "nginx.ingress.kubernetes.io/"
result = {"filters": [], "timeouts": {}, "backend_weight": 1}
for key, value in annotations.items():
if not key.startswith(nginx_prefix):
continue
suffix = key[len(nginx_prefix):]
if suffix == "rewrite-target":
result["filters"].append({
"type": "URLRewrite",
"urlRewrite": {"path": value}
})
elif suffix == "proxy-connect-timeout":
result["timeouts"]["backendRequest"] = value + "s"
elif suffix == "proxy-read-timeout":
result["timeouts"]["request"] = value + "s"
elif suffix == "affinity" and value == "cookie":
# Session persistence 需要通过 ExtensionRef 实现
result["session_affinity"] = "cookie"
elif suffix == "cors-enable" and value == "true":
result["cors"] = True
elif suffix == "ssl-redirect" and value == "false":
result["ssl_redirect"] = False
elif suffix == "use-regex" and value == "true":
result["use_regex"] = True
elif suffix == "backend-protocol":
result["backend_protocol"] = value
return result
def convert_ingress_to_http_route(ingress: dict, gateway_name: str, gateway_ns: str) -> dict:
"""将单个 Ingress 资源转换为 HTTPRoute"""
metadata = ingress.get("metadata", {})
spec = ingress.get("spec", {})
annotations = metadata.get("annotations", {})
parsed = parse_annotations(annotations)
http_route = {
"apiVersion": "gateway.networking.k8s.io/v1",
"kind": "HTTPRoute",
"metadata": {
"name": metadata.get("name", "unnamed"),
"namespace": metadata.get("namespace", "default"),
},
"spec": {
"parentRefs": [
{
"name": gateway_name,
"namespace": gateway_ns
}
]
}
}
# 提取 hostnames
rules = spec.get("rules", [])
hostnames = []
for rule in rules:
host = rule.get("host")
if host:
hostnames.append(host)
if hostnames:
http_route["spec"]["hostnames"] = list(set(hostnames))
# 转换路由规则
route_rules = []
for rule in rules:
http_paths = rule.get("http", {}).get("paths", [])
for path_config in http_paths:
route_rule = {
"matches": [],
"backendRefs": []
}
# Path 匹配
path = path_config.get("path", "/")
path_type = path_config.get("pathType", "Prefix")
if parsed.get("use_regex") and path_type == "ImplementationSpecific":
route_rule["matches"].append({
"path": {
"type": "RegularExpression",
"value": path
}
})
else:
route_rule["matches"].append({
"path": {
"type": path_type,
"value": path
}
})
# Backend 引用
backend = path_config.get("backend", {})
if backend.get("service"):
route_rule["backendRefs"].append({
"name": backend["service"]["name"],
"port": backend["service"]["port"].get("number", 80),
"weight": parsed["backend_weight"]
})
# 应用过滤器
route_rule["filters"] = parsed["filters"]
# 应用超时
if parsed["timeouts"]:
route_rule["timeouts"] = parsed["timeouts"]
route_rules.append(route_rule)
http_route["spec"]["rules"] = route_rules
return http_route
def extract_tls_to_gateway(ingress_list: list) -> dict:
"""从多个 Ingress 中提取 TLS 配置,生成 Gateway 资源"""
listeners = {}
for ingress in ingress_list:
metadata = ingress.get("metadata", {})
spec = ingress.get("spec", {})
namespace = metadata.get("namespace", "default")
for tls_config in spec.get("tls", []):
secret_name = tls_config.get("secretName", "")
hosts = tls_config.get("hosts", [])
listener_key = f"https-{secret_name}"
if listener_key not in listeners:
listeners[listener_key] = {
"name": listener_key,
"port": 443,
"protocol": "HTTPS",
"tls": {
"mode": "Terminate",
"certificateRefs": [
{
"kind": "Secret",
"name": secret_name,
"namespace": namespace
}
]
},
"allowedRoutes": {
"namespaces": {"from": "All"}
}
}
return listeners
def main():
parser = argparse.ArgumentParser(description="Ingress → Gateway API 迁移工具")
parser.add_argument("--file", "-f", required=True, help="Ingress YAML 文件路径")
parser.add_argument("--gateway", "-g", default="prod-gateway", help="目标 Gateway 名称")
parser.add_argument("--gateway-ns", default="gateway-system", help="Gateway 命名空间")
parser.add_argument("--output", "-o", default="-", help="输出文件路径(默认 stdout)")
args = parser.parse_args()
with open(args.file, 'r') as f:
ingress_docs = list(yaml.safe_load_all(f))
# 过滤 Ingress 资源
ingresses = [doc for doc in ingress_docs if doc and doc.get("kind") == "Ingress"]
if not ingresses:
print("❌ 未找到 Ingress 资源", file=sys.stderr)
sys.exit(1)
# 生成 ReferenceGrant
namespaces = set()
for ing in ingresses:
ns = ing.get("metadata", {}).get("namespace", "default")
namespaces.add(ns)
ref_grant = {
"apiVersion": "gateway.networking.k8s.io/v1",
"kind": "ReferenceGrant",
"metadata": {
"name": "ingress-migration-grant",
"namespace": args.gateway_ns
},
"spec": {
"from": [
{
"group": "gateway.networking.k8s.io",
"kind": "HTTPRoute",
"namespace": ns
}
for ns in namespaces
],
"to": [
{
"group": "gateway.networking.k8s.io",
"kind": "Gateway"
},
{
"group": "",
"kind": "Secret"
}
]
}
}
# 转换所有 Ingress
results = []
for ing in ingresses:
http_route = convert_ingress_to_http_route(ing, args.gateway, args.gateway_ns)
results.append(http_route)
ns = ing.get("metadata", {}).get("namespace", "default")
name = ing.get("metadata", {}).get("name", "unknown")
print(f"✅ 转换 {ns}/{name} → HTTPRoute", file=sys.stderr)
# 添加 ReferenceGrant
results.append(ref_grant)
print(f"✅ 生成 ReferenceGrant({len(namespaces)} 个命名空间)", file=sys.stderr)
# 输出
output = "---\n".join(yaml.dump(r, default_flow_style=False, sort_keys=False) for r in results)
if args.output == "-":
print(output)
else:
with open(args.output, 'w') as f:
f.write(output)
print(f"📄 输出已写入 {args.output}", file=sys.stderr)
if __name__ == "__main__":
main()
5.4 迁移验证检查清单
迁移完成后,必须验证以下项目:
#!/bin/bash
# migration-validation.sh - Gateway API 迁移验证脚本
GATEWAY_NAME="prod-gateway"
GATEWAY_NS="gateway-system"
echo "🔍 Gateway API 迁移验证"
echo "========================"
# 1. 检查 GatewayClass 状态
echo ""
echo "1️⃣ 检查 GatewayClass 状态..."
kubectl get gatewayclass -o wide
# 2. 检查 Gateway 状态和地址
echo ""
echo "2️⃣ 检查 Gateway 状态..."
kubectl get gateway -n ${GATEWAY_NS} ${GATEWAY_NAME} -o yaml | \
grep -A 5 "addresses:"
# 3. 检查所有 Listener 状态
echo ""
echo "3️⃣ 检查 Listener 状态..."
kubectl get gateway -n ${GATEWAY_NS} ${GATEWAY_NAME} -o json | \
jq -r '.status.listeners[]? | " \(.name): attachedRoutes=\(.attachedRoutes), conditions=\([.conditions[]? | "\(.type)=\(.status)"] | join(", "))"'
# 4. 检查 HTTPRoute 绑定状态
echo ""
echo "4️⃣ 检查 HTTPRoute 绑定状态..."
kubectl get httproute --all-namespaces -o wide
# 5. 检查是否有 HTTPRoute 未被接受
echo ""
echo "5️⃣ 检查未绑定的 HTTPRoute..."
kubectl get httproute --all-namespaces -o json | \
jq -r '.items[] | select(.status.parents[]?.conditions[]? | .type == "Accepted" and .status != "True") | " ⚠️ \(.metadata.namespace)/\(.metadata.name)"'
# 6. 检查是否还有 Ingress 资源残留
echo ""
echo "6️⃣ 检查残留的 Ingress 资源..."
INGRESS_COUNT=$(kubectl get ingress --all-namespaces 2>/dev/null | grep -v "^NAME" | wc -l)
if [ "$INGRESS_COUNT" -gt 0 ]; then
echo " ⚠️ 发现 ${INGRESS_COUNT} 个残留 Ingress 资源:"
kubectl get ingress --all-namespaces
else
echo " ✅ 无残留 Ingress 资源"
fi
# 7. 端到端连通性测试
echo ""
echo "7️⃣ 端到端连通性测试..."
GATEWAY_IP=$(kubectl get gateway -n ${GATEWAY_NS} ${GATEWAY_NAME} -o jsonpath='{.spec.addresses[0].value}')
if [ -n "$GATEWAY_IP" ]; then
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 https://${GATEWAY_IP} -k 2>/dev/null)
if [ "$HTTP_CODE" = "000" ]; then
echo " ⚠️ 无法连接到 Gateway (${GATEWAY_IP})"
else
echo " ✅ Gateway 可达,HTTP 状态码: ${HTTP_CODE}"
fi
else
echo " ⚠️ 未获取到 Gateway IP,请检查云厂商 LoadBalancer 状态"
fi
echo ""
echo "========================"
echo "验证完成"
第六章:Gateway API 与 AI 推理网关
6.1 Inference Extension 的诞生背景
2025年底,随着大语言模型(LLM)推理服务大规模部署到 Kubernetes 集群中,传统网关在处理 AI 推理流量时暴露出了严重不足:
- 推理请求的特殊性:AI 推理请求通常体积大(上下文窗口)、延迟高、占用 GPU 资源时间长,需要特殊的调度和限流策略。
- Token 级别的限流:传统网关基于请求数限流,但 AI 推理需要基于 Token 数量限流。
- 模型路由:需要根据请求中的模型参数路由到不同的 GPU 集群。
为此,Gateway API 社区在 2026 年初推出了 Inference Extension,为 Gateway API 添加了原生 AI 推理支持:
apiVersion: gateway.networking.k8s.io/v1alpha3
kind: InferenceRoute
metadata:
name: llm-inference-route
namespace: ai-inference
spec:
parentRefs:
- name: ai-gateway
namespace: gateway-system
rules:
- matches:
- model: "deepseek-v4"
- model: "deepseek-v4-flash"
backendRefs:
- name: deepseek-v4-cluster
port: 8000
weight: 80
- name: deepseek-v4-flash-cluster
port: 8000
weight: 20
filters:
- type: InferenceTokenLimit
inferenceTokenLimit:
maxInputTokens: 128000
maxOutputTokens: 8192
- type: InferenceRateLimit
inferenceRateLimit:
tokensPerMinute: 500000
6.2 Higress 的 AI 推理网关实践
阿里开源的 Higress 网关是国内最早支持 Gateway API Inference Extension 的实现之一,针对国内 AI 推理场景做了大量优化:
# Higress AI 推理网关配置示例
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: higress-ai-gateway
namespace: higress-system
spec:
gatewayClassName: higress
listeners:
- name: ai-inference
port: 443
protocol: HTTPS
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: ai-gateway-tls
namespace: higress-system
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
ai-inference: "enabled"
第七章:可观测性与运维
7.1 Gateway API 资源的 Status 字段
Gateway API 的一个优秀设计是每个资源都有详细的 Status 字段,让运维人员可以清楚地了解当前状态:
# 查看 Gateway 详细状态
kubectl describe gateway prod-gateway -n gateway-system
# 关键状态信息:
# Conditions:
# Type: Accepted | Programmed | Ready
# Status: True/False
# Reason & Message
# 查看 HTTPRoute 绑定状态
kubectl get httproute -A -o custom-columns=\
NAMESPACE:.metadata.namespace,\
NAME:.metadata.name,\
GATEWAY:.spec.parentRefs[0].name,\
ACCEPTED:".status.parents[0].conditions[?(@.type==\"Accepted\")].status",\
RESOLVED:".status.parents[0].conditions[?(@.type==\"ResolvedRefs\")].status"
7.2 监控指标采集
# Prometheus ServiceMonitor 配置
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: envoy-gateway-metrics
namespace: monitoring
spec:
selector:
matchLabels:
app.kubernetes.io/name: envoy-gateway
namespaceSelector:
matchNames:
- envoy-gateway-system
endpoints:
- port: metrics
interval: 15s
path: /stats/prometheus
relabelings:
- sourceLabels: [__name__]
regex: "envoy_(http|cluster|listener)_.+"
action: keep
关键监控指标:
# Grafana Dashboard JSON 片段 - 关键指标
dashboard:
panels:
- title: "请求速率 (RPS)"
expr: sum(rate(envoy_http_requests_total{gateway="prod-gateway"}[5m])) by (route)
- title: "P99 延迟"
expr: histogram_quantile(0.99, sum(rate(envoy_http_request_duration_seconds_bucket[5m])) by (le, route))
- title: "错误率 (%)"
expr: sum(rate(envoy_http_requests_total{response_code=~"5.."}[5m])) / sum(rate(envoy_http_requests_total[5m])) * 100
- title: "活跃连接数"
expr: sum(envoy_cluster_active_connections{gateway="prod-gateway"}) by (cluster)
- title: "Gateway API 资源状态"
expr: gateway_api_accepted_routes{gateway="prod-gateway"}
7.3 日志聚合与结构化日志
# Envoy Gateway 访问日志配置
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: production-config
namespace: gateway-system
spec:
logging:
accessLog:
- format:
type: Text
text: |
[%START_TIME%] "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%"
response_code=%RESPONSE_CODE% response_flags="%RESPONSE_FLAGS%"
bytes_received=%BYTES_RECEIVED% bytes_sent=%BYTES_SENT%
duration=%DURATION% upstream_service_time="%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%"
x_forwarded_for="%REQ(X-FORWARDED-FOR)%" user_agent="%REQ(USER-AGENT)%"
request_id="%REQ(X-REQUEST-ID)%" route_name="%ROUTE_NAME%"
upstream_cluster="%UPSTREAM_CLUSTER%"
path: /dev/stdout
第八章:性能优化实战
8.1 TLS 优化
TLS 握手是网关性能的关键瓶颈。以下是优化配置:
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: tls-optimized-config
namespace: gateway-system
spec:
telemetry:
accessLog:
- path: /dev/stdout
provider:
type: Kubernetes
kubernetes:
envoyPatches:
- patch:
value:
static_resources:
listeners:
- name: https_listener
filter_chains:
- filter_chain_match:
transport_protocol: "tls"
tls_context:
common_tls_context:
tls_params:
tls_min: "TLSv1.3" # 仅支持 TLS 1.3
cipher_suites:
- "TLS_AES_256_GCM_SHA384"
- "TLS_AES_128_GCM_SHA256"
session_cache:
num_sessions: 10000 # TLS Session 缓存
alpn_protocols: ["h2", "http/1.1"]
TLS 优化效果(基于压测数据):
| 优化项 | 优化前 | 优化后 | 提升 |
|---|---|---|---|
| TLS 1.2 握手延迟 | 3.2ms | - | - |
| TLS 1.3 握手延迟 | 1.8ms | 1.0ms(0-RTT) | 44%↓ |
| 会话复用率 | 0% | 85% | - |
| 每秒新建连接数 | 5,000 | 15,000 | 200%↑ |
8.2 连接池优化
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
name: connection-pool
namespace: my-app
spec:
targetRef:
group: ""
kind: Service
name: my-api-service
namespace: my-app
connection:
bufferLimit: 32768 # 32KB 缓冲区
maxConnectionsPerEndpoint: 10000
perConnectionBufferLimitBytes: 16384 # 16KB 每连接缓冲
8.3 HTTP/2 和 HTTP/3 配置
# Gateway 支持 HTTP/2 和 HTTP/3
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: http3-gateway
namespace: gateway-system
spec:
gatewayClassName: envoy-gateway
listeners:
- name: https-h2
port: 443
protocol: HTTPS
hostname: "*.example.com"
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: wildcard-tls
namespace: cert-manager
allowedRoutes:
namespaces:
from: All
- name: http3
port: 443
protocol: HTTPS
hostname: "*.example.com"
tls:
mode: Terminate
certificateRefs:
- kind: Secret
name: wildcard-tls
namespace: cert-manager
options:
name: http3-profile
namespace: gateway-system
allowedRoutes:
namespaces:
from: All
第九章:Kubernetes v1.36 与 Gateway API 的最新变化
9.1 v1.36 中的关键变更
2026年4月发布的 Kubernetes v1.36 是一个里程碑版本,对网关生态有重大影响:
Ingress NGINX 正式退役
这是一个历史性时刻。Kubernetes SIG Network 宣布 Ingress NGINX 进入退役流程:
- 不再接受新的 feature PR
- 仅接受安全修复(持续 6 个月)
- 12 个月后归档仓库
- 建议所有用户迁移到 Gateway API
Gateway API 默认 CRD 版本升级
v1.36 集群默认安装 Gateway API v1.3.0 CRD,以下资源正式 GA:
TCPRoute(v1)UDPRoute(v1)GRPCRoute(v1)ReferencePolicy升级为ReferenceGrant(完全替代)
新增 BackendLBPolicy
标准化后端负载均衡策略配置:
apiVersion: gateway.networking.k8s.io/v1
kind: BackendLBPolicy
metadata:
name: consistent-hash-lb
namespace: my-app
spec:
targetRef:
group: ""
kind: Service
name: session-service
type: ConsistentHash
consistentHash:
hashPolicies:
- type: Header
header:
name: X-Session-ID
9.2 升级注意事项
# v1.36 升级前检查清单
#!/bin/bash
echo "📋 Kubernetes v1.36 升级前网关检查"
# 1. 检查 Ingress 使用情况
echo ""
echo "1️⃣ 当前集群中的 Ingress 资源:"
kubectl get ingress --all-namespaces
INGRESS_COUNT=$(kubectl get ingress --all-namespaces -o json | jq '.items | length')
echo " 总计: ${INGRESS_COUNT} 个"
# 2. 检查 IngressClass 使用情况
echo ""
echo "2️⃣ IngressClass 配置:"
kubectl get ingressclass
# 3. 检查 Gateway API CRD 版本
echo ""
echo "3️⃣ Gateway API CRD 版本:"
kubectl get crd | grep gateway.networking.k8s.io
# 4. 检查已废弃 API 的使用
echo ""
echo "4️⃣ 检查已废弃的 API 使用:"
# networking.k8s.io/v1beta1 Ingress 在 v1.22 已移除
kubectl get ingress --all-namespaces -o json | jq -r '.items[] | select(.apiVersion == "networking.k8s.io/v1beta1") | "\(.metadata.namespace)/\(.metadata.name) 使用了已废弃的 v1beta1 API"' 2>/dev/null
# 5. 检查 externalIPs 使用(v1.36 中弃用)
echo ""
echo "5️⃣ 使用 externalIPs 的 Service(v1.36 弃用):"
kubectl get svc --all-namespaces -o json | jq -r '.items[] | select(.spec.externalIPs and (.spec.externalIPs | length > 0)) | "\(.metadata.namespace)/\(.metadata.name): \(.spec.externalIPs | join(", "))"' 2>/dev/null
echo ""
echo "✅ 检查完成"
第十章:总结与展望
10.1 Gateway API 的核心价值总结
回顾全文,Gateway API 的核心价值可以归纳为以下几点:
- 标准化:统一的 API 定义,消除厂商锁定,不同 Controller 之间的配置可以互迁移。
- 角色分离:GatewayClass、Gateway、Route、ReferenceGrant 四层资源模型,完美匹配企业组织结构。
- 表达能力:原生支持 HTTP/HTTPS/TCP/UDP/gRPC 多协议,Header 匹配、流量权重、URL 重写等高级路由。
- 面向未来:Inference Extension 支持 AI 推理网关,BackendLBPolicy 支持自定义负载均衡,ExtensionRef 提供无限扩展可能。
- 可观测性:详细的 Status 字段、标准化的监控指标、丰富的条件状态。
10.2 技术选型建议
| 场景 | 推荐 Controller | 理由 |
|---|---|---|
| 全新项目,无历史包袱 | Envoy Gateway | 社区活跃,性能优秀,API 覆盖最全 |
| 已有 Cilium CNI | Cilium Gateway | eBPF 数据面,性能极致 |
| 已有 Istio 服务网格 | Istio Gateway | 深度集成 mTLS、流量治理 |
| 已有 NGINX 运维经验 | NGINX Gateway Fabric | 运维知识可复用,迁移成本最低 |
| 需要 AI 推理网关 | Higress | 国内生态最好,Inference Extension 支持完善 |
| API 管理需求重 | Kong Gateway | 插件生态丰富,认证授权能力强 |
10.3 行动路线图
现在 → 30天内:
├── 评估现有 Ingress 配置复杂度
├── 选择合适的 Gateway Controller
├── 在非生产环境安装和测试
└── 编写迁移脚本和验证方案
30天 → 90天内:
├── 非核心服务迁移到 Gateway API
├── 验证监控、日志、告警
└── 团队培训和文档编写
90天 → 180天内:
├── 核心服务逐步迁移
├── 旧 Ingress 保留但不再新增配置
└── 性能对比和优化
180天后:
├── 删除所有旧 Ingress 资源
├── 评估高级特性(金丝雀、多租户、AI 推理网关)
└── 完成全面迁移
Ingress NGINX 退役的钟声已经敲响,Gateway API 已经在 v1.3.0 达到了生产级别的成熟度。对于每一个在 Kubernetes 上构建业务的团队来说,这次迁移不是"要不要做"的问题,而是"什么时候做"的问题。
现在就是最好的时机。
附录
A. 常用 kubectl 命令速查
# 查看所有 Gateway 资源
kubectl get gatewayclasses
kubectl get gateways --all-namespaces
kubectl get httproutes --all-namespaces
kubectl get tcproutes --all-namespaces
kubectl get udproutes --all-namespaces
kubectl get grpcroutes --all-namespaces
kubectl get referencegrants --all-namespaces
# 查看 Gateway 详细状态
kubectl describe gateway <name> -n <namespace>
# 查看 HTTPRoute 绑定状态
kubectl get httproute <name> -n <namespace> -o jsonpath='{.status.parents[0].conditions}' | jq
# 查看 GatewayClass 对应的 Controller
kubectl get gatewayclass -o custom-columns=NAME:.metadata.name,CONTROLLER:.spec.controllerName,ACCEPTED:".status.conditions[?(@.type==\"Accepted\")].status"
# 快速查看所有路由的绑定状态
kubectl get httproutes --all-namespaces -o json | \
jq -r '.items[] | "\(.metadata.namespace)/\(.metadata.name): \(
[.status.parents[]? | select(.conditions[]? | .type == "Accepted" and .status == "True")] | length
) accepted, \(
[.status.parents[]? | select(.conditions[]? | .type == "Accepted" and .status != "True")] | length
) rejected"'
B. 完整的入门示例(从零部署)
# 1. 创建测试命名空间
kubectl create ns demo
kubectl label ns demo gateway-access=allowed
# 2. 部署示例应用
kubectl apply -n demo -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: echo-server
spec:
replicas: 2
selector:
matchLabels:
app: echo-server
template:
metadata:
labels:
app: echo-server
spec:
containers:
- name: echo-server
image: hashicorp/http-echo:0.2.3
args: ["-text", "Hello from Gateway API!"]
ports:
- containerPort: 5678
---
apiVersion: v1
kind: Service
metadata:
name: echo-server
spec:
selector:
app: echo-server
ports:
- port: 80
targetPort: 5678
EOF
# 3. 创建 Gateway
kubectl apply -n demo -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: demo-gateway
spec:
gatewayClassName: envoy-gateway
listeners:
- name: http
port: 80
protocol: HTTP
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
gateway-access: "allowed"
EOF
# 4. 创建 HTTPRoute
kubectl apply -n demo -f - <<EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: echo-route
spec:
parentRefs:
- name: demo-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /echo
backendRefs:
- name: echo-server
port: 80
EOF
# 5. 测试
GATEWAY_IP=$(kubectl get gateway demo-gateway -n demo -o jsonpath='{.status.addresses[0].value}')
echo "Gateway IP: ${GATEWAY_IP}"
curl http://${GATEWAY_IP}/echo