在当今快节奏的软件开发领域,自动化至关重要。在本文中,我将向您展示如何构建一个全面的 DevOps 流水线,该流水线能够:
- 使用 Terraform 预置完整的 AWS 基础设施。
- 部署一个包含私有子网和公共子网、RDS PostgreSQL 以及完整配置的网络堆栈的 EKS 集群。
- 通过 Helm 安装关键的 Kubernetes 组件,包括 cert-manager(集成 Let’s Encrypt)、ingress-nginx、ArgoCD 和 SonarQube。
- 使用 GitLab CI/CD 流水线自动化基础设施和应用程序工作流。
让我们开始吧!
架构概览
我们的解决方案包含以下几个层级:
1. 基础设施配置:
我们将使用 Terraform 模块创建:
- 一个包含公有/私有子网、路由表、NAT 网关、IGW 和 EIP 的 VPC。
- 一个包含工作节点的 EKS 集群(使用成熟的 Terraform 模块)。
- 一个用于持久存储的 RDS PostgreSQL 实例。
- Route53 中的 DNS 管理,用于证书验证。
2. 通过 Helm 进行 Kubernetes 部署:
集群运行后,我们将部署:
- cert-manager(已为 Let’s Encrypt 配置 ClusterIssuer)用于管理 TLS 证书。
- ingress-nginx 作为我们的入口控制器。
- ArgoCD 用于基于 GitOps 的持续交付。
- SonarQube 用于静态代码分析。
3. 使用 GitLab 的 CI/CD 流水线:
我们设置了两个 GitLab 流水线:
- 一个用于管理 Terraform 代码(初始化、验证、规划和应用)。
- 另一个用于构建、测试(通过 SonarQube)和部署应用程序(从 new-app)。
1. 使用 Terraform 预配 AWS 基础设施
我们将首先创建一个模块化的 Terraform 项目。以下是简化的项目结构:
.
├── modules
│ ├── vpc
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── eks
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ └── rds
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── helm
│ ├── ingress-nginx-values.yaml
│ ├── argocd-values.yaml
│ ├── sonarqube-values.yaml
│ └── cert-manager-values.yaml
├── main.tf
├── variables.tf
├── outputs.tf
└── providers.tf
provider.tf
我们配置 AWS 以及 Kubernetes 和 Helm 提供程序。请注意,Kubernetes 提供程序使用来自我们 EKS 模块的输出:
// providers.tf
provider "aws" {region = var.aws_region
}
provider "kubernetes" {host = module.eks.cluster_endpointcluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)token = data.aws_eks_cluster_auth.cluster.token
}
provider "helm" {kubernetes {host = module.eks.cluster_endpointcluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)token = data.aws_eks_cluster_auth.cluster.token}
}
data "aws_eks_cluster_auth" "cluster" {name = module.eks.cluster_name
}
Variables.tf
定义 AWS、VPC、EKS 和 RDS 配置的变量:
// variables.tf
variable "aws_region" {description = "AWS region for resources"type = stringdefault = "us-west-2"
}
variable "vpc_cidr" {description = "CIDR block for the VPC"type = stringdefault = "10.0.0.0/16"
}
variable "public_subnet_cidrs" {description = "List of public subnet CIDRs"type = list(string)default = ["10.0.1.0/24", "10.0.2.0/24"]
}
variable "private_subnet_cidrs" {description = "List of private subnet CIDRs"type = list(string)default = ["10.0.101.0/24", "10.0.102.0/24"]
}
variable "availability_zones" {description = "List of availability zones to use"type = list(string)default = ["us-west-2a", "us-west-2b"]
}
variable "cluster_name" {description = "EKS Cluster name"type = stringdefault = "my-eks-cluster"
}
variable "db_username" {description = "RDS PostgreSQL username"type = string
}
variable "db_password" {description = "RDS PostgreSQL password"type = stringsensitive = true
}
variable "db_name" {description = "RDS PostgreSQL database name"type = stringdefault = "mydatabase"
}
main.tf (Root Module)
在我们的根模块中,我们调用子模块并定义 Helm 部署:
// main.tf
terraform {required_version = ">= 1.0.0"required_providers {aws = {source = "hashicorp/aws"version = "~> 4.0"}kubernetes = {source = "hashicorp/kubernetes"version = "~> 2.0"}helm = {source = "hashicorp/helm"version = "~> 2.0"}}
}
// VPC Module
module "vpc" {source = "./modules/vpc"vpc_cidr = var.vpc_cidrpublic_subnet_cidrs = var.public_subnet_cidrsprivate_subnet_cidrs = var.private_subnet_cidrsavailability_zones = var.availability_zonesaws_region = var.aws_region
}
// EKS Module (using a well-known module from the Terraform Registry)
module "eks" {source = "terraform-aws-modules/eks/aws"version = ">= 18.0.0"cluster_name = var.cluster_namecluster_version = "1.24"subnets = module.vpc.private_subnet_idsvpc_id = module.vpc.vpc_idnode_groups = {default = {desired_capacity = 2max_capacity = 3min_capacity = 1instance_type = "t3.medium"subnets = module.vpc.private_subnet_ids}}
}
// RDS Module for PostgreSQL
module "rds" {source = "./modules/rds"vpc_id = module.vpc.vpc_idprivate_subnet_ids = module.vpc.private_subnet_idsdb_username = var.db_usernamedb_password = var.db_passworddb_name = var.db_nameaws_region = var.aws_region
}
// Deploy cert-manager via Helm
resource "helm_release" "cert_manager" {name = "cert-manager"repository = "https://charts.jetstack.io"chart = "cert-manager"version = "v1.9.1"namespace = "cert-manager"create_namespace = trueset {name = "installCRDs"value = "true"}
}
// Create a ClusterIssuer for Let's Encrypt (for automatic TLS certs)
resource "kubernetes_manifest" "letsencrypt_clusterissuer" {manifest = {"apiVersion" = "cert-manager.io/v1""kind" = "ClusterIssuer""metadata" = {"name" = "letsencrypt-prod"}"spec" = {"acme" = {"email" = "your-email@example.com" // Replace with your email"server" = "https://acme-v02.api.letsencrypt.org/directory""privateKeySecretRef" = {"name" = "letsencrypt-prod"}"solvers" = [{"dns01" = {"route53" = {"region" = var.aws_region// Optionally, you can specify hosted zone details here.}}}]}}}depends_on = [helm_release.cert_manager]
}
// Deploy ingress-nginx via Helm
resource "helm_release" "ingress_nginx" {name = "ingress-nginx"repository = "https://kubernetes.github.io/ingress-nginx"chart = "ingress-nginx"version = "4.7.0"namespace = "ingress-nginx"create_namespace = truevalues = [file("${path.module}/helm/ingress-nginx-values.yaml")]
}
// Deploy SonarQube via Helm
resource "helm_release" "sonarqube" {name = "sonarqube"repository = "https://SonarSource.github.io/helm-chart-sonarqube"chart = "sonarqube"version = "9.6.0"namespace = "sonarqube"create_namespace = truevalues = [file("${path.module}/helm/sonarqube-values.yaml")]
}
// Deploy ArgoCD via Helm with ingress exposure
resource "helm_release" "argocd" {name = "argocd"repository = "https://argoproj.github.io/argo-helm"chart = "argo-cd"version = "4.10.2"namespace = "argocd"create_namespace = truevalues = [file("${path.module}/helm/argocd-values.yaml")]
}
2. Terraform Modules
VPC Module
此模块创建您的 VPC、子网、NAT 网关、IGW 和相关路由表。
modules/vpc/main.tf:
resource "aws_vpc" "this" {cidr_block = var.vpc_cidrenable_dns_support = trueenable_dns_hostnames = truetags = {Name = "terraform-vpc"}
}
resource "aws_internet_gateway" "this" {vpc_id = aws_vpc.this.idtags = {Name = "terraform-igw"}
}
resource "aws_subnet" "public" {count = length(var.public_subnet_cidrs)vpc_id = aws_vpc.this.idcidr_block = var.public_subnet_cidrs[count.index]availability_zone = element(var.availability_zones, count.index)map_public_ip_on_launch = truetags = {Name = "terraform-public-${count.index}"}
}
resource "aws_subnet" "private" {count = length(var.private_subnet_cidrs)vpc_id = aws_vpc.this.idcidr_block = var.private_subnet_cidrs[count.index]availability_zone = element(var.availability_zones, count.index)map_public_ip_on_launch = falsetags = {Name = "terraform-private-${count.index}"}
}
// NAT Gateway resources (using the first public subnet)
resource "aws_eip" "nat" {vpc = true
}
resource "aws_nat_gateway" "this" {allocation_id = aws_eip.nat.idsubnet_id = aws_subnet.public[0].idtags = {Name = "terraform-nat"}
}
// Public route table
resource "aws_route_table" "public" {vpc_id = aws_vpc.this.idroute {cidr_block = "0.0.0.0/0"gateway_id = aws_internet_gateway.this.id}tags = {Name = "terraform-public-rt"}
}
resource "aws_route_table_association" "public" {count = length(aws_subnet.public)subnet_id = aws_subnet.public[count.index].idroute_table_id = aws_route_table.public.id
}
// Private route table (using NAT Gateway)
resource "aws_route_table" "private" {vpc_id = aws_vpc.this.idroute {cidr_block = "0.0.0.0/0"nat_gateway_id = aws_nat_gateway.this.id}tags = {Name = "terraform-private-rt"}
}
resource "aws_route_table_association" "private" {count = length(aws_subnet.private)subnet_id = aws_subnet.private[count.index].idroute_table_id = aws_route_table.private.id
}
modules/vpc/variables.tf:
variable "vpc_cidr" {description = "VPC CIDR block"type = string
}
variable "public_subnet_cidrs" {description = "List of public subnet CIDRs"type = list(string)
}
variable "private_subnet_cidrs" {description = "List of private subnet CIDRs"type = list(string)
}
variable "availability_zones" {description = "List of availability zones"type = list(string)
}
variable "aws_region" {description = "AWS region"type = string
}
modules/vpc/outputs.tf:
output "vpc_id" {value = aws_vpc.this.id
}
output "public_subnet_ids" {value = aws_subnet.public[*].id
}
output "private_subnet_ids" {value = aws_subnet.private[*].id
}
EKS Module
我们利用了流行的 Terraform EKS 模块。您可以按所示方式包装它,也可以直接在根模块中调用它。
modules/eks/main.tf:
module "eks" {source = "terraform-aws-modules/eks/aws"version = ">= 18.0.0"cluster_name = var.cluster_namecluster_version = var.cluster_versionsubnets = var.subnetsvpc_id = var.vpc_idnode_groups = var.node_groupstags = {Environment = "dev"}
}
modules/eks/variables.tf:
output "cluster_name" {value = module.eks.cluster_name
}
output "cluster_endpoint" {value = module.eks.cluster_endpoint
}
output "cluster_certificate_authority_data" {value = module.eks.cluster_certificate_authority_data
}
RDS Module
该模块在私有子网中配置一个 RDS PostgreSQL 实例。
modules/rds/main.tf:
resource "aws_db_subnet_group" "this" {name = "${var.db_name}-subnet-group"subnet_ids = var.private_subnet_idstags = {Name = "${var.db_name}-subnet-group"}
}
resource "aws_db_instance" "this" {allocated_storage = 20engine = "postgres"engine_version = "13.7"instance_class = "db.t3.micro"name = var.db_nameusername = var.db_usernamepassword = var.db_passworddb_subnet_group_name = aws_db_subnet_group.this.namevpc_security_group_ids = var.db_security_group_idsskip_final_snapshot = truepublicly_accessible = falsetags = {Name = "${var.db_name}-rds"}
}
modules/rds/variables.tf:
variable "vpc_id" {type = string
}
variable "private_subnet_ids" {type = list(string)
}
variable "db_username" {type = string
}
variable "db_password" {type = stringsensitive = true
}
variable "db_name" {type = string
}
variable "db_security_group_ids" {description = "List of security group IDs for the DB instance"type = list(string)default = []
}
modules/rds/outputs.tf:
output "rds_endpoint" {value = aws_db_instance.this.endpoint
}
3. Helm Deployments in EKS
我们的 Helm 部署管理着 Kubernetes 资源。以下值文件(位于 helm/ 目录中)可自定义我们的安装。
ingress-nginx-values.yaml
controller:service:annotations:service.beta.kubernetes.io/aws-load-balancer-type: "nlb"admissionWebhooks:enabled: true
argocd-values.yaml
server:ingress:enabled: trueingressClassName: "nginx"hosts:- argocd.example.com # Replace with your domainannotations:kubernetes.io/ingress.class: "nginx"cert-manager.io/cluster-issuer: "letsencrypt-prod"
sonarqube-values.yaml
service:type: ClusterIP
ingress:enabled: trueingressClassName: "nginx"hosts:- sonarqube.example.com # Replace with your domainannotations:kubernetes.io/ingress.class: "nginx"cert-manager.io/cluster-issuer: "letsencrypt-prod"
cert-manager-values.yaml
如果您需要对 cert-manager 进行进一步的自定义,可以在此处添加。我们的 Helm 版本已通过设置 installCRDs=true 启用 CRD。
4. GitLab CI/CD Pipelines
我们现在设置了两个管道——一个用于 Terraform 代码,另一个用于应用程序。这种分离使基础设施变更和应用程序部署保持独立。
Terraform Pipeline (.gitlab-ci.terraform.yml)
# .gitlab-ci.terraform.yml
image: hashicorp/terraform:latest
stages:- init- validate- plan- apply
variables:TF_ROOT: "./terraform"TF_IN_AUTOMATION: "true"AWS_DEFAULT_REGION: "us-west-2"
before_script:- cd $TF_ROOT- terraform --version
terraform_init:stage: initscript:- terraform initartifacts:paths:- .terraform/expire_in: 1 hour
terraform_validate:stage: validatescript:- terraform validate
terraform_plan:stage: planscript:- terraform plan -out=tfplanartifacts:paths:- tfplanexpire_in: 1 hour
terraform_apply:stage: applywhen: manualscript:- terraform apply -auto-approve tfplan
Application Pipeline (.gitlab-ci.app.yml)
该管道验证 YAML 文件、为您的应用程序构建 Docker 映像、运行 SonarQube 分析并部署到 Kubernetes:
# .gitlab-ci.app.yml
stages:- validate- build- test- deploy
# Validate YAML files
yaml_lint:stage: validateimage: cytopia/yamllint:latestscript:- yamllint .only:- merge_requests- master
# Build Docker image
build_app:stage: buildimage: docker:latestservices:- docker:dindvariables:DOCKER_DRIVER: overlay2script:- echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin $CI_REGISTRY- docker build -t $CI_REGISTRY_IMAGE:new-app .- docker push $CI_REGISTRY_IMAGE:new-apponly:- merge_requests- master
# SonarQube analysis
sonarqube_scan:stage: testimage: sonarsource/sonar-scanner-cli:latestscript:- sonar-scanner \-Dsonar.projectKey=new-app \-Dsonar.sources=. \-Dsonar.host.url=$SONAR_HOST_URL \-Dsonar.login=$SONAR_TOKENonly:- merge_requests- master
# Deploy to Kubernetes
deploy_app:stage: deployimage: bitnami/kubectl:latestscript:- kubectl apply -f k8s/only:- masterenvironment:name: productionurl: http://your-app-domain.example.com
5. 最终步骤与执行
1. Terraform 工作流:
初始化:terraform init
验证:terraform validate
计划:terraform plan -out=tfplan
应用(手动批准):terraform apply -auto-approve tfplan
2. Helm 部署:
EKS 集群准备就绪后,Helm 提供程序将部署 cert-manager(使用 ClusterIssuer 为 Let’s Encrypt 提供程序)、ingress-nginx、SonarQube 和 ArgoCD。
更新您的 DNS (Route53),以便您的域名(例如 argocd.example.com、sonarqube.example.com)指向 Ingress 负载均衡器。
3. GitLab CI/CD 管道:
Terraform 管道可自动执行基础架构变更。
应用程序管道构建、测试(通过 SonarQube)并将您的应用程序部署到 EKS。
4. 保护您的机密信息:
在 GitLab CI/CD 变量中配置 AWS 凭证、SonarQube 令牌和 Docker 镜像仓库凭证。
结论
本项目提供了一个强大的端到端解决方案,集成了 Terraform 的基础设施配置、Helm 的 Kubernetes 应用管理以及 GitLab CI/CD 的自动化流水线。通过结合这些工具,您可以以最少的人工干预交付一个云原生、安全且可扩展的环境。
请记住,实际部署可能需要进一步的定制和安全增强(例如,完善的 IAM 策略或额外的监控)。请将本指南作为坚实的起点,并随着项目的进展不断迭代。
欢迎在评论区分享您的经验或提问,或直接通过 Mohamed.ElEmam.Hussin@gmail.com 和 Mohamed ElEmam | LinkedIn 联系我。祝您编程愉快,DevOps 愉快!