'kubectl'에 해당하는 글 17건

EKS version update

Daily/Prog 2023. 2. 14. 20:39

share this post

* AWS Email 전문

2023년 2월 15일부터 Amazon EKS는 Kubernetes 1.21 버전을 더 이상 지원하지 않습니다. 또한 그 이후, 더 이상 새 1.21버전의 클러스터를 생성할 수 없으며 Kubernetes 1.21 버전을 사용중인 모든 EKS 클러스터는 최신 플랫폼 버전의 Kubernetes 버전 1.22로 업데이트될 예정입니다.
...
1.21 버전의 클러스터를 Kubernetes 1.22 버전 이상으로 업데이트하는 것을 권장드립니다. 클러스터를 바로 가장 최신 Kubernetes 버전인 1.24 버전으로 업데이트하여, 버전 업데이트를 수행해야 하는 빈도를 최소화할 수 있습니다.

일단 EKS 버전 업데이트가 처음인데다가, 현재 운영중인 클러스터라 다운타임이 발생하면 회사에 치명적, 내 수명에도 치명적... 다른일 처리하다가 미루고 미루다 보니 내일 모레가 듀 데이트... 다행히 개발/운영 클러스터를 각각 사용하고 있어서 개발 클러스터로 다운타임이 발생하지 않는지 확인할 수 있었다. 결론부터 말하면 fargate 사용시 다운타임은 발생하지 않았다.

이왕 손댔을 때, 메일에 쓰인대로 1.21 버전에서 1.24 버전으로 올리면 좋겠지만 한번에 하나의 마이너버전 업데이트만 가능하다. 일단 업데이트 전 버전별 업데이트 주의사항들을 확인해 봤다.

https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html

Updating an Amazon EKS cluster Kubernetes version - Amazon EKS

Even though Amazon EKS runs a highly available control plane, you might experience minor service interruptions during an update. For example, assume that you attempt to connect to an API server around when it's terminated and replaced by a new API server t

docs.aws.amazon.com

요약하자면,

kubectl client/server 버전 동기화
사라지는 v1beta API 들 체크하여, 기존 YAML 파일 변경
AWS Load Balancer Controller 2.4.1 이상으로 업데이트

이 정도인 것 같다. kubectl 은 로컬에 lens, docker 등에서도 설치가 되어 최신버전이 표시되게 설정이 필요하다. kubectl 버전은 EKS 클러스터 API 버전을 기준으로 마이너버전 +-1 만 사용이 가능하다. 예를 들어 1.23 kubectl Client 는 Kubernetes 1.22, 1.23, 1.24 클러스터에서 사용이 가능하다... 고는 하는데, kubectl client 1.24 과 server 1.21 을 사용할 때 큰 이슈는 없었다... 아무튼 최신버전(1.25+)에서는 제약이 있는 듯. yaml 파일 수정은 rbac.authorization.k8s.io/v1beta1 하나 했고, ALB Controller 는 업데이트 하지 않았더니 무한 restart...ㅎ

EKS v1.22 업데이트 순서

(현재 프로젝트용 fargate 와 모니터링용 node-group 을 사용중이다.)

1. 클러스터 버전 업데이트

$ eksctl upgrade cluster \
--name my-cluster \
--version 1.24 \
--approve

2. 노드 그룹 버전 업데이트 (옵션)

$ eksctl upgrade nodegroup \
--name=node-group-name \
--cluster=my-cluster \
--region=region-code \
--kubernetes-version=1.24

노드 그룹의 경우 볼륨 수대로 파드가 생성되서 에러인줄 알고 깜놀했는데, 10여분정도 뒤에 업데이트가 끝나면 알아서 다 정리된다. 프로메테우스는 재설치 했다..;

3. Custom Resources 업데이트

$ helm repo update
$ kubectl apply -k "github.com/aws/eks-charts/stable/aws-load-balancer-controller/crds?ref=master"
customresourcedefinition.apiextensions.k8s.io/ingressclassparams.elbv2.k8s.aws configured
customresourcedefinition.apiextensions.k8s.io/targetgroupbindings.elbv2.k8s.aws configured

4. AWS Load Balancer Controller 2.4.1 이상으로 업데이트

$ helm upgrade aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=my-cluster \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controller \
--set region=my-region \
--set vpcId=vpc-1234567890 \
--set image.repository=my-region-id.dkr.ecr.my-region.amazonaws.com/amazon/aws-load-balancer-controller

5. EKS 대시보드의 [추가기능] 업데이트

Amazon VPC CNI
CoreDNS
kube-proxy

업데이트시, 충돌 옵션을 선택하지 않으면 아래와 같은 에러가 발생한다. 옵션에서 [재정의] 를 선택하면 된다.

Conflicts found when trying to apply. Will not continue due to resolve conflicts mode. Conflicts: ConfigMap kube-proxy-config - .data.config DaemonSet.apps kube-proxy - .spec.template.spec.containers[name="kube-proxy"].image DaemonSet.apps kube-proxy - .spec.template.spec.containers[name="kube-proxy"].image

6. 프로젝트 deployment 각각 restart (업데이트 버전의 노드 확인)

$ kubectl rollout restart -n my-namespace deployment/my-project
...

EKS v1.23 업데이트

v1.23 에서는 노드그룹(EC2) 를 사용한다면, EBS CSI driver 설치를 해주어야 한다. EKS 대시보드의 [추가기능] 에서 생성. 생성시 필요한 IAM 역할 생성은 아래 사이트 참고.

https://docs.aws.amazon.com/eks/latest/userguide/csi-iam-role.html

Creating the Amazon EBS CSI driver IAM role for service accounts - Amazon EKS

Creating the Amazon EBS CSI driver IAM role for service accounts The Amazon EBS CSI plugin requires IAM permissions to make calls to AWS APIs on your behalf. For more information, see Set up driver permission on GitHub. When the plugin is deployed, it crea

docs.aws.amazon.com

EBS CSI driver 가 정상적인 권한으로 설치되지 않으면 Volumes mount 가 실패하고 에러가 발생한다.

failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-c1234a61-1234-4321-1234-12347d4f1234": could not create volume in EC2: NoCredentialProviders: no valid providers in chain caused by: EnvAccessKeyNotFound: failed to find credentials in the environment. SharedCredsLoad: failed to load profile, . EC2RoleRequestError: no EC2 instance role found caused by: RequestError: send request failed caused by: Get "http://111.111.111.111/latest/meta-data/iam/security-credentials/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

EKS v1.24 업데이트

v1.24 에서는 따로 설정을 변경한 것은 없었지만, PSP(PodSecurityPolicy) 가 v1.25에서는 제거될 예정이라 PSS(PodSecurityStandards) 로 대체할 수 있다. PSS 설치 내용이 너무 길어서; v1.25 언제 쓸지도 모르는데 일단 그때까지 보류하는걸로

https://aws.amazon.com/ko/blogs/containers/implementing-pod-security-standards-in-amazon-eks/

Implementing Pod Security Standards in Amazon EKS | Amazon Web Services

Introduction Securely adopting Kubernetes includes preventing unwanted changes to clusters. Unwanted changes can disrupt cluster operations and even compromise cluster integrity. Introducing pods that lack correct security configurations is an example of a

aws.amazon.com

그저 단순 업데이트처럼 보이지만 참으로 살떨리는 작업이었다...ㅋ

저작자표시 비영리 변경금지

WRITTEN BY

: 손가락귀신
정신 못차리면, 벌 받는다.

EKS IAM 사용자 추가

Server/AWS 2022. 8. 4. 21:23

share this post

일반적으로 EKS cluster 를 생성한 계정 이외에 다른 IAM 사용자나 role 을 세분화하여 추가할 수 있으며, 심플한 사용자 추가를 진행해 보았다. 예전에 뭣 때문에 eksctl 을 설치했는지는 모르겠다만 땡큐~

eksctl 이 설치되어 있지 않다면... https://docs.aws.amazon.com/ko_kr/eks/latest/userguide/eksctl.html

eksctl 설치 또는 업데이트 - Amazon EKS

GitTag 버전은 0.105.0 이상이어야 합니다. 그렇지 않은 경우 터미널 출력에서 설치 또는 업그레이드 오류가 있는지 확인하거나, 1단계의 주소를 https://github.com/weaveworks/eksctl/releases/download/v0.105.0/eksct

docs.aws.amazon.com

eksctl 을 이용하면 간단히 cluster 정보와 추가할 iam 계정의 arn, ClusterRoleBinding group 을 지정한다.

$ eksctl create iamidentitymapping \
    --cluster cluster-name \
    --region=ap-northeast-2 \
    --arn arn:aws:iam::111122223333:user/ggamzzak \
    --group system:masters \
    --profile my-profile

2022-08-02 17:25:32 [ℹ]  eksctl version 0.75.0
2022-08-02 17:25:32 [ℹ]  using region ap-northeast-2
2022-08-02 17:25:32 [ℹ]  adding identity "arn:aws:iam::111122223333:user/ggamzzak" to auth ConfigMap

"system:masters" 그룹의 자격 증명을 사용하면 cluster-admin 슈퍼유저의 역할을 바인딩하여 모든 작업을 수행할 수 있다.

configmap/aws-auth 를 확인하면 다음과 같이 추가되어 있다.

$ kubectl describe configmap -n kube-system aws-auth
Name:         aws-auth
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>

Data
====
mapUsers:
----
- groups:
  - system:masters
  userarn: arn:aws:iam::111122223333:user/ggamzzak 

mapRoles:
...

또는,

$ eksctl get iamidentitymapping --cluster my-cluster --region=ap-northeast-2 --profile my-profile
2022-08-02 20:52:23 [ℹ]  eksctl version 0.75.0
2022-08-02 20:52:23 [ℹ]  using region ap-northeast-2
ARN                                                             USERNAME                        GROUPS                                                                          ACCOUNT
...
arn:aws:iam::111122223333:user/ggamzzak                                                         system:masters

추가된 IAM 계정으로 테스트 ㄱㄱ~

저작자표시 비영리 변경금지

WRITTEN BY

: 손가락귀신
정신 못차리면, 벌 받는다.

EKS 무중단 배포

Server/AWS 2022. 1. 11. 19:57

share this post

$ kubectl get pods -o wide -w -n exapi
NAME                         READY   STATUS    RESTARTS   AGE   IP            NODE                                                     NOMINATED NODE                                READINESS GATES
oauth-b88bb75fb-cbpfb        1/1     Running             0          12m     10.1.19.219   fargate-ip-10-1-19-219.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-b88bb75fb-cbpfb        1/1     Running             0          12m     10.1.19.219   fargate-ip-10-1-19-219.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-b88bb75fb-zx68l        1/1     Running             0          9m47s   10.1.27.53    fargate-ip-10-1-27-53.ap-northeast-2.compute.internal    <none>                                        1/1
oauth-b88bb75fb-zx68l        1/1     Running             0          9m47s   10.1.27.53    fargate-ip-10-1-27-53.ap-northeast-2.compute.internal    <none>                                        1/1
oauth-6889b9d9d8-89f8d       0/1     Pending             0          0s      <none>        <none>                                                   <none>                                        0/1
oauth-6889b9d9d8-89f8d       0/1     Pending             0          1s      <none>        <none>                                                   3e4a3b4284-f3a1e8e9245847e19d248aa203cec099   0/1
oauth-6889b9d9d8-89f8d       0/1     Pending             0          65s     <none>        fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   3e4a3b4284-f3a1e8e9245847e19d248aa203cec099   0/1
oauth-6889b9d9d8-89f8d       0/1     ContainerCreating   0          65s     <none>        fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-89f8d       1/1     Running             0          100s    10.1.22.204   fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-89f8d       1/1     Running             0          101s    10.1.22.204   fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-89f8d       1/1     Running             0          102s    10.1.22.204   fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-89f8d       1/1     Running             0          2m11s   10.1.22.204   fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-89f8d       1/1     Running             0          3m34s   10.1.22.204   fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-6889b9d9d8-89f8d       1/1     Running             0          3m34s   10.1.22.204   fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-b88bb75fb-zx68l        1/1     Terminating         0          14m     10.1.27.53    fargate-ip-10-1-27-53.ap-northeast-2.compute.internal    <none>                                        1/1
oauth-6889b9d9d8-89f8d       1/1     Running             0          3m34s   10.1.22.204   fargate-ip-10-1-22-204.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-6889b9d9d8-8n2mf       0/1     Pending             0          0s      <none>        <none>                                                   <none>                                        0/1
oauth-6889b9d9d8-8n2mf       0/1     Pending             0          1s      <none>        <none>                                                   16625974aa-a093eb0abc954f958086de0c456c0ea2   0/1
oauth-b88bb75fb-zx68l        0/1     Terminating         0          14m     10.1.27.53    fargate-ip-10-1-27-53.ap-northeast-2.compute.internal    <none>                                        1/1
oauth-b88bb75fb-zx68l        0/1     Terminating         0          14m     10.1.27.53    fargate-ip-10-1-27-53.ap-northeast-2.compute.internal    <none>                                        1/1
oauth-b88bb75fb-zx68l        0/1     Terminating         0          14m     10.1.27.53    fargate-ip-10-1-27-53.ap-northeast-2.compute.internal    <none>                                        1/1
oauth-6889b9d9d8-8n2mf       0/1     Pending             0          50s     <none>        fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   16625974aa-a093eb0abc954f958086de0c456c0ea2   0/1
oauth-6889b9d9d8-8n2mf       0/1     ContainerCreating   0          50s     <none>        fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-8n2mf       1/1     Running             0          85s     10.1.18.160   fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-8n2mf       1/1     Running             0          86s     10.1.18.160   fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-8n2mf       1/1     Running             0          87s     10.1.18.160   fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-8n2mf       1/1     Running             0          116s    10.1.18.160   fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        0/1
oauth-6889b9d9d8-8n2mf       1/1     Running             0          3m19s   10.1.18.160   fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-6889b9d9d8-8n2mf       1/1     Running             0          3m19s   10.1.18.160   fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-6889b9d9d8-8n2mf       1/1     Running             0          3m19s   10.1.18.160   fargate-ip-10-1-18-160.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-b88bb75fb-cbpfb        1/1     Terminating         0          20m     10.1.19.219   fargate-ip-10-1-19-219.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-b88bb75fb-cbpfb        0/1     Terminating         0          20m     10.1.19.219   fargate-ip-10-1-19-219.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-b88bb75fb-cbpfb        0/1     Terminating         0          20m     10.1.19.219   fargate-ip-10-1-19-219.ap-northeast-2.compute.internal   <none>                                        1/1
oauth-b88bb75fb-cbpfb        0/1     Terminating         0          20m     10.1.19.219   fargate-ip-10-1-19-219.ap-northeast-2.compute.internal   <none>                                        1/1

자알 돌아간다~ 정상적인 것처럼 보이나...

Default 값을 믿고 열심히 삽질해 준 나에게 감사한다.

Rolling Update 시 가장 먼저 선행해야 할 것은 strategy 명시이다. 기본값으로 테스트하겠다고 이 부분을 생략한게 큰 타격이 됐다.

$ my-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: exapi
  name: api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0         # 0 fast
      maxUnavailable: 1   # 0 slow

maxSurge 와 maxUnavailable 값은 배포시 최대 생성 pod 수와 종료 수를 명시할 수 있다. 각각 0/1, 1/0 으로 설정할 경우 running 상태 이후에 terminating 을 시키느냐, pending 과 동시에 terminating 을 시키느냐의 차이이며 약 30초 정도의 차이를 확인했다.

겉보기에는 잘 작동하는 것처럼 보이나 배포시 약 5~10초간 502 Gateway Error 뒤에 504 Gateway Time-out 이 발생한다. 우선 502 에러가 발생하는 시점을 찾아봤다.

쿠버네티스에서 kubectl rollout restart 명령으로 배포할 경우, 새 pod 가 추가되고 Running 상태가 되면 기존 포드가 삭제되는 식이다. 확인 결과 pod 가 Terminating 되는 순간에 502 에러가 발생했다. ALB 의 Target 이 동시에 draining 되는 시점이기도 하다. 인터넷을 후벼파서 결과, 502 에러를 최소화 할 수 있는 방법을 찾아 결국 해냈다.

502 Gateway Error / 504 Gateway Time-out Error 최소화

1. ingress : connection-draining 설정

ALB 에 connection-draining 관련 옵션을 주어 기존 연결에 대한 처리를 유지한다. (이미 기본값으로 동작중. 효과없음.)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  ...
  annotations:
    ...
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout: "30"

2. pod : readiness-gate 설정

readiness-gate 옵션을 활성화 하여 새로운 target 이 healthy 상태로 연결되기 전까지 기존 target 을 종료하지 않음. (마찬가지로 502 에러는 발생하나, ALB 내의 target 이 healthy 상태가 한개도 유지되지 않는 어이없는 상황은 막을 수 있음). 반드시 필요!

$ kubectl label namespace {mynamespace} elbv2.k8s.aws/pod-readiness-gate-inject=enabled

apiVersion: apps/v1
kind: Deployment
metadata:
spec:
  template:
    spec:
      containers:
      - name: api
      readinessGates:
      - conditionType: target-health.alb.ingress.k8s.aws/api-ingress_api-nodeport_80

readiness-gate 옵션을 활성화할 namespace 에 레이블을 설정하고, target-health.alb.ingress.k8s.aws/{ingress-name}_{service-name}_{port} 를 위처럼 삽입한다. 설정이 올바르지 않으면 ALB 에서 target 그룹을 인식하지 못할 수도 있으니 주의!

3. pod : preStop 설정

life-cycle hooks 에서 pod 가 중지되기 전에(preStop) 딜레이를 가지며 기존 연결을 마저 처리한다. 새로운 연결이 이루어지지 않는다. (인터넷에서 좋아요도 가장 많고, 실제로 preStop 설정만으로 502 에러를 해결하였음.)

kind: Deployment
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: api
        lifecycle:
          preStop:
            exec:
              command: ["sleep", "60"]

4. pod : terminationGracePeriodSeconds 설정

preStop 가 실패했을 때 대신 컨테이너를 종료시킨다. (preStop sleep 값보다 +10초 정도로 설정: default 값은 45초)

kind: Deployment
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: web-api
      ...
      terminationGracePeriodSeconds: 70

5. pod livenessProbe / readinessProbe 설정

pod 의 활성화 상태를 나타내는 livenessProbe 가 실패하면 재시작 정책의 대상이 된다. pod 준비 상태를 나타내는 readinessProbe 가 실패하면 해당 pod 는 모든 엔드포인트에서 제거된다. (두 방법 모두 비정상 pod 에 연결을 못받게 하여 502 에러를 줄인다고는 하는데 효과는 잘 모르겠음.)

kind: Deployment
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: api
      ...
        livenessProbe:
          httpGet:
            path: /
            port: 80
          periodSeconds: 4
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /
            port: 80
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 10

정리

이 5가지 설정 중 3가지를 사용하여 무중단 배포를 완벽하게 해결했다.

preStop
readinessGates
terminationGracePeriodSeconds

readinessGates 설정을 가장 먼저 한 바람에 필수요소인지는 확실치 않다. (조만간 테스트 예정)
preStop 설정으로 무중단 배포를 해결했다. (본인의 서비스와 사양에 맞는 설정이 필요할 수 있다.)
terminationGracePeriodSeconds 설정을 하지 않아도 문제는 없었다. 내 경우 기본값이 45초 만으로 충분한 듯.

한가지 더... kubectl rollout 으로 테스트 할 때 기존의 pod 가 원치 않는 동작을 할 수도 있다. 항상 기존 pod 는 delete 로 삭제한 후에 새로 생성하여 테스트 하는 것이 원하는 결과를 얻는데 도움이 될 것 같다.

무중단 배포도 안된다고 며칠을 eks 욕하고 있었는데... 정말 다행이다. ^_______________^

저작자표시 비영리 변경금지

WRITTEN BY

: 손가락귀신
정신 못차리면, 벌 받는다.

EKS ingress https

Server/AWS 2021. 12. 28. 23:37

share this post

EKS 에서 http 를 https 로 리다이렉트 시켜주는 ALB 가 필요하다면, 아래와 같은 ingress 를 구성한다.

- 사전에 ACM 의 인증서 arn 이 필요하다.

$ vi ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  namespace: example
  name: example-ingress
  annotations:
    # Ingress Core Settings
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/target-group-attributes: stickiness.enabled=true,stickiness.lb_cookie.duration_seconds=60
    # SSL Settings
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:ap-northeast-2:111122223333:certificate/11fceb32-9cc2-4b45-934f-c8903e4f9e12
    alb.ingress.kubernetes.io/ssl-redirect: '443'
spec:
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: example-nodeport
                port:
                  number: 80

기존 http 구성에 # SSL Setting 아래 3줄만 추가하였음.

기존 ingress 삭제하고 다시 생성.

$ kubectl delete -f ingress.yaml
ingress.networking.k8s.io "example-ingress" deleted

$ kubectl apply -f ingress.yaml
ingress.networking.k8s.io/example-ingress created

저작자표시 비영리 변경금지

WRITTEN BY

: 손가락귀신
정신 못차리면, 벌 받는다.

CodeBuild / CodePipeline

Server/AWS 2021. 12. 18. 00:09

share this post

EKS 에서의 CI / CD 를 고민중이다. AWS 리소스만을 이용하는 방법, 범용적인 CI/CD 를 사용하는 방법... ArgoCD, TerraForm, Jenkins 등 뭐가 많지만 되도록 AWS 리소스를 사용해 보기로 했다. 예전에 ECS 배포할 때는 gradle 에서 Docker 이미지 생성 / ECR 업로드 / 배포까지... 전부 해결했었다. 로컬에서 태스크만 실행하면 되니 build.gradle 만 기다랗게 작성하는 것 빼고는 별 문제가 없었다. 그 때의 애로사항은 배포하려는 개발자의 PC 에 Docker 가 실행되고 있어야 한다는 점. ECR 인증키가 git 에 포함된다는 점. 배포시 오류가 발생한다면 즉시 파악이 어려운 점... 트집을 잡자면 그 정도. 하지만 대부분의 개발자가 docker 를 사용하고, git 도 프라이빗, 배포시 오류는 알림으로 처리했었고... 라며 흡족하게는 썼었지만, 이번엔 왠지 도구를 이용해야만 할 것 같은 느낌적인 느낌. 시간 있으면 이것저것 다 써보는게 좋은거지. 싶어서 CodeCommit, CodeBuild, CodePipeline 을 사용해 보았다.

사용할 AWS CI/CD Resource

CodeCommit
- git 저장소 / 소스 위치
CodeBuild
- 소스코드를 가져와 Test/Build 실행.
- 자체 컴퓨팅으로 docker 이미지 생성, ECR 이미지 업로드, EKS 배포에 관한 빌드사양 파일 작성(buildspec.yml)
CodePipeline
- git 저장소에 소스코드 변경이 감지되었을 때, source 아티팩트 S3 업로드 후 CodeBuild 실행

CodeBuild 생성

AWS 관리콘솔에서 빌드 프로젝트 만들기

관리형 이미지 컴퓨팅 선택
역할 이름 : 여러 빌드 프로젝트를 관리한다면 하나로 재사용
권한이 있음(privileged) : 도커 이미지를 빌드하거나 빌드의 권한을 승격시 필요
VPC / 컴퓨터 유형 선택 : 빌드 성능 결정
환경 변수 : buildspec 에 노출하고 싶지 않은 키/값 지정
Buildspec 이름 : 실제 buildspec 파일의 경로 지정
캐시 유형 : 로컬 (DockerLayerCache, SourceCache, CustomCache)
나머지 기본값

buildspec.yml 파일 작성

빌드 프로젝트 생성시 설정한 경로에서 buildspec.yml 를 작성한다. 기본적으로 4개의 페이즈가 있다.

install : 빌드 환경 구성
pre_build : 빌드에 필요한 ECR 로그인이나 패키지 종속성 설치 등
build : 빌드 실행
post_build : 빌드 결과물 처리, ECR 이미지 푸시, 빌드 알림 등

페이즈 마다의 규칙이 있지는 않지만, 추후 구간별 처리 속도를 알 수 있으므로 구분하여 사용자 정의.

$ vi buildspec.yml

version: 0.2

phases:
  install:
    runtime-versions:
      java: corretto11
    commands:
      - curl -o kubectl https://amazon-eks.s3-us-west-2.amazonaws.com/1.21.2/2021-07-05/bin/linux/amd64/kubectl
      - chmod +x ./kubectl
      - mv ./kubectl /usr/local/bin/kubectl
      - aws --version
      - aws eks update-kubeconfig --region ap-northeast-2 --name example-cluster
      - kubectl get po -n exapi
  pre_build:
    commands:
      - echo pre_build Logging in to Amazon ECR...
      - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - REPOSITORY_URI=${REPO_ECR}
      - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - IMAGE_TAG=${COMMIT_HASH:=latest}
  build:
    commands:
      - echo Build started on `date`
      - gradle --version
      - chmod +x gradlew
      - ./gradlew :ApiFacility:build
      - docker build -t $REPOSITORY_URI:latest ./Api
      - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker images...
      - docker push -a $REPOSITORY_URI
      #- kubectl apply -f ./AWS/example-deployment.yaml
      - kubectl rollout restart -f ./AWS/example-deployment.yaml

위에 정의한 사양은,

install : 자바환경, kubectl 설치, kubeconfig 구성
pre_build : ECR 로그인, ECR 경로, 이미지 태그명 정의
build : gradle 빌드, docker 이미지 생성
post_build : ECR 이미지 푸시, eks deployment 롤링 업데이트

또한, 위 사양처럼 REPO_ECR 등의 프라이빗 정보들은 빌드 프로젝트 구성시 환경변수로 지정한 변수명으로 사용할 수 도 있다. echo 나 이미지 태깅 관리 때문에 조금 길어지긴 했는데 필요에 맞게 사용자 정의...

CodePipeline 생성

AWS 관리콘솔에서 파이프라인 생성

역할 이름 : 여러 파이프라인을 관리한다면 하나로 재사용
아티팩트가 업로드 될 버킷 위치 수동 지정시
소스 스테이지 : CloudWatch 로 변경 감지, 출력 아티팩트 형식은 기본값을 사용하여 zip 형식의 데이터를 사용해도 무방하지만 간단히 메타데이터만 전달하는 전체 복제 선택.(gitpull 권한 추가 필요)
빌드 스테이지 추가 후 배포 스테이지 건너뛰기

배포 테스트

CodePipeline 이 생성되면 즉시 실행이 되는데, 아니면 수동으로 빌드하던지, git 소스를 푸시하던지... 해서 배포 롤링 업데이트 확인까지. 이제부터는 에러의 향연이다. CodeBuild 의 플로우를 보면 거의 모든 단계에 해당 리소스에 대한 권한이 필요하므로 생성한/된 codebuildrole 에 각 권한을 추가해 주면 된다.

[에러 모음 링크]

마치 일부러 에러를 찾아내려는 듯한 치밀한 실수들이 많았다; awscli 버전이 1.2 라 추후 문제가 있을수도 있지 않을까 하는 걱정도 되고... CodeCommit, CodeBuild, CodePipeline, ECR, EKS 등 AWS 리소스 만으로 CI/CD 를 구성해 본 느낌은... 그냥 그렇다.ㅋ 뭔가 자동화 스러우면서도, 관리 포인트가 점점 늘어나는 찝찝한 느낌... 남은건 구성 완료에 대한 약간의 성취감? ㅋ

저작자표시 비영리 변경금지

WRITTEN BY

: 손가락귀신
정신 못차리면, 벌 받는다.

'kubectl'에 해당하는 글 17건

EKS version update

EKS v1.22 업데이트 순서

EKS v1.23 업데이트

EKS v1.24 업데이트

EKS IAM 사용자 추가

EKS 무중단 배포

1. ingress : connection-draining 설정

2. pod : readiness-gate 설정

3. pod : preStop 설정

4. pod : terminationGracePeriodSeconds 설정

5. pod livenessProbe / readinessProbe 설정

정리

EKS ingress https

CodeBuild / CodePipeline

CodeBuild 생성

buildspec.yml 파일 작성

CodePipeline 생성

배포 테스트

티스토리툴바