Systems

Thux

Heptapod: the complete guide from Docker to CI/CD

THUX offers a comprehensive easy guide to get Heptapod (GitLab with Mercurial) deployed using Docker behind a Træfik load balancer. The final setup is completed by activation of the built-in Docker registry and runners for optimal CI/CD.

Goal

The goal of this post is to bolster the adoption of Heptapod through the very few steps really needed to get it working via docker and get a complete setup with working CI (Continuous Integration) and registry for the image produced within the CI.

The target of the article are system administrators with a little knowledge of docker. No previous knowledge of Gitlab is required as we’ll lead you through the main concepts.

THUX has always used mercurial as the Version Control System of choice for our code and in the last years we hosted the repositories in Bitbucket, until the moment the infamous decision of Bitbucket to discontinue the support of mercurial forced us to take a decision.
We considered switching to git but I didn’t like to do that just for a non-technical reason and we definitely like mercurial more than git.

Heptapod came to the rescue, especially its docker image, ready to deploy in no-time. Heptapod brings Mercurial SCM support to Gitlab, so that any feature of gitlab is available to the Mercurial world.
It honestly took some time to get to the final configuration, but in hindsight it all was just very simple and can be replicated with no effort by anybody just following the right directions.

Let’s start from the big picture to offer a sneak peak of the outcome:

We have a docker host that has a Træfik load balancer (version 2+), that handles reachability of all docker instances on port 80 and 443. Træfik is great as it also manages certificates from LetsEncrypt with no hassle at all.
Heptapod runs in a docker and we want to reach it on port 22 as well as 443, so that we can hg clone <my-repo> in a natural way. That implies we reach docker server via ssh on port 23.
We have a runner (actually we can have several different runners for different scopes) that handles the jobs needed when a commit is done. Eg: check the code, run some tests, build a docker image…
Heptapod makes it effortless to inspect the result of each job: a little icon informs us of the state of the job and a click on it will just drop us to the console where the job is running.
We have a registry to host our images that in our case will be exposed with a separate name and will be reachable via Træfik as well.

Docker

We will take for granted you have a server with docker and docker-compose installed. Similarly I suppose you have 2 entries in DNS pointing to this server: one for the heptapod and one for the registry. The working configuration and a streamlined README can be downloaded from the public area of our Heptapod

SSH

As anticipated, we want to use mercurial via ssh so that we opted to change port of the docker server to 23 in /etc/ssh/sshd_config:

Port 23

followed by a systemctl reload ssh.service

Træfik

Træfik is a very useful load balancer, suited for the docker environment. We run it in a docker, it forwards http requests to other servers. In our case it forwards requests to other dockers.
It has several different “providers” that is to say, different ways to configure it’s routing. Here we use only the “docker” provider, that reads the docker socket to read the “labels” each docker can publish. This way, the configuration of the routing to a particular docker is delegated to that same docker and it is defined in its docker-compose.yml.
You can see how this mechanism is used to publish an internal dashboard of traefik:

traefik.http.routers.api.rule: Host(`monitor.thux.it`) traefik.http.routers.api.service: api@internal

that you can read as: all trafik to monitor.thux.it is to be routed to the service api@internal

version: "3.3"

services: traefik: image: traefik:v2.2 restart: always ports: - "80:80" # <== http - "443:443" # <== https command: #### These are the CLI commands that will configure Traefik and tell it how to work! #### - --api.dashboard=true # <== Enabling the dashboard to view services, middlewares, routers, etc... - --log.level=DEBUG # <== Setting the level of the logs from traefik - --accessLog.filePath=/logs/access.log - --log.filePath=/logs/traefik.log ## Provider Settings - https://docs.traefik.io/providers/docker/#provider-configuration ## - --providers.docker=true # <== Enabling docker as the provider for traefik - --providers.docker.exposedbydefault=false # <== Don't expose every container to traefik, only expose enabled ones - --providers.docker.network=web # <== Operate on the docker network named web ## Entrypoints Settings - https://docs.traefik.io/routing/entrypoints/#configuration ## - --entrypoints.web.address=:80 # <== Defining an entrypoint for port :80 named web - --entrypoints.web-secured.address=:443 # <== Defining an entrypoint for https on port :443 named web-secured ## Certificate Settings (Let's Encrypt) - https://docs.traefik.io/https/acme/#configuration-examples ## - --certificatesresolvers.leresolver.acme.tlschallenge=true # <== Enable TLS-ALPN-01 to generate and renew ACME certs - --certificatesresolvers.leresolver.acme.email=myemail@domain.com # <== Setting email for certs - --certificatesresolvers.leresolver.acme.storage=/letsencrypt/acme.json # <== Defining acme file to store cert information volumes: - ./logs:/logs # <== Volume for certs (TLS) - ./letsencrypt:/letsencrypt # <== Volume for certs (TLS) - /var/run/docker.sock:/var/run/docker.sock # <== Volume for docker admin - ./dynamic.yaml:/dynamic.yaml # <== Volume for dynamic conf file, **ref: line 27 networks: - web # <== Placing traefik on the network named web, to access containers on this network labels: #### Labels define the behavior and rules of the traefik proxy for this container #### traefik.enable: true traefik.http.routers.api.rule: Host(`monitor.thux.it`) traefik.http.routers.api.service: api@internal traefik.http.routers.api.middlewares: redirect-to-https traefik.http.middlewares.redirect-to-https.redirectscheme.scheme: https # let’s define a redirect to https valid for any request: traefik.http.routers.http-catchall.rule: hostregexp(`{host:.+}`) traefik.http.routers.http-catchall.entrypoints: web traefik.http.routers.http-catchall.middlewares: redirect-to-https traefik.http.middlewares.redirect-to-https.redirectscheme.scheme: https traefik.http.routers.traefik.rule: Host(`monitor.thux.it`) traefik.http.routers.traefik.service: api@internal traefik.http.routers.traefik.tls.certresolver: leresolver networks: web: external: true

Note we tell Træfik that it must request certificates to LetsEncrypt.
3 lines of configuration are enough to forget about ssl certificates from now on. As soon as Træfik receives from the label of a docker machine that it needs a certificate it will issue the request to LetsEncrypt for us and it will renew it when needed.

Heptapod

Let’s move on to the core of our challenge: Heptapod.
As of this writing, the current version is 0.15.1 but 1 will soon be available. Current version is built on Gitlab 13.1.5 and mercurial 5.4.2 while the last release of Gitlab is 13.2.1.

You can configure GitLab via file gitlab.rb (ruby syntax) or via the environment configuration GITLAB_OMNIBUS_CONFIG that will overwrite any configuration found in the default config template. I’ll follow this second possibility as it is easier to copy/paste and to keep updated when you update your docker image.
My docker-compose.yml starts as follows:

version: '3.4'

services: heptapod: image: octobus/heptapod:0.15.1 hostname: hg.thux.dev restart: always networks: - web ports: - "22:22" environment: GITLAB_OMNIBUS_CONFIG: | external_url 'https://gitlab.thux.it' gitlab_rails['gitlab_email_from'] = 'hg@thux.it' gitlab_rails['gitlab_email_display_name'] = 'Thux Mercurial Server' gitlab_rails['gitlab_email_reply_to'] = 'name@example.com' gitlab_rails['smtp_enable'] = true gitlab_rails['smtp_address'] = "smtp.example.com" gitlab_rails['smtp_port'] = 25 gitlab_rails['smtp_authentication'] = "plain" gitlab_rails['smtp_enable_starttls_auto'] = true gitlab_rails['smtp_domain'] = "thux.it" # Registry settings registry_external_url 'https://registry.gitlab.thux.it' gitlab_rails['registry_enabled'] = true registry['enable'] = true gitlab_shell['migration'] = { enabled: true, features: ["hg"] } nginx['listen_port'] = 80 nginx['listen_https'] = false nginx['proxy_set_headers'] = { "Host" => "$$http_host_with_default", "X-Real-IP" => "$$remote_addr", "X-Forwarded-For" => "$$proxy_add_x_forwarded_for", "X-Forwarded-Proto" => "https", "X-Forwarded-Ssl" => "on", "Upgrade" => "$$http_upgrade", "Connection" => "$$connection_upgrade" } registry_nginx['listen_https'] = false registry_nginx['listen_port'] = 80 volumes: - ./gitlab/config:/etc/gitlab - ./gitlab/logs:/var/log/gitlab - ./gitlab/data:/var/opt/gitlab

Before delving into the GitLab configuration, we can note:

This configuration requests docker daemon to forward port 22 to port 22 of this docker instance.
The real data will be kept outside the docker, in a directory of the host and bind-mounted within the docker instance so as to be available to Heptapod. The volume section should result pretty clear.
Logs are kept in a volume as well, that is bind-mounted
Docker is attached to an external network web that is the network where Træfik will forward traffic to
Environment variable GITLAB_OMNIBUS_CONFIG is defined using yaml literal style - so that newlines are preserved. Double $$ is needed to pass a single $ to gitlab and avoid early interpolation by docker.
gitlab/conf is an empty folder the first time we start but will be filled by gitlab with gtilab.rb and auto-generated secrets. It’s linked to a VOLUME and if you don’t bind-mount a host folder it will be regenerated each time with problems for the runners (When I did I couldn’t even open the Runners’ page)
Both settings registry_nginx[... ] and nginx[...] are needed!!

GitLab configuration

As stated above we preferred to configure gitlab via variable GITLAB_OMNIBUS_CONFIG so we don’t need to mount a conf file and when we upgrade to a newer version of heptapod we’re sure any default will correctly be picked from the template. We can inspect the default configuration reading the file /etc/gitlab/gitlab.rb:

docker-compose exec heptapod cat /etc/gitlab/gitlab.rb

That file is exactly the default, while configuration from environment variable GITLAB_OMNIBUS_CONFIG is applied on top of that. You will need to look at it to know which values you can customize. In case you prefer to edit it directly remember to bind-mount it.

The first section is very easy to understand: it just enables email and configures the relay server.
The second section configures how to enable and serve registry service. As per our big picture, we decided to use Træfik to handle the ssl certificate so the connection between
træfik and gitlab is on port 80, that is what the nginx config part does. The result can be inspected by running:

# docker-compose exec heptapod cat /var/opt/gitlab/nginx/conf/gitlab-registry.conf

whose output is:

# This file is managed by gitlab-ctl. Manual changes will be erased! server { listen *:80; server_name registry.gitlab.thux.it; ... location / { proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_set_header X-Forwarded-Ssl on; ... proxy_pass http://localhost:5000; } }

Up to now we have not yet configured Træfik that knows nothing about our docker services. It’s time to add one more section in our docker-compose file:

labels: traefik.enable: true traefik.http.routers.hg.rule: Host(`gitlab.thux.it`) traefik.http.routers.hg.entrypoints: web-secured traefik.http.routers.hg.tls.certresolver: leresolver traefik.http.routers.hg.tls: true traefik.http.routers.hg.service: hg traefik.http.routers.registry.rule: Host(`registry.gitlab.thux.it`) traefik.http.routers.registry.entrypoints: web-secured traefik.http.routers.registry.service: registry traefik.http.routers.registry.tls.certresolver: leresolver traefik.http.routers.registry.tls: true

traefik.http.services.hg.loadbalancer.server.port: 80 traefik.http.services.registry.loadbalancer.server.port: 80

Starting from the last 2 lines: docker informs Træfik that it offers 2 services on port 80 named hg and registry, these service names are used in the router definitions above. Each of the two blocks defines a router (named again hg and registry) with entry point port 443 (named web-secured in our Træfik docker-compose), how to get a certificate and which service should be attached to. Both services are handled by nginx that proxies them according to their domain name and we saw earlier an excerpt.

If we start this docker machine with command:

mkdir -p gitlab/config gitlab/logs gitlab/data docker-compose pull docker-compose up -d

… and wait enough (some minutes) for the services to be started we should see a page where the password for the root user is prompted (“Please create a password for your new account”). That is where we set it. Going to administration area we should see something similar to this:

Please note that Container registry appears enabled.

Should you fiddle around with gitlab-ctl, you should call it via exec. The proper way to invoke the help is:

docker-compose exec heptapod gitlab-ctl help

Shared Runners

So far so good! Shared Runners are enabled but we have none yet. Shared Runners are a key part in continuous integrations. They are the servers that execute the jobs. Each of our projects has a configuration file named .gitlab-ci.yml that dictates which jobs will be run on each commit and under which conditions. We could decide to check code quality, to run tests or to produce the docker image… we can decide to run some jobs on each commit and an image build on each commit of the default branch. The configuration of .gitlab-ci.yml is outside the scope of this article but I’ll give a couple examples to generate docker images as it goes along with different Runner configurations.

Let’s start from the docker-compose configurations for our runner:

heptapod-runner: image: octobus/heptapod-runner:0.3.0 restart: always networks: - web volumes: - /var/run/docker.sock:/var/run/docker.sock - ./gitlab-runner:/etc/gitlab-runner

Please note:

configuration is in gitlab-runner that is a folder, not a file. We don’t need to pass a file as it will be created in the registration process
we bind-mount /var/run/docker.sock ie: the runner will see the same docker daemon as the docker daemon we’re running on. This is just one of at least 3 possibilities. I’ll register and configure two runners
- Thux Socket Binding: that takes advantage of this shared socket
- Thux Priviledged: that uses dind (docker in docker) to have a completely independent docker daemon

we will use job’s tags to select the correct runner for each job

no labels are defined as runners are not exposed to the external. In fact you can run runners anywhere, on any servers. They will start working for your Heptapod after you have registered them.

Registering the runner

To register a runner we need to get the token from the administration area, following “GitLab runner” link and run:

# docker-compose exec heptapod-runner gitlab-runner register

The registration process will prompt us with some simple questions and the result will be written to the config.toml. You can add more runners specialized in different jobs. After adding 2 runners and a little bit of hand editing, this is the resulting configuration:

# cat gitlab-runner/config.toml concurrent = 1 check_interval = 0 [session_server] session_timeout = 1800 [[runners]] name = "ThuxRunnerPriviledged" url = "https://hg.thux.dev/" token = "SecretToken" executor = "docker" environment = ["DOCKER_DRIVER=overlay2"] [runners.custom_build_dir] [runners.cache] [runners.cache.s3] [runners.cache.gcs] [runners.docker] tls_verify = false image = "docker:19.03.12" privileged = true disable_entrypoint_overwrite = false oom_kill_disable = false disable_cache = false volumes = ["/certs/client", "/cache"] shm_size = 0 [[runners]] name = "Thux Runner Socket Binding" url = "https://hg.thux.dev/" token = "SecretToken" executor = "docker" environment = ["DOCKER_DRIVER=overlay2"] [runners.custom_build_dir] [runners.cache] [runners.cache.s3] [runners.cache.gcs] [runners.docker] tls_verify = false image = "docker:19.03.12" privileged = false disable_entrypoint_overwrite = false oom_kill_disable = false disable_cache = false volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"] shm_size = 0

Please note:

Thux Priviledged has
- priviledged = True
- a volume for the certificates
Thux Runner Socket Binding has the socket in the volumes
The previous points have been hand edited after the registration. Upon saving the file, the changes are automatically applied.
DOCKER_DRIVER=overlay2 is already default in all my docker server. Don’t set it if you happen to have a btrfs underlying fs (use btrfs in that case)
You can’t see here, but while registering you are given the opportunity to set tags that will be used to couple job and runner. I added dind to "Thux Priviledged" and Python to the other

A third option, that may be more secure, uses kaniko, a builder of docker images that doesn’t require root privileges.

Working example

You can test with a simple project. Let’s start with a fake Python project. No code and random dependencies:

echo “ipython\nrequests” > requirements.txt

The Dockerfile is very basic but uses BuildKit (a toolkit for converting source code to build artifacts in an efficient, expressive and repeatable manner - note the syntax = part, it's not a comment) that in turns implies caching is working::

# syntax = docker/dockerfile:experimental FROM python:3.8 as production-stage COPY requirements.txt /requirements.txt RUN \ mkdir /code/ \ && pip install --no-cache-dir -r /requirements.txt COPY . /code/ WORKDIR /code/ USER ${APP_USER}:${APP_USER} CMD ["ipython"]

Let’s now move on to preparing a couple of .gitlab-ci.yml for different runners.

Example with dind

build_image: image: docker tags: - dind services: - name: "docker:19.03.12-dind" command: ["--experimental"] variables: DOCKER_BUILDKIT: 1 IMAGE_NAME: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG DOCKER_DRIVER: overlay2 # Create the certificates inside this directory for both the server # and client. The certificates used by the client will be created in # /certs/client so we only need to share this directory with the # volume mount in `config.toml`. DOCKER_TLS_CERTDIR: "/certs" script: - mkdir $HOME/.docker - "echo -e '{\n \"experimental\": \"enabled\"\n}' | tee $HOME/.docker/config.json" - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY - docker build -t $IMAGE_NAME . - docker push $IMAGE_NAME

Some remarks:

docker image is used to execute this job, entrypoint command realizes that no DOCKER_HOST is defined and uses tcp://docker:2375 (its default, when there’s no docker socket) i.e. the docker service from the docker:19.03.12-dind image
DOCKER_BUILDKIT enables BuildKit
“experimental”, both in server and client, gives access to experimental features that you may not need
We believe this is not the correct way to use buildkit with dind. Build time in my case is much longer than the other runner. Here is an article that explains how to use both dind and caching to speed up image build

Inspect result

Add our .gitlab-ci.yml, commit and push.
A little icon will appear in the project page indicating the state of the job, clicking on that icon will get you to a pipeline page where you can investigate what is going on.

Example with bind-mount

build_image: image: docker variables: DOCKER_BUILDKIT: 1 IMAGE_NAME: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG script: - mkdir $HOME/.docker - "echo -e '{\n \"experimental\": \"enabled\"\n}' | tee $HOME/.docker/config.json" - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY - docker build -t $IMAGE_NAME . - docker push $IMAGE_NAME

Some remarks:

tag python will make Thux Runner Socket Binding choose it
docker command (cli) is provided by docker image whose entrypoint, missing a DOCKER_HOST variable, looks for a /var/run/docker.sock socket (that we passed as bind-mount from the main docker daemon)
the special user gitlab-ci-token is granted access to the repository

This time let’s follow the icons:

to land in the terminal where we see the job execution logs

Final remarks

We want to thank the Octobus team for the effort done to add mercurial to Gitlab in this wonderful way. I found the team very helpful, a special mention goes to Georges Racinet. I would say it is not a minor detail to have good and prompt support. They’re also super reactive in the mercurial list and that makes it a perfect match.
We started investigating Heptapod just to have a shared repo for our code and we landed in a much more up-to-date setup than we were used before.

You can download the ready-to-use configuration from the public area of our Heptapod site.

This article is written and curated by Alessandro Dentella

Documenti

Vuoi approfondire l'argomento?