Skip to Content.
Sympa Menu

perfsonar-user - [perfsonar-user] [Request For Comments] perfSonar Toolkits Docker container

Subject: perfSONAR User Q&A and Other Discussion

List archive

[perfsonar-user] [Request For Comments] perfSonar Toolkits Docker container


Chronological Thread 
  • From: Victor Orlikowski <>
  • To: "" <>
  • Subject: [perfsonar-user] [Request For Comments] perfSonar Toolkits Docker container
  • Date: Wed, 16 Oct 2019 13:09:05 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=duke.edu; dmarc=pass action=none header.from=duke.edu; dkim=pass header.d=duke.edu; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=waftIXegEWX4oQnWNnFf8qYRh+nQ697JFoaPfvA9o+o=; b=VtNI8DkXJSVMEOWcLAJlf+RN44CioG9Srhih1rZ6cIS4PF3wUvpTOKAKcl+32kPu9uKL7tirlcJNF/JtT1Pr+T6EQaeS/7g7rgoP4ssn1gwhha5nNarMMMfZa6Zqc8JYOUow06HHcbaRw3Pgjz0Ed4BGLibPsU6saZ7CR1FW0ODq9dOaHwqbGsho2VjzOsYgUF2tBcRBOFfnC6YV/3kfUYwLKg1QNtcTSEAdKnCOJO9rRh5XO9KbKHgPE2f4CuxE/nZGqQOFxFvhHBUE0f518yI81AXqf9udt8erwAx1MZw4DYKvh67jCsN51/o+DboAgl8hzGLM750aCU5m6mILWg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EcPzLfvaf+YJGf9oIoo7BJsyUX62RkuSqzMufvJjwJBG5TkxmGIcuImHKPVmIKaVxtb45z3lvjaRJHuJNdETEG/6EQS8uNZeRMd3GB6+1kr1QrDOXNCF2upNnh5uHICj1zL6IMM9FnZZ3GqK2F3T0bW0lYhhEUF/nKsnz30J3UkNIoyfJ8fTXB3BTxe8h9TU+dtIJkyiZ2/50HU2KSScsl3LCFc6p4rRGcp+cG3pdWu42irFtc5knRUflqvuQIeycK/4+UK47WivY/WraHJs8DZmuv5gFOQ/ODa832Qa8vFD0Ez84AKCw+vb0hohhXSD3A4rJK+Ja8RRkZibef+0wQ==

perfSonar folks,

In my de-lurk post, I present to you a variant on the perfSonar
Testpoint Docker container that we've ginned up here at Duke. We've
been running it for a little while now, and, while it has some rough
edges, we'd like to present it to the wider community so that we can
get input on potential improvements (as well as work on getting it
considered for upstreaming).

I'll also be talking about our choice of deployment, to give you
some ideas about how you might choose to deploy this yourselves.

First off - here's a pointer to our repo (forked from the upstream
Testpoint repo):

https://github.com/LinuxAtDuke/perfsonar-toolkit-docker

And here's a pointer to the diffs against the Testpoint repo:

https://github.com/perfsonar/perfsonar-testpoint-docker/compare/master...LinuxAtDuke:master

To build, you should be able to merely do:

git clone https://github.com/LinuxAtDuke/perfsonar-toolkit-docker.git
cd perfsonar-toolkit-docker
docker build .

You may optionally tag the build; we do:

docker tag ${BUILD_ID} perfsonar-toolkit

This effort was motivated, in no small part, by issues we'd had with
some perfSonar version updates in the past; we'd surprisingly often
run into situations where an automatic update would break our
deployment (particularly integration with MaDDasH).

By building the Toolkit into a Docker container - we've made it so
that we're comfortable disabling auto-updates in our "production"
containers. When a new version of the Toolkit is released, we build
and test the new version in parallel with our running production
containers; if we're comfortable with how the new version behaves,
we simply switch the production containers over to the new version
with minimal fuss.

We now need to discuss the nature of our deployment, to make things
a bit clearer.

Our primary "edge" perfSonar server is a "decently robust"
Supermicro server that has dual Intel x710 NICs (as well as a third
"management" NIC). One of the two x710 NICs is connected to our
"production" network, one hop off of our edge router. The other x710
NIC is connected to our "Software-Defined Science Network" - and is
*also* only one hop off of our edge router. The server is running
Ubuntu 18.04 LTS, and acts as a Docker host.

The Docker host has had NTP configured with at least 4 upstreams.
The only network stack tuning we performed is:

root@perfsonar-tel-docker-host-01:/home/psonar# cat
/etc/sysctl.d/90-perfSonar-network-tuning.conf
# allow testing with buffers up to 128MB
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
# increase Linux autotuning TCP buffer limit to 64MB
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
# recommended default congestion control is bbr
net.ipv4.tcp_congestion_control=bbr
# recommended for hosts with jumbo frames enabled
net.ipv4.tcp_mtu_probing=1

We've also disabled processor frequency scaling; here's how we
modified /etc/default/grub:

GRUB_CMDLINE_LINUX_DEFAULT="splash quiet intel_iommu=on
intel_idle.max_cstate=0 processor.max_cstate=1 idle=halt"

Note that "intel_iommu" is enabled; this will be important in a
moment.

Our final bit of tuning is in sysfs and systemd:

root@perfsonar-tel-docker-host-01:/home/psonar# cat
/etc/tmpfiles.d/perfsonar.conf
w /sys/devices/system/cpu/cpufreq/policy0/scaling_governor - - - - performance
w /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs - - - - 8
w /sys/bus/pci/devices/0000:01:00.1/sriov_numvfs - - - - 8

root@perfsonar-tel-docker-host-01:/home/psonar# systemctl status ondemand
● ondemand.service - Set the CPU Frequency Scaling governor
Loaded: loaded (/lib/systemd/system/ondemand.service; disabled; vendor
preset: enabled)
Active: inactive (dead)

We had previously run:

systemctl disable ondemand

The tmpfiles entry ensures that the Linux kernel processor scaling
policy is "performance"; we do this, as well as the boot options, to
keep NTP as stable as possible.

Note that the perfsonar.conf in tmpfiles.d has two more entries;
these, along with enabling SR-IOV support in the BIOS and specifying
"intel_iommu=on" in the kernel command line, allow us to use the
SR-IOV functions of the Intel x710 cards to define 8 virtual
functions (virtual NICs) per physical card (each individual card
supports as many as 64 virtual functions).

This is important for our A/B testing rubric.

When we started this effort, we attempted to use Docker's "macvlan"
networking stack (https://docs.docker.com/network/macvlan/) with
both the Testpoint and Toolkit suites. While it does allow for us to
have multiple containers running with what appears to be individual
physical interfaces bridged into them, "macvlan" is based on Linux
bridging - and is therefore only capable of supporting about 3-4
Gbps in throughput (rather unfortunately for us).

That's where the SR-IOV virtual interfaces come in.

We plumb individual virtual interfaces into distinct Docker
container instances, allowing us to run "production" perfSonar
Toolkit instances on the same hardware as "testing"; in that way, we
can qualify new releases of the Toolkit suite on the same hardware
where we will run it in production. While the 8 defined SR-IOV
interfaces must all share the 10 Gbps throughput that physical card
supports - we only ever use 1 or 2 of these virtual interfaces in
practice (1, when there is no new release to test), and the deployed
Docker containers are thereby able to perform tests at full
line-rate in general.

SR-IOV affords us the ability to do A/B testing - but, if one wanted
to our Docker container with a non-SR-IOV-capable NIC (for ease of
deployment/management) - that should work as well.

Let's take a look at our scripts for deploying "production
instances"; they are quick and dirty, but get the job done:

root@perfsonar-tel-docker-host-01:/home/psonar# cat create_prod_perfsonar.sh
#!/bin/sh

# Create desired volumes
docker volume create prod_toolkit0_etc_data
docker volume create prod_toolkit0_var_data
docker volume create prod_toolkit0_pki_data
docker volume create prod_toolkit0_postgresql_data
docker volume create prod_toolkit0_cassandra_data

# Spin up container
docker run -d --net=none --name prod_toolkit0 -h
perfsonar-tel-prod-01.oit.duke.edu -v prod_toolkit0_etc_data:/etc/perfsonar
-v prod_toolkit0_var_data:/var/lib/perfsonar -v
prod_toolkit0_pki_data:/etc/pki -v
prod_toolkit0_postgresql_data:/var/lib/pgsql -v
prod_toolkit0_cassandra_data:/var/lib/cassandra perfsonar-toolkit

# Plumb in interface and get going!
/home/psonar/setup_docker_physical_interface_prod_primary.sh

Walking through - we first create named Docker volumes (so that we
can persist test definitions and data across container instance
invocations). We then spin up the container instance itself; note
that we use "--net=none". This creates the container without any
network interfaces, other than localhost loopback.

Finally, we use a secondary script to plumb in the interface; let's
take a look at it:

root@perfsonar-tel-docker-host-01:/home/psonar# cat
/home/psonar/setup_docker_physical_interface_prod_primary.sh
#!/bin/sh
CONTAINER_NAME="prod_toolkit0"
CONTAINER_INTERFACE="enp2s10"
CONTAINER_IP="152.3.227.81/31"
CONTAINER_GATEWAY="152.3.227.80"

CONTAINER_CERT="/home/psonar/certs/perfsonar-tel-prod-01_oit_duke_edu_cert.cer"
CONTAINER_KEY="/home/psonar/certs/perfsonar-tel-prod-01.oit.duke.edu.key"

export CONTAINER_NAME
pid=$(docker inspect -f '{{.State.Pid}}' ${CONTAINER_NAME})

mkdir -p /var/run/netns
ln -s /proc/$pid/ns/net /var/run/netns/$pid
ip link set ${CONTAINER_INTERFACE} netns $pid
ip netns exec $pid ip addr add ${CONTAINER_IP} dev ${CONTAINER_INTERFACE}
ip netns exec $pid ip link set ${CONTAINER_INTERFACE} mtu 9000 up
ip netns exec $pid ethtool -G ${CONTAINER_INTERFACE} rx 4096 tx 4096
ip netns exec $pid ip route del default
ip netns exec $pid ip route add default via ${CONTAINER_GATEWAY}
rm /var/run/netns/$pid

docker cp ${CONTAINER_CERT} ${CONTAINER_NAME}:/etc/pki/tls/certs/localhost.crt
docker cp ${CONTAINER_KEY}
${CONTAINER_NAME}:/etc/pki/tls/private/localhost.key
docker exec -it ${CONTAINER_NAME} supervisorctl restart httpd
docker exec -it ${CONTAINER_NAME} /usr/lib/perfsonar/scripts/add_psadmin_user
--auto

Walking through again, we define environment variables for the
container's name, the interface we're choosing to plumb into the
container, the IP address we want to use, the upstream gateway, and
the cert and key for TLS.

We then get the PID of the Docker container, and perform the
gymnastics required to talk to the network namespace associated with
it. We then plumb the network interface into the network namespace
("ip link set"), set the IP, tune the interface, and set up the
default route.

Finally, we copy TLS cert and key into the container, restart httpd
(both to read the cert and key, and to get httpd to become aware of
the newly-plumbed-in interface), and create the web admin user
(presuming that this is the first run of the container; so long as
we preserve the named Docker volumes, we will not be prompted for
this information again).

Note that we restart httpd using supervisorctl; we have supervisord
set up so that we can monitor and restart services, as needed:

root@perfsonar-tel-docker-host-01:/home/psonar# docker exec -it prod_toolkit0
bash
[root@perfsonar-tel-prod-01 /]# supervisorctl
cassandra RUNNING pid 20, uptime 4 days, 21:42:44
config_daemon RUNNING pid 17, uptime 4 days, 21:42:44
httpd RUNNING pid 665, uptime 4 days, 21:42:40
ls_registration_daemon RUNNING pid 12, uptime 4 days, 21:42:44
owampd RUNNING pid 15, uptime 4 days, 21:42:44
postgresql RUNNING pid 10, uptime 4 days, 21:42:44
pscheduler-archiver RUNNING pid 14, uptime 4 days, 21:42:44
pscheduler-runner RUNNING pid 18, uptime 4 days, 21:42:44
pscheduler-scheduler RUNNING pid 13, uptime 4 days, 21:42:44
pscheduler-ticker RUNNING pid 9, uptime 4 days, 21:42:44
psconfig_pscheduler_agent RUNNING pid 19, uptime 4 days, 21:42:44
rsyslog RUNNING pid 16, uptime 4 days, 21:42:44
twampd RUNNING pid 11, uptime 4 days, 21:42:44
supervisor>

The scripts are similar for our SDSN Docker container; here, you can
see them running on the same host:

root@perfsonar-tel-docker-host-01:/home/psonar# docker ps
CONTAINER ID IMAGE COMMAND CREATED
STATUS PORTS NAMES
8ff50b429882 perfsonar-toolkit "/bin/sh -c '/usr/bi…" 4 days ago
Up 4 days sdsn_toolkit0
3cc522582710 perfsonar-toolkit "/bin/sh -c '/usr/bi…" 4 days ago
Up 4 days prod_toolkit0

If you'd like to test against them, you can find them at:

https://perfsonar-tel-prod-01.oit.duke.edu/toolkit/

and

https://perfsonar-tel-sdsn-01.oit.duke.edu/toolkit/

Now - for the caveats and rough edges... ;)
1) NTP reports as "unsynced" in the Toolkit web interface.
This is due to the underlying Docker host handling NTP for the
containers - and the Toolkit web interface is wholly unaware of that
fact.

If this work is accepted upstream - we'd need a means to expose that
NTP was being handled by the underlying Docker host.

2) Tests are occasionally duplicated on container restart.
This has *something* to do with how I'm handling state persistence
(maybe I'm not cleaning something up, that I should, when I tear
down/re-launch a container with a set of persistent volumes) - but,
somewhat randomly, tests sometimes get duplicated when I restart a
container.

Performing a "save" of the test definitions from the web UI after a
restart *seems* to sort that out - but, I haven't been able to
replicate the problem *reliably enough* to get a good handle on
debugging it.

That's all for now - share, enjoy, discuss. ;)
I'd personally like to see this taken up and used, so that it's
quicker and easier for folks to deploy and manage the Toolkit in
their environments.

Victor
--
Victor J. Orlikowski <> vjo@(ee.|cs.)?duke.edu



Archive powered by MHonArc 2.6.19.

Top of Page