{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "stuck-visibility", "metadata": {}, "source": [ "# Working with several networks\n", "\n", "When one single network isn't enough.\n", "\n", "---\n", "\n", "- Website: https://discovery.gitlabpages.inria.fr/enoslib/index.html\n", "- Instant chat: https://framateam.org/enoslib\n", "- Source code: https://gitlab.inria.fr/discovery/enoslib\n", "\n", "---\n", "\n", "## Prerequisites\n", "\n", "
\n", " Make sure you've run the one time setup for your environment\n", "
\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "authentic-affairs", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": null, "id": "turkish-prison", "metadata": {}, "outputs": [], "source": [ "import enoslib as en\n", "\n", "# Enable rich logging\n", "_ = en.init_logging()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "organic-outreach", "metadata": {}, "source": [ "We reserve two nodes (with at least two network interfaces), the first network interface of each node will use the production network of Grid'5000 (not isolated network) while a second network interface will be configured to use a Vlan.\n", "\n", "- To find out which machine have at least two network cards, you can refer to the [hardware page of Grid'5000](https://www.grid5000.fr/w/Hardware)\n", "- To know more about Vlans on Grid'5000, you can refer to this [page](https://www.grid5000.fr/w/KaVLAN)\n", "\n", "
\n", "Beware: the number of VLAN is limited. Here we want a routed vlans and there are only 6 routed vlan per sites (3 are monosite and 3 are multisite) \n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "numerous-cooper", "metadata": {}, "outputs": [], "source": [ "SITE = \"rennes\"\n", "\n", "network = en.G5kNetworkConf(type=\"prod\", roles=[\"public\"], site=SITE)\n", "private = en.G5kNetworkConf(type=\"kavlan\", roles=[\"private\"], site=SITE)\n", "\n", "conf = (\n", " en.G5kConf.from_settings(job_name=\"enoslib_several_networks\")\n", " .add_network_conf(network)\n", " .add_network_conf(private)\n", " .add_machine(\n", " roles=[\"server\", \"xp\"],\n", " cluster=\"paravance\",\n", " nodes=1,\n", " primary_network=network,\n", " secondary_networks=[private],\n", " )\n", " .add_machine(\n", " roles=[\"client\", \"xp\"],\n", " cluster=\"paravance\",\n", " nodes=1,\n", " primary_network=network,\n", " secondary_networks=[private],\n", " )\n", " .finalize()\n", ")\n", "conf" ] }, { "cell_type": "code", "execution_count": null, "id": "derived-tradition", "metadata": {}, "outputs": [], "source": [ "provider = en.G5k(conf)\n", "roles, networks = provider.init()\n", "roles" ] }, { "attachments": {}, "cell_type": "markdown", "id": "sound-collins", "metadata": {}, "source": [ "## Get the network information of your nodes\n", "\n", "First we retrieve the network information by syncing the Host descriptions with the remote machines.\n", "Syncing the information will populate every single Host datastructure with some actual information (e.g. number of cores, network information).\n", "This relies on Ansible fact gathering and is provider agnostic. \n", "Note that Grid'5000 provides a lot of node information in its [REST API](https://api.grid5000.fr) (but provides only static information)" ] }, { "cell_type": "code", "execution_count": null, "id": "contained-library", "metadata": {}, "outputs": [], "source": [ "roles = en.sync_info(roles, networks)\n", "roles" ] }, { "attachments": {}, "cell_type": "markdown", "id": "dense-angel", "metadata": {}, "source": [ "We can now filter the network addresses of the nodes given a network" ] }, { "cell_type": "code", "execution_count": null, "id": "conceptual-avenue", "metadata": {}, "outputs": [], "source": [ "server = roles[\"server\"][0]\n", "server.filter_addresses(networks=networks[\"private\"])" ] }, { "cell_type": "code", "execution_count": null, "id": "enclosed-saskatchewan", "metadata": {}, "outputs": [], "source": [ "ip_address = server.filter_addresses(networks=networks[\"private\"])[0]\n", "str(ip_address.ip.ip)" ] }, { "cell_type": "code", "execution_count": null, "id": "configured-specialist", "metadata": {}, "outputs": [], "source": [ "server.filter_addresses(networks=networks[\"public\"])" ] }, { "attachments": {}, "cell_type": "markdown", "id": "piano-hollow", "metadata": {}, "source": [ "## A simple load generation tool\n", "\n", "We are using [flent](https://flent.org/), a convenient client to netperf that is able to play different network benchmarks.\n", "\n", "Roughly speaking, Flent connects to a Netperf server, starts a benchmark and collect metrics in various format (csv, images ... ).\n", "That makes a good candidates when you need to get a quick insight into the performance of the network between your nodes\n", "\n", "The goal of this part is to initiate a benchmark of TCP traffic on the `private` network. So we need to instruct `flent` to connect to the `netperf` server on the relevant address." ] }, { "cell_type": "code", "execution_count": null, "id": "middle-feeding", "metadata": {}, "outputs": [], "source": [ "with en.actions(roles=roles) as a:\n", " a.apt_repository(\n", " repo=\"deb http://deb.debian.org/debian $(lsb_release -c -s) main contrib non-free\",\n", " state=\"present\",\n", " )\n", " a.apt(\n", " name=[\"flent\", \"netperf\", \"python3-setuptools\", \"python3-matplotlib\"],\n", " state=\"present\",\n", " update_cache = \"yes\"\n", " )" ] }, { "attachments": {}, "cell_type": "markdown", "id": "detected-sport", "metadata": {}, "source": [ "---\n", "Checking the routes on the nodes. Make sure the `private` network goes through the `private` interface." ] }, { "cell_type": "code", "execution_count": null, "id": "permanent-organ", "metadata": {}, "outputs": [], "source": [ "routes = en.run_command(\"ip route list\", roles=roles)\n", "print(\"\\n-Routes-\\n\")\n", "print(\"\\n\\n\".join([f\"{r.host} => {r.stdout}\" for r in routes]))" ] }, { "cell_type": "code", "execution_count": null, "id": "motivated-brass", "metadata": {}, "outputs": [], "source": [ "server_address = str(server.filter_addresses(networks=networks[\"private\"])[0].ip.ip)\n", "\n", "with en.actions(pattern_hosts=\"server\", roles=roles) as a:\n", " a.shell(\"netperf\", background=True) # this is somehow idempotent .. will fail silently if netperf is already started\n", " a.wait_for(port=12865, state=\"started\", task_name=\"Waiting for netperf to be ready\")\n", " \n", "\n", "with en.actions(pattern_hosts=\"client\", roles=roles) as a:\n", " a.shell(\n", " \" flent tcp_upload -p totals \"\n", " \" -l 60 \"\n", " f\" -H { server_address } \"\n", " \" -t 'tcp_upload test' \"\n", " \" -o result.png\"\n", " )\n", " a.fetch(src=\"result.png\", dest=\"result\")" ] }, { "cell_type": "code", "execution_count": null, "id": "408fe1d5-9a9e-43d5-a3b1-c371da5d4e2e", "metadata": {}, "outputs": [], "source": [ "with en.actions(pattern_hosts=\"client\", roles=roles) as a:\n", " a.fetch(src=\"result.png\", dest=\"/tmp/result\")\n", " r = a.results\n", "r" ] }, { "cell_type": "code", "execution_count": null, "id": "offshore-positive", "metadata": {}, "outputs": [], "source": [ "from IPython.display import Image\n", "Image(f\"/tmp/result/{roles['client'][0].alias}/result.png\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "widespread-variable", "metadata": {}, "source": [ "---\n", "Forcing the flent client to be bound on the right network (not really necessary if the routes are set correctly).\n", "It's an opportunity to use host variables so let's do it ;)\n", "\n", "`flent` has an option for this `--local-bind `" ] }, { "cell_type": "code", "execution_count": null, "id": "fundamental-converter", "metadata": {}, "outputs": [], "source": [ "for h in roles[\"client\"]:\n", " h.extra.update({\"local_bind\": h.filter_addresses(networks=networks[\"private\"])[0].ip.ip})\n", "roles[\"client\"][0]" ] }, { "cell_type": "code", "execution_count": null, "id": "advance-hearts", "metadata": {}, "outputs": [], "source": [ "server_address = str(server.filter_addresses(networks=networks[\"private\"])[0].ip.ip)\n", "\n", "with en.actions(pattern_hosts=\"server\", roles=roles) as a:\n", " a.shell(\"netperf\", background=True) # this is somehow idempotent .. will fail silently if netperf is already started\n", " a.wait_for(port=12865, state=\"started\", task_name=\"Waiting for netperf to be ready\")\n", " \n", "\n", "with en.actions(pattern_hosts=\"client\", roles=roles) as a:\n", " a.shell(\n", " \" flent tcp_upload -p totals \"\n", " \" -l 60 \"\n", " f\" -H { server_address } \"\n", " \"--local-bind {{ local_bind }} \"\n", " \" -t 'tcp_upload test' \"\n", " \" -o result_bind.png\"\n", " )\n", " a.fetch(src=\"result_bind.png\", dest=\"/tmp/result\")" ] }, { "cell_type": "code", "execution_count": null, "id": "beautiful-music", "metadata": {}, "outputs": [], "source": [ "from IPython.display import Image\n", "Image(f\"/tmp/result/{roles['client'][0].alias}/result_bind.png\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "025ebb6c-1651-43b5-9e75-3bc9984fb93a", "metadata": {}, "source": [ "## Checking that the network traffic flows through the right interface :)" ] }, { "cell_type": "code", "execution_count": null, "id": "70e383e9-a920-4a92-9a70-b0046ac1dc6e", "metadata": {}, "outputs": [], "source": [ "# we enable the statistics on all known interfaces\n", "# note that this seems incompatible with --epoch :( :(\n", "with en.Dstat(nodes=roles[\"xp\"], options=\"--full\") as d:\n", " backup_dir = d.backup_dir\n", " with en.actions(pattern_hosts=\"server\", roles=roles) as a:\n", " a.shell(\"netperf\", background=True) # this is somehow idempotent .. will fail silently if netperf is already started\n", " a.wait_for(port=12865, state=\"started\", task_name=\"Waiting for netperf to be ready\")\n", "\n", "\n", " with en.actions(pattern_hosts=\"client\", roles=roles) as a:\n", " a.shell(\n", " \" flent tcp_upload -p totals \"\n", " \" -l 60 \"\n", " f\" -H { server_address } \"\n", " \"--local-bind {{ local_bind }} \"\n", " \" -t 'tcp_upload test' \"\n", " \" -o result_bind.png\"\n", " )\n", " a.fetch(src=\"result_bind.png\", dest=\"result\")" ] }, { "cell_type": "code", "execution_count": null, "id": "961491cf-2379-4d7f-a5c5-dd21544c9f58", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import seaborn as sns \n", "\n", "print(backup_dir)\n", "\n", "# create a dictionnary: host -> pd.DataFrame\n", "results = dict()\n", "for host in roles[\"xp\"]:\n", " result = pd.DataFrame()\n", " host_dir = backup_dir / host.alias\n", " csvs = host_dir.rglob(\"*.csv\")\n", " for csv in csvs:\n", " print(csv)\n", " df = pd.read_csv(csv, skiprows=5, index_col=False)\n", " df[\"host\"] = host.alias\n", " df[\"csv\"] = csv\n", " result = pd.concat([result, df], axis=0)\n", " results[host] = result" ] }, { "cell_type": "code", "execution_count": null, "id": "bc874abc-6eaf-4052-911c-946add6752f2", "metadata": {}, "outputs": [], "source": [ "results[roles[\"xp\"][0]]" ] }, { "cell_type": "code", "execution_count": null, "id": "ec3e8cf1-4ca9-47c4-89c0-b19cc3746b03", "metadata": {}, "outputs": [], "source": [ "from itertools import product\n", "import matplotlib.pyplot as plt \n", "\n", "for host, result in results.items():\n", " interfaces = host.filter_interfaces()\n", " # interfaces = [eno1, enos2]\n", " keys_in_csv = [fmt % interface for interface, fmt in product(interfaces, [\"net/%s:recv\", \"net/%s:send\"])]\n", " # keys_in_csv = ['net/eno2:recv', 'net/eno2:send', 'net/eno1:recv', 'net/eno1:send']\n", " print(keys_in_csv)\n", " plt.figure()\n", " # melt makes the data tidy\n", " # 0, {recv, send}, value_0\n", " # 1, {recv, send}, value_1\n", " sns.lineplot(data=result.melt(value_vars = keys_in_csv, ignore_index=False).reset_index(), x=\"index\", y=\"value\", hue=\"variable\")\n", " plt.title(f\"{host.alias} \\n ~ traffic should be on {host.filter_interfaces(networks=networks['private'])} ~\")" ] }, { "attachments": {}, "cell_type": "markdown", "id": "functional-netherlands", "metadata": {}, "source": [ "## Emulating the network conditions" ] }, { "attachments": {}, "cell_type": "markdown", "id": "junior-fifth", "metadata": {}, "source": [ "We'll illustrate how network constraints can be set on specific network interfaces on the nodes of the experiment.\n", "To do so EnOSlib provides two services:\n", "- the Netem service which is a wrapper around [netem](https://wiki.linuxfoundation.org/networking/netem).\n", "- the NetemHTB which provides a high level interface to finer grained [HTB network based emulation](https://tldp.org/HOWTO/Traffic-Control-HOWTO/classful-qdiscs.html)\n", "\n", "More information can be found in the EnOSlib documentation: https://discovery.gitlabpages.inria.fr/enoslib/apidoc/netem.\n", "\n", "EnOSlib let's you set the constraint easily on a dedicated network by only specifying it with its logical name." ] }, { "cell_type": "code", "execution_count": null, "id": "colored-tourism", "metadata": {}, "outputs": [], "source": [ "netem = en.Netem()\n", "# symetric constraints:\n", "# node1|10ms ---> 10ms|node2|10ms --> 10ms|node1\n", "netem.add_constraints(\"delay 10ms\", roles[\"xp\"], symetric=True, networks=networks[\"private\"])" ] }, { "cell_type": "code", "execution_count": null, "id": "friendly-carpet", "metadata": {}, "outputs": [], "source": [ "netem.deploy()" ] }, { "attachments": {}, "cell_type": "markdown", "id": "nutritional-liability", "metadata": {}, "source": [ "---\n", "There's a convenient method that let you quickly check the network condition (at least the RTT latency)" ] }, { "cell_type": "code", "execution_count": null, "id": "about-actress", "metadata": {}, "outputs": [], "source": [ "netem.validate()" ] }, { "cell_type": "code", "execution_count": null, "id": "efa57562-7d8e-48ac-9d57-2026864fea12", "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "server_alias = roles['server'][0].alias\n", "print(server_alias)\n", "print(Path(f\"_tmp_enos_/{server_alias[:-3]}.fpingout\")read_text())\n", "\n", "print(\"...8<\"*20)\n", "client_alias = roles['client'][0].alias\n", "print(client_alias)\n", "\n", "print(Path(f\"_tmp_enos_/{client_alias[:-3]}.fpingout\").read_text())" ] }, { "attachments": {}, "cell_type": "markdown", "id": "intermediate-unknown", "metadata": {}, "source": [ "## Clean" ] }, { "cell_type": "code", "execution_count": null, "id": "recovered-corpus", "metadata": {}, "outputs": [], "source": [ "provider.destroy()" ] }, { "cell_type": "code", "execution_count": null, "id": "459d0b3f-82da-4869-8b5c-faaea8735ad4", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" }, "toc-showcode": false, "toc-showmarkdowntxt": false }, "nbformat": 4, "nbformat_minor": 5 }