AWS NAT Route Controller

Note: This is alpha software

Overview

This controller will discover tagged NAT Instances and Routing Tables, filter down to healthy NAT Instances and allocate egress routes through the available NAT Instances for each Routing Table. To ensure only 1 router updates the routes, instances are sorted by LaunchDate and the oldest healthy instance will be considered the leader.

Following tags are expected on both EC2 Instance and Routing Table resources:

Key	Description	Default
`aws-nat-router/id`	Multiple controller can watch multiple resources	`squid`
`aws-nat-router/zone`	Used to simplify zone lookup of Instance / rtb	`-`

Allocation algorithm

Currently, the router will prefer to allocate the NAT Instance in the same zone as the routing table. If there is no healthy NAT Instance in the same zone, it will allocate to any NAT Instance which has the least routing tables. If there are multiple healthy NAT Instances per zone, it will try to allocate the routing tables equally across all available NAT Instances

Terraform Instance Profile

aws-nat-router should run on each NAT Instance, which requires the following rights:

actions = [
      "ec2:DescribeInstances",
      "ec2:DescribeRouteTables",
      "ec2:CreateRoute",
      "ec2:ReplaceRoute",
      "ec2:ModifyInstanceAttribute", # to disable SourceDestChecks on Instances launched through ASGs
    ]

A more complete Instance Role setup would look like this:

resource "aws_iam_instance_profile" "router" {
  name = "nat-router-role"
  role = "${aws_iam_role.router.name}"
}

resource "aws_iam_role" "router" {
  name               = "nat-router-role"
  assume_role_policy = "${data.aws_iam_policy_document.assume_ec2_role.json}"
}

data "aws_iam_policy_document" "assume_ec2_role" {
  statement {
    actions = ["sts:AssumeRole"]

    principals {
      type        = "Service"
      identifiers = ["ec2.amazonaws.com"]
    }
  }
}

resource "aws_iam_role_policy" "ec2_router" {
  name   = "nat-router-role-ec2"
  role   = "${aws_iam_role.squid.name}"
  policy = "${data.aws_iam_policy_document.ec2_router.json}"
}

data "aws_iam_policy_document" "ec2_router" {
  statement {
    sid = "1"

    actions = [
      "ec2:DescribeInstances",
      "ec2:DescribeRouteTables",
      "ec2:CreateRoute",
      "ec2:ReplaceRoute",
      "ec2:ModifyInstanceAttribute",
    ]

    resources = [
      "*",
    ]
  }
}

Systemd Unit

[Unit]
Description=AWS NAT Router
Documentation=https://github.com/so0k/aws-nat-router
Requires=network-online.target
After=network-online.target
[Service]
# -z: request a file modified later than the given filename modification date (mtime)
ExecStartPre=/usr/bin/curl -sLo /usr/local/bin/aws-nat-router /
  -z /usr/local/bin/aws-nat-router /
  https://github.com/so0k/aws-nat-router/releases/download/0.1.4/aws-nat-router
ExecStartPre=/usr/bin/chmod +x /usr/local/bin/aws-nat-router
Environment=LOG_LEVEL=INFO
ExecStart=/usr/local/bin/aws-nat-router \
  --vpc-id ${vpc_id} \
  --cluster-id ${cluster_id} \
  --ec2-election \
  --timeout 500ms \
  --interval 5s
Restart=always
RestartSec=10
# amount of time (seconds) systemd waits after start before marking it as failed
TimeoutStartSec=20

Todo

runOnce implementation:

Discover Tagged Instances
Discover Tagged Routing Tables
Implement TCP HealthCheck
Filter down to only Healthy NAT Instances
Implement PreventSourceDestCheck
Allocate Routing Tables to Instances
Update Routing Tables with allocations
Implement recovery actions (Restart or Terminate unhealthy nodes)

Controller implementation:

Use AWS secrets from commandline args / env vars or ec2 Role
Take region / vpc-id / cluster-id arguments for discovery
Take interval arguments and loop runOnce on interval

Deployment:

The controller is meant to run on EC2 Instances, prior to k8s bootstrap, thus we can't use Docker / Kubernetes as a deployment mechanism.

Add GitHub release to CircleCI
Add Sample Systemd unit file

Reference

based on AWS nat_monitor.sh

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.circleci		.circleci
cmd/aws-nat-router		cmd/aws-nat-router
pkg		pkg
.gitignore		.gitignore
Gopkg.lock		Gopkg.lock
Gopkg.toml		Gopkg.toml
Makefile		Makefile
README.md		README.md
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS NAT Route Controller

Overview

Allocation algorithm

Terraform Instance Profile

Todo

Reference

About

Releases 1

Packages

Languages

so0k/aws-nat-router

Folders and files

Latest commit

History

Repository files navigation

AWS NAT Route Controller

Overview

Allocation algorithm

Terraform Instance Profile

Todo

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages