Passing Environments Variables to EC2 Instances using AWS CDK

October 17, 2023

Tags: aws, deployment, shell • Categories: Web Development

Table of Contents

I was updating an older project that ran using dokku and hosted an Elixir/Phoenix application with a postgres database. Uptime and other normally important things didn’t matter, so I just wanted to host the database and application on a small EC2 instance. However, the application got complex enough (SQS queues, some lambdas, etc) that I wanted to pass the SQS endpoints automatically to the EC2 instances via environments.

Should be easy, right? This feels like a very common use case.

I struggled to find a solution to this problem that Just Worked. What I wanted to do was

Build SQS queues and other AWS resources in CDK
From CDK, pass the SQS endpoints and other important variables to the EC2 instance
Make sure the vars were available on first run
And if the box restarted

UserData is the way to pass data to an EC2 instance. However, userData commands are only executed when the instance is first created, so throwing exports in the userData isn’t enough. Here’s how I solved it.

Adding Environment Variables to /etc/profile.d/*.sh

In your userData you’ll want to write a /etc/profile.d/*.sh file which exports the variables. The /etc/profile.d/*.sh are sourced automatically by /etc/profile in all common shells (sh, bash, zsh).

Below I explain shell access, but be aware of ssm start-session! It does act like a normal shell by default and you won’t see your environment variables.

I couldn’t find any out of the box way to write the export bash script, so I wrote helpers:

export class AppStack extends Stack {
  constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);

    // ... generate sqs and other stuff

    const envVars = {
      SQS_COMPLETED_SCRAPES_URL: completedScrapesSQS.queueUrl,
      SQS_PENDING_SCRAPES_URL: pendingScrapesSQS.queueUrl,
      SQS_FAILED_SCRAPES_URL: failedScrapesSQS.queueUrl,
      // ... more vars
    }
    const profiledPath = "/etc/profile.d/cdk_variables.sh";

    const userData = aws_ec2.UserData.forLinux();
    userData.addCommands(this.generateEnvUserData(envVars, profiledPath))

    const ec2Instance = new aws_ec2.Instance(this, 'Instance', {
      userData: userData,
      // ... rest of ec2 configuration
    })
  }

  generateEnvUserData(envVars: { [key: string]: string }, profiledPath: string): string {
    const envExports = Object.keys(envVars)
      .map(key => `export ${key}=\\"${envVars[key]}\\"`)
      .join('\\n');

    return `
touch ${profiledPath}
chmod +x ${profiledPath}
echo -e "${envExports}" > ${profiledPath}
  `;
  }
}

Tip: When you are playing with userData you’ll want to set userDataCausesReplacement: true on the EC2 instance config to easily recreate EC2 instances.

Here are some helpful resources:

EC2 shell access

Next, I wanted to get SSH access to my box to make sure that the environment variables were working.

SSM (simple systems manager) is the way to do this. This is the most secure way (no public port exposure) to get login access.

aws ssm start-session --target i-08b330e292cd8ccbc

However, this does not act like a traditional login session. /etc/profile.d/ is not loaded, sh is used instead of bash, a custom user account is used for login, and it’s not an SSH connection. This prevents a lot of other tools (like TablePlus, dokku, and ansible; all of which I was using on this project!) that rely on the ability to specify an SSH config from working without additional configuration. There’s a great thread here

You can force a bash session with some additional parameters:

aws ssm start-session --target i-08b330e292cd8ccbc --document-name AWS-StartInteractiveCommand --parameters command="bash -l"

However, I was curious how to setup traditional SSH access in AWS. You’ll need port 22 exposed, but outside of that the tricky part is creating a SSH key/pair and getting it on your local machine.

This example code was the most recent + complete
aws_ec2.CfnKeyPair is the class you want to use to generate a key/pair
Pass keyName to your EC2 instance creation
To pull a key you want to use aws ssm get-parameter --with-decryption --name /ec2/keypair/YOUR_KEY_ID --query Parameter.Value --output text not aws secretsmanager get-secret-value which was the old way, that is still floating around in CDK examples.
- You can list out all available parameters aws ssm describe-parameters

Here’s what it looks like:

const key = new aws_ec2.CfnKeyPair(this, 'AdminInstanceKeyPair', {
  keyName: 'ec2-key-pair',
});

const ec2Instance = new aws_ec2.Instance(this, 'Instance', {
  keyName: key.keyName,
  // ...
})

new cdk.CfnOutput(this, "Download Key Command", { value: `aws ssm get-parameter --with-decryption --name /ec2/keypair/${key.attrKeyPairId} --query Parameter.Value --output text > ${key.keyName}.pem && chmod 400 ${key.keyName}.pem` })

new cdk.CfnOutput(this, 'ssh command', { value: `ssh -i ${key.keyName}.pem -o IdentitiesOnly=yes ubuntu@` + ec2Instance.instancePublicIp })

Note that the root username changes across linux distros:

Ubuntu is ubuntu
Amazon Linux is ec2-user

SSM Documents

The document-name in the above ssm start-session command references a publicly-available parameterized script that can be used by SSM.

You can list out all available public SSM documents using:

aws ssm list-documents --document-filter-list "key=Owner,value=Public"

You can get a specific document using:

aws ssm get-document --name AWS-StartInteractiveCommand

This document looks like this:

  {
      "Name": "AWS-StartInteractiveCommand",
      "CreatedDate": "2020-11-30T20:27:51.602000-07:00",
      "DocumentVersion": "1",
      "Status": "Active",
      "Content": "{\n  \"schemaVersion\": \"1.0\",\n  \"description\": \"Document to run single interactive command on an instance\",\n    \"sessionType\": \"InteractiveCommands\",\n  \"parameters\": {\n    \"command\": {\n      \"type\": \"String\",\n      \"description\  ": \"The command to run on the instance\"\n    }\n  },\n  \"properties\": {\n    \"windows\": {\n      \"commands\": \"{{command}}\",\  n      \"runAsElevated\": false\n    },\n    \"linux\": {\n      \"commands\": \"{{command}}\",\n      \"runAsElevated\": false\n    }  ,\n    \"macos\": {\n      \"commands\": \"{{command}}\",\n      \"runAsElevated\": false\n    }\n  }\n}",
      "DocumentType": "Session",
      "DocumentFormat": "JSON"
  }

The Content key is decoded and executed on a per-platform basis. The command is substituted via the CLI params --parameters command="bash -l".

This is neat but overly complex. This is why platforms like FlightControl exist: AWS provides an extensive set of primitives but does not intend to make it easy to use. There are so many paper cuts, weird APIs, and lots of rabbit holes to go down to accomplish anything simple.