Blog

Summary: Modifying rsyslog to Add Cloud Instance Metadata

By Rich

Rich here.

Quick note: I basically wrote an entire technical post for Tool of the Week, so feel free to skip down if that’s why you’re reading.

Ah, summer. As someone who works at home and has children, I’m learning the pains of summer break. Sure, it’s a wonderful time without homework fights and after-school activities, but it also means all 5 of us in the house nearly every day. It’s a bit distracting. I mean do you have any idea how to tell a 3-year-old you cannot ditch work to play Disney Infinity on the Xbox?

Me neither, which explains my productivity slowdown.

I’ve actually been pretty busy at ‘real work’, mostly building content for our new Advanced Cloud Security course (it’s sold out, but we still have room in our Hands-On class). Plus a bunch of recent cloud security assessments for various clients. I have been seeing some interesting consistencies, and will try to write those up after I get these other projects knocked off. People are definitely getting a better handle on the cloud, but they still tend to make similar mistakes.

With that, let’s jump right in…

Top Posts for the Week

Tool of the Week

I’m going to detour a bit and focus on something all you admin types are very familiar with: rsyslog. Yes, this is the default system logger for a big chunk of the Linux world, something most of us don’t think that much about. But as I build out a cloud logging infrastructure I found I needed to dig into it to make some adjustments, so here is a trick to insert critical Amazon metadata into your logs (usable on other platforms, but I can only show so many examples).

Various syslog-compatible tools generate standard log files and allow you to ship them off to a remote collector. That’s the core of a lot of performance and security monitoring. By default log lines look something like this:

 Jun 24 00:21:27 ip-172-31-40-72 sudo: ec2-user : TTY=pts/0 ; PWD=/var/log ; USER=root ; COMMAND=/bin/cat secure

That’s the line outputting the security log from a Linux instance. See a problem?

This log entry includes the host name (internal IP address) of the instance, but in the cloud a host name or IP address isn’t nearly as canonical as in traditional infrastructure. Both can be quite ephemeral, especially if you use auto scale groups and the like. Ideally you capture the instance ID or equivalent on other platforms, and perhaps also some other metadata such as the internal or external IP address currently associated with the instance. Fortunately it isn’t hard to fix this up.

The first step is to capture the metadata you want. In AWS just visit:

 http://169.254.169.254/latest/meta-data/

To get it all. Or use something like:

 curl http://169.254.169.254/latest/meta-data/instance-id

to get the instance ID. Then you have a couple options. One is to change the host name to be the instance ID. Another is to append it to entries by changing the rsyslog configuration (/etc/rsyslog.conf on CentOS systems), as in the below to add a %INSTANCEID% environment variable to the hostname (yes, this means you need to set INSTANCEID as an environment variable, and I haven’t tested this because I need to post the Summary before I finish, so you might need a little more text manipulation to make it work… but this should be close):

 template(name="forwardFormat" type="string"
          string="<%PRI%>%TIMESTAMP:::date-rfc3339% %INSTANCEID%-%HOSTNAME% %syslogtag:1:32%%msg:::sp-if-no-1st-sp%%msg%"
         )

There are obviously a ton of ways you could slice this, and you need to add it to your server build configurations to make it work (using Ansible/Chef/Puppet/packer/whatever). But the key is to capture and embed the instance ID and whatever other metadata you need. If you don’t care about strict syslog compatibility, you have more options. The nice thing about this approach is that it will capture all messages from all the system sources you normally log, and you don’t need to modify individual message formats.

If you use something like the native Amazon/Azure/Google instance logging tools… you don’t need to bother with any of this. Those tools tend to capture the relevant metadata for you (e.g., using Amazon’s CloudWatch logs agent, Azure’s Log Analyzer, or Google’s StackDriver). Check the documentation to make sure you get them correct. But many clients want to leverage existing log management, so this is one way to get the essential data.

Securosis Blog Posts this Week

Other Securosis News and Quotes

Another quiet week…

Training and Events

No Related Posts
Comments

If you like to leave comments, and aren’t a spammer, register for the site and email us at info@securosis.com and we’ll turn off moderation for your account.