I want to discuss deployment tradeoffs in Database Activity Monitoring, focusing on advantages and disadvantages of hardware appliances. It might seem minor, but the delivery model makes a big first impression on customers. It’s the first difference they notice when comparing DAM products, and it’s impressive – those racks of blinking whirring 1U & 2U machines, neatly racked, do stick with you. They cluster in groups in your data center, with lots of cool lights, logos, and deafening fans. Sometimes called “pizza boxes” by the older IT crowd, these are basic commodity computers with 1-2 processors, memory, redundant power supplies, and a disk drive or two. Inexpensive and fast, appliances are more than half the world’s DAM deployments.

When choosing between solutions, first impressions make a huge difference to buying decisions, and this positive impression is a big reasons appliances have been a strong favorite for years. Everything is self-contained and much of the monitoring complexity can be hidden from view. Basic operation and data storage are self-contained. System sizing – choosing the right processor(s), memory, and disk are the vendor’s concern, so the customer doesn’t have to worry about it or take responsibility (even if they do have to provide all the actual data…). Further cementing the positive impression, the initial deployment is easier for an average customer, with much less work to get up and running.

And what’s not to like? There are several compelling advantages to appliances, namely:

  • Fast and Inexpensive: The appliance is dedicated to monitoring. You don’t need to share resources across multiple applications (or worry another application will impact monitoring), and the platform can be tailored to its task. Hardware is chosen to fit the requirements of the vendor’s code; and configuration can be tuned to well-known processor, memory, and disk demands. Stripped-down Linux kernels are commonly used to avoid unneeded OS features. Commodity hardware can be chosen by the vendor, based purely on cost/performance considerations. When given equal resources, appliances performed slightly better than software simply because they have been optimized by the vendor and are unburdened by irrelevant features.
  • Deployment: The beauty of appliances is that they are simple to deploy. This is the most obvious advantage, even though it is mostly relevant in the short term. Slide it into the rack, connect the cables, power it up, and you get immediate functionality. Most of the sizing and capacity planning is done for you. Much of the basic configuration is in place already, and network monitoring and discovery are available without little to no effort. The box has been tested; and in some cases the vendor pre-configures policies, reports, and network settings before to shipping the hardware. You get to skip a lot of work on each installation. Granted, you only get the basics, and every installation requires customization, but this makes a powerful first impression during competitive analysis.
  • Avoid Platform Bias: “We use HP-UX for all our servers,” or “We’re an IBM shop,” or “We standardized on SQL Server databases.” All the hardware and software is bundled within the appliance and largely invisible to the customer, which helps avoid religious wars configuration and avoids most compatibility concerns. This makes IT’s job easier and avoids concerns about hardware/OS policies. DAM provides a straightforward business function, and can be evaluated simply on how well it performs that function.
  • Data Security: The appliance is secured prior to deployment. User and administrative accounts need to be set up, but the network interfaces, web interfaces, and data repositories are all set up by the vendor. There are fewer moving parts and areas to configure, making appliances more secure than their software counterparts when they are delivered, and simplifying security management.
  • Non-relational Storage: To handle high database transaction rates, non-relational storage within the appliance is common. Raw SQL queries from the database are stored in flat files, one query per line. Not only can records be stored faster in simple files, but the appliance itself avoids have the burden of running a relational database. The tradeoff here is very fast storage at the expense of slower analysis and reporting.

A typical appliance-based DAM installation consists of two flavors of appliances. The first and most common is small ‘node’ machines deployed regionally – or within particular segments of a corporate network – and focused on collecting events from ‘local’ databases. The second flavor of appliance is administration ‘servers’; these are much larger and centrally located, and provide event storage and command and control interfaces for the nodes. This two-tier hierarchy separates event collection from administrative tasks such as policy management, data management, and reporting. Event processing – analysis of events to detect policy violations – occurs either at the node or server level, depending on the vendor. Each node sends (at least) all notable events to its upstream server for storage, reporting, and analysis. In some configurations all analysis and alerting is performed at the ‘server’ layer.

But, of course, appliances are not perfect. Appliance market share is being eroded by software and software-based “virtual appliances”. Appliances have been the preferred deployment model for DAM for the better part of the last decade, but may not be for much longer. There are several key reasons for this shift:

  • Data Storage: Commodity hardware means data is stored on single or redundant SATA disks. Some compliance efforts require storing events for a year or more, but most appliances only support up 90 days of event storage – and in practice this is often more like 30-45 days. Most nodes rely heavily on central servers for mid-to-long-term storage of events for reports and forensic analysis. Depending on how large the infrastructure is, these server appliances can run out of capacity and performance, requiring multiple servers per deployment. Some server nodes use SAN for event storage, while others are simply incapable of storing 6-12 months of data. Many vendors suggest compatible SIEM or log management systems to handle data storage (and perhaps analysis of ‘old’ data).
  • Virtualization: You can’t deploy a physical appliance in a virtual network. There’s no TAP or SPAN port to plug into. The virtual topology of the network often makes it impossible to deploy an appliance, even with a software agent to collect events. Virtualization of networks and servers has undercut appliance deployments, and spawned the ‘virtual appliance’ options I will discuss later. For now, I will simply note that a virtual appliance is not a physical appliance at all, but instead an entire software stack extracted from the physical platform and deployed in a virtual machine container, controlled by a Virtual Machine Manager just like any other server or application. This trend is becoming even more prevalent as IT shops adopt cloud services which are inherently virtualized.
  • Scalability: It’s easy enough to scale as you add more databases – with appliances, you just add another node. But that’s expensive. Databases grow in size and numbers – sometimes you can simply add a new data collection agent and point it at an existing appliance. In other cases your network topology may not allow that, or demands may outgrow the appliance, requiring purchase of additional hardware.
  • Flexibility: One size does not fit all. Monitoring solutions are resource-constrained by the policies they need to enforce. Nodes with many rules and policies require additional processing capacity. Behavioral and dynamic monitoring require plenty of memory to build and maintain profiles. Compliance projects demand large volumes of storage. Requirements change, and it’s simply harder to re-provision appliances to support changes in the volume of database activity, or in security and compliance requirements.
  • Disaster Recovery: In the event of disaster and other data center outages, appliances must be physically moved to a new data center. Redeployment of software or virtual machines – on whatever hardware is available – is cheaper and faster with software based DAM than with physical hardware which might need to be purchased and shipped from the vendor. And even standby nodes cost money.

Appliances offer many compelling advantages, but deciding whether appliances are right for you requires careful consideration of your goals and database environment. The important takeaway here is that the advantages of appliances are most pronounced early in the buying cycle. On the other hand, as we’ll see in the next section, deploying software requires more up-front work during installation, configuration, and hardware procurement. Early on, appliances are much easier – their limitations often appear only much later. Don’t forget that the deployment will last longer than the initial evaluation, and keep the rest of the product lifecycle in mind as you figure out how you like it. Ease of deployment is very important, but long-term product satisfaction has more to do with the ease of day-to-day operations management, so weight that in your selection process.