If you made it this far we know your old platform is akin to an old junker automobile: every day you drive to work in a noisy, uncomfortable, costly vehicle that may or may not get you where you need to be, and every time you turn around you’re spending more money to fix something. With cars figuring out what you want, shopping, getting financing, and then dealing with car sales people is no picnic either, but in the end you do it to make you life a bit easier and yourself more comfortable. It is important to remember this because, at this stage of SIEM replacement, it feels like we have gone through a lot of work just so we can do more work to roll out the new platform. Let’s step back for a moment and focus on what’s important; getting stuff done as simply and easily as possible.
Now that you are moving to something else, how do you get there? The migration process is not easy, and it takes effort to move from from the incumbent to the new platform. We have outlined a disciplined and objective process to determine whether it is worth moving to a new security management platform. Now we will outline a process for implementing the new platform and transitioning from the incumbent to the new SIEM. You need to implement, and migrate your existing environment to the new platform, while maintaining service levels, and without exposing your organization to additional risk. This may involve supporting two systems for a short while. Or in a hybrid architecture using two systems indefinitely. Either way, when a customer puts his/her head on the block to select a new platform, the migration needs to go smoothly. There is no such thing as a ‘flash’ cutover. We recommend you start deploying the new SIEM long before you get rid of the old. At best, you will deprecate portions of the older system after newer replacement capabilities are online, but you will likely want the older system as a fallback until all new functions have been vetted and tuned. We have learned the importance of this staging process the hard way. Ignore it at your own peril, keeping in mind that your security management platform supports several key business functions.
We offer a migration plan for moving to the new security management platform. It covers data collection as well as migrating/reviewing policies, reports, and deployment architectures. We break the migration process into two phases: planning and implementation. Your plan needs to be very clear and specific about when things get installed, how data gets migrated, when you cut over from old systems to new, and who performs the work. The Planning step leverages much of the work done up to this point in evaluating replacement options – you just need to adapt it for migration.
- Review: Go back through the documents you created earlier. First consider your platform evaluation documents, which will help you understand what the current system provides and key deficiencies to address. These documents become the priority list for the migration effort, the basis for your migration task list. Next leverage what you learned during the PoC. To evaluate your new security management platform provider you conducted a mini deployment. Use what you learned from that exercise – particularly what worked and didn’t – as input for subsequent planning, and address the issues you identified.
- Focus on incremental success: What do you install first? Do you work top down or bottom up? Will you keep both systems operational throughout the entire migration, or shut down portions of the old as each node migrates? We recommend using your deployment model as a guide. You can learn more about these models by checking out Understanding and Selecting a SIEM. When using a mesh deployment model, it is often easiest to make sure a single node/location is fully functional before moving on to the next. With ring architectures it is generally best to get the central SIEM platform operational, and then gradually add nodes around it until you reach the scalability limit of the central node. Hierarchal models are best deployed top-down, with the central server first, followed by regional aggregation nodes in order of criticality, down to the collector level. Break the project up to establish incremental successes and avoid dead ends.
- Allocate resources: Who does the work? When will they do it? How long will it take to deploy the platform, data collectors, and/or log management support system(s)? This is also the time to engage professional services and enlist the new vendor’s assistance. The vendor presumably does these implementations all day long so they should have expertise at estimating these timelines. You may also want to engage them to perform some (or all) of the work in tandem with your staff, at least for the first few locations until you get the process down.
- Define the timeline: Estimate the time it will take to deploy the servers, install the collectors, and implement your policies. Include time for testing and verification. There is likely to be some ‘guesstimation’, but you have some reasonable metrics to plan from, from the PoC and prior experience with SIEM. You did document the PoC, right? Plan the project commencement date and publish to the team. Solicit feedback and adjust before commencing because you need shared accountability with the operations team(s) to make sure everyone has a vested interest in success.
- Preparation: We recommend you do as much work as possible before you begin migration, including construction of the rules and policies you will rely on to generate alerts and reports. Specify in advance any policies, reports, user accounts, data filters, backup schedules, data encryption, and related services you can. You already have a rule base so leverage it to get going. Of course you’ll tune things as you go, but why reinvent the wheel or rush unnecessarily? Keep in mind that you will always find something you failed to plan for – often an unexpected problem – that sets your schedule back. Preparation helps spot missing tasks and makes deployment go faster.
It is helpful for team morale, not to mention the confidence of upper management, to demonstrate the value of the new platfrom early on. So you should plan some “quick wins” into the migration process where possible. Delivering what you already have in the incumbent platform may be critical to long-term success, but completely uninspiring to the people deciding your bonus. If there are key facets of the new platform that can be delivered early in the implementation process, it is worth your time to do so.
The migration need not (and in fact generally should not) be an all-at-once exercise – you have the luxury of doing one piece at a time in the order that best suits your requirements.
- Deploy platform(s): This varies based on the deployment model as discussed above, but typically you install the main security management platforms first. Basic system configuration, identity management and access control integration, and basic network configuration. Once complete connect to a couple data sources and other aggregation points to make sure the system is operating correctly.
- Deploy supporting services: Deploy the data collectors and make sure event collection is working correctly. If you use a flat deployment model, configure the platform to collect events for the first set of deployment tasks. If you use a Log Management/SIEM hybrid or regional data aggregators, install the additional aggregation points and get them feeding data into the primary SIEM system to confirm proper information flow – at a small scale before ramping up event traffic. If you are moving to a new platform for real-time analysis make sure event collection happens properly. Your only concern right now should be getting data into the system in a timely fashion – tune it later.
- Install policies and reports: Next deploy the rules that comb through events and find anomalies. Hopefully you created as many as possible during the PoC and planning stages, and perhaps you can leverage your initial implementation. For real-time analysis you need to tune those rules to optimize performance. Remember that each additional rule incurs significant processing cost. It’s math – correlating multiple data sources against many rules causes the system to do exponentially more work, reducing effective performance and throughput. Look for ways to create rules with fewer comparisons, and balance fine-tuning rules for specific problems against more generic rules that catch many problems – sometimes you can throw hardware at the problem (with a bigger server) to handle more events, but it is always useful to strive for more efficient policies.
- Test and verify: Are your reports being generated properly? Are the correct alerts being generated in a timely fashion? Generate copies of the reports and send them to the team for review and compare against the existing platform (which is still operational, right?). For alerts and forensic analysis it makes sense to rerun your “Red Team” drill from the PoC to make sure you catch anomalies and confirm the accuracy of your results. Verify you get what you need – now is the time to find any problems with the system – while you still have a chance to find and fix problems, before you start depending on the new platform.
- Stakeholder sign-off: Get it in writing – trust us, this will save aggravation in the future when someone from Ops says: “Hey, where is XYZ? I still need it!” Have the compliance, security, and IT ops teams sign off on completion of the project – they own it now too (remember shared accountability?). Make sure the group is satisfied and/or all issues are documented – if not fully solved – by this point.
- Decommission: Now you can retire the older system. You may choose to run the incumbent SIEM for a few months after the new system is fully operational, just in case. But there are not many reasons to keep the older system around long-term, and plenty of reasons to send it packing. Older agents and sensors should be removed, user accounts dedicated to the older platform locked down, and hardware and virtual server real estate reclaimed. Once again, someone will need to be assigned the work with an agreed-on time frame for completion. Trouble ticketing systems are a handy way to schedule these tasks and get automated completion reports.