A friend of mine just left the world of consulting. I asked him for the biggest change in his thinking, he said:
Something is going to go wrong. It’s not a matter of if, it’s when. When that bad thing goes wrong everything hinges on how you detect and respond to it.
As consultants so much of our job is focused on reducing risk for our clients, but that’s all we can do. We reduce the risk, we can’t take it all away. That means that no matter what we do, or what our clients do, there will always be some risk. On a long enough timescale there is guaranteed to be a breach, data loss, a denial of service, or some other security incident.
When that incident occurs how will you react?
Too often we focus on the response phase, but there are four phases that are important with any incident.
Detection
Do you have the systems in place to detect a breach? Many breaches aren’t detected for months after they happen. If an attacker is in your system gaining persistence, exfiltrating data, and setting up back doors for months it will be very, very difficult to evict the attacker from your network. There are a lot of cutting edge solutions out there that attempt to use AI or ML to detect anomalous behavior and some of them are very impressive.
However, I think the best place to start is getting your logging strategy ironed out. I’ve seen too many clients who log everything, but their error rates and warnings are so frequent that they’re all but worthless. Tuning your logs and getting to a point when anomalous behavior is truly anomalous should be your first goal. Once you have that done, strategize other ways of detection: honeypots, IDS systems, are all good ideas, but be sure to match solutions to your organization’s needs.
Response
Once you’ve detected an incident, how will you respond? Each incident will require a different response. Enumerating these potentials and creating playbooks for responses can be invaluable in the time of a crisis. Involve different groups and teams to get their perspective. Carrying out tabletop exercises can be a great way to initiate lines of communication and brainstorm ideal responses to different threats.
Analysis
After the incident it’s time to take stock. Perform a Root Cause Analysis or follow the Five Whys to understand what happened and how it happened. The only thing worse than making a mistake is making the same mistake twice. Analyze your process, response, and detection, not just the root vulnerability the attacker used to breach your systems. You want to learn as much from this incident as possible. This is going to cost you, so make it count. Hopefully, this doesn’t happen often, so squeeze all the lessons you can from this opportunity.
Remediation
Once you understand how the incident happened it is time to put systems in place to reduce the likelihood in the future. This may be patching a system, increasing logging or detection capabilities, adding a tool or appliance, improving your process or communication channels or staffing up the security team.
There are two sides to the security coin. First, we must do everything we can to protect the data that is entrusted to us and to secure our systems and enterprises as well as we can. Second, we must understand that we live in the real world and in that world attackers have an extended timeline they can use to attack our systems. With enough time all systems fail. When those systems fail we must do everything we can do detect, respond, analyze and remediate the issues.