TL;DR
- IR maps to CC7.3 (security event evaluation, the triage discipline) and CC7.4 (defined response program with containment, mitigation, recovery, communication)
- On-prem IR programs need pre-staged forensic capacity (forensic workstation, write blockers, imaging tools, segregated VLAN) because you cannot snapshot a workload to another availability zone mid-incident
- Detection stack typically combines Splunk or Wazuh for host SIEM, Security Onion for network monitoring, and commercial EDR for endpoint visibility
- Physical incident response at the colocation facility is the provider's job, not the tenant's. Your role as a colocation tenant is to be informed when something happens in the provider's domain and to coordinate any downstream impact on your systems
- Document communication lines with the colocation provider in the IR plan up front: who to call, how they notify you, what evidence they supply. Leaseweb Canada and OVH Cloud are two archetypal Canadian providers with documented contact paths
3:17am. The on-call engineer's phone lights up with a notification from the colocation NOC: a badge access anomaly was detected at the facility, the provider's security team is investigating, no impact on tenant systems is suspected. The engineer logs the notification in the incident queue, sets a follow-up to read the provider's incident report when it arrives, and goes back to sleep. There is no IAM role to rotate, no security group to flip, no AZ to fail over to. There is also no containment decision to make, because the incident is happening in a building the tenant does not operate.
Standard IR is the focus. Communication lines with the colo are the addition.
When your data center is operated by someone else, physical incident response is the provider's responsibility. Your IR program runs the same standard digital playbook a cloud-native team would run, plus a defined communication protocol with the provider so notifications flow in both directions when an event crosses the boundary.
The 3am scenario above is the standard pattern in well-run colocation tenancy. The provider runs physical IR. The tenant runs digital IR on the systems they actually operate, receives notifications when something at the facility might affect them, and incorporates those notifications into their own incident communication flow. Under the 2017 Trust Services Criteria, the tenant's IR program lives under CC7.3 (security event evaluation) and CC7.4 (incident response execution). The provider's physical IR is covered by the provider's own SOC 2 or CSAE 3000 report, which the tenant relies on as part of subservice vendor management.
How IR Maps to the Trust Services Criteria
CC7.3 and CC7.4 are adjacent criteria that cover two halves of the same loop.
CC7.3 is the evaluation criterion. It asks whether the organization evaluates security events to determine if they could have, or did, result in a failure to meet objectives, and whether action is taken when that threshold is crossed. In practical terms, CC7.3 is the triage discipline: the process that turns raw detections from the SIEM, the network monitoring platform, the ticketing queue, and inbound notifications from subservice providers into either closed cases or escalated incidents. It's the difference between clearing an alert and creating a case.
CC7.4 is the execution criterion. It asks whether confirmed incidents are handled through a defined response program that understands, contains, remediates, and communicates. The Points of Focus under CC7.4 cover the full lifecycle: roles and responsibilities, containment strategy, mitigation of ongoing effects, resolution, restoration of operations, remediation of identified vulnerabilities, communication, and periodic evaluation of whether the program itself is working.
Read together, CC7.3 answers did you evaluate what came in. CC7.4 answers what did you do once you confirmed it was an incident. The Phase 1 posts on logging and SIEM for bare metal and ticketing, SLA, vulnerability and incident response cover the upstream detection layer and the ticket mechanics. This post focuses on the IR program itself for environments where the hardware is in a colocation facility somebody else operates and the tenant is responsible for everything from the operating system upward.
Scope: What an On-Prem IR Program Actually Has to Cover
The lens that matters for an on-prem or colocation tenant is which incidents are theirs to respond to and which ones flow in as notifications from the provider.
- Digital incidents on tenant systems. Credential compromise, lateral movement, malware, ransomware detonation, insider misuse, web application exploitation, database exfiltration, EDR detections on workforce or production hosts. Detection comes from the SIEM, the IDS, EDR, or application logs. Containment is an identity, network, or host action the tenant performs on systems they operate. This is the bulk of the IR program.
- Tenant-owned hardware events. Compromise of out-of-band management interfaces (Dell iDRAC, HPE iLO, Supermicro IPMI) on servers the tenant owns and racks in the cage, suspected firmware tampering on tenant hardware, unauthorized boot media on tenant machines. Detection is harder because most SIEM rules don't watch the out-of-band layer, but the management plane belongs to the tenant and the response is the tenant's responsibility.
- Subservice provider notifications. Inbound notifications from the colocation provider about events in their domain: physical access incidents at the facility, environmental or power events, upstream network incidents, planned maintenance with security implications, or facility-level security findings. The tenant's job is to receive the notification, evaluate downstream impact, log it in the incident queue if any tenant-side action is needed, and incorporate the provider's incident report into the tenant's own evidence trail when the event touched tenant systems.
Physical IR at the facility is not a tenant scope item
Badge access anomalies at the building, unauthorized physical entry to common areas, fire suppression discharges, environmental control failures, and physical security incidents at the perimeter are the colocation provider's responsibility. The provider has its own IR program for these, audited under its own SOC 2 or CSAE 3000 report. The tenant's role is to be informed and to factor the provider's communication lines into the tenant's own IR plan, not to drill physical scenarios as if they were tenant-owned.
Technology: The Detection and Response Stack
On-prem teams typically build the detection layer from a mix of open source and commercial tools. The stack that shows up most often in mid-market SaaS, MSPs, defence and manufacturing shops, and healthcare environments is consistent.
Host and log detection. Splunk is the enterprise standard where budget and scale support it. Wazuh is the agent-based open source SIEM common in mid-market shops, pulling host logs, file integrity events, and CIS benchmark deviations into a single console. Both feed the triage queue with the host-level signal.
Network detection. Security Onion is the common companion for network monitoring, running Zeek and Suricata against a span port or network tap, with full packet capture available when storage allows it. For environments where east-west traffic is the hardest thing to see, it's often the difference between catching lateral movement and missing it.
Endpoint detection and response. A commercial EDR such as CrowdStrike, SentinelOne, or Microsoft Defender for Endpoint sits on production and workforce endpoints, providing process telemetry and containment actions that agent-based SIEMs don't handle as cleanly.
Out-of-band management telemetry on tenant hardware. BMC, iDRAC, iLO, and IPMI audit trails belong in the IR evidence trail when the tenant owns the hardware. The management plane is a dedicated network and nobody forwards its logs anywhere by default, which is exactly where an attacker in a serious hardware compromise scenario will be. Forwarding management-plane logs into the SIEM closes that gap.
Ticketing as the response system of record. JIRA, ServiceNow, Linear, or Azure DevOps Boards is where the incident ticket lives and where the evidence accumulates. The ticket links to the alert that triggered it, carries the containment and eradication steps, attaches forensic artifacts, and closes with the post-incident review. The ticketing and SLA workflow guide walks through how IR tickets thread through the same SLA model used for vulnerability remediation.
Pre-Staging Forensic Capacity
This is the single biggest operational difference between cloud IR and on-prem IR, and where most on-prem programs have the largest gap. In a cloud environment, forensic capacity is elastic: snapshot a suspect workload, spin up a forensic instance in a different account or availability zone, investigate without touching production. Compute and storage are on tap.
Capacity that isn't there before the event is capacity you don't have during the event
On-prem forensic capacity has to be pre-staged: clean forensic workstation, write blockers, imaging tools, memory capture kit, a segregated forensic VLAN, and printed chain of custody templates stored with the kit. The discipline is identical to backup and disaster recovery.
On-prem environments don't get the cloud option. Forensic capacity has to be pre-staged before an incident happens, because once an incident is in progress, the team will not have time to provision a clean analysis environment. At minimum, the program needs a clean forensic workstation (with write blockers, imaging tools such as FTK Imager or dd, and volatility for memory analysis), storage capacity to hold a disk image from the largest production system in scope plus headroom for a second concurrent case, a segregated forensic VLAN enforced at the edge firewall appliance, memory capture tooling pre-approved for remote capture, and printed chain of custody templates stored with the kit. A chain of custody filled out after the fact is a legal liability, not evidence.
The Six-Phase IR Playbook
The phases below map cleanly to the NIST SP 800-61 lifecycle most IR programs already use (Preparation, Detection and Analysis, Containment, Eradication, Recovery, Post-Incident Activity). Functional naming keeps the program readable across frameworks.
1. Detection. The first moment a security event becomes something the team is aware of. Detection sources include Splunk or Wazuh alerts, Security Onion network detections, EDR detections, application logs, user reports, out-of-band management plane events, and inbound notifications from the colocation provider. Every source needs a path into the triage queue. A source that drops events into a mailbox nobody monitors is not a detection source.
2. Triage (the CC7.3 control). Every event that warrants investigation gets a case with timestamp, the detection that triggered it, the events and observables reviewed, and the conclusion. Even when the conclusion is no action needed, false positive from automated scanning, the documented triage is what the auditor is looking for. A clean dashboard with zero alerts is a red flag. A dashboard with alerts that were investigated and closed with rationale is evidence of a functioning monitoring program.
From real engagements: a SaaS team had deployed a SIEM and IDS, and the monitoring was working. Real events were showing up, including SQL injection attempts and automated scanning. The team was clearing alerts from the dashboard without documenting any investigation, which meant they had strong monitoring evidence but an empty response trail. The fix was a simple process change: create a case for every alert that warrants investigation, even when the conclusion is no action needed. Weekly dashboard reviews during the observation period built a rhythm of documented triage that accumulated into a strong CC7.3 evidence trail.
3. Containment. Containment happens on systems the tenant operates: network isolation, credential rotation, disabling an account, killing a malicious process, quarantining a host through the EDR, or pulling a tenant-owned server off the network through its switch port. The containment decision has to balance preserving evidence (especially memory) against stopping the damage, and the playbook should state that tradeoff explicitly so the on-call responder isn't making it alone at 2am.
4. Eradication. Remove the root cause: uninstall malware, rebuild a host from a known-good image, patch an exploited vulnerability, rotate compromised credentials and keys, or reflash a suspect BMC on a tenant-owned server. The step should not begin until containment is confirmed and forensic capture is complete where warranted, and the ticket should record each decision point.
5. Recovery. Restore the affected systems to production in a verified state: a clean backup restore, staged re-enablement, a verification check against the restored system, and close monitoring for recurrence. The same backup capacity that serves disaster recovery on-prem serves recovery here. Verification belongs in the ticket, not in someone's head.
6. Lessons Learned. A short post-incident review captures the timeline, root cause, what went well, what didn't, and the changes required to prevent recurrence. Those changes flow out as tickets: a detection rule update, a playbook revision, a configuration change, a new test in the tabletop library. Mature programs get stronger from real-world input, and the post-incident review is the mechanism that makes it happen.
Communication Lines With the Colocation Provider
When the data center is operated by someone else, the IR plan has to capture how notifications flow in both directions across the boundary. The tenant doesn't run physical IR for the facility. The tenant does need a documented communication protocol so the provider's incidents reach the tenant in time to assess downstream impact, and so the tenant's incidents can be escalated to the provider when they touch shared infrastructure.
Document the communication lines before the incident, not during it
For Canadian colocation tenancy, Leaseweb Canada and OVH Cloud are the two archetypal examples. Both publish contact paths for support, abuse, and security escalations. A mature IR plan captures the inbound and outbound contact details, the categories of events the provider commits to notify the tenant about, and the response time the provider commits to for each category. This is a planning step, not a drill.
The communication protocol the tenant captures in the IR plan covers four things:
- Inbound notifications from the provider to the tenant. What categories of provider-side events trigger a notification (physical access incidents, environmental events, upstream network incidents, planned maintenance with security implications, facility-level security findings), the channel the notification arrives on (email, phone, portal), and the committed response time.
- Outbound escalations from the tenant to the provider. Who at the provider to contact for an abuse or DDoS incident originating upstream, network connectivity failures, suspected facility-level issues affecting the tenant's cage, or any tenant incident that warrants provider awareness. Phone numbers, email addresses, ticketing portal URLs, and the committed response time per channel.
- Evidence the provider supplies for tenant audit purposes. The provider's own SOC 2 or CSAE 3000 trust report covers the physical IR controls. Maintenance and incident notification records, badge access logs where contracted, network flow data where contracted, and the trust report itself become tenant-side evidence for the subservice vendor management story.
- Roles on the tenant side that handle provider communication. One named person on the tenant team who receives provider notifications, one named backup, and a documented routing rule that puts provider notifications into the same triage queue as internal detections so they don't sit in a mailbox nobody reads.
The provider relationship also shapes what's in scope for the tenant's IR evidence. The physical controls inside the data center are a subservice organization activity, meaning the provider's own SOC 2 or CSAE 3000 report carries the physical security criteria for the shared facility. The user entity controls the tenant runs on top are still the tenant's: access to the cage, the configuration of the out-of-band management plane on tenant-owned hardware, and the communication protocols when a facility event intersects tenant systems. The subservice vendor management post walks through that boundary in depth.
Tabletops: Drill the Scenarios the Tenant Actually Owns
IR tabletops should drill the scenarios the tenant team actually has to respond to, not the ones that belong to the colocation provider. Most cloud-native libraries cycle through credential leak, ransomware, and DDoS. An on-prem or colocation tenant library extends that with scenarios the tenant team owns end to end, plus a communication drill that exercises the inbound and outbound notification flow with the provider.
A realistic tenant tabletop library:
- Credential compromise and lateral movement
- Ransomware detonation on an internal host
- Web application exploitation of a public-facing service
- Insider data exfiltration
- Suspected firmware or BMC compromise on a tenant-owned server
- Stolen or missing tenant equipment (production host, backup media, workforce laptop)
- Privacy incident involving personal information subject to PIPEDA or Law 25 breach notification
- Communication drill with the colocation provider. Walk through receiving an inbound notification from the provider about an event in their domain, deciding what action the tenant needs to take, and logging the notification in the incident queue. Walk through escalating an outbound incident to the provider when shared infrastructure is involved. The drill exercises the IR plan's communication protocol, not the provider's physical IR.
Running two or three of these per quarter, rotating through the library over a year, produces both a better-prepared team and the CC7.4 evidence that the program is periodically evaluating itself. A tabletop that produced updates to the playbook, detection rules, or communication templates is a stronger audit artifact than one that produced a meeting note.
Evidence: What the Auditor Samples
IR evidence under CC7.3 and CC7.4 follows the same three-part continuous evidence pattern used across the rest of the program.
Configuration evidence proves the program exists: the incident response policy, the playbook library, the roles and responsibilities documented in the Security Program Manual, the colocation provider communication protocol and contact list, the detection rule inventory, and the forensic capacity register. Configuration evidence shows the program is designed.
Execution history proves the process runs on cadence: the triage case log covering the observation period, the incident ticket history with timestamps and state transitions, the weekly or monthly monitoring review records, the tabletop exercise schedule and results, the post-incident review records, and the playbook revision history. Execution history shows the program is operating.
Representative samples prove the output is meaningful. The auditor will sample specific artifacts: a closed triage case with documented investigation notes, a confirmed incident walked end to end from detection through post-incident review, a tabletop exercise with the follow-up changes tracked to completion, and, where applicable, an inbound notification from the colocation provider that produced a documented tenant-side response. Representative samples show the program is producing audit-grade output, not just paperwork.
The failure mode to avoid: a well-written policy, a thoughtful playbook library, and an empty ticket history. Auditors read that as a program that exists on paper but doesn't run.
People: Who Owns What When Things Go Wrong
The IR roles that hold up under audit are fewer and clearer than most org charts suggest.
- An incident commander makes the containment and communication decisions during an active incident on tenant systems. One person, always named, with a documented backup. For most on-prem shops, this is the senior infrastructure or security lead, with a fractional security team member as the designated backup on retainer.
- A technical lead executes the containment, eradication, and recovery steps. Often the same person who owns the affected system day to day.
- A communications lead handles internal updates, customer communications, and regulator notifications where required. On small teams this is often the CTO or the head of customer success.
- An evidence custodian maintains the chain of custody, the forensic artifacts, and the incident ticket. Separating this role from the technical lead is a meaningful segregation-of-duties control, even when the team is small.
- A provider liaison handles inbound notifications from the colocation provider and outbound escalations to the provider. On small teams the incident commander often holds this hat too, but the routing rule that gets provider notifications into the triage queue belongs to whoever holds it.
Programs run on cadence, not intention
The biggest failure mode isn't poor technical execution. It's a program where the on-call rotation doesn't know who the incident commander is, the colocation provider contact list lives in one engineer's notebook, and the forensic workstation hasn't been powered on in six months.
Where This Lands in an Effective Security Program
Teams that pass on-prem SOC 2 cleanly on CC7.3 and CC7.4 aren't the ones with the fanciest EDR or the most tabletops on the calendar. They're the ones whose program is honest about operational reality: detection sources routed into a single triage queue, forensic capacity pre-staged before it's needed, a clear boundary between tenant-owned IR and the colocation provider's physical IR, communication lines with the provider documented in advance, tabletops that drive real changes, and evidence captured as a byproduct of running the process rather than assembled the week before the audit.
Build the program once with a workflow that matches how the team actually runs. Map frameworks onto it without restarting. The same IR program satisfies the response outcomes in SOC 2, the incident management outcomes in ISO 27001, and the incident handling controls in CPCSC and ITSP.10.171. Extend, don't restart.
Build an IR Program That Holds Up at 3am
Truvo designs incident response programs as part of an effective security program with documented colocation communication lines and continuous audit evidence.
Further Reading
- SOC 2 Readiness for Bare Metal SaaS: cluster overview and what an on-prem engagement actually looks like
- SOC 2 Logging and SIEM for Bare Metal: the detection layer that feeds the triage queue
- SOC 2 Ticketing, SLA, Vulnerability, and Incident Response: how incident tickets thread through the same SLA model as vulnerability remediation
- SOC 2 Vendor Management When Your Data Center Is a Subservice Organization: where the boundary with the colocation provider sits
- How to Choose a SOC 2 Consultant: what to look for if the IR program needs outside help
How CC7.3 and CC7.4 Points of Focus Show Up in Incident Response
The 2017 Trust Services Criteria (with revised Points of Focus, 2022) list the characteristics auditors evaluate when assessing whether incident response is suitably designed and operating effectively. CC7.3 carries three core Points of Focus. CC7.4 carries eleven. Here's how each one maps to the on-prem IR program described above, paraphrased in Truvo's words.
CC7.3, Evaluating Security Events
Responds to Security Incidents. The program has documented procedures for responding to security incidents and periodically evaluates whether those procedures are still effective. In practice this is the playbook library plus the tabletop cadence, and the tabletops are how the program confirms the playbooks still work.
Communicates and Reviews Detected Security Events. Detected events are communicated to and reviewed by the people responsible for the security program, and action is taken where required. In the workflow above this is the triage case, the escalation path, and the weekly review rhythm that ensures no detection sits uninvestigated.
Develops and Implements Procedures to Analyze Security Incidents. There are procedures for analyzing incidents and determining system impact. For on-prem this is the forensic capability plus the analysis step inside the ticket, with scope notes that name what was examined and what the findings were.
CC7.4, Executing Incident Response
Assigns Roles and Responsibilities. Roles for designing, running, and executing the IR program are assigned, including external resources where needed. Incident commander, technical lead, communications lead, evidence custodian, provider liaison, and the fractional security team on retainer all belong here.
Obtains Understanding of Nature of Incident and Determines Containment Strategy. The program understands how the incident occurred and which resources were affected, and uses that understanding to pick a containment approach and response time frame. This is the moment the incident commander decides whether to preserve memory before pulling a host off the network, and the rationale lives in the ticket.
Contains, Mitigates, and Resolves Security Incidents. Three adjacent Points of Focus cover the active-response phase: procedures to contain incidents that actively threaten objectives, to mitigate effects while they're in progress, and to resolve them through closure of vulnerabilities, removal of unauthorized access, and other remediation. Network isolation, credential rotation, EDR quarantine, and tenant-side containment actions are the typical tools, captured as state transitions on the incident ticket.
Restores Operations. Procedures restore data and business operations to an interim state that meets objectives. Recovery from backups, staged re-enablement, and verification before handing a system back to production.
Remediates Identified Vulnerabilities and Communicates Remediation Activities. Vulnerabilities surfaced during the incident are remediated through documented activity, and those activities are communicated in line with the IR program. An unpatched component, a misconfigured firewall rule, or a compromised BMC on tenant hardware flows into the change and patch program as a tracked ticket.
Develops and Implements Communication of Security Incidents. Protocols communicate timely information about incidents and actions taken to affected parties. Internal updates, customer notifications, regulatory notifications under PIPEDA, Law 25, or sectoral regimes, and inbound or outbound coordination with the colocation provider all flow from this Point of Focus.
Evaluates the Effectiveness of Incident Response and Periodically Evaluates Incidents. The design of IR activities is periodically evaluated for effectiveness, and management reviews incidents across security, availability, processing integrity, confidentiality, and privacy, identifying system changes based on patterns and root causes. Quarterly tabletop results, post-incident reviews, and an annual program review serve the first. A quarterly incident trend review that feeds the risk register closes the loop on the second.
Explore further in Framework Explorer: CC7.3 · CC7.4, see the full requirement, implementation guidance, evidence types, and cross-framework mappings.
Source: AICPA TSP Section 100, 2017 Trust Services Criteria with Revised Points of Focus (2022). Point of Focus characteristics described in Truvo's words and mapped to an on-prem incident response implementation pattern. Consult the source document for the official AICPA text.
Frequently Asked Questions
What do SOC 2 CC7.3 and CC7.4 require for incident response?
CC7.3 requires the organization to evaluate security events to determine whether they could have, or did, result in a failure to meet objectives, and to take action when that threshold is crossed. CC7.4 requires the organization to respond to confirmed incidents through a defined response program that understands, contains, remediates, and communicates. Neither criterion prescribes specific tools or timelines. Both expect the program to match how the team actually operates and to produce continuous evidence across the observation period.
Is physical incident response at the colocation facility the tenant's responsibility?
No. Physical incident response at the facility, including badge access incidents, unauthorized physical entry, environmental and power events, and facility-level security incidents, is the colocation provider's responsibility. The provider runs its own IR program for these scenarios and has them audited under its own SOC 2 or CSAE 3000 report. The tenant's role is to receive notifications when something at the facility might affect tenant systems, evaluate the downstream impact, and incorporate the notification into the tenant's own incident communication flow. The tenant's IR program drills the scenarios the tenant actually owns.
How is on-prem incident response different from cloud incident response?
Forensic capacity has to be pre-staged instead of spun up on demand, because on-prem teams cannot provision a clean analysis environment in another availability zone mid-incident. Detection typically runs on Splunk, Wazuh, Security Onion, and commercial EDR rather than cloud-native services. Out-of-band management telemetry from BMC, iDRAC, iLO, and IPMI on tenant-owned hardware belongs in the IR evidence trail. And the IR plan needs documented communication lines with the colocation provider so notifications flow in both directions when an event crosses the boundary between tenant-owned systems and the facility.
How do we coordinate incident response with a colocation provider like Leaseweb Canada or OVH Cloud?
Document the communication protocol in the IR plan up front, before any incident happens. The protocol covers inbound notifications from the provider (what events trigger a notification, the channel, the response time), outbound escalations from the tenant to the provider (who to contact for an upstream network issue or any tenant incident that warrants provider awareness), the evidence the provider supplies for tenant audit purposes, and the named tenant-side roles that handle provider communication. When the colocation is Leaseweb Canada or OVH Cloud, the contact paths are published and can be captured in the IR plan rather than figured out mid-incident.
What evidence should be staged before a SOC 2 IR walkthrough?
The incident response policy, the playbook library, the roles and responsibilities documentation, the colocation provider communication protocol and contact list, the forensic capacity register, the triage case log for the observation period, the incident ticket history with timestamps and state transitions, the tabletop exercise schedule and results, and the post-incident review records. Auditors sample across these artifacts, so continuity across the observation period matters more than depth in any single record.
Ready to Start Your Compliance Journey?
Get a clear, actionable roadmap with our readiness assessment.
About the Author
Former security architect for Bank of Canada and Payments Canada. 20+ years building compliance programs for critical infrastructure.
How Ready Are You for SOC 2?
Score your security program in under 5 minutes. Free.
Take the Scorecard