Introduction
A Security Operations Center (SOC) is the nerve center of an organization’s cybersecurity posture. Whether you are building a SOC from scratch inside a fast-growing fintech company, or maturing an existing team at an enterprise, the fundamentals remain the same: the right people, effective processes, and well-tuned technology must work in concert to detect, investigate, and respond to threats before they become breaches.
In this article, I draw on my experience deploying and operating SOC capabilities at PostEx — a fintech and logistics company — to walk through the practical realities of building a high-performance SOC. We will cover SOC tiers and team structure, SIEM deployment and tuning, alert triage workflows, escalation procedures, and the KPIs that actually matter to security managers and CISOs.
The goal of a SOC is not to generate alerts. The goal is to reduce the time between a threat entering your environment and the moment you contain it.
SOC Tier Structure and Responsibilities
Most mature SOCs operate on a tiered model that routes alerts to the appropriate level of analyst based on complexity and required expertise. This prevents your senior engineers from drowning in low-fidelity noise while ensuring high-fidelity alerts receive immediate expert attention.
Tier 1 — Alert Monitoring and Initial Triage
Tier 1 analysts are the first eyes on every alert generated by the SIEM. Their primary responsibilities include:
- Monitoring SIEM dashboards for active alerts in real time
- Performing initial alert enrichment — querying threat intel, OSINT, and internal asset databases
- Determining whether an alert is a true positive, false positive, or requires escalation
- Documenting initial findings and opening tickets for escalation
- Executing pre-defined response playbooks for known alert types
At this tier, speed is paramount. Average time to acknowledge should be under 15 minutes for high-severity alerts. The tools Tier 1 analysts use daily include the SIEM (Splunk, ELK, or QRadar), EDR console, threat intelligence platforms, and ticketing systems like JIRA or ServiceNow.
Tier 2 — Incident Investigation and Analysis
Tier 2 analysts handle escalated incidents and perform deeper forensic analysis. They are expected to:
- Conduct host and network forensics on impacted systems
- Correlate events across multiple log sources and timeframes
- Determine attack scope, root cause, and affected assets
- Recommend or execute containment and remediation actions
- Produce detailed incident reports for stakeholders
Tier 2 engineers should be proficient in log analysis, memory forensics, network packet analysis (Wireshark, Zeek), and malware behavior analysis. Their work feeds directly into the detection engineering pipeline by identifying detection gaps and false positive root causes.
Tier 3 — Threat Hunting and Detection Engineering
Tier 3 is proactive. Rather than waiting for alerts, Tier 3 analysts actively hunt for threats that have bypassed automated detection. They also own the detection engineering function — writing, testing, and tuning detection rules that feed the SIEM.
The cycle from Tier 3 to Tier 1 is critical: threat hunters discover novel attack techniques, detection engineers encode them as rules, and Tier 1 monitors for those rules at scale. This continuous improvement loop is what separates a mature SOC from a reactive one.
SIEM as the SOC Foundation
The SIEM is the central nervous system of any SOC. At PostEx, we deployed Splunk Enterprise to centralize log aggregation from:
- Local network infrastructure — switches, access points, internal servers
- Global network components — WAN links, remote sites, cloud-connected infrastructure
- Core routers — MikroTik and Cisco devices feeding NetFlow and syslog
- Infrastructure assets — Windows endpoints via Universal Forwarders, Linux servers via syslog-ng, VMware ESXi hosts
Prior to Splunk, we operated an ELK Stack + Wazuh SIEM that covered 500+ endpoints. The Splunk migration expanded visibility significantly, adding network device telemetry, router logs, and cloud workload data that was previously out of scope.
Critical Log Sources
Not all log sources are created equal. In our environment, the highest signal-to-noise ratio came from the following sources:
# High-priority log sources ranked by detection value
1. Windows Security Event Logs (Event IDs 4624, 4625, 4672, 4688, 4698, 4702)
2. DNS Query Logs (C2 detection, DNS tunneling)
3. Firewall/Proxy Logs (lateral movement, exfiltration detection)
4. EDR Telemetry (process execution, file writes, network connections)
5. Active Directory Logs (privilege escalation, kerberoasting)
6. NetFlow/IPFIX (internal network anomalies)
7. VPN Authentication Logs (credential stuffing, impossible travel)
8. Web Application Firewall Logs (SQLi, XSS, brute force)
Alert Triage Workflow
A well-designed alert triage workflow is the difference between an overwhelmed SOC drowning in noise and an efficient team that closes incidents quickly. The following is the workflow we implemented at PostEx:
ALERT TRIAGE WORKFLOW
[SIEM Alert Generated]
|
v
[Tier 1 Acknowledges < 15 min for HIGH/CRITICAL]
|
v
[Initial Enrichment]
- Threat Intel lookup (IP, domain, hash)
- Asset lookup (owner, criticality, exposure)
- User context (role, recent activity, anomaly score)
|
v
[Decision]
False Positive? ---> [Document FP reason, tune rule, close]
True Positive? ---> [Escalate to Tier 2, open incident ticket]
Uncertain? ---> [Escalate with context, set SLA timer]
|
v
[Tier 2 Investigates]
- Forensic deep-dive
- Scope determination
- Containment decision
|
v
[Containment & Remediation]
- Isolate host / block IP / revoke credentials
- Patch, clean, or rebuild affected systems
- Verify containment effectiveness
|
v
[Post-Incident Report]
- Timeline reconstruction
- Root cause analysis
- Detection gap identification
- Lessons learned -> Detection Engineering
MITRE ATT&CK Integration
Mapping SOC operations to the MITRE ATT&CK framework provides a structured vocabulary for describing adversary behavior and measuring your detection coverage. Every detection rule in our environment is tagged with a MITRE ATT&CK technique ID.
| ATT&CK Tactic | Technique | Detection Source | Coverage |
|---|---|---|---|
| Initial Access | T1190 Exploit Public-Facing App | WAF, Web Logs | 🟢 High |
| Execution | T1059 Command & Scripting Interpreter | EDR, Windows Events | 🟢 High |
| Persistence | T1053 Scheduled Tasks | Windows Events 4698/4702 | 🟢 High |
| Privilege Escalation | T1078 Valid Accounts | AD Logs, SIEM | 🟡 Medium |
| Defense Evasion | T1070 Indicator Removal | EDR, File Monitoring | 🟡 Medium |
| Lateral Movement | T1021 Remote Services | Firewall, NetFlow | 🟢 High |
| Exfiltration | T1041 Exfil Over C2 Channel | Proxy, DNS Logs | 🟡 Medium |
| C2 | T1071 Application Layer Protocol | DNS, HTTP Logs | 🟡 Medium |
SOC KPIs That Matter
Measuring SOC effectiveness is critical for justifying investment and identifying improvement areas. The metrics that carry the most weight with CISOs and security managers are:
- Mean Time to Detect (MTTD) — Average time from threat entry to alert generation. Target: <60 minutes.
- Mean Time to Respond (MTTR) — Average time from detection to containment. Target: <4 hours for critical incidents.
- Mean Time to Notify (MTTN) — Time from alert to analyst notification. Our automation achieved <2 minutes.
- False Positive Rate — Percentage of alerts that are not genuine threats. We reduced this by 40% through rule tuning.
- Dwell Time — How long an attacker was inside the environment before detection. Target: <24 hours.
- Escalation Rate — Percentage of Tier 1 alerts escalated to Tier 2. High rates indicate rule quality issues.
- Incidents Closed Per Analyst Per Week — Team productivity metric.
SOC Automation Opportunities
Manual triage does not scale. As your log volume grows, the only way to maintain response quality is to automate repetitive tasks. At PostEx, we implemented n8n-based playbook automation for the following use cases:
- Alert notification — Wazuh alerts automatically routed to Telegram with enrichment data, achieving <2min MTTN
- IP reputation enrichment — Automatic VirusTotal and AbuseIPDB lookups on every external IP in an alert
- JIRA ticket creation — High-severity alerts automatically create JIRA tickets with pre-populated fields
- Account lockout response — Brute force detection triggers automatic AD account review workflow
- Hash detonation — Suspicious file hashes automatically submitted to MalwareBazaar and VirusTotal
# Example: Wazuh alert webhook to Telegram via n8n
{
"webhook": "https://your-n8n-instance/webhook/wazuh-alerts",
"payload": {
"alert_level": "rule.level",
"rule_id": "rule.id",
"rule_description": "rule.description",
"agent_name": "agent.name",
"source_ip": "data.srcip",
"timestamp": "timestamp"
}
}
Lessons Learned
After 3+ years running SOC operations in a fintech environment, the most important lessons I have learned are:
- Tune before you scale. Adding more log sources without tuning existing rules will bury your analysts in noise.
- Document everything. Incident reports, playbooks, runbooks — institutional knowledge is fragile without documentation.
- Measure your blind spots. Use ATT&CK Navigator to visualize detection coverage gaps and prioritize detection engineering work.
- Automate the boring parts. Alert enrichment, ticketing, and notification should all be automated so analysts can focus on analysis.
- Threat hunt regularly. Automated detection will never catch everything. Schedule structured threat hunting exercises monthly.
Conclusion
Building a high-performance SOC is a continuous journey, not a one-time project. The organizations that succeed are those that treat their SOC as a living system — constantly measuring, tuning, and improving. Whether you are standing up your first SIEM or scaling a mature operation, the principles remain constant: centralize visibility, reduce noise, automate response, and hunt for what your rules cannot see.
References: NIST SP 800-61r2 — Computer Security Incident Handling Guide | MITRE ATT&CK Framework (attack.mitre.org) | Splunk Security Essentials Documentation | Wazuh Documentation (documentation.wazuh.com)