About
A showcase of interesting debugging sessions and other technical writeups related to software development or security challenges.
From these case studies, we can extract:
- Reusable methodologies to apply in similar scenarios;
- Disclosed information that can spark new bug reports or patches. Consider how oftentimes interface errors are aggravated by insufficient, misleading, or unintended messages.
Inspiration
Computer Security
- GoogleCTF 2022 - eldar (333 pt / 14 solves)
- 34c3 CTF minbashmaxfun writeup
- Solving warsaw’s Java Crackme 3 – ReWolf's blog
- Reddish - HackTheBox Writeup - IppSec
- Fuzzing Browsers for weird XSS Vectors - LiveOverflow
- Hacking with Environment Variables
- x86matthew - Exploiting a Seagate service to create a SYSTEM shell (CVE-2022-40286)
- Kernel Pwning with eBPF: a Love Story
- Project Zero: Down the Rabbit-Hole...
Data Analysis
- 24-core CPU and I can’t move my mouse
- Active Benchmarking: bonnie++
- Analysing Petabytes of Websites
- Using FOIA Data and Unix to halve major source of parking tickets
- Battelle Publishes Open Source Binary Visualization Tool
- How to Implement a Simple USB Driver for FreeBSD
- Wireshark visualization TIPS & tricks by Megumi Takeshita - SharkFest’19
- Visualizing Commodore 1541 Disk Contents – Part 2: Errors - pagetable.com
- Unraveling The JPEG
Systems Programming
- Debugging memory corruption: who the hell writes “2” into my stack?! - Unity Technologies Blog
- Tracking down a segfault in grep
- Debugging an evil Go runtime bug - marcan.st
- Hunting down a non-determinism-bug in our Rust Wasm build
- USB Debugging and Profiling Techniques
- The select story - removal of a compiler optimization
- The hunt for the cluster-killer Erlang bug | by Dániel Szoboszlay | Klarna Engineering
Contraptions Programming
- GIF MD5 hashquine - Rogdham
- Polyglot Assembly - vojtechkral.github.io
- Palindromic 64 bit ELF binaries
- Accidentally Turing-Complete
- codemix/ts-sql: A SQL database implemented purely in TypeScript type annotations - GitHub
- sed maze solver - GitHub
- AES in Scratch
Yak Shaving
- Rabbit Holes: The Secret to Technical Expertise
- blog_deploy_yak_shave.md
- Everything I googled in a week as a professional software engineer
Development Challenges
- Exposing Kafka brokers inside a k8s cluster via load balancers. After scaling up the number of brokers, an external application could not send batched requests to some brokers.
- There was a misconfiguration, where public addresses were not set for load balancers associated with extra Kafka nodes. This resulted in cluster internal addresses to be exposed, which were unreachable outside the cluster.
- I found this challenge interesting due to several confounding factors:
- Log messages simply stated timeouts expiring request batches (which might not be ready to send for reasons besides loss of network connection);
- Broker connection issues were ignored (Kafka client silently removed unreachable nodes, which was only confirmed with a debugger);
- Downscaling Kafka didn’t rollback to a correct state (rolling update triggered via Flux reconciliation simply brought down extra nodes, without restarting any remaining nodes, so one of them could be part of the misconfigured set);
- Although we had readiness probes for these brokers, they only covered reachability inside the cluster. What could help would be external healthchecks;
- Sorting paginated search results for a web interface. These results were retrieved from multiple databases, running distinct database engines. It would be highly inefficient to retrieve the full result sets in a single request.
- The solution I developed was to asynchronously perform, for each database, a ranged sql query. At the application level, we merged and sorted the result sets of these queries. If we returned back to the web interface some results of a given database, we would increment and cache the corresponding range offset, so that requesting the next page would fetch the next ranged result set.
- I found this challenge interesting due to implementing an algorithm from scratch for a complex use case which was not contemplated by the frameworks we were using.
- Managing an application’s lifecycle with the service manager
systemd
. When the process was stopped with our service, some subprocesses did not perform a clean shutdown, and a manual subprocess start was required. However, stopping the application manually resulted in all subprocesses successfully shutting down.- The root cause was found while comparing the system calls between the two shutdown procedures. The service sent a kill signal to the parent process and each child, while the manual stop only sent a kill to the parent process, which in turn sent network requests to each subprocess containing a command to gracefully shutdown. After reconfiguring the service to only send a kill signal to the parent process, the issue was solved.
- I found this challenge interesting due to requiring low-level analysis, since there were no evidences for this behaviour in typical indicators such as application logs.
- Running applications in distinct hosts, although they did not have support for this scenario. An endpoint of an
application (host A)
returned an address foranother application in the sub-network (host B)
. This address was consumed by bothA
and anexternal application in a VPN network (host C)
. A sub-network address couldn’t be resolved byC
, while a VPN network address couldn’t be resolved by hostA
.- The solution I applied was to add a NAT OUTPUT rule in the firewall of the endpoint host, causing locally-generated packets to a given IP and port in the VPN network range to be sent to a sub-network IP and port instead. This allowed
A
to communicate withB
, while setting an address reachable byC
. - I found this challenge interesting due to requiring cross-cutting knowledge in networking, allowing us to continue using our applications in the scenario we needed.
- The solution I applied was to add a NAT OUTPUT rule in the firewall of the endpoint host, causing locally-generated packets to a given IP and port in the VPN network range to be sent to a sub-network IP and port instead. This allowed
More…
- Debugging: Methodologies, Case Studies
- Reverse Engineering: Methodologies, Case Studies