Eduardo Silva — HPC • Robotics • Graphics • Systems

PROJECTS

Case-study style overview

Search, filter, and open any project for outcomes, responsibilities, and technical detail.

Sort

Showing 8 of 8

2024

PDF

Fluid Solver Optimization (CPU, OpenMP, CUDA)

Performance / Systems Engineer

Reduced runtime 30.85s → 3.69s (8.36×) on CPU; best OpenMP ~10.77× at ~16-24 threads; CUDA kernels reached ~90.16 GiB/s.

Optimized a 3D Stable Fluids solver end-to-end: (1) CPU locality + ILP (loop reordering, tiling, division to mult), (2) shared-memory parallelism with OpenMP (collapse, reductions, static scheduling; Red/Black Gauss-Seidel), (3) GPU acceleration with CUDA (persistent device memory, kernelized solver steps, Nsight-guided tuning). Profiled bottlenecks and scalability with perf/gprof and Nsight.

C/C++OpenMPCUDALinuxperfgprof

HPCProfilingOpenMPCUDAOptimizationScalability

2025

PDF

MPI Allgather+Merge on a Cluster (Bruck vs Circulant)

HPC / Parallel Software Engineer

Implemented and benchmarked 3 Allgather+merge variants up to 640 ranks; parallel variants delivered ~1.5×–2× average speedup, with Circulant typically 10–30% faster than Bruck.

Built a sequential baseline (MPI_Allgather + p-way merge) and two allgather-merge algorithms (generalized Bruck and Circulant). Optimized merge and memory traffic (loser-tree merge, SoA layout, ping-pong buffers), implemented non-blocking and pipelined communication, and analyzed scaling limits (eager threshold/rendezvous handshakes, synchronization walls) on the Hydra cluster at TUW over multiple node/process configurations and message sizes.

C/C++MPI (OpenMPI)LinuxCluster Benchmarking

MPIHPCCollectivesBenchmarkingPerformance AnalysisC/C++

2023

Media

Graphics Engine / OpenGL Rendering

C++ Developer

Lean pipeline with VBO rendering and expressive geometry primitives.

Built an XML-driven engine featuring VBO-based rendering, Catmull-Rom splines, Bézier patches, and common lighting/mapping techniques.

C++OpenGLGLSLXML

C/C++OpenGLEngineGeometry

2025

Media

Autonomous Racing (F1TENTH, ROS2)

Robotics / Software Engineer• Team project

1st place and 3rd place in autonomous racing challenges; delivered real-time navigation with SLAM+LiDAR and robust control in ROS2.

Integrated SLAM, obstacle avoidance, and path tracking (Pure Pursuit, Disparity Extender) with ROS2 tooling, telemetry, and reproducible test runs for rapid iteration.

ROS2C++/PythonLiDARRVizFoxgloveNavigation

ROS2SLAMLiDARControlReal-timeC/C++

2021

Custom Anti-Cheat System

Developer / Operator• Tournament ops• Independent

Tournament ops at scale with automated integrity checks.

Hosted a 180-player online event and integrated third-party API signals into an anti-cheat workflow and reporting pipeline.

C#SMTPAutomationReporting

API IntegrationAutomationOps

2023

DNS System (Resolver + Primary/Secondary + Zone Transfer)

Software Engineer• Computer Communications

Implemented a DNS-like distributed naming service with recursive/iterative resolution, caching, and TCP-based zone transfer between primary and secondary servers.

Built a multi-component DNS system in Python: client queries over UDP; resolver performs iterative/recursive resolution via root/top-level/authoritative servers; primary and secondary servers support replication via a custom TCP zone-transfer protocol. Implemented structured config/data parsing, logging, and a cache layer; organized components using an MVC-style module split.

Python 3.10Sockets (UDP/TCP)LinuxLoggingConfig Parsing

NetworkingClient/ServerUDPTCPProtocolsDistributed Systems

2025

OTT Streaming over an Application-Layer Overlay (CORE)

Software / Networking Engineer• Network Services Engineering

Built a prototype CDN-style OTT service: overlay nodes + PoPs distribute RTP video streams to multiple clients with dynamic route costs and basic failure recovery.

Implemented an application-layer overlay in Python on the CORE emulator. A bootstrapper provisions neighbor sets from a JSON topology; nodes exchange periodic HELLOs and routing info, compute path costs from latency and packet loss, and forward streams along best routes. Clients select the best PoP using RTT + path cost, then control playback via RTSP and receive video via RTP over UDP. Added teardown propagation and recovery behaviors on node/PoP failure.

PythonSockets (UDP/TCP)ThreadsCORE emulatorRTP/RTSP

NetworkingOverlay NetworksRTPRTSPUDPMonitoring

2024

PictuRAS (Image Processing Web App, Microservices)

Full-Stack / Software Engineer

Built and deployed an image-processing web app using a microservices + API Gateway architecture; supports projects, tool-based image transforms, and payments.

Implemented the PictuRAS platform end-to-end: React frontend, Node.js/Express services backed by MongoDB, API Gateway entry point, Stripe payments, and containerized deployment. Tools run as independent services, enabling an extensible pipeline for image operations. Designed the system around scalability/elasticity and maintainability, and documented architecture views (arc42).

ReactNode.jsExpressMongoDBStripeDocker

Full-StackMicroservicesAPI GatewayWebCloud