16 KiB
Session: Kraken — Senior Software Engineer, AI Infrastructure
Status: Phase 0: DONE — awaiting user confirmation before Phase 1
Created: 2026-05-01
JD source: JDs/Senior Software Engineer – AI Infrastructure @ Kraken.pdf
Output folder: output/Kraken_AI_Infrastructure/
JD Info
| Field | Value |
|---|---|
| Company | Kraken (Payward Inc.) — global crypto exchange |
| Role | Senior Software Engineer — AI Infrastructure |
| Department | Engineering, AI & Machine Learning |
| Location | Remote — Switzerland eligible (Dennis lives in Bern → DIRECT match) |
| Format | 2-page resume + 1-page CL |
| Bundle (primary) | ML / AI Engineer |
| Bundle (secondary) | Data Platform / Infra |
Requirements Table
| # | Requirement | Status | Evidence / Bridge |
|---|---|---|---|
| 1 | 5+ yrs building/operating high-scale production systems | DIRECT | 11+ yrs (Swisscom + Bosch + Vizrt + Fraunhofer + Generali) |
| 2 | Strong proficiency in Rust and systems-level programming | GAP (bridge LOW-MED) | C++ systems work at Vizrt (distributed video transcoding); Python/Java production. NO Rust production. |
| 3 | Distributed systems, reliability, performance optimization | DIRECT | Vizrt distributed transcoding; Swisscom Kafka/Teradata at scale; Bosch 24/7 fab |
| 4 | Services serving millions of users / high-throughput | DIRECT | Swisscom (Switzerland's largest telco, ~6M customers); Vizrt (CNN/BBC/Al Jazeera) |
| 5 | ML infra / model serving / MLOps | DIRECT | Bosch BS-1: containerized ML inference in 24/7 fab; Swisscom K8s/AWS infra |
| 6 | Observability, monitoring, failure recovery | DIRECT | Bosch BS-4: ELK + Grafana + Prometheus + Loki; on-call SLA at Swisscom |
| 7 | Cross-team collaboration | DIRECT | Component Owner at Swisscom; App Owner at Bosch |
| 8 | High ownership in high-stakes prod | DIRECT | 24/7 fab ML deployment; Component/App Owner roles |
| 9 | NTH: agent/LLM-powered systems | BRIDGE (MED) | Swisscom GenAI/custom GPTs (per user memory); ARTUS NLP at Fraunhofer |
| 10 | NTH: high-perf networking, async, low-latency | BRIDGE (MED) | Vizrt real-time A/V transcoding; Kafka streaming at Swisscom |
| 11 | NTH: container orchestration, cloud-native | DIRECT | K8s × 2 employers; AWS migration with CloudFormation/IaC |
| 12 | NTH: evaluation frameworks, model perf monitoring at scale | BRIDGE (MED) | Anomaly detection PoC at Bosch; observability stack |
| 13 | NTH: 0→1 / platform-building | DIRECT | Introduced ELK observability at Bosch; introduced CI/CD at Fraunhofer/Generali |
| 14 | NTH: crypto / blockchain | BRIDGE (HIGH) | Long-term Kraken customer since 2017 (BTC + ETH); Solidity smart-contract dev in free time; active user of Kraken / Kraken Pro / Krak apps. Genuine enthusiast — strong CL hook. |
Summary: 11 of 14 are DIRECT or DIRECT-equivalent matches. The single hard gap is Rust production experience. Crypto domain is an acceptable gap (Kraken invites enthusiasts).
ATS Keywords (extracted from JD)
Tier 1 (must appear in resume):
- Rust (handle carefully — see Honest Framing below)
- ML inference, model serving, MLOps, model deployment
- distributed systems, reliability engineering, performance optimization
- observability, monitoring, failure recovery
- Kubernetes, container orchestration, cloud-native
- production systems, high-throughput, scalable systems
- AI agents, agent systems, LLM
- async, low-latency
Tier 2 (nice to embed):
- Python (Dennis primary), C++ (Vizrt evidence)
- Kafka, Airflow, Apache Iceberg, AWS
- CI/CD, GitLab, Jenkins, Docker, Ansible
- Prometheus, Grafana, ELK
- DevSecOps, security compliance
Gap Assessment
| Gap | Bridge framing | Confidence | Decision |
|---|---|---|---|
| Rust production | "Systems-level proficiency in C++ (Vizrt distributed video transcoding); building toward Rust" — list Rust ONLY in a "Learning" row, never alongside production languages | LOW-MED | Bridge honestly; do NOT inflate. Skills section must reflect this. |
| Crypto/blockchain | Long-term Kraken customer since 2017; Solidity smart-contract dev in free time; active Kraken/Kraken Pro/Krak app user. | HIGH (genuine enthusiast) | Lead the CL with this. Optionally add a small "Crypto/Blockchain — Solidity (smart contracts), Kraken (long-term user)" line in resume Skills if space permits. |
| Direct LLM serving infra | "Containerized ML inference in 24/7 production (Docker, K8s, Ansible)" — closest analog | MED | Use as proxy; do not claim "LLM serving experience". |
| Trillion-row workloads / millions QPS | "Production data infrastructure at Switzerland's largest telco" — implies scale without overclaim | MED | Frame via Swisscom/Bosch fab context. |
Company Context
Kraken is one of the world's largest crypto exchanges (founded 2011), now ~200+ Rust engineers and "millions of lines of Rust across hundreds of services" per their engineering blog (Oxidizing Kraken Parts 1 & 2). They've made a deliberate, multi-year bet on Rust for backend services, migrating from PHP and modernizing core infrastructure.
The AI Infrastructure team specifically powers AI agent systems in production. In Nov 2025 Kraken open-sourced Kraken CLI — the first crypto exchange CLI built natively for AI agents (Rust binary, MCP server compatible with Claude Code/Cursor/Codex, paper-trading engine). This team builds the inference, orchestration, and execution layers behind that.
Mission: "Accelerate the global adoption of crypto so everyone can achieve financial freedom and inclusion." Strong crypto-ethos culture — they explicitly value crypto conviction.
Why this team: Production-oriented, deeply systems-focused, building 0→1 agent infrastructure at high scale.
Framing Strategy
Lead narrative: "Production AI/ML infrastructure engineer — owns model inference, container orchestration, and observability in 24/7 high-stakes environments; brings full cloud-native data platform depth from Switzerland's largest telco."
Reframing map (selected):
- BS-1 (ML inference containerization, Bosch fab) → "Designed and deployed model inference infrastructure (Docker, Kubernetes, Ansible) into 24/7 production — image classification serving with zero-downtime constraint."
- SW-3 (K8s + GitLab CI/CD, Swisscom) → "Architected and operate Kubernetes-deployed Python services with full GitLab CI/CD automation in agile DevOps environment."
- SW-1 (AWS migration) → "Re-architected legacy ETL stack to cloud-native AWS infrastructure (S3, Glue, Athena/Iceberg, Redshift, Airflow, CloudFormation IaC) — scalable, observable, serverless data layer."
- SW-2 (Component Owner) → "Component Owner for business-critical pipelines under on-call SLA — full reliability engineering ownership at scale."
- BS-4 (ELK/Grafana/Prometheus) → "Designed observability stack (ELK + Kafka, Grafana, Prometheus, Loki) for high-volume 24/7 production — anomaly detection and monitoring built from zero."
- SW-GenAI (corrected — no LangChain) → "Built custom GPTs and LiteLLM-based LLM API integrations to automate engineering workflows (code review, documentation, pipeline triage) in a spec-driven (Kiro) development environment at Swisscom." — LiteLLM is a strong AI-infra signal (model-gateway abstraction).
- VZ-1 (Vizrt distributed transcoding) → Python-led framing: "Built distributed real-time backend components in Python (with legacy C++ modules) for Vizrt's broadcast platform serving CNN, BBC, Al Jazeera at scale." C++ mentioned as legacy context only — do not lead with or bold C++.
Honest framing on Rust + C/C++ (per user feedback 2026-05-01):
- Rust: DO NOT include alongside production languages. Optional: brief "Rust (active learning)" only if it doesn't crowd the line — otherwise omit; rely on systems-level / distributed-systems signal from Vizrt and bridge in CL.
- C/C++: Per user feedback, do NOT lead with or bold C++. It's been many years and the user is not confident. Mention only as legacy context (e.g., "Python (with legacy C++ modules)"); if listed in skills, place last with no emphasis. Python and Java are the strong signals.
GenAI / agent toolchain (CORRECTED 2026-05-01 — LangChain was a fabrication): Verified tools: Kiro (AI IDE / Spec-Driven Development), VS Code + Copilot, LiteLLM (LLM API gateway — created/used APIs), custom GPTs with fed domain knowledge. DO NOT list LangChain, LangGraph, or LlamaIndex anywhere — they have not been used. Apple and Infineon resume outputs contain LangChain as a fabrication and need cleanup later.
Emphasize: MLOps in 24/7 production, Kubernetes ownership × 2 employers, observability stack, distributed systems, async/streaming (Kafka, A/V real-time), platform-building initiative.
Downplay / omit: BDD, RPA, IBM ODM, Tibco Spotfire, BI/dashboard framing, semiconductor domain specifics, test automation as primary identity.
User focus directives: None given — using bundle Priority Matrix defaults.
Critique Context (for /critique later)
Reviewer persona: Kraken Engineering hiring manager — likely a Rust-fluent senior infra engineer or EM. Will weight (a) production systems credibility, (b) Rust signal honesty (won't tolerate inflation), (c) MLOps maturity, (d) crypto enthusiasm in CL, (e) ability to operate 0→1 in fast-moving teams.
Competitive landscape: Pool likely includes Rust-native backend engineers from FAANG / crypto-native firms (Coinbase, Binance, Polygon) and ML infra engineers from AI labs. Dennis competes by leading with MLOps + production reliability + cloud-native depth — and being honest about Rust as building.
Domain vocabulary: model inference, orchestration, execution layer, agent systems, model serving, evaluation frameworks, guardrails, async, Tokio, MCP, observability, SLO, latency budget, throughput, p99.
Cover Letter Plan
Institution type: Crypto-native, Rust-heavy, production-engineering-focused.
Length: 1 page, 250-300 words.
Paragraph structure:
- Hook (2-3 sentences): Open with Kraken-customer-since-2017 line + Solidity in free time — establishes genuine crypto-native identity from sentence one. Then pivot to professional fit: I've followed Oxidizing Kraken Parts 1 & 2 and the Kraken CLI launch — the AI Infrastructure team's mandate is what I'd want to work on regardless of company.
- Production ML credibility: Bosch BS-1 — designed and deployed ML inference into a 24/7 semiconductor fab; the operational constraint (no maintenance windows, hardware-in-the-loop) is what shapes how I think about model-serving infrastructure.
- Cloud-native + observability + scale: Swisscom (Switzerland's largest telco) — owning K8s-deployed Python data services on AWS, Kafka-based streaming, plus the observability stack at Bosch (ELK + Prometheus + Grafana). Tie to "high request throughput, observability, failure recovery."
- Honest on Rust: One short, candid sentence — systems-level background is C++ (Vizrt distributed transcoding); building Rust depth currently. No inflation.
- Close: Switzerland-based (location match); long-time Krakenite as a customer, would be excited to be one as an engineer.
Hooks (specific to research):
- Long-term Kraken customer since 2017 (BTC + ETH); active user of Kraken / Kraken Pro / Krak apps — primary CL opener
- Solidity smart-contract development in free time — concrete proof of crypto-native engineering interest, not just trading
- "Oxidizing Kraken Parts 1 & 2" — millions of lines of Rust across hundreds of services, async Tokio migration in 2020-21
- Kraken CLI (Nov 2025) — first crypto CLI built for AI agents, MCP-native
- Mission: financial freedom and inclusion via crypto
Jargon level: High — technical reader. Use Tokio, async, MCP, model inference, p99, observability comfortably.
Avoid in CL: SCEDAS / maritime / BDD / RPA / Tibco / semiconductor domain depth (mention Bosch, but lead with the ML deployment angle, not the wafer/fab specifics).
Bundle Selection Rationale
- Primary: ML/AI Engineer (
bundle_ml_ai_engineer.md) — JD title and team mission are AI Infrastructure / agent systems / model inference. Priority Matrix and Reframing Map align directly. - Secondary: Data Platform/Infra (
bundle_data_platform.md) — for the distributed systems / observability / Kubernetes / cloud-native framing. Use to bridge 1-2 bullets toward the systems-engineering side of the JD (e.g., reframe SW-3 with platform-leaning language; pull BS-4 observability framing).
Output Files (planned)
e2e_kraken_ai_infra_resume.tex— 2-page resumee2e_kraken_ai_infra_cover_letter.tex— 1-page cover lettercritique_kraken_ai_infra.md— critique output
Bullet Plan (CONFIRMED 2026-05-01)
Final: 18 variable bullets across 5 positions (2 added during page-fill gate: SW-4 + GN-2).
| Position | Bullets | IDs | Notes |
|---|---|---|---|
| Swisscom | 6 | SW-2, SW-1, SW-GenAI (corrected: LiteLLM/Kiro/custom GPTs — no LangChain), SW-3, SW-6, SW-4 | SW-4 added for page fill |
| Bosch | 4 | BS-1, BS-4, BS-3, BS-2 | BS-1 leads (24/7 ML inference) |
| Fraunhofer | 3 | FC-2, FC-1, FC-3 | |
| Vizrt | 2 | VZ-1 (Python-led, C++ legacy parenthetical), VZ-2 | C++ unbold per user feedback |
| Generali | 3 | GN-1, GN-2 (added), GN-3 | GN-2 added for page fill |
Skills section: 5 groups including a Crypto / Web3 line (Solidity smart contracts, Ethereum, Kraken long-term user) — confirmed by user. C++ kept in languages but unbold.
Forced exclusions: SW-4 (B2B dashboards — weak for AI infra), SW-5 (Security Champion — only 2025/26 per memory, off-theme), BS-5 (Tibco — irrelevant), FC-4 (grant proposal — weak), GN-2 (UIPath RPA — irrelevant).
Budget Gate: Target 20-21 from resume_reference.md; user accepted 16 for quality > quantity. Skills section will absorb the slack (slightly fuller skills block compensates for fewer bullets). PASS.
Status
- Phase 0: DONE
- Phase 1: DONE (18 bullets final, after page-fill adjustment from 16)
- Phase 2: DONE — compiled, 2 pages, 18 bullets, all char counts within budget
- CL: DONE — compiled, 1 page, ~285 words, 2 em-dashes, all hooks verified
- Critique: CURRENT (Pass 2 = 84.5/100; all Pass 1 Tier 1 fixes verified applied)
Critique summary (Pass 2):
- Score trajectory: Pass 1 81.5 → Pass 2 84.5 (+3.0). Converged near theoretical max ~86; hard ceiling ~88 (Rust gap).
- All three Pass 1 Tier 1 fixes verified in compiled PDF: summary crypto/Solidity hook lands at recruiter-glance speed; B3 carries "agent assistants" + "LLM API gateway, model routing"; B6 reframed to ML/analytics consumers (no B2B dashboards).
- ATS match 76% → ~80%. Compile clean (2pp resume + 1pp CL). AI fingerprint clean (em-dashes 1+2, no banned words, no -ing endings).
- No Tier 1 fixes remaining. Tier 2 polish optional: (a) add "agent orchestration / guardrails" to skills group #1, (b) CL active-bridge closer, (c) trim B4 -7 chars.
- Verdict: Submit-ready as-is. Tier 2 only if a polish round desired.
Output Files:
e2e_kraken_ai_infra_resume.tex/.pdf— 2 pages, 176KBe2e_kraken_ai_infra_cover_letter.tex/.pdf— 1 page, 143KBresume.cls— copied locally for compilation
Hook verification log (CL):
- "Oxidizing Kraken" — verified via blog.kraken.com (Feb 2021, Simon Chemouil)
- "Kraken CLI MCP-native for Claude, Cursor, Codex" — verified via github.com/krakenfx/kraken-cli
- Kraken customer since 2017 + Solidity — personal claim from user memory (user_crypto.md)
Next: /clear then /critique output/Kraken_AI_Infrastructure/session_kraken_ai_infra.md