# Google — Senior Data Engineer (Merchant Data Science) — Interview Loop Prep Brief

> Status: PASSED Hiring Assessment 2026-06-20; recruiting reviewing for next steps.
> Role: Merchant Data Science, Merchant Shopping Org. Zürich / Mountain View.
> JD level chip = "Mid"; title = Senior → **clarify L4 vs L5 with recruiter** (see §5).
> Comp posted US $156–227k + 15% + equity; **Zürich band NOT posted → verify clears 180k all-in** (see §5).

This brief maps the actual loop rounds onto your real Swisscom (SW-*) and Bosch (BS-*) evidence.
Accuracy rule still applies: own your domains/products/pipelines; never claim solo ownership of company-scale platforms.

---

## 1. The funnel from here

1. ✅ Resume screen
2. ✅ Hiring Assessment (24-mo valid)
3. **Recruiter call** — level, comp, logistics. *Your leverage point — settle level + comp here.*
4. **Technical phone screen** — ~45 min, shared Google doc: coding + SQL
5. **Virtual onsite loop** — 4–5 rounds (below)
6. **Hiring Committee** — packet review off your loop; confirms level
7. **Offer + comp review**

Timeline ≈ 6–10 weeks; 1–2 week silences are normal.

---

## 2. The onsite loop — what to expect for a Google DE

| Round | What it tests | Your prep weight |
|---|---|---|
| Coding ×1–2 | Python, data structures/algorithms, moderate (not hardcore competitive) | **HIGH — warm up most** |
| SQL + data modeling | Window functions, joins, dimensional modeling, schema design | MED — your wheelhouse, just refresh |
| Pipeline / system design | Design a batch+stream data product at scale: ingestion, modeling, quality, lineage, serving | MED — strong story material |
| Googleyness & Leadership | Behavioral / ownership / ambiguity / collaboration | MED — prep STAR stories |

The JD's center of gravity is **"build durable data products + self-serve tools + automated pipelines + data quality/reliability monitoring"** — that's design + modeling + quality, all things you do. The coding rounds are the part furthest from your daily work, so that's where the hours go.

---

## 3. Round-by-round, mapped to YOUR evidence

### 3a. Coding (the gap to close)
- Format: live Python in a shared doc, no autocomplete, talk while you code.
- Scope: arrays/strings/hashmaps, two-pointers, sliding window, sorting, recursion/trees, occasionally graphs/heaps. Google leans clean problem-solving + communication over obscure tricks.
- **Action:** ~2–3 weeks of LeetCode Easy→Medium in Python. Target ~40–60 mediums. Focus patterns: hashmap counting, intervals, BFS/DFS, top-K/heap. Practice *narrating* your approach and stating complexity (Big-O) out loud.
- Leverage: you write Python daily (SW-2, SW-3, SW-6 PySpark) — you're fluent, you just need interview-shaped reps. Don't over-study DP/hard graph theory; not the Google DE bar.

### 3b. SQL + data modeling (your strength)
- Likely: write non-trivial SQL (window functions `ROW_NUMBER/RANK/LAG`, CTEs, multi-join aggregations, dedup), then design a dimensional model (star/snowflake, fact vs dimension, SCD types).
- **Your evidence:** SW-2 (Oracle→Kafka→Teradata DWH ownership), SW-1 (Athena/Iceberg lakehouse), SW-7 (Data Mesh data products + metadata). You model dimensional warehouses for a living.
- **Action:** refresh window-function fluency and be ready to talk SCD Type 2, grain selection, idempotent loads. Prep a crisp answer for "how do you model X for analytics consumers" — the JD is explicitly about self-serve data products.

### 3c. Pipeline / system design (your strongest story)
- Prompt shape: "Design a data product/pipeline that ingests merchant event data and serves it for analytics/ML at scale."
- **Hit these beats (all from your real work):**
  - Ingestion: batch + streaming — **Kafka event ingestion** (SW-2)
  - Storage/format: **S3 + Athena + Apache Iceberg** open table format, partitioning (SW-1)
  - Orchestration: **Airflow / Step Functions**, idempotency, backfills (SW-1)
  - Modeling: dimensional + **reusable data products** in a **Data Mesh** (SW-7) — *say "governed data products within Swisscom's company-wide Data Mesh," not "I built the Data Mesh"*
  - Quality/reliability: **automated validation, data quality checks, monitoring** — JD calls this out explicitly; tie to your Component Owner SLA + on-call (SW-2) and Bosch observability PoC (BS-4: ELK/Grafana/Prometheus/Loki)
  - Governance/metadata: **active metadata management + data catalog** (SW-7) — discoverability for self-serve
  - Serving: tables/views for BI + ML consumers
- This round is where your seniority shows. Practice drawing it on a whiteboard/doc in 35 min with trade-offs (cost vs latency, batch vs stream, schema evolution with Iceberg).

### 3d. Googleyness & Leadership (behavioral)
Prep 5–6 STAR stories. Strong candidates from your KB:
- **Ownership/autonomy:** Component Owner of business-critical Fulfillment pipelines under on-call SLA (SW-2). JD: "operate with high autonomy, own from conception to impact."
- **Driving a migration:** legacy Teradata/Oracle → AWS cloud-native (SW-1). Trade-offs, sequencing, risk.
- **Ambiguity / non-routine problem:** Data Mesh / data-product modeling where requirements weren't pre-defined (SW-7).
- **Production reliability under pressure:** ML inference into a 24/7 semiconductor fab — zero-downtime constraint (BS-1).
- **Cross-functional collaboration:** B2B stakeholders + Product Owner backlog (SW-4). JD: "collaborate with data scientists, engineers, PMs."
- **Raising the bar / quality:** anomaly-detection + observability PoC at Bosch (BS-4), Security Champion rigor (SW-5).

---

## 4. JD-to-evidence quick map (for tailoring answers)

| JD signal | Your proof |
|---|---|
| Design data pipelines, dimensional modeling, sync+async | SW-2 (Kafka async + batch into Teradata DWH) |
| Spark / DataFlow-class processing | SW-6 (PySpark distributed processing) |
| Build scalable data products + self-serve tools | SW-7 (Data Mesh data products, metadata/catalog) |
| Automated validation, data quality, reliability monitoring | SW-2 (SLA/on-call), BS-4 (ELK/Grafana/Prometheus/Loki) |
| ML for production workflows (preferred) | BS-1 (containerized ML inference in 24/7 fab) |
| Coding in 1+ languages, 5 yrs | Python (primary), Java, some C#; PySpark, SQL |
| Stakeholder partnering | SW-4 (B2B products, PO collaboration) |
| BI / notebooks / Tableau / Power BI (preferred) | SW-4 dashboards; Bosch Spotfire co-ownership (TAF 2022) |

**Honest gaps to be ready for** (don't fabricate):
- Google-internal stack (Flume, Dremel, Borg, BigQuery) — you won't know it; say so and map to your equivalents (Kafka↔Flume, Athena/Iceberg↔BigQuery, K8s↔Borg). They expect external candidates to learn internal tools.
- No giant-consumer-scale (billions of rows/day) claim unless true — frame your scale honestly (national telco data volumes).

---

## 5. Recruiter call — settle these BEFORE the loop

You're interviewed *at* a level; Hiring Committee confirms it off that loop. Getting slotted wrong is hard to undo, so pin it down now.

**Script:**
- *Level:* "The posting chip said 'Mid' but the title is Senior — can you confirm whether this is L4 or L5? I'm currently Staff (Engineer IV) at Swisscom, so I want to make sure I'm being considered at the right level." → **You should be pushing for L5.** An L4 offer is likely a lateral/down move vs your current Staff title and your 180k bar.
- *Comp:* "The US band was posted but not the Zürich one — can you share the Zürich base range and the total-comp picture (base + bonus + equity)?" → Your bar: **clears CHF 180k all-in**, hybrid ≤2–3 days, Bern-commutable to Zürich.
- *Location/logistics:* confirm Zürich-based, hybrid expectation, and that a Bern commute / remote-days split works.
- *Process:* "What does the loop look like and what's the timeline?" — get the round list so you prep the right things.

If they only have an L4 budget or the Zürich comp won't clear 180k, that's a decision point *before* you invest weeks in the loop.

---

## 6. Two-week prep plan (if loop gets scheduled)

- **Week 1:** LeetCode Python — 4–5 mediums/day across hashmap/two-pointer/sliding-window/BFS-DFS. Refresh SQL window functions. Draft + rehearse the §3c pipeline design end-to-end.
- **Week 2:** Mock the design round out loud (record yourself, 35 min). Write out the 5–6 STAR stories in §3d. Do 2–3 timed mock coding rounds (talk while coding). Light review of dimensional modeling / SCD / idempotency.
- **Day before:** re-read this brief, the JD, and your submitted resume so your stories match what they have.

---

_Generated 2026-06-20. Source: live JD (output/Google_Senior_Data_Engineer/JD_*.txt), experience_swisscom.md, bundle_data_engineer.md._