Watch the full debrief

Two voices. One question. The insider reaction you don't usually see.

Also on YouTube 5–7 min 2026

Question decoded

"Your Reels recommendation model passed offline evaluation but online engagement dropped 8% after launch. Walk me through how you diagnose this."

Competency tested

Production ML Ownership

Who asks it

Bar Raiser · HM · Peer

What they're really asking

Do you own models in production or just train them?

Answers compared

The answer that fails — and why

Candidate answer Does not raise the bar — Production ML Ownership

I'd start by looking at the usual suspects — data distribution shift between training and serving, maybe a feature engineering mismatch, or possibly label leakage in the training set. I'd also check whether the serving pipeline was preprocessing inputs the same way as training. Position bias is another thing worth considering if the new model is surfacing content the old ranker never showed. Once I narrowed it down I'd file a bug, work with the data team to pull logs, and iterate from there.

Bar Raiser evaluation

⚑ Enumerates failure classes but applies no prioritisation or sequence

⚑ No mention of existing monitoring, dashboards, or alerting infrastructure

⚑ Diagnosis deferred to the data team — no personal ownership of the investigation

⚑ No mechanism described to prevent this class of failure from recurring

Prefer to hear it? Watch the video for the two-voice delivery with live reaction commentary.

Meta debrief · MLE loop · Bar Raiser evaluation Below Bar

Meta Value: Production ML Ownership

Does not demonstrate Production ML Ownership.

✗ Lists failure classes without prioritisation; no evidence of systematic triage under time pressure

✗ No reference to existing monitoring or feature distribution dashboards — treats diagnosis as ad hoc

✗ Defers log analysis to data team; candidate does not own the investigation personally

✗ No instrumentation plan offered; same failure class would recur undetected at next launch

interview101.com · Production ML Ownership · Meta MLE · Bar Raiser debrief reference

→ Now here's what a strong answer actually sounds like

The answer that works — in full

Strong answer Raises the bar — Production ML Ownership

First thing I do is check whether this is a serving incident or a model quality issue — I pull the prediction score distribution from our serving logs and compare it against the offline holdout distribution. If those diverge, I know I have a serving skew problem before I even look at features. My priority order is: serving preprocessing mismatch first — it's the most common cause and fastest to verify — then feature distribution drift, then label leakage. I'd have a monitoring dashboard that already surfaces feature statistics at serving time versus training time, so I can isolate the failing feature in under an hour rather than guessing. If it's position bias, I check whether our training labels were collected under the previous ranker's distribution and run a swap test. Within 24 hours I'd have a root cause hypothesis, a rollback decision point, and a fix scoped. Going forward, I'd add a serving-versus-training skew alert to the launch checklist so no model ships without those checks in place.

Bar Raiser evaluation

✓ Leads with a concrete first diagnostic action — prediction score distribution check

✓ Prioritises failure classes explicitly with rationale, not just enumeration

✓ Owns the investigation end-to-end including the rollback decision point

✓ Closes with a systemic instrumentation fix to prevent recurrence at launch

Meta debrief · MLE loop · Bar Raiser evaluation Raises Bar

Meta Value: Production ML Ownership

Strong signal. Raises the bar.

✓ Opens with a concrete first action — prediction distribution comparison — not a brainstorm

✓ Explicit prioritisation of failure classes with rationale; reflects real production triage discipline

✓ Owns rollback decision point personally; no delegation of the investigation to other teams

✓ Proposes serving-versus-training skew alert on the launch checklist — systemic, not one-off fix

interview101.com · Production ML Ownership · Meta MLE · Bar Raiser debrief reference

Fix your answer before your loop

Run your story through these three questions

1

Do you name a specific first diagnostic action, not a list of possibilities?

If not, you sound like a researcher brainstorming, not a production owner triaging.

2

Do you explain why you check failure classes in the order you choose?

Without rationale, prioritisation looks arbitrary — the Bar Raiser needs to see your mental model.

3

Do you end with an instrumentation fix that prevents this failure class recurring?

Diagnosis without prevention shows you treat production incidents as one-off events, not systemic risks.

Get your personalized report

How do your real stories score?

Get a personalized report scored against the interview rubric Meta uses for your role.

Get your Meta Machine Learning Engineer report →

More Meta Machine Learning Engineer debriefs