About The Role
This is a high-ownership
applied ML role focused on speech in real production constraints. You will improve SEA speech performance across languages, accents, code-switching, and noisy audio while working under real latency, cost, and reliability requirements. You will be trusted with production-impacting changes and expected to operate with maturity, initiative, and speed.
What This Role Is Really About
You are not here to only run notebooks.
You are here to:
- Take ownership of model and pipeline improvements that move core speech metrics.
- Move from experiments to deployed improvements without being micromanaged.
- Identify failure modes and edge cases in real-world speech data.
- Ship models, features, or tuning that measurably improve accuracy, robustness, or latency.
- Think beyond BLEU/WER and understand customer and business impact.
You should be comfortable where:
- Requirements and evaluation criteria evolve.
- Data is messy, multi-lingual, and imperfect.
- Speed matters, but quality and safety matter too.
- You must make decisions with incomplete labels and signals.
Responsibilities
- Experiment with and tune speech/ASR models for SEA languages and accents.
- Design and run experiments under realistic production constraints (latency, cost, memory).
- Work on inference optimisation and GPU utilisation.
- Develop strategies for multilingual and code-switching scenarios.
- Collaborate with engineering to integrate models into production pipelines.
- Build evaluation suites and datasets for tracking model performance.
- Document approaches, experiments, and tradeoffs.
What We Expect From You
- Founding Mindset
- You think in terms of shipped improvements, not just paper metrics.
- You ask “how will this behave in production?” before trying a new approach.
- You act like speech quality is your responsibility.
- You balance research depth with shipping velocity.
- You don’t wait for others to point out model failures; you go find them.
- Maturity
- You communicate clearly about what is known, unknown, and risky.
- You admit when an experiment failed and extract learning.
- You take feedback from both researchers and engineers without ego.
- You stay calm under pressure when a model behaves unexpectedly in production.
- You follow through on investigations into failure modes.
- Initiative
- You propose new hypotheses, architectures, or data strategies.
- You investigate root causes behind model errors instead of just tweaking hyperparameters.
- You improve evaluation pipelines and diagnostics.
- You refine data curation and annotation processes.
- You continuously balance performance and cost optimisations.
- ML / Speech Competence
- Solid Python and PyTorch fundamentals.
- Understanding of speech and ASR basics.
- Experience with model training, fine-tuning, and evaluation.
- Familiarity with GPU inference and optimisation workflows.
- Practical ML engineering mindset, not just theory.
Bonus
- Experience with multilingual or low-resource speech.
- Exposure to on-device or low-latency inference.
- Experience shipping ML models into production systems.
What Success Looks Like
- You own improvements to a specific speech use case or language.
- You ship at least one measurable improvement in accuracy, robustness, or latency.
- You identify and document notable failure modes and mitigation strategies.
- You contribute to model evaluation and monitoring infrastructure.
What You Gain
- Real-world applied ML experience under production constraints.
- Direct collaboration with founders and senior engineers.
- A portfolio of experiments and shipped improvements in production.
- A path towards an applied ML or speech-focused engineering role.
Who Should Not Apply
- If you only want to work on toy datasets and offline benchmarks.
- If you avoid messy data and hard debugging.
- If you prefer purely research environments detached from production.
- If you are looking for a low-intensity internship.
Who Will Thrive Here
- Builders who love shipping ML to production.
- Systems thinkers who see the whole pipeline, not just the model.
- Calm debuggers of strange model behaviour.
- High-agency individuals who care about real-world impact.