AI Inference Engineer

Perplexity AI logo

Perplexity AI

We are looking for an AI Inference Engineer to join our growing team. Our current stack includes Python, C++, TensorRT-LLM, and Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.


Responsibilities

  • Develop APIs for AI inference that will be used by both internal and external customers.
  • Benchmark and address bottlenecks throughout our inference stack.
  • Improve the reliability and observability of our systems and respond to system outages.
  • Explore novel research and implement LLM inference optimizations.

Qualifications

  • Required:
    • Experience with ML systems and deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX).
    • Familiarity with common LLM architectures and inference optimization techniques (e.g., continuous batching, quantization, etc.).
    • Experience with deploying reliable, distributed, real-time model serving at scale.
  • Optional:
    • Understanding of GPU architectures or experience with GPU kernel programming using CUDA.

Compensation

The cash compensation range for this role is $190,000 - $240,000.


About Perplexity

At Perplexity, we've experienced tremendous growth and adoption since publicly launching the world's first fully functional conversational answer engine just over a year ago. Our AI-powered search assistant has:

  • 10 million monthly active users as of early 2024.
  • Over 1 million mobile app installs across iOS and Android devices.
  • 500 million queries served globally in 2023 alone.

Our Growth & Funding

To support our rapid expansion, we've raised significant funding from some of the most respected investors in technology:

  • January 2024: $73.6 million Series B round led by IVP, with participation from NVIDIA, Jeff Bezos' investment fund, NEA, Databricks, and other prominent firms.
  • April 2024: $62.7 million Series B1 round led by Daniel Gross, valuing Perplexity at over $1 billion.

Prominent Investors

Our investor base includes:

  • IVP, NEA, NVIDIA, Databricks
  • Jeff Bezos, Bessemer Venture Partners
  • Elad Gil, Nat Friedman, Naval Ravikant, Tobi Lutke, and many other visionary individuals.

Join us to shape the future of AI-powered systems!

Location

    San Francisco Bay Area

Job type

  • Fulltime

Role

Engineering

Keywords

  • AI
  • Inference