Backend Software Engineer, Inference Optimization Department of An Internet Gian
Hireio, Inc.
Singapore
SGD 80,000 - 100,000
Job description
About the Team
Our Trust and Safety R&D team is fast growing and responsible for building machine learning models and systems to identify and defend internet abuse and fraud on our platform. With the continuous efforts from our team, our global popular app is able to provide the best user experience and bring joy to everyone in the world.
Our mission is to build a bridge for collaboration between algorithm models and business scenarios, and efficiently and stably apply TnS's algorithmic capabilities to business scenarios. And let TnS's algorithmic capabilities cover wherever this app needs them.
Responsibilities:
Work closely with business teams to optimize the integration plan for algorithm applications, improve efficiency in evaluating and using algorithm applications across various business scenarios, and reduce the cost of managing and optimizing algorithm applications in different business scenarios.
Be responsible for the architectural design, development, and performance tuning of algorithm applications, solving technical challenges such as high concurrency, high reliability, and high scalability. Work includes multiple sub-areas: resource scheduling, task orchestration, model training, model inference, model management, dataset management, workflow orchestration, etc.
Be responsible for researching and introducing cutting-edge engineering technologies related to algorithms.
Qualifications
Bachelor's degree in Computer Science, Engineering or equivalent practical experience.
Master at least one of the following languages in the Linux environment: C/C++, Python, Go.
Master at least one state-of-the-art machine learning framework (e.g., Tensorflow, Pytorch).
Strong analytical abilities and problem solving.
Preferred Qualifications
Have experience working in large scale tech companies.
Familiar with model optimization algorithms like quantization and pruning.
Practical experience in performance optimization/tuning of deep learning model training/inference.
Practical experience in CUDA programming and TensorRT.
Familiar with LLM model inference frameworks like VLLM and TensorRT-LLM.