The successful candidate will implement, test, optimize and deploy multimodal generative foundation models with a focus on the audio modality.
Design, implement, and maintain data pipelines for various purposes.
Demonstrate proficiency in generative AI techniques, including familiarity with LLM models.
Establish best practices in software engineering applied to generative AI.
Keep up to date with emerging AI trends, tools, and frameworks.
Manage codebase using version control systems like Gitlab (or Git).
Ensure AI solutions adhere to security and privacy standards. Address data privacy risks, model vulnerabilities, and ethical considerations.
Create detailed architecture documentation, including diagrams, flowcharts, and technical specifications.
Open sourcing and publication.
Qualifications and Experience
To qualify for this position, you will need to meet the following requirements:
Master’s degree in Computer Science, Computer Engineering (or equivalent) with 5+ years of experience in architecting, designing, developing, and deploying AI solutions.
Expertise in audio-related fields: audio (speech, sound, or music) generation, text-to-speech (TTS) synthesis, text-to-music generation, text-to-sound generation, speech recognition, speech/audio representation learning, video-to-audio generation, audio-visual learning, audio language models, lip sync, etc.
Experienced in one of the following popular ML frameworks: Pytorch, Tensorflow.