As a University of Applied Learning, the Singapore Institute of Technology (SIT) works closely with industry in its research pursuits. This position is situated within the SIT x NVIDIA AI Centre (SNAIC).
This role is part of an industry innovation project with a large consumer goods company, where you will develop an evaluation framework for vision-language model (VLM) with applications in the personal care sector. The research focuses on fine-grained VLM capabilities such as spatial reasoning, temporal grounding, event tracking, and domain knowledge using a curated multimodal dataset.
Key Responsibilities
- Manage the research project together with the Principal Investigator (PI) and industry partner to ensure all project deliverables are met
- Design and implement evaluation frameworks and metrics for vision-language models
- Develop annotated video datasets and capability-tagged evaluation tasks
- Build end-to-end evaluation pipelines and failure mode analysis tools to analyze VLM performance across reasoning dimensions
- Prepare technical reports, publications, and industry-facing deliverables
- Mentor student assistants
- Candidates are to communicate with any internal or external parties to ensure project deliverables are met.
- Any other ad-hoc duties as assigned by Supervisor.
Requirements
- PhD in Computer Science or related field
- Expertise in computer vision and vision-language models
- Experience with ML evaluation metrics and benchmarking
- Proficiency in Python and deep learning frameworks (e.g., PyTorch)
- Interest in applied, industry-collaborative research