Research Engineer/ Fellow (Applied AI research in computer vision, vision-language models) (XBJ) Job Details

Research Engineer/ Fellow (Applied AI research in computer vision, vision-language models) (XBJ)

Posting Start Date: 27/02/2026

Schemes of Service: Research

Division: Infocomm Technology

Employment Type: Fixed Term

Job Purpose

As a University of Applied Learning, SIT works closely with industry in its research pursuits. Our research staff will have the opportunity to develop applied research skill sets relevant to industry needs while contributing to research projects in SIT.The primary responsibility of this role is to contribute to an industry innovation research project as part of the research team, focusing on applied AI research in vision-language models (VLMs) and multimodal AI for real-world applications.

Key Responsibilities

Participate in and manage the research project with Principal Investigator (PI), Co-PI and the research team members to ensure all project deliverables are met.
Conduct applied AI research in vision-language models, multimodal understanding and generation, and agentic AI.
Develop multimodal AI solutions for semantic understanding, reasoning applications.
Collaborate with industry partners to develop AI solutions addressing real-world challenges.
Develop and maintain research prototypes and software systems.
Publish impactful research findings and present work for knowledge sharing.
Work independently and collaboratively to ensure high-quality project delivery.

Job Requirements

Have relevant competence in the areas of computer vision, multimodal AI, or vision-language models.
Have a Bachelor’s or Master’s degree in computer science, data science, AI, or related fields.
Knowledge of large vision-language models (LVLMs), multimodal learning, or visual generation will be advantageous.

Key Competencies

Proficiency in programming languages (e.g., Python and C++), and familiarity with AI frameworks (e.g., PyTorch).
Familiarity with multimodal AI frameworks, large language models (LLMs), or vision-language model development.
Self-directed learner who believes in continuous learning and development.
Proficient in technical writing and presentation.
Possess strong analytical and critical thinking skills.
Show strong initiative and take ownership of work.

Provider	Description	Enabled
Vimeo	Vimeo is a video hosting, sharing and services platform focused on the delivery of video. Opting out of Vimeo cookies will disable your ability to watch or interact with Vimeo videos. Cookie Policy Privacy Policy Terms and Conditions	Consent to cookies from provider Vimeo
YouTube	YouTube is a video-sharing service where users can create their own profile, upload videos, watch, like and comment on videos. Opting out of YouTube cookies will disable your ability to watch or interact with YouTube videos. Cookie Policy Privacy Policy Terms and Conditions	Consent to cookies from provider YouTube