We are looking for a Research Engineer to join our MOE AcRF Tier 1 funded research project at the Singapore Institute of Technology. You will be part of an interdisciplinary team spanning accounting and NLP/AI, working to design, build, and validate a pipeline that detects inconsistencies and contradictions between firms' written financial filings (10-K/10-Q) and spoken earnings call transcripts. The role involves implementing retrieval-augmented natural language inference models and LLMs, processing large-scale financial text data, and supporting empirical analysis of contradiction scores.
---
Job Responsibilities
- Drive the technical execution of the research project under the guidance of the PI and Co-PI, ensuring project deliverables and timelines are met.
- Design and implement the retrieval-augmented NLI pipeline for inconsistency and contradiction detection across corporate disclosures (earnings call transcripts and SEC filings).
- Perform large-scale text preprocessing, cleaning, segmentation, and indexing of earnings call transcripts and financial filing corpora.
- Implement claims extraction modules using LLM-based approaches to identify key assertions from corporate disclosures.
- Build semantic retrieval mechanisms (e.g., Sentence-BERT embedding similarity search) for matching claim-context pairs across disclosure channels.
- Implement and evaluate contradiction detection models using NLI frameworks and LLM prompting techniques.
- Aggregate model outputs into firm-quarter level scores and prepare structured datasets for empirical analysis.
- Support human-in-the-loop validation by coordinating with student annotators and preparing annotation guidelines.
- Maintain reproducible research workflows using version control (Git) and systematic documentation of processes, code, and findings.
---
Job Requirements
- Bachelor’s degree or Master’s degree in Computer Science, Information Systems, Data Science, Artificial Intelligence, or a closely related field.
- Strong programming skills in Python, with experience in relevant libraries such as HuggingFace Transformers, PyTorch/TensorFlow, and scikit-learn.
- Prior experience or coursework in natural language processing, machine learning, or information retrieval.
- Familiarity with version control tools (e.g., Git/GitHub) and collaborative development practices.
- Experience with financial data or financial text analysis will be advantageous.
- Familiarity with LLM APIs, prompt engineering, and retrieval-augmented generation (RAG) workflows will be advantageous.
- Good communication skills and ability to work both independently and collaboratively.