Ghasem Nobari — AI & Computational Sciences

Project deep dives

The systems behind the work — across Novartis, Optum, Nokia Bell Labs, Leadbook and NUS. Each diagram shows, step by step, how the system actually works.

Novartis · AI Innovation Centre2020 — present

Director, Data Science, AI & Computational Sciences — taking AI from prototype to validated production across discovery, R&D and enterprise; partnerships with NVIDIA, Microsoft and OpenAI.

Material approval · evidence

01/ 06material / claim received

Material / claim

needs approved evidence

Internal · lab reports

External · FDA · approvals

evidence docs

FDA label · pembro.95

Lab report · assay 7.88

Approval dossier.79

Internal SOP v4.71

Trial summary.63

document · page 12 · §3

TEXTp.12 · efficacy endpoint

TABLEp.7 · dose response

CHARTfig.3 · survival curve

IMGp.4 · assay image

SLIDEdeck · slide 9

chartfig.3 · survival.96

tabledose response.90

textefficacy endpoint.83

slidedeck · slide 9.74

imgassay image.66

Human reviewerin the loop

58%≥ 90% → auto-approve

01 · Novartis

Enterprise GenAI · Healthcare

Horizon MAP

An end-to-end multimodal platform that automates medical-claim and material validation. The hard part was the inputs: messy, unstructured lab reports with embedded charts and tables. Custom-built models parse those charts and tables into structured fields, a library of ~20 specialist models is orchestrated over them, and a unified evaluation layer with human-in-the-loop gates and full audit trails turns a multi-week manual review into a near-real-time, traceable workflow.

Parse unstructured reports · charts · tables→ Orchestrate ~20 specialist models→ Evaluate · human gate · audit

40 → 1weeks to validate

~20models orchestrated

multimodaltext · charts · tables

Custom chart/table parsersMultimodal LLMs + OCRModel orchestrationEvaluation harnessHITL reviewAudit & governance

Generative · brand-safe

brief + rules

generate

brand guardrails

on-brand output

copylayoutvisuals

02 · Novartis

Generative AI · Marketing

Horizon-X

A sibling to Horizon MAP that turns the same governed foundation toward creation: an LLM system that generates marketing and communication materials — copy, layout and on-brand visuals — from a short brief and a set of brand and compliance rules, with review built in. The same orchestration and guardrails that validate documents are reused to produce them.

Brief + brand & compliance rules→ Generate with an LLM→ On-brand copy · layout · visuals

Brief→deckin minutes

On-brandguardrailed output

Reuseof Horizon stack

Generative LLMsBrand / compliance guardrailsTemplate synthesisHuman review

Autonomous R&D loop

01/ 06ideation · agents propose hypotheses

H1 · target X modulates pathway Y

H2 · combine A + B raises affinity

H3 · biomarker Z predicts response

0papers0tools built

Library agenttools · MCP

MCP · searchMCP · fetch

researchers ask

agents query

10+ LLMs Sfast-7BMmid-70BLfrontierLreasoningMcodeSembed

personas OrchestratorPlannerSearcherCoderCriticWriter

in-silico benchmark · DGX + cloud

Cycle 2 report · validated

↻ feeds the next R&D cycle

03 · Novartis

Agentic · Autonomous R&D

Multi-Agent R&D

A first-of-its-kind agentic system for continuous, autonomous R&D, running as an iterative loop: ideation, then agents that build their own library of papers and tools end-to-end, exposed through a library agent over tools and MCP that both other agents and human researchers can query. Specialist personas — orchestration, planning, search, code, critique, writing — run across 10+ heterogeneous LLMs, with shared memory, human checkpoints, GPU/cloud orchestration, in-silico benchmarking and a comprehensive report after every cycle.

Ideate + build the library→ Experiment + benchmark on GPU/cloud→ Report · loop the cycle

80–100+GPUs · DGX + cloud

10+LLMs · agent personas

24/7autonomous cycles

Agentic personasSelf-built paper + tool libraryLibrary agent · MCPRAGvLLM servingEval (hallucination / task)NVIDIA DGX

Lab-in-the-loop · antibody design

RL loopDGX Cloud

de novo design

MD sim

ML emulator

wet-lab

feedback

04 · Novartis

AI-Driven Drug Discovery · Galaxy Award ’25

Biologics AI Platform

An integrated antibody-discovery platform built around a reinforcement loop. Large-scale molecular-dynamics simulations are orchestrated across ~80 GPUs on DGX; their trajectories feed feature extraction that fine-tunes ML emulators of biophysical and developability properties — stability, aggregation, manufacturability. Predictions are benchmarked against wet-lab results (lab-in-the-loop), and the gap drives the next round. The emulators cut compute cost while preserving accuracy, alongside de novo design and binding-affinity optimisation.

MD simulation · 80 GPU→ Trajectories → features→ Fine-tune ML emulators→ Benchmark vs lab → repeat

≈ 40%faster screening

~80GPU MD orchestration

2025Novartis Galaxy Award

Molecular dynamicsML emulatorsReinforcement loopGenerative de novo designAffinity optimisationNVIDIA DGX Cloud

Agentic RAG · evidence

knowledge base

multi-agent RAG

extract + verify

developability risk

VH/VLclinicalFDApatents

05 · Novartis

Agentic AI · Immunogenicity

Immunogenicity Evidence Extraction

A multi-agent RAG system that mines and validates immunogenicity evidence from VH/VL sequence-level signals all the way to clinical findings, FDA approvals and labels, patents and published studies. Extractor agents pull candidate evidence; critic agents challenge and cross-check it; a validation step enforces structured, dataset-level QA with traceable provenance — feeding downstream developability and risk assessment with evidence you can audit back to source.

Retrieve from sequences → labels → literature→ Extract, then critic agents challenge→ Validate with traceable provenance

Agentextractor + critic

VH/VL → clinicevidence span

Traceableprovenance + QA

Multi-agent RAGKnowledge basesAgent criticsStructured QAProvenance tracking

Multi-omics → ranked targets

Target · ONC-A.95

Target · ONC-B.88

Biomarker · MS-1.80

Target · ONC-C.72

Biomarker · MS-2.64

06 · Novartis

Multi-Omics · Target Discovery

Target ID & Biomarkers

AI-driven therapeutic-target identification that integrates multi-omics and high-dimensional biology with graph-based learning to surface and rank novel targets. A companion biomarker-discovery effort (imaging + proteomics) advances predictive signatures across neuroimmunology programmes, including progressive multiple sclerosis.

Integrate multi-omics→ Graph-based learning→ Rank novel targets

Oncologytarget identification

MRI + omicspredictive signatures

MSneuroimmunology

Multi-omicsGraph neural networksHigh-dimensional biologyImagingProteomics

More from this era PKS — drug-report extraction text · tables · charts Microsoft AI Empowerment 2021 Responsible-AI governance EU AI Act

Optum · UnitedHealth2018 — 2020

Principal Data Scientist, AI Innovation (iLab) — production AI for affordability, payment integrity and clinical decision support inside a regulated US health plan.

Claims → spend driver

embed claims

temporal drift

across layers

flag driver

facilityprovidermember

07 · Optum

Payment Integrity · Custom Embeddings

Bi² — Behaviour Signal Intelligence

A system that transforms raw claim data into temporal signals using a custom-built embedding model, then tracks how those embeddings move through space and time across different layers of the healthcare system — facility, provider, member. Where an embedding drifts away from its neighbours, a previously invisible medical-spend driver surfaces — turning claims into prioritised, explainable affordability levers, with dynamic dashboards to explore them.

Embed claims with a custom model→ Track drift across space · time · layers→ Surface the spend driver

2US patents

3 layersfacility · provider · member

self-sup.custom embeddings

Custom embedding modelTemporal / spatial driftSelf-supervised learningDynamic dashboardsTensorFlow · Spark

Clinical signal → cost-saving levers

Site-of-care shift.94

Generic substitution.88

Readmission avoidance.80

Care-gap closure.72

Duplicate-test flag.64

08 · Optum

Affordability · Cost Ideation

CIS — Clinical Intelligence System

An engine that automates medical cost-saving ideation. It reads clinical and claims signal across the population and translates it into a ranked set of actionable, explainable affordability levers — surfacing where care can be delivered better and cheaper, and handing analysts a prioritised worklist instead of a blank page.

Read clinical + claims signal→ Generate cost-saving ideas→ Rank affordability levers

1US patent

rankedactionable levers

explainableanalyst worklist

Clinical + claims signalCost-saving ideationRanking / prioritisationExplainable AI

Payment integrity · NLP + XAI

claim queue

NLP read

XAI explain

decision

✓ pay⚑ review

0 claims / hr

09 · Optum

Payment Integrity · NLP + XAI

AutoD — Automatic Payment Decision Helper

A high-throughput assistant for claim review and payment integrity. NLP reads each claim, an explainable-AI layer surfaces the reasons behind every call, and the system recommends pay-or-review under strict audit and compliance constraints — keeping a human in control while clearing the routine volume fast.

NLP reads the claim→ XAI explains the call→ Recommend · pay or review

NLP + XAIexplainable review

audit-gradecompliance constraints

HITLhuman stays in control

NLP / NLUExplainable AI (XAI)Payment integrityAudit & compliance

More from this era Unsupervised fraud & anomaly detection claims-scale Provider behavior monitoring self-supervised Dynamic dashboards D3 · Plotly

Nokia Bell Labs2016 — 2018

Lead Data Scientist, Innovation & AI — multimodal AI, large-scale media systems, and human-sensory interfaces (EEG / ECG / EMG / eye-tracking).

Text · image · video → one index

0indexed items0modalities

08 · Nokia Bell Labs

Multimodal AI · IEEE

Smart News Aggregation

The largest smart news-aggregation system built in the lab: it ingests news as text, image and video, classifies it with multimodal models, and organises everything under shared, machine-generated topic labels so the whole corpus becomes searchable and linkable across formats. The indexing core (ANNOTATE) was demonstrated at Mobile World Congress 2018 and published at IEEE Big Data.

Ingest text · image · video→ Multimodal classification→ Unified index + search

MWC ’18live demonstration

IEEEBig Data 2018

3modalities, one index

Multimodal classificationTopic modelsSemantic searchSparkAzure / AWSDocker · Kubernetes

Gaze → top-5

read gaze

descend hierarchy

confidence > 80%

top-5

EEGECGEMGeye

09 · Nokia Bell Labs

Human-Sensory AI · HCI

Mind-Reader & Sensory Interfaces

Work on AI-driven, passive human–computer interaction using EEG, ECG, EMG and eye-tracking. The flagship demo: a “mind-reader.” You think of an object and stand before a gaze-tracking screen; as images cycle, the model reads where your eyes settle and walks you down the ImageNet hierarchy — narrowing from broad categories to specifics — until confidence passes 80% and it returns its top-5 guesses. No clicks, no typing; intent inferred from gaze.

Read gaze on cycling images→ Descend the ImageNet hierarchy→ Confidence > 80% → top-5

4sensor modalities

>80%confidence to commit

Passiveno clicks · no typing

Eye-trackingEEG · ECG · EMGImageNet hierarchyBayesian narrowingPassive interaction

Video → labelled object chunks

video in

segment

DL label

object chunks

facelogosceneobjectcaption

10 · Nokia Bell Labs

Multimodal · Media at Scale

Automated Video Transformation

A scalable pipeline that turns raw video into structured, multimodal object chunks. It ingests media at scale, segments each video, and runs fast deep-learning models to label every segment — finding the inner similarities and relations across a library so that video becomes searchable, linkable content rather than an opaque stream.

Ingest + segment video→ Fast DL segment labelling→ Linked multimodal chunks

scalableoptimised ingestion

fast DLsegment labelling

linkedinner similarities

Video segmentationDeep learningMedia ingestionSimilarity / relationsSpark

Meaning · infer → compact → personalise

content

infer meaning

compact rep.

personalise

hier. LDAdisambig.transferper-user

11 · Nokia Bell Labs

NLP · Personalised Meaning

Smart Communication — Meaning Transformation

A system to represent, infer and communicate meaning for personalised content. Unsupervised hierarchical topic models — with experiments in deep generative networks — learn a compact representation of meaning that can be transferred and re-expressed per user, with evolving disambiguation of senses and generalised topic labelling across multimedia, so the same message adapts to each recipient.

Infer meaning · hierarchical topics→ Compact, transferable representation→ Personalised re-expression

hier. LDAunsupervised topics

compacttransferable meaning

per-userpersonalised output

Hierarchical LDADeep generative (GAN)Sense disambiguationTopic labellingSpark

More from this era Indoor navigation & IoT context 2 patents EEG / EMG passive interaction UX research

Leadbook2014 — 2016 · Singapore

Senior Data Scientist & R&D Lead — built one of Asia’s largest B2B intelligence platforms, end to end from data engineering to learned recommendation.

B2B intelligence graph · matched

0contacts0companies

10 · Leadbook

B2B Intelligence · 11 Patents

B2B Graph & Prospect Recommender

One of Asia’s largest B2B intelligence graphs — tens of millions of verified company and contact records merged from across the web — with a prospect recommender built on a patented Company–Product–Customer “deep relationship” model that learns which new prospects resemble a customer’s best existing ones. The engineering ran from distributed crawling and entity-matching to real-time lookup.

Crawl & merge millions of records→ Model company–product–customer ties→ Recommend the right prospects

44M+verified contacts

11.5Mcompanies

11Singapore patents

Deep Relationship ModelSpark · HadoopElasticsearch (custom plugin)Go proxyReact extensionTorch

More from this era Real-time lookup proxy Go Contact-lookup browser extension React.js Persona-scoring plugin Elasticsearch Feature classifier Torch Email-validation service

NUS & Singapore2008 — 2014

PhD & Researcher, School of Computing (A*STAR SINGA) — NLP, topic models and privacy-preserving data systems, alongside ventures and AI-for-good prototypes.

Discussions → aspect / action topics

pricing · complaint.92

feature · request.85

support · praise.77

delivery · issue.69

UX · suggestion.61

11 · NUS (PhD)

NLP · AAAI

Aspect–Action Discussion Graph

Doctoral work that takes millions of flat user comments and posts and turns them into a temporal aspect–action graph: a joint aspect–action topic model infers what people are talking about and what they intend to do — without labelled data — and arranges it into a structured, time-aware hierarchy of who said what about which aspect, when. Published at AAAI; the foundation of a self-supervised discussion-analysis and prediction system, supervised by Prof. Chua Tat-Seng.

Millions of flat posts→ Joint aspect–action topic model→ Temporal aspect/action graph

AAAI ’14peer-reviewed

Self-supervisedno labels needed

Flat → graphtemporal structure

NLPJoint aspect–action topic modelTemporal graphsSelf-supervised learning

Sensor → haptic alert

ultrasonic sensor

on-device model

obstacle range

haptic alert

0 real-time · on the ring

12 · Singapore

Accessibility · Wearable

Sonar Ring for the Blind

A wearable smart ring that fuses on-device AI vision with ultrasonic / sonar ranging to perceive obstacles and open space, then guides blind and low-vision users with intuitive, real-time directional feedback. Designed and prototyped in Singapore.

On-device vision→ Ultrasonic ranging→ Directional haptic feedback

On-deviceprivate by design

Real-timeobstacle sensing

Wearablering form factor

Edge AIComputer visionUltrasonic sensingHapticsSensor fusion

Camera → personalised health hue

shelf camera

health scoring

your profile

colour-coded hue

13 · Singapore (GTC ’15)

Health AI · Vision

Health-Aware Shelf Scanner

Point a phone at a supermarket shelf and a vision-and-health model recognises each product and paints a personalised health “hue” over it — guiding shoppers toward better choices for their own profile. It runs on the first structured database of Singapore food labels, which we built and processed to train the model.

Recognise each product→ Score for your profile→ Paint a health hue

1ststructured SG label DB

Personalper-profile scoring

Phonelive camera overlay

VisionOCRNutrition modellingCustom datasetMobile / AR overlay

Location → crowd tasks

where you are

contextual topics

match tasks

crowd shoppers

14 · Singapore

Venture · Startup Weekend

MysteryShopper

A crowd-sourced, location-based mystery-shopping platform that matches tasks to the right people and places using contextual topic models — turning everyday shoppers into a distributed sensing network for retail insight. Winner, Startup Weekend Singapore.

Contextual topic models→ Match task · person · place→ Distributed sensing

WinnerStartup Weekend SG

Geolocation-aware matching

Crowddistributed network

Topic modelsGeo-matchingCrowdsourcingMobile

More from this era Clinical anonymization — PASS DASFAA 2010 Maritime geolocation graph · topic models Visual model for the blind GTC 2015

From prototype to validated production.

Let’s build something worth doing.