Berlin · Open to ML roles

Most AI engineers are good at training models. Far fewer are good at shipping them.

Abu Bakar
Siddik Nayem

[|]

Production AI across NLP, computer vision, and MLOps. From 60M-user recommendation engines to research in Nature and IEEE.

0M+
users served
0
papers published
0+
years in prod AI
0%
detection accuracy

// about

The gap between a model and a product is where most AI projects die.

I close that gap. Four years of production AI — recommendation engines for 60M+ Grameenphone subscribers, computer vision pipelines at TRACKBOX.BE, research published in Nature Scientific Reports and IEEE.

What connects it: a bias toward production. Not a marginal improvement. Systems that work at scale, under constraints, for real users.

Available

Open to ML Engineering roles

Berlin-based · Remote OK · German Work Permit

nayemabs.de@gmail.com

Berlin, Germany

Central European Time · UTC+1

M.Sc. Artificial Intelligence

BTU Cottbus-Senftenberg · 2025 – Present

nayem @ berlin
const me = {
papers: 7,
users: '60M+',
stack: [
'PyTorch', 'AWS', 'GCP'
],
}
60M+
Users Served
7
Papers Published
92%
Detection Accuracy
5+
Years in Production AI

// projects

Systems that shipped.

Production work, research, and open-source tools — each with a specific problem and a measurable outcome.

TRACKBOX.BE

Spatiotemporal Ball Detection

92% accuracy at 10px threshold — 2× the YOLOv8 baseline

Standard object detectors fail on fast-moving balls: motion blur, occlusions, and perspective collapse. The solution: treat video as a sequence, not a frame. A frozen DINOv2 backbone (via RF-DETR encoder) extracts spatial features from 5-frame windows at 10 FPS. A 4-layer Temporal Transformer then reasons across those frames to predict ball visibility and exact pixel position in the final frame.

  • RF-DETR encoder with frozen DINOv2 backbone (86M params) — 3-stage progressive unfreezing
  • Temporal Transformer: 4 layers, 8 heads, d_model=512, custom positional encoding for frame sequences
  • Dual loss: Focal Loss for visibility + Masked MSE for position (only computed on visible frames)
  • GCP Batch training infrastructure — 26K+ samples, W&B tracking across 25+ architecture variants
  • F1 Score 0.9032, RMSE 15.23px, Position-aware @25px = 0.85, @50px = 0.88
  • Evaluated against: YOLOv8 fine-tuned (45% baseline), TimeSformer, Cross-Attention variants, TOTNet (7.19px RMSE)
92% @ 10px thresholdvs 45% YOLOv8 baselineF1 0.903226K+ training samples
PyTorchDINOv2RF-DETRTemporal TransformerGCP BatchW&BOpenCV
Think Flagship IXP

Grameenphone Recommendation Engine

Real-time personalisation for 60M+ subscribers at sub-100ms

Bangladesh's largest telco needed a recommendation engine that could serve 60M+ subscribers in real time, surface relevant short-form video, and integrate ad targeting — without sacrificing latency. The architecture is a two-tower retrieval + reranking pipeline built with Milvus for vector search, gRPC microservices for inter-service communication, and a multilingual sentiment layer covering English, Bangla, and Banglish.

  • Two-tower model: user tower (behaviour history, demographics) + item tower (content embeddings, metadata)
  • Milvus vector database for sub-100ms candidate retrieval across 60M+ user profiles
  • gRPC microservices architecture with Docker + Kubernetes GitOps deployment (GitLab CI/CD)
  • Channel rotation with fallback logic — ensures diversity, prevents filter bubbles
  • Multilingual sentiment analysis on Bangla, Banglish, and English UGC for toxicity filtering
  • Ad-targeting engine integration: downstream revenue signal fed back into re-ranking score
  • Load tested with psutil + memory_profiler under peak concurrent request scenarios
60M+ subscribers<100ms retrieval3 languagesgRPC + Kubernetes
PyTorchMilvusFastAPIgRPCLangChainKubernetesGitLab CI/CDtransformers
Personal

HealthRAG

Production bilingual RAG backend — English ↔ Japanese medical retrieval

Healthcare professionals in Japan and internationally need rapid access to multilingual clinical guidelines. HealthRAG is a production FastAPI service that ingests EN/JA medical documents, indexes them with FAISS using a multilingual sentence-transformer (384-dim, 50+ languages), and serves cross-lingual semantic search — query in English, retrieve from Japanese documents and vice versa.

  • Multilingual embedding: paraphrase-multilingual-MiniLM-L12-v2 — single 384-dim space, 50+ languages
  • FAISS in-memory index with disk persistence — scales to ~1M vectors/node at sub-ms latency
  • 4 swappable LLM backends: GPT-4o-mini, Claude Sonnet, offline template, or custom endpoint
  • 4 translation backends: Google, googletrans, Argostranslate (fully offline), or mock
  • Auto language detection via langdetect on ingest — no manual tagging required
  • Multi-platform Docker image (linux/amd64 + arm64) published to GHCR via GitHub Actions CI/CD
  • CPU inference ~20ms/doc — no GPU required for production serving
50+ languages~20ms/doc CPU~1M vectors FAISSMulti-platform Docker
FastAPIFAISSLangChainSentenceTransformersDockerGitHub Actionspytest
CCDS, Independent University Bangladesh

BOLM: Bangladesh Open LULC Map

891M annotated pixels across 4,392 km² — Nature Scientific Reports 2025

Annotated satellite data for developing countries is scarce — the global LULC community has extensive coverage of North America and Europe, almost nothing for South Asia. BOLM addresses this: pixel-level land use/land cover annotations across the Dhaka metropolitan area using Bing imagery at 2.22m/pixel. 11 classes, 891 million annotated pixels, three-stage GIS expert validation.

  • 4,392 km² coverage — Dhaka metro and surrounding rural/peri-urban zones
  • 891 million annotated pixels across 11 classes: farmland, water, forest, urban structure, urban built-up, rural built-up, road, meadow, marshland, brick factory, unrecognized
  • 3-stage annotation: Bing imagery (2.22m/pixel) + QGIS + GIS expert validation
  • Benchmarks: DeepLabV3+, HRNetv2, U-Net, UnimatchV2, Segmenter ViT-16 — best IoU 0.50 (UnimatchV2)
  • Companion paper accepted at ICIP 2025 (arXiv:2505.21915)
  • Part of a 7-paper arc: FCN-8 (SLAAI 2020) → SIGML/Sensors 2021 → ICPR 2020 → Scientific Reports 2025
891M pixels4,392 km²11 LULC classesNature Scientific Reports
DeepLabV3+U-NetHRNetv2PyTorchGDALQGISRemote Sensing
Personal / BTU Coursework

Llama 3.2-1B Emotion Classifier

Multi-label emotion detection via LoRA — F1 Macro 0.7011 on 5 classes

Fine-tuning LLMs for classification is expensive. LoRA (Low-Rank Adaptation) changes that: train ~0.1% of parameters while matching full fine-tune quality. This project fine-tunes Meta's Llama 3.2-1B for multi-label emotion detection (anger, fear, joy, sadness, surprise) using LoRA rank=16 and 4-bit NF4 quantization — reducing VRAM by ~75% while hitting F1 Macro 0.7011.

  • LoRA config: rank=16, alpha=32, target modules q_proj + v_proj — ~0.1% of total parameters trained
  • 4-bit NF4 quantization via BitsAndBytes — ~75% VRAM reduction vs full precision
  • F1 Macro: 0.7011 | F1 Micro: 0.7183 | Hamming Loss: 0.188 | Jaccard (macro): 0.544
  • Per-class F1: fear 0.791, surprise 0.746, sadness 0.701, anger 0.667, joy 0.600
  • 30 epochs, batch size 2 + gradient accumulation, lr 5e-5 with linear warmup
  • PEFT + BitsAndBytes + Transformers + Datasets — reproducible stack pushed to HuggingFace Hub
F1 Macro 0.7011F1 Micro 0.7183~0.1% params trained75% VRAM reduction
Llama 3.2LoRAPEFTBitsAndBytesHuggingFacetransformersPyTorch4-bit NF4
Acme AI / CCDS IUB

Livestock Weight Prediction System

Image-to-weight estimation for cattle — $246,000 Gates Foundation grant

Weighing livestock in developing-world farms requires expensive equipment most farmers cannot afford. This project builds an end-to-end pipeline: photograph a cow, get a weight estimate. Covers cattle semantic segmentation (DeepLabV3+ on MMSegmentation), rear-view pose estimation, morphometric feature extraction, and a production FastAPI + Celery API for async batch inference.

  • Cattle segmentation: DeepLabV3+ on MMSegmentation (OpenMMLab) — repo: livestock-segmentation (2★ GitHub)
  • Pose estimation from rear view (cattle-pose-rear) — keypoint detection for girth, height, length measurement
  • Weight regression from body morphometrics → kg estimate
  • Production API: FastAPI + Celery + RabbitMQ for async batch jobs (cattle-web — 4★, 2 forks)
  • Optional Flower monitoring dashboard for task queue observability
  • 12K-image public dataset released on Kaggle — 4K+ downloads
  • Funded by Bill & Melinda Gates Foundation ($246,000 grant) through CCDS, IUB
$246K Gates Foundation12K-image dataset4K+ Kaggle downloadsFastAPI + Celery
DeepLabV3+MMSegmentationPyTorchFastAPICeleryRabbitMQOpenCV

// experience

The pattern: I follow problems through.

From Dhaka to Berlin — research labs, AI startups, and product companies.

Software Engineer (ML)

TRACKBOX.BE

Aug 2025 – Present
Remote
  • Soccer analytics platform — computer vision and NLP pipelines for match intelligence
  • Custom CNN+Transformer hybrid: 92% detection accuracy at 10-pixel threshold
  • NL2SQL analytics layer: query success from 20% to 95% across 7 iterations
PyTorchYOLONL2SQLOpenCVPython

M.Sc. Artificial Intelligence

Brandenburg University of Technology Cottbus-Senftenberg

Apr 2025 – Present
Cottbus, Germany
  • Focus: deep learning, NLP, computer vision, and autonomous systems

Senior AI Engineer

Think Flagship IXP

Nov 2024 – Jul 2025
Remote
  • Multi-stage recommendation system for Grameenphone — 60M+ subscribers
  • User/item tower architecture with real-time feature serving under 100ms
  • Multilingual sentiment analysis: English, Bangla, Banglish
  • Ad-targeting engine integration with downstream revenue impact
PyTorchLangChainAWSRedisPostgreSQL

AI Engineer

AinoviQ IT Limited

Oct 2023 – Oct 2024
Dhaka, Bangladesh
  • Virtual try-on system using generative adversarial networks (GANs)
  • Garment segmentation and pose estimation pipeline for e-commerce
  • Reduced inference latency by optimizing model serving on GCP
GANsPyTorchGCPOpenCVFastAPI

Lead ML Engineer

Acme AI Ltd.

Dec 2021 – Feb 2024
Dhaka, Bangladesh
  • Livestock health monitoring using satellite imagery and deep learning
  • Medical imaging pipelines for diagnostic assistance
  • Built and led the ML team — hired and mentored 4 engineers
  • Authored technical articles on RAG, GraphRAG, and open-source LLMs
PyTorchAWSMLflowDockerRemote Sensing

Research Associate

Independent University Bangladesh

May 2019 – Nov 2022
Dhaka, Bangladesh
  • Published in Nature Scientific Reports (2025), Sensors (2021), IEEE ICPR (2020)
  • BOLM: 12K-image land use/land cover benchmark dataset for Dhaka
  • Disaster response image classification with attention mechanisms
  • Funded by Bill & Melinda Gates Foundation and Australia Awards
ResearchDeep LearningRemote SensingPythonPyTorch

B.Sc. Computer Science

Independent University Bangladesh

Sep 2015 – Apr 2019
Dhaka, Bangladesh
  • Graduated with distinction — foundations in algorithms, ML, and systems

// skills

The tools, not the buzzwords.

Production-tested across NLP, computer vision, and infrastructure. Only what I have shipped.

Large Language ModelsLoRA / QLoRARAG / GraphRAGLangChainLlamaIndexHuggingFacePrompt EngineeringMultilingual NLPSentiment AnalysisNL2SQLYOLO (v5–v11)DETR / RF-DETRGANsOpenCVObject DetectionImage SegmentationPose EstimationSpatio-temporal ModelsRemote SensingVideo AnalyticsAWS (EC2, S3, SageMaker)Google Cloud PlatformDockerTerraformMLflowCI/CD PipelinesFAISS / MilvusModel ServingGDPR / HIPAA / ISOMonitoringPythonPyTorchTensorFlowFastAPITypeScriptPostgreSQLRedisNumPy / PandasScikit-learnLinuxLarge Language ModelsLoRA / QLoRARAG / GraphRAGLangChainLlamaIndexHuggingFacePrompt EngineeringMultilingual NLPSentiment AnalysisNL2SQLYOLO (v5–v11)DETR / RF-DETRGANsOpenCVObject DetectionImage SegmentationPose EstimationSpatio-temporal ModelsRemote SensingVideo AnalyticsAWS (EC2, S3, SageMaker)Google Cloud PlatformDockerTerraformMLflowCI/CD PipelinesFAISS / MilvusModel ServingGDPR / HIPAA / ISOMonitoringPythonPyTorchTensorFlowFastAPITypeScriptPostgreSQLRedisNumPy / PandasScikit-learnLinux

NLP & GenAI

Large Language ModelsLoRA / QLoRARAG / GraphRAGLangChainLlamaIndexHuggingFacePrompt EngineeringMultilingual NLPSentiment AnalysisNL2SQL

Computer Vision

YOLO (v5–v11)DETR / RF-DETRGANsOpenCVObject DetectionImage SegmentationPose EstimationSpatio-temporal ModelsRemote SensingVideo Analytics

MLOps & Infrastructure

AWS (EC2, S3, SageMaker)Google Cloud PlatformDockerTerraformMLflowCI/CD PipelinesFAISS / MilvusModel ServingGDPR / HIPAA / ISOMonitoring

Languages & Frameworks

PythonPyTorchTensorFlowFastAPITypeScriptPostgreSQLRedisNumPy / PandasScikit-learnLinux

// spoken languages

BengaliNative
EnglishC1 · Professional
GermanA2 · Learning

// contact

Let's build something.

I would welcome the opportunity to talk through how my background maps onto what you are building. Available for ML engineering roles in Berlin and remotely.

nayemabs.de@gmail.com
Berlin, Germany — German Work Permit
Download CV

// built with Next.js · deployed on Vercel //