PORTFOLIO / 2026

Anu
Bilegdemberel

This is the dread of the endless learning curve. I sacrifice so much time and mental energy grinding through failed outputs, broken dependencies, and dead ends just to build something that works. But when the script finally runs successfully, I don't feel like a visionary genius. I just feel drained. I'm left sitting alone, staring at my screen, realizing that solving this one problem just unlocked ten new, harder problems I now have to figure out.

Turning LLMs into reliable systems — through agent orchestration, cloud deployment, and security-first AI architecture.

Available for work
Seoul, South Korea
CURRENTLY
GenAI Engineer
FOCUS
PythonLangGraphRAGAWSKubernetesOpenAI API
Kookmin University
Software Engineering & Marketing
2023 — Present
LANGUAGES
English — proficient
Korean — TOPIK level 6
Mongolian — native
Chinese — 12 years of study
BACKGROUND
AI Agent Development Hackathon — 2nd place · 2026.01
~1 year work experience — California
~6 months LLM research — Silicon Valley

Flagship works

01SAP HANA workload triage from noisy telemetry

Problem

HANA already logs sessions, statements, and waits in detail. Operators still drowned in alerts, took too long to find root cause, and governance got shaky when people acted on a half-formed theory.

Solution

Python pipeline: extract features from SQL and SQLScript, combine rules with confidence-ranked root-cause guesses, spell out impact, and only run remediations from an allowlist with human approval and rollback. If we plug in an LLM, it polishes narrative text. It does not decide what is safe or what ranks first.

Evaluation & metrics

  • Offline eval: precision@k on root cause, false-alert checks, zero tolerance on allowlist violations, narrative rubric completeness
  • scripts/eval_run.py writes baseline vs improved profiles (e.g. alert dedup) to reports/eval-baseline-vs-v2.md

Security & reliability

No open-ended auto-fix. High-risk steps need a human. Decisions and actions append to an audit log. Secrets stay in .env or BTP destinations, not in git.

System architecture

Telemetry (HANA / staging / CSV fixtures)
    → Ingest → store
    → Detect → rank (rules + confidence) → impact
    → Plan actions (allowlist YAML)
    → Safety (approval gate, rollback)
    → Narrative (template-first) → audit log
SAP HANA & BTP (target)Python 3.11+SQL / SQLScriptpytest

02Customer-facing agents with evals before deploy

Problem

Every prompt or tool change nudged the agent off course. There was no steady signal before the build went to customers.

Solution

Multi-step LangGraph flows, golden-set regression tests, and LLM-as-judge checks in CI and staging, so each deploy had a clear pass/fail bar on the paths that matter.

Evaluation & metrics

  • Core flows: regression detection went from days to hours
  • Pass rates and latency budgets documented per workflow version

Security & reliability

Tool tokens scoped to least privilege, PII stripped from traces, rate limits and idempotency keys on anything that writes or calls out.

System architecture

User / API
    → Ingress & auth
    → Orchestrator (LangGraph state machine)
    → Tool sandbox (scoped credentials, timeouts)
    → Model + structured output
    → Audit log / metrics export
PythonLangGraphKubernetesPostgreSQL

03RAG with policy filters for internal teams

Problem

Plain vector search fetched answers that sounded right and were not. Docs changed between releases and did not always show up in retrieval. Access rules were not the same for every team.

Solution

Hybrid retrieval (dense plus keyword), metadata filters that match access policy, and a pipeline that refreshes embeddings on versioned chunks so answers stay grounded and citeable.

Evaluation & metrics

  • Roughly 40% fewer off-topic or over-broad pulls vs a single-index baseline
  • Grounded answer rate on a labeled eval set, checked each release

Security & reliability

Tenants separated at index and query layers, no raw PII inside embedding payloads, audit trail on what chunks were retrieved.

System architecture

Sources (docs, tickets, policies)
    → Chunk + enrich metadata
    → pgvector + BM25 hybrid index
    → Retrieval router (policy + filters)
    → LLM with citations + refusal paths
FastAPIOpenAI APIpgvectorRedis

Work experience

AI Instructor

Part-time

2026.02 — Present

Remote

Teaching RAG, n8n workflow automation, and prompt engineering on Lambda Global’s AI learning platform—courses aimed at non-technical and industry professionals building practical AI literacy.

RAGn8nPrompt engineeringCurriculum

UI/UX & Frontend

Intern

2025.09 — 2026.02

Manhattan Beach, CA

Bestia Group LLC

Agent-assisted operational CRM: secure Next.js surface for dense lead and property workflows, OAuth-backed channels, and scoped LLM help with server-side boundaries and latency budgets.

Next.jsTypeScriptFigmaTwilio

Flutter & UI/UX

Contract · full-time

2025.03 — 2025.09

Los Angeles, CA

Handiers Inc.

Brand-true design ops and Figma-driven handoff across consumer and pro apps; data-informed agent workflows for UX, IA, and product specs alongside Vertex AI job intelligence and payout-ready commerce flows.

FlutterFigmaVertex AIgRPC

Product Engineer

Seasonal

2026.01 — 2026.02

Seoul, South Korea

Shinhan Bank — Industry–Academic Collaboration Project

Scholarship Foundation program — multi-role PWA ecosystem (scholars, mentors, alumni) plus 74+ page admin CRM: design tokens, FastAPI services, Supabase—automated reporting and operations replacing manual cycles.

Next.jsFastAPISupabasePWA

Let's Connect

Open to roles where AI meets product—production LLMs and agents, end-to-end ownership, and measurable outcomes for users and the business.

© 2026 Anu Bilegdemberel. All rights reserved.