AI-Powered HR Analytics System

Achievement Log

2025-01 Week 3–4: Analyzed 7 MongoDB collections, catalogued 15 common HR queries, evaluated Sonnet vs Haiku (81% vs 54%). Built PromptBuilder with schema injection — reduced field hallucination from 19% to 1.5%. 2025-02 Week 1–2: Built 47-pattern Simple Engine (94% accuracy on 120 test queries). Internal alpha with 5 analysts: 85% first-attempt success rate. Discovered 3.7× budget overrun — triggered Redis caching initiative. 2025-02 Week 3–4: Implemented two-level Redis caching (Bedrock calls -62%). Added compound indexes (p99: 4.2s → 320ms). Expanded few-shot examples to 8. Projected cost now 30% under budget. 2025-03 Week 1–2: Multi-stage Docker build, GitHub Actions OIDC pipeline, App Runner VPC deployment. 3 analyst onboarding sessions. 2025-03 Week 3–4: 150 production queries collected, 18 negative items reviewed — 12 new Simple Engine patterns added. App Runner concurrency fixed. Final: 85% first-attempt, 70% cost reduction, p99 2.1s under 8 concurrent users.

Overview

Conversational analytics platform enabling Turkish-speaking HR teams to query MongoDB HR data in natural language. Built on Amazon Bedrock with Claude 3.5 Sonnet, the dual-engine system routes simple queries through a deterministic local engine and complex ones through LLM-generated MongoDB aggregation pipelines. Achieved 85% first-attempt success rate and reduced report generation from 2–3 days to under 8 seconds.

Core Technologies

Amazon Bedrock & Claude 3.5 SonnetMongoDB & MongooseFastAPI & PythonDocker & AWS App RunnerRedis (ElastiCache) & Caching LayerTableau & Visual AnalyticsBash Scripting & OIDC Automation

Implementation & Architecture

Dual-Engine Query Routing Architecture

Simple Engine with 47 Turkish regex patterns (12 intent categories) runs first with slot-filling for dates and department names. On miss, Complex Engine uses asyncio.gather to parallelize schema fetching and prompt assembly before Bedrock invocation. PipelineValidator whitelists operators, validates field names against the live schema with fuzzy correction, and blocks JavaScript execution operators.

Turkish Language Intent Processing

Normalization pipeline: Turkish Unicode preservation, number word resolver ('iki yüz' → 200), locative case suffix stripper ('İstanbul'da' → 'İstanbul'), and colloquial abbreviation expander. 8 Turkish few-shot examples in the Bedrock system prompt. Rapidfuzz fuzzy matching for department name normalization against live MongoDB slugs.

CI/CD Pipeline & Container Registry

GitHub Actions with OIDC role assumption (no static IAM keys). Multi-stage Docker build (280 MB final image, 62% smaller than single-stage). ECR with git SHA tagging and 10-image lifecycle policy. App Runner zero-downtime rolling deployment on every main branch push.

Project Outcomes

✓
Report generation time dropped from 2–3 days to under 8 seconds for 85% of HR analytical questions.
✓
85% first-attempt success rate on complex Turkish multi-stage aggregation queries; 96% by second attempt.
✓
70% reduction in monthly Bedrock costs through Redis caching, schema-aware prompt truncation, and the Simple Engine.
✓
MongoDB p99 aggregation latency reduced 87% (4.2s → 320ms) via compound indexes and stage reordering.
✓
8 concurrent analyst sessions handled without degradation after App Runner concurrency optimization.
✓
Fully containerized CI/CD with OIDC-based GitHub Actions — zero static credentials, zero-downtime rolling deployments.