Baby photo of Norman Bui

Norman Bui

Computer Science Graduate from Carleton University

Seeking new grad software roles. I have worked across 9 software teams building backend systems, AI tooling, data workflows, and developer productivity software across games, design software, health, geospatial, and public-sector environments.

Available June 2026

Based in Ottawa or Toronto, open to remote or hybrid new grad roles

New Grad Software Engineering

Backend systems, AI tooling, and product-minded engineering for teams that care about shipping well.

Best fit for backend, developer tooling, and applied AI roles. Recent work spans Autodesk repair workflows, EA infrastructure reliability, and Riverbreak's text-layout detection pipeline across browser ML, evaluation data, and interactive visualization.

9 software teams
6 internships
3 part-time roles
3.9 GPA

Recent Teams

Role Fit

Where I add value fastest

  • Backend Engineering
  • Developer Tools
  • Applied AI
  • Data Infrastructure
  • Production systems with real reliability constraints
  • Internal tooling that improves developer speed
  • Applied AI features that still need predictable behavior

Recent Work

Three quick signals

Autodesk Repair flows for Fusion manufacturing data
Electronic Arts CI and load-testing reliability, plus agentic QA work
Kongsberg Geospatial Real-time geospatial apps and automated airspace workflows

Electronic Arts

Software Developer Intern

Built AI-assisted testing and load-testing workflows for The Sims

  • Kotlin
  • AI Agents
  • Automated Testing
  • Analyzed Servo (LLM-powered automated game testing system) by evaluating model performance, reviewing codebase architecture, and providing technical recommendations for system optimization and soak testing workflows
  • Debugged authentication failures in CI/CD load testing pipeline and improved test coverage, increasing infrastructure reliability from 35% to 100% success rate
  • Integrated 10+ new game features into load testing frameworks, implementing automated test coverage and tracking performance metrics to ensure accurate testing under high-traffic conditions

Autodesk

Software Developer Intern

Built validation and repair systems for Fusion manufacturing data

  • TypeScript
  • Java
  • GraphQL
  • Built data validation and repair infrastructure for Fusion platform, establishing foundational framework to prevent missing manufacturing data across cloud workflows
  • Developed an O(n) algorithm with intelligent concurrency controls to repair missing primary relationships, achieving 8x throughput through optimized parallel processing
  • Designed and deployed API endpoints to automate data validation and repair operations in manufacturing workflows, reducing production failures through proactive issue detection
  • Implemented permission validation and error handling protocols, eliminating existing security vulnerabilities for sensitive data
  • Maintained 100% test coverage across all code changes using Mocha, Chai, and Sinon

Public Health Agency of Canada

Data Scientist Intern

Built literature-review and data pipelines for epidemiology research

  • Python
  • LLM
  • RAG
  • MDM
  • Developed LLM-powered literature review workflow for epidemiological research in collaboration with Imperial College London and WHO, building MDM infrastructure to enable standardized global health data access
  • Conducted embedding model evaluation using Ollama to validate RAG system performance, confirming optimal model configuration and establishing baseline metrics for future improvements
  • Designed automated academic paper collection pipeline using GPT-4o, developing proof-of-concept for scalable PDF retrieval and metadata extraction

Bedarra Corporation

Student Researcher

Researched RAG systems for obscure programming language support

  • Python
  • RAG
  • LangChain
  • FAISS
  • Sentence Transformers
  • Researched and developed RAG systems to answer queries about obscure programming languages (Factor, K)
  • Integrated LangChain orchestration to retrieve relevant code examples and documentation, providing context-aware responses for obscure language queries

Kongsberg Geospatial

Software Developer Intern

Built real-time geospatial demo apps and automation workflows

  • C++
  • Qt
  • Python
  • GIS
  • Developed demo apps using TerraLens SDK as the sole developer on the Solutions Engineering team, successfully delivering custom solutions to major clients in aviation, defense, and government sectors
  • Integrated LiDAR, WMTS, and S-57 data and optimized render pipeline to achieve consistent 60+ FPS with real-time object tracking, while implementing system monitoring (GPU, CPU, RAM) and modernizing Qt UI/UX
  • Developed automated airspace classification framework, removing manual processing and standardizing data categorization
  • Automated VFR and terrain data processing workflows, reducing aggregating time by more than 40%

Health Canada

Data Scientist Intern

Built chemical data and modeling pipelines for risk assessment

  • Python
  • ML
  • NLP
  • ETL
  • Rebuilt organization-wide federated chemical search system using BS4 and lxml, implementing automated fault tolerance and rate limiting that increased query success rate by 20%
  • Architected scalable web scraping framework using TDD practices and unittest, expanding accessible chemical data sources by 113% and achieving 100% test coverage
  • Built random forest classifiers with pandas, NumPy, SciPy, and scikit-learn to accurately predict chemical toxicity
  • Engineered automated knowledge graph pipeline using Neo4j, CoreNLP, and Stanza to extract semantic triples from medical literature, streamlining chemical assessment workflows

Department of National Defence

Data Scientist Intern

Built a RAG chatbot for querying complex tax documentation

  • Python
  • LLM
  • RAG
  • NLP
  • Built chatbot implementing RAG architecture to enable intelligent querying across 20+ tax documents
  • Developed NLP pipeline using NLTK, integrating tokenization, contextual analysis, and NER to enhance text understanding and information extraction accuracy
  • Architected vector search system using FAISS and SQLite3, optimizing embedding storage and retrieval for queries

NAV CANADA

Software Developer Intern

Improved ATC tooling performance and deployment automation

  • C++
  • Qt
  • PowerShell
  • GIS
  • Optimized data management algorithm for internal data tool, resulting in a 60% decrease in storage utilization and 30% boost in automated data analysis performance
  • Resolved multiple bugs in ATC software, achieving an overall 16% improvement in frame rate
  • Automated network configurations with PowerShell, reducing manual work and setup time by more than 95%

Correctional Service Canada

Software Developer Intern

Improved accessibility and UX for a public victim services portal

  • C#
  • .NET
  • JavaScript
  • ASP.NET
  • Improved web accessibility using the WET framework to ensure WCAG 2.1 Level AA compliance, enhancing UX for 100+ daily users in victim services application
  • Developed ASP.NET solutions to resolve UI bugs and implement frontend improvements for victim info delivery application

Languages

  • Python
  • TypeScript
  • JavaScript
  • C++
  • C#
  • Kotlin
  • Java

Frameworks & Runtimes

  • Next.js
  • Node.js
  • Express.js
  • FastAPI
  • Flask
  • React Native
  • Expo
  • Electron
  • Streamlit
  • Qt
  • .NET
  • ASP.NET

AI / ML / Data

  • LLMs
  • Agentic AI
  • LLM fine-tuning
  • RAG
  • LangGraph
  • LangChain
  • FAISS
  • Ollama
  • Qwen
  • LoRA
  • Hugging Face
  • PEFT
  • Sentence Transformers
  • scikit-learn
  • IsolationForest
  • Anomaly Detection
  • Telemetry Simulation
  • Synthetic Data
  • Feature Attribution
  • pandas
  • NumPy
  • NLTK

Databases & APIs

  • PostgreSQL
  • PostGIS
  • pgvector
  • MySQL
  • MongoDB
  • SQLite
  • Prisma
  • Neo4j
  • Firebase
  • GraphQL

Tools & Practices

  • CI/CD
  • TDD
  • Docker
  • Docker Compose
  • Ubuntu
  • Linux CLI
  • Grafana
  • Observability
  • Networking
  • Backup Automation
  • Mocha
  • Chai
  • Sinon
  • pytest
  • unittest
  • WCAG 2.1
  • Cloudflare Workers
  • PowerShell

Carleton University

Bachelor of Computer Science (Honours) · 3.9/4.0 GPA

  • Relevant coursework:
    Theory & Algorithms
    Algorithms I & II, Discrete Structures I & II, Graph Analytics
    Systems & Security
    Operating Systems, Systems Programming, Applied Cryptography
    Software Engineering
    Software Engineering, Object-Oriented Programming, Software Quality Assurance, Programming Paradigms, Data Structures, Human-Computer Interaction (UI/UX)
    Data & Applications
    Database Management Systems, Web Development, Mobile Multimedia
    Math & Stats
    Calculus I & II, Linear Algebra, Statistical Modeling
  • Awards: President's Scholarship, Henry Marshall Tory Scholarship, Chalmers Jack MacKenzie Scholarship, Harry S. Southam Scholarship, Dean's Honour List (2022, 2023, 2024, 2025)

AI/ML

Telemetry, detection, and applied model evaluation

Orbital Refueling Simulator

Synthetic spacecraft telemetry, anomaly monitoring, and LLM explainer prototype

Orbital Refueling Simulator generates synthetic autonomous refueling telemetry, injects anomaly scenarios, compares deterministic engineering rules with phase-aware ML anomaly scoring, and fine-tunes an LLM explainer over grounded monitoring outputs.

  • Python
  • Streamlit
  • scikit-learn
  • IsolationForest
  • LLM fine-tuning
  • LoRA
  • Telemetry simulation
  • Regression tests
Orbital refueling telemetry dashboard preview with mission phase, scenario alerts, signal charts, and anomaly score
  • Hybrid monitoring design: Kept hard-limit deterministic rules separate from advisory ML scoring so explicit threshold breaches and subtle multivariate drift are both visible
  • Scenario-driven simulation: Modeled a 410-second refueling mission across nine phases with nominal telemetry plus nine injected anomaly cases
  • Fine-tuned explainer layer: Trained a small Qwen LoRA adapter to explain grounded rule alerts, advisory ML scores, and top contributing telemetry signals

TwinQuery

Agentic Text-to-SQL and RAG assistant for building-stock digital twins

TwinQuery is an digital-twin query assistant that lets non-technical users ask plain-English questions over building-stock and retrofit datasets. It combines a safe Text-to-SQL pipeline with local RAG over retrofit guidance, serving answers grounded in both database rows and retrieved documents through a FastAPI backend and Streamlit/PyDeck interface.

  • Python
  • LangGraph
  • FastAPI
  • Streamlit
  • PostgreSQL/PostGIS
  • Ollama
  • RAG
  • Docker
  • Safe Text-to-SQL: Deterministic guardrails, fallback query templates, and PostgreSQL/PostGIS execution over Ottawa building-footprint geometry
  • Hybrid RAG synthesis: Local retrieval over retrofit guidance grounded LLM answers in both database rows and retrieved document sources
  • Geospatial interface: 2D/3D PyDeck visualization with highlighted building polygons, SQL transparency panel, and tested API response schemas
TwinQuery interface showing map query mode with 3D building extrusion over Ottawa and generated SQL

Riverbreak

Typographic river detection and layout demo

Riverbreak detects typographic rivers in justified text and compares heuristic detection with a small U-Net segmentation model. The project combines a browser line-breaking demo, synthetic data generation, hand-labeled evaluation data, and a prototype reranker that scores layouts by river severity.

  • Python
  • PyTorch
  • U-Net
  • ONNX Runtime
  • JavaScript
  • HTML Canvas
Riverbreak project card showing justified text with detected whitespace rivers
  • Browser-first demo: Built an inspectable text layout simulator with real-time heuristic overlays and detector-guided line-break comparison
  • ML benchmark: Trained a small U-Net on synthetic weak labels, then compared it against the heuristic baseline on hand-labeled validation and test samples
  • End-to-end artifacts: Shipped the ONNX browser model, precomputed sample overlays, benchmark reports, and reproducible repository checks

SWE

Full-stack products, desktop apps, and shipped tools

Latch

Local-first macOS focus blocker

Latch is a local-first macOS focus blocker built with Electron. It blocks distracting sites with a privileged helper, keeps blocklists on-device, and optionally uses a Chromium extension for a friendlier blocked-page experience.

  • Electron
  • TypeScript
  • macOS
  • Local-first
  • Native helper
  • Chromium extension
Latch focus session screen showing timer, active blocklist, and session controls
  • Local-first architecture: Kept blocklists and session state on-device with no account system or cloud dependency
  • System-level enforcement: Paired the Electron app with a one-time privileged helper to block distracting sites across Chromium browsers
  • Focus-session reliability: Supported timed sessions, always-on blocking, and crash recovery so active sessions survive interruptions
  • Practical distribution: Packaged the desktop app, helper, Chromium extension bundle, and native messaging host into a single macOS release flow

CruxOS

Full-stack product for climbing performance decisions

CruxOS is a full-stack decision-support system for intermediate climbers, combining fast mobile session logging with a web app for analysis, reports, and weekly training guidance. It turns training and recovery data into deterministic, explainable insights that support better performance decisions.

  • Next.js
  • TypeScript
  • Prisma
  • SQLite
  • Mobile + Web
  • Deterministic insights
CruxOS product card showing web analysis and mobile logging
  • Full-stack system design: Structured CruxOS as a product, not just a tracker, with shared data flowing from fast session capture into deeper analysis and reporting workflows
  • Mobile + web architecture: Separated low-friction logging on mobile from richer review and trend analysis on the web to support the full training loop
  • Deterministic insight engine: Translated training, recovery, and workload signals into clear weekly recommendations instead of opaque black-box outputs
  • Typed data model: Used Prisma and SQLite to keep sessions, recovery signals, and reports consistent across mobile capture and web review
  • Real-world usefulness: Built for intermediate climbers who need help deciding when to push, recover, or adjust training to improve performance

SocialSaplings

Hackathon-winning reforestation platform

SocialSaplings is a hackathon-winning reforestation platform that recommends suitable tree species from location data and visualizes planting impact for users.

  • HTML/SCSS
  • JavaScript
  • Bootstrap
  • Node.js
  • Express.js
  • Firebase
SocialSaplings screenshot
  • Hackathon outcome: Won 1st Place at KuriusHacks while leading a cross-functional team building a reforestation platform
  • Recommendation engine: Integrated location and environmental APIs to analyze user geodata and suggest suitable tree species for planting
  • Impact visualization: Built a Google Maps dashboard for reforestation metrics and global environmental impact data

Miscellaneous

Self-hosted operations and reliability work

ThinkPad Home Lab

Ubuntu home lab for containerized services

Ubuntu home lab running containerized media and automation services with persistent storage, segmented networking, observability, and automated operations for reliable 24/7 use.

  • Ubuntu
  • Docker
  • Docker Compose
  • CI/CD
  • Grafana
  • Linux CLI
  • Networking
My ThinkPad
  • Platform engineering: Engineered containerized media and automation services with persistent storage, segmented networking, and automated deployments
  • Observability: Implemented monitoring for uptime, disk forecasting, service health, and incident recovery across 24/7 Linux infrastructure
  • Operations automation: Automated updates, backups, permission repair, and container lifecycle management through scripts and CI/CD workflows

Judge My Taste in Books →