Skip to content
View hydrangeas20's full-sized avatar
  • Canada

Block or report hydrangeas20

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hydrangeas20/README.md

Hi, I'm Hamda 👋

I'm an empirical AI safety researcher and ML systems engineer. I build benchmarks and evaluation pipelines that test how robust, interpretable, and reliable language models actually are — fine-tuning attack resistance, mechanistic interpretability, scaling laws, and adversarial robustness — and I publish what I find, including the results that don't confirm my hypotheses.

I'm the founder of Plum AI Labs, where this research lives alongside two first-author publications. I also write about LLM evaluation and AI safety at Applied Alignment.

What I work on

  • Fine-tuning attack resistance & safety evaluation — does alignment hold up under adversarial fine-tuning, and do our evaluations actually measure what we think they measure?
  • Mechanistic interpretability — logit lens, activation patching, sparse autoencoders on transformers built from scratch
  • Scaling laws & GPU systems — recovering empirical scaling relationships, benchmarking torch.compile and Triton kernels against eager PyTorch
  • Production ML systems — pipelines, monitoring, and infrastructure that actually ship

Featured work

Project What it does
SafetyLens Measures divergence between eval-framed and deployment-framed model behaviour across model scale
AudioGuard Adversarial robustness evaluation for audio classifiers — FGSM, PGD, adversarial training
JAX Interpretability Mechanistic interpretability on a transformer built from scratch — logit lens, activation patching, SAEs
ScaleTrace Empirically recovers Chinchilla-style scaling laws from a small training grid
KernelBench GPU kernel benchmarking — PyTorch eager vs. torch.compile vs. Triton
e2e-ml-pipeline Production-grade ML pipeline with DVC, MLflow, and Dockerized FastAPI deployment

Background

Infrastructure & AI Systems Engineer by day, independent researcher the rest of the time. My engineering background (Python, PyTorch, JAX, Docker, Kubernetes, AWS, distributed data systems) is what lets me build and run the experiments behind the research, not just theorize about them.

Let's talk

If you're working on AI safety evaluation, interpretability, or adversarial robustness — or just want to talk shop about empirical ML research — I'd love to connect.

Pinned Loading

  1. e2e-ml-pipeline e2e-ml-pipeline Public

    Production-grade end-to-end ML pipeline using DVC for data/model versioning and MLflow for experiment tracking + model registry, with automated workflows and Dockerized FastAPI deployment for servi…

    Python

  2. audioguard audioguard Public

    Adversarial robustness evaluation for audio classification models using FGSM, PGD, and adversarial training, with reproducible benchmarking and robustness analysis.

    Jupyter Notebook

  3. jax-interpretability jax-interpretability Public

    Mechanistic interpretability experiments for transformer language models built from scratch in JAX/Flax. Investigates internal representations using Logit Lens, Activation Patching, and Sparse Auto…

    Jupyter Notebook

  4. scaletrace scaletrace Public

    Empirical study of neural scaling laws using transformers trained on a corpus of Python standard library modules. Investigates how model size, dataset size, and compute influence language modeling …

    Jupyter Notebook

  5. kernelbench kernelbench Public

    GPU systems benchmark comparing PyTorch eager execution and torch.compile across common deep learning operations. Measures execution time, throughput, TFLOPS, memory bandwidth, and kernel-level spe…

    Jupyter Notebook