All projects

    Project 02 · Case study

    Yelp AI Review Classifier

    Fine-tuned DistilBERT plus local Qwen3-8B for zero-cost sentiment classification. 81% accuracy with a domain-shift analysis that mattered more than the headline number.

    PythonDistilBERTQwen3-8BTransformersPyTorchHugging FaceNLPMachine Learning
    Yelp AI Review Classifier

    Case study

    The problem

    I wanted to see how far you could push sentiment classification without touching a paid API. Fine-tune a small model, run everything locally, measure honestly.

    The approach

    Fine-tuned DistilBERT on Yelp reviews with Hugging Face Transformers. Paired it with a local Qwen3-8B for the prompt-engineered baseline. Ran domain-shift analysis on out-of-distribution reviews (non-restaurant categories) to see where the fine-tuned model's confidence actually held up.

    What worked

    81% accuracy on the held-out test set, zero operational cost, all inference local. The domain-shift analysis turned out to be the most useful part: the model's accuracy dropped predictably on non-restaurant reviews, which told me exactly where a production deploy would need retraining data.

    What I'd do differently

    I overspent on prompt iteration for the Qwen baseline before I had a good eval harness. Next time the eval framework comes first.

    More detail

    A review classifier built to test how far sentiment analysis can go without touching a paid API. Fine-tuned DistilBERT on Yelp data, paired with a local Qwen3-8B baseline, and measured honestly against out-of-distribution reviews.