Question Answering with Transformers - Case Study

Project Overview

An extractive question-answering system that takes a context paragraph and a question, then returns a span of text from the context as the answer. Transformer models (e.g. DistilBERT, BERT) are fine-tuned on the Stanford Question Answering Dataset (SQuAD). The project includes a full training pipeline with configurable model name, data path, epochs, and batch size, plus a Flask web app where users can enter context and questions and receive extracted answers. Evaluation uses Exact Match and F1; metrics are saved to outputs/eval_metrics.txt. The app uses outputs/final if present, otherwise the base model.

Dataset / Input Data

DatasetStanford Question Answering Dataset (SQuAD)

Filestrain-v1.1.json, dev-v1.1.json in data/

SourceDownload from Kaggle or official SQuAD; place in data/ folder

Model / Approach

Task

Extractive QA: model predicts start and end token positions of the answer span within the given context.

Training

Fine-tuning on SQuAD via src.train. Supports full dataset or smaller subsets (e.g. --max_train_samples 2000, --max_eval_samples 500). Configurable epochs, batch size, output directory. Checkpoints and final model saved under outputs.

Models

DistilBERT (default), BERT, or other Hugging Face QA models. Compare by running training with different --model_name and --output_dir and inspecting eval_metrics.txt.

Evaluation

Exact Match and F1 score; metrics written to outputs/eval_metrics.txt.

Tools / Frameworks

Hugging Face Transformers Tokenizers Datasets Pandas Flask

Tech Stack & Project Structure

Python 3.8+

Flask

Hugging Face Transformers

Datasets / SQuAD

Pandas

Exact Match & F1

Structure: app.py, question_answering.ipynb, data/ (SQuAD JSON), src/ (data_loader, preprocess, train, inference), templates/, static/css/, outputs/ (checkpoints, final, eval_metrics.txt).

Results / Output

Outputs

Trainingoutputs/checkpoint-*, outputs/final

Metricsoutputs/eval_metrics.txt (EM & F1)

Flask apphttp://127.0.0.1:5000 — context + question → answer

Highlights

Extractive QA with Transformers fine-tuned on SQuAD
Flask web interface for context + question → answer
CLI training: quick run (small subset) or full training with configurable args
Support for different base models (e.g. bert-base-uncased, distilbert-base-uncased)
Evaluation: Exact Match and F1 saved to file

Demo Video

Watch project demo on Google Drive Extractive QA Flask app — context + question → answer

Watch Demo Video View Source Code