StrainsDB: AI-Generated Cannabis Database
strainsdb.org is a Django web application that catalogues cannabis strains, built as an early experiment in AI-powered content generation. Every strain description, terpene profile, and data point was generated using large language models and structured with Pydantic models for consistency.
What It Does
- Strain Browser — Browse hundreds of cannabis strains with detailed descriptions, effects, and growing information.
- Search — Find strains by name, effects, or characteristics.
- Terpene Profiles — Detailed terpene breakdowns for each strain.
- Structured AI Content — All data generated by GPT-3.5, validated through Pydantic models to ensure consistency.
Why I Built It
StrainsDB was a proof of concept: could you build a content-rich website almost entirely with AI-generated data, and could that data be structured and reliable enough to be useful?
The answer turned out to be yes, with the right tooling. The key insight was using the Instructor library to constrain LLM outputs to Pydantic models. Instead of hoping the AI would return well-formatted data, I defined exactly what "well-formatted" meant and let the library enforce it. This produced a database of hundreds of strains with consistent, structured information.
Technical Stack
- Django — Web framework and admin interface.
- GPT-3.5 — Content generation for strain descriptions and profiles.
- Instructor + Pydantic — Structured output validation, ensuring every AI-generated entry matches the expected schema.
- SQLite — Lightweight storage, perfect for a proof of concept.
What It Demonstrated
This project predated the widespread adoption of LLMs for content generation. It showed that AI could produce not just plausible text, but structured, validated data suitable for a real web application — as long as you had the right constraints in place.
The pattern of "AI generates, Pydantic validates" has since become common practice. StrainsDB was one of the early demonstrations of why that pattern works.