WriteRAG - Learn to Write, Not Just Fix Text

The Investment

I completed professional editor training, finished five writing courses, earned a qualification in print and digital media, and studied a lot of editing materials. I invested significantly in my education. I wasn't satisfied that this knowledge just sat on a drive or a shelf without practical use.

So I ran an experiment: can this knowledge be turned into a working system? Not just an AI that gives generic tips, but a tool that delivers structured feedback grounded in specific sources — ones I know and trust.

Before / After

Before: Grammarly flags passive voice → fixes it → I accept → I don't understand why passive voice is a problem → I keep making the same mistake.

After: WriteRAG says, "You wrote 'mistakes were made' — passive voice. Rule: Strunk & White, Element 14: Use the active voice. Why: Passive hides responsibility and weakens writing. Better: 'I made mistakes.'" → I actually understand the principle.

Impact: 85% accuracy in detecting violations. 18 rule categories derived from 10+ books. I mastered "show, don't tell" after seven repetitions — now I catch it myself, without the tool.

How It Works

Step 1: I extracted rules from 10+ professional writing books, added course notes in writing craft, and organized the material into 18 categories. When you send a draft, the AI shows exactly WHICH rule you violated (with a citation to the specific book or course module) and WHY it matters. Rules are internalized through repetition — typically in 5–7 iterations.

Step 2: You send your text to a Telegram bot. The AI finds which rules apply to your text. If you write, "The ball was thrown by John," it matches the "passive voice" rule.

Step 3: It returns: "Rule violated: Use active voice (Strunk & White, p.18). Problem: 'was thrown by' is passive. Why it matters: Active is more direct and stronger. Fix: 'John threw the ball.' Example: [shows good/bad versions]."

Result: You learn writing principles from professional sources, not generic AI advice. You track patterns: "You violate 'show, don't tell' 80% of the time — here's why it matters."

Technical Architecture

1. Knowledge Base Construction:

Selected top writing sources: On Writing Well, Bird by Bird, The Elements of Style, Save the Cat, storytelling courses
Created custom extraction rules for each source — 18 rule categories
Primary categories (4–6): clarity, structure, voice/tone, show don't tell, pacing, dialogue
Secondary categories: character, conflict, description, transitions, opening hooks, etc.
Each rule is structured with name, category, principle, explanation, good/bad examples, and recommendations

2. Rule Extraction and Structuring:

Manual curation: read books and courses, distilled discrete, actionable rules
AI assistance: Claude helped formalize rules from course notes
Quality validation: each rule tested on real writing samples
No vague principles — only rules that can be identified objectively

3. Vector Database Training:

Created embeddings for each rule with OpenAI Embeddings
Stored in Qdrant with metadata (category, source, severity)
Embeddings capture meaning: "passive voice" surfaces passive constructions; "telling not showing" finds exposition dumps
Trained on positive examples (good writing) and negative examples (violations)

4. Text Analysis Pipeline:

User submits text → system builds an embedding
Vector similarity search finds closest rule violations in Qdrant
AI (Claude) validates matches: "Does this actually violate 'show, don't tell'?"
Returns a ranked list of issues with confidence scores, citations, explanations, and concrete recommendations

5. Learning Through Repetition:

The system tracks which rules you violate most often
Builds a "mistake profile" over time
Prioritizes feedback: "You keep telling instead of showing — here's another instance"
Progress tracking: see changes in violation frequency week over week

18 Rule Categories

Primary (core):

Clarity & concision: remove needless words, use concrete language, avoid jargon
Structure & flow: logical organization, paragraph unity, effective transitions
Voice & tone: consistent perspective, appropriate formality, authentic voice
Show, don't tell: sensory detail, reveal character through action, avoid exposition dumps
Pacing: balance scene and summary, vary sentence length, control information flow
Active vs passive: prefer active voice, identify passive, use passive intentionally

Secondary:

Dialogue: natural speech, subtext, attribution
Character development: consistency, motivation, arc
Conflict & tension: stakes, obstacles, escalation
Description: relevant detail, avoid info dumps, sensory balance
Opening hooks: capture attention, set tone, create questions
Endings: satisfying resolution, avoid deus ex machina, thematic closure
…and six more specialized categories

What Makes It Different

Every piece of feedback cites a real book page or course module (e.g., Strunk & White p.18, Bird by Bird ch.3). Not AI "opinions," but professional writing wisdom you've paid for — and then forgotten. It builds intuition through repetition, not dependency.

Real Numbers

Performance:

18 rule categories from 10+ books and courses
10–20 seconds per analysis
85%+ accuracy in identifying violations
Rules stick after 5–7 rounds of feedback

What Actually Changed:

Before: Grammarly fixes it; I don't learn "why"
After: I see the exact professional rule I'm violating and why it matters
Identified a systematic weakness: "show, don't tell"
Now I catch passive voice in my own writing without the tool
I apply writing techniques consciously thanks to repetition

Value and Scale

For: one person (me) learning professional writing

Potential market: anyone improving their writing — students, creators, novelists, professionals. 30M+ writers on Medium, Substack, and personal blogs.

What it teaches: not just "fix this," but "why." Internalize principles via repetition. Discovered a weakness I didn't realize I had ("show, don't tell"). Now I catch passive voice myself.

Core difference: Grammarly/ChatGPT create dependency. WriteRAG builds independence — understand principles and apply them on your own.

Skills Demonstrated

Content curation and knowledge extraction from professional sources
RAG architecture (Qdrant + embeddings)
Rule-based AI systems (hybrid vector search + validation)
Writing theory and pedagogy
Pattern recognition and learning systems
Product design for learning (not just quick fixes)
Quality control for AI outputs (validation layer)

Tech Stack

Technologies: Python, Claude 3.5 Sonnet, OpenAI Embeddings, Qdrant, FastAPI

Knowledge base: 10+ professional books/courses, 18 categories, hundreds of discrete rules

Complexity: 7/10 (RAG, rule extraction, validation layer, pattern tracking)