LangMentor - AI Tutor That Learns from My Mistakes

The Problem

I can't afford regular lessons with tutors, and I don't have time for language schools. Their format isn't always effective anyway. So I built a system for myself that learns my mistakes and strengths, trains me with spaced repetition, tracks my progress automatically, and picks daily topics for me.

Before / After

Before: A tutor twice a week → $240/month → we go page by page through a textbook → no tracking of what exactly "falls apart" for me. Notes are kept, but the goal is to get through the book.

After: Daily practice in Telegram → $4/month → the AI notices that I confuse "the subjunctive after expressions of emotion" → gradually generates 20 more targeted exercises exactly for that → finally mastered it.

Bot base: 4 digitized textbooks (A1–B2 for Spanish/English), 2300+ exercises. 95% accuracy in detecting my weak spots. Practice anywhere, anytime. Primarily to get verified theory and cover the entire span from A1 to B2.

How It Works

Step 1: The Telegram bot sends an exercise from a digitized textbook. "Fill the form: Espero que tú ___ (venir) mañana."

Step 2: You answer. The AI checks instantly. If there's a mistake, it saves it with context (which rule, what type of error).

Step 3: After 10 mistakes on the subjunctive, the AI sees a pattern and gives MORE practice exactly on that topic. It spaces the drills over time (forgot → reviewed → remembered better).

Result: Unlimited practice precisely on what is hard for you personally. Not "one-size-fits-all lessons." When textbook tasks run out, the AI generates new ones of different types, tailored to your weak spots.

Technical Architecture

1. Content Digitization Pipeline:

OCR + manual proofreading of 4 full textbooks
AI extraction of theory, glossaries, grammar rules, and exercises
YAML structure: each lesson has metadata, goals, content sections, exercise types. 500+ units
Triple verification system: if Claude and GPT agree, answer is unambiguously correct; if they disagree, Opus 4.1 is called for re-verification. All variants stored and shown to user. Out of 2300+ sub-exercises, only 73 required triple verification
Quality control: comparing extracted content to the source

2. Lesson Storage and Delivery:

PostgreSQL stores lesson content, my progress, and completed tasks
Lesson YAML files are parsed dynamically on topic request
Telegram interface for seamless on-phone learning
Support for push notifications (daily reminders) and pull mode (self-starting sessions)

3. Error Tracking and Vector Memory:

Every answer is saved with metadata: question type, topic, grammar concept, vocabulary
Mistakes are embedded with OpenAI embeddings and stored in Qdrant
Similarity search surfaces related misses: "trouble with the subjunctive in past tenses"
The system builds a detailed profile of my mistakes over time

4. Adaptive Learning Engine:

Analyzes error patterns: which rules, which lexical topics, which exercise types
Calculates topic confidence (0–100%) based on recent results
Assigns more practice to low-confidence topics (spaced repetition)
Reduces frequency for mastered concepts (>90% accuracy across several sessions)

5. AI Exercise Generation:

When textbook tasks for a topic are exhausted, the AI creates analogous exercises
Claude for complex generation (translations, essays)
GPT-4o-mini for simple drills (fill-ins, conjugations)
Quality check: alignment with the textbook's difficulty level

6. Cost Optimization with Multiple Models:

Exercise Checking: GPT-4o-mini (cheap and fast) for objective answers (fill-ins, multiple choice)
Translation Checking: GPT-4 for nuanced evaluation
Grammar Explanations: Claude 3.5 Sonnet for detailed, teacher-style explanations
Exercise Generation: GPT-4o-mini for simple, Claude for complex tasks
Current cost: ~$4/month (usage-dependent)

Why It Works

The AI tracks mistakes you don't notice yourself. The system found that I was failing "the subjunctive after expressions of emotion" — I thought I was "just bad with the subjunctive overall." Targeted practice fixed the real problem.

Real Numbers

Throughput:

4 textbooks digitized, 2300+ sub-exercises
$4/month versus $240/month for a tutor
Instant feedback instead of waiting for the next lesson
95%+ accuracy in detecting weak topics

What Actually Changed:

Before: 1–2 lessons per week, expensive, linear program
After: daily practice tailored to MY mistakes, unlimited tasks
The system catches patterns I wasn't aware of
You can study via Telegram anywhere, no scheduling

Value and Scale

Solves the problem for: 1 person (me) learning Spanish/English on a tight budget

Potential market: 1.5B language learners, 50M on Duolingo (too superficial), millions can't afford $50/hour tutors

Unit economics: $4/month vs. $240/month with a human. Professional educational content (not "gamification for its own sake"). The AI finds error patterns you don't notice.

What changed: The system exposed "the subjunctive after expressions of emotion" — a pattern I had no idea about. Now I study anywhere via Telegram, without a schedule.

Tech Stack

Technologies: Python, aiogram (Telegram), Claude 3.5 Sonnet, GPT-4, GPT-4o-mini, Qdrant, OpenAI Embeddings, PostgreSQL, YAML

Content: 4 textbooks digitized, 2300+ structured sub-exercises

Complexity: 8/10 (content extraction, adaptive algorithms, multi-model routing)