The Problem
I can't afford regular lessons with tutors, and I don't have time for language schools. Their format isn't always effective anyway. So I built a system for myself that learns my mistakes and strengths, trains me with spaced repetition, tracks my progress automatically, and picks daily topics for me.
Before / After
Before: A tutor twice a week → $240/month → we go page by page through a textbook → no tracking of what exactly "falls apart" for me. Notes are kept, but the goal is to get through the book.
After: Daily practice in Telegram → $4/month → the AI notices that I confuse "the subjunctive after expressions of emotion" → gradually generates 20 more targeted exercises exactly for that → finally mastered it.
Bot base: 4 digitized textbooks (A1–B2 for Spanish/English), 2300+ exercises. 95% accuracy in detecting my weak spots. Practice anywhere, anytime. Primarily to get verified theory and cover the entire span from A1 to B2.
How It Works
Step 1: The Telegram bot sends an exercise from a digitized textbook. "Fill the form: Espero que tú ___ (venir) mañana."
Step 2: You answer. The AI checks instantly. If there's a mistake, it saves it with context (which rule, what type of error).
Step 3: After 10 mistakes on the subjunctive, the AI sees a pattern and gives MORE practice exactly on that topic. It spaces the drills over time (forgot → reviewed → remembered better).
Result: Unlimited practice precisely on what is hard for you personally. Not "one-size-fits-all lessons." When textbook tasks run out, the AI generates new ones of different types, tailored to your weak spots.
Technical Architecture
1. Content Digitization Pipeline:
- OCR + manual proofreading of 4 full textbooks
- AI extraction of theory, glossaries, grammar rules, and exercises
- YAML structure: each lesson has metadata, goals, content sections, exercise types. 500+ units
- Triple verification system: if Claude and GPT agree, answer is unambiguously correct; if they disagree, Opus 4.1 is called for re-verification. All variants stored and shown to user. Out of 2300+ sub-exercises, only 73 required triple verification
- Quality control: comparing extracted content to the source
2. Lesson Storage and Delivery:
- PostgreSQL stores lesson content, my progress, and completed tasks
- Lesson YAML files are parsed dynamically on topic request
- Telegram interface for seamless on-phone learning
- Support for push notifications (daily reminders) and pull mode (self-starting sessions)
3. Error Tracking and Vector Memory:
- Every answer is saved with metadata: question type, topic, grammar concept, vocabulary
- Mistakes are embedded with OpenAI embeddings and stored in Qdrant
- Similarity search surfaces related misses: "trouble with the subjunctive in past tenses"
- The system builds a detailed profile of my mistakes over time
4. Adaptive Learning Engine:
- Analyzes error patterns: which rules, which lexical topics, which exercise types
- Calculates topic confidence (0–100%) based on recent results
- Assigns more practice to low-confidence topics (spaced repetition)
- Reduces frequency for mastered concepts (>90% accuracy across several sessions)
5. AI Exercise Generation:
- When textbook tasks for a topic are exhausted, the AI creates analogous exercises
- Claude for complex generation (translations, essays)
- GPT-4o-mini for simple drills (fill-ins, conjugations)
- Quality check: alignment with the textbook's difficulty level
6. Cost Optimization with Multiple Models:
- Exercise Checking: GPT-4o-mini (cheap and fast) for objective answers (fill-ins, multiple choice)
- Translation Checking: GPT-4 for nuanced evaluation
- Grammar Explanations: Claude 3.5 Sonnet for detailed, teacher-style explanations
- Exercise Generation: GPT-4o-mini for simple, Claude for complex tasks
- Current cost: ~$4/month (usage-dependent)
Why It Works
The AI tracks mistakes you don't notice yourself. The system found that I was failing "the subjunctive after expressions of emotion" — I thought I was "just bad with the subjunctive overall." Targeted practice fixed the real problem.
Real Numbers
Throughput:
- 4 textbooks digitized, 2300+ sub-exercises
- $4/month versus $240/month for a tutor
- Instant feedback instead of waiting for the next lesson
- 95%+ accuracy in detecting weak topics
What Actually Changed:
- Before: 1–2 lessons per week, expensive, linear program
- After: daily practice tailored to MY mistakes, unlimited tasks
- The system catches patterns I wasn't aware of
- You can study via Telegram anywhere, no scheduling
Value and Scale
Solves the problem for: 1 person (me) learning Spanish/English on a tight budget
Potential market: 1.5B language learners, 50M on Duolingo (too superficial), millions can't afford $50/hour tutors
Unit economics: $4/month vs. $240/month with a human. Professional educational content (not "gamification for its own sake"). The AI finds error patterns you don't notice.
What changed: The system exposed "the subjunctive after expressions of emotion" — a pattern I had no idea about. Now I study anywhere via Telegram, without a schedule.
Tech Stack
Technologies: Python, aiogram (Telegram), Claude 3.5 Sonnet, GPT-4, GPT-4o-mini, Qdrant, OpenAI Embeddings, PostgreSQL, YAML
Content: 4 textbooks digitized, 2300+ structured sub-exercises
Complexity: 8/10 (content extraction, adaptive algorithms, multi-model routing)