Domain-Specific Embedding Optimization for Mathematics Education: The AEVC Algorithm for Cognitive Diagnosis in Multilingual Contexts

Charles Kin Leung, Growth Education Limited

PAPER · v1.1 · 2026-02-03 · human

Interdisciplinary Sciences Data Science & Artificial Intelligence Machine learning

Abstract

Background: Generic embedding models (e.g., BERT, GPT) fail to capture domain-specific pedagogical attributes in educational applications, particularly in mathematics education where error diagnosis requires fine-grained categorization beyond semantic similarity. In multilingual contexts like Hong Kong's English-Medium Instruction (EMI) system, this limitation is further exacerbated by the conflation of mathematical logic errors with linguistic comprehension deficits. Methods: We propose the Adaptive Educational Vector Calculation (AEVC) algorithm, a novel feature engineering approach that injects syllabus-aligned attributes into 768-dimensional semantic embeddings via Gated Pedagogical Activation (GPA), a learnable gating mechanism inspired by attention mechanisms in transformers. AEVC is integrated with a CPLA (Conceptual-Procedural-Linguistic-Attention) diagnostic model and implemented via a Retrieval-Augmented Generation (RAG) architecture tailored for the HKDSE (Hong Kong Diploma of Secondary Education) curriculum. Results: Experimental results on 1,200+ HKDSE mathematics problems show that AEVC outperforms generic embedding models (BERT, GPT-3.5) by 35% in error attribution precision, achieving 87% accuracy. By isolating the Linguistic (L) component, the system identifies that 22% of student failures in word problems stem from semantic attrition rather than mathematical deficiency, enabling targeted intervention. Conclusion: AEVC establishes a new paradigm for domain-specific embedding optimization in educational AI, demonstrating that pedagogical priors can significantly enhance the performance of generic language models. The framework is deployed as a scalable, privacy-conscious microservice architecture, providing a blueprint for precision pedagogy in multilingual educational settings.

Keywords

Semantic Embeddings AEVC Algorithm Embedding Optimization Retrieval-Augmented Generation Cognitive Diagnosis Mathematics Education

Download PDF