GPT-4 розв'язує математичні олімпіади. Llama-7B — ні. Різниця? 100x параметрів і мільярди доларів тренування. Але що якщо можна «вкрасти» reasoning здібності великої моделі і передати їх маленькій?
Не copy-paste відповідей — це банально і не працює. А copy-paste способу мислення. Не «що», а «як».
Reasoning distillation — саме про це. Навчити компактну модель не просто імітувати outputs, а відтворювати процес логічного міркування. І результати вражають: 7B модель після distillation може досягати 70-80% performance GPT-4 на reasoning tasks. За 1% вартості.
Класична vs Reasoning Distillation
Проблема класичної Knowledge Distillation (Hinton, 2015):
class ClassicKnowledgeDistillation:
"""
Класичний підхід: студент імітує soft labels teacher'а
Проблема для reasoning:
- Teacher: "Відповідь: 42"
- Student вчиться: передбачати "42"
- Student НЕ вчиться: ЧОМУ 42
"""
def distillation_loss(self, student_logits, teacher_logits, temperature=2.0):
"""
KL divergence між softened distributions
Обмеження: передає "що відповідати", не "як думати"
"""
student_soft = F.softmax(student_logits / temperature, dim=-1)
teacher_soft = F.softmax(teacher_logits / temperature, dim=-1)
return F.kl_div(
student_soft.log(),
teacher_soft,
reduction='batchmean'
) * (temperature ** 2)
class ReasoningDistillation:
"""
Reasoning Distillation: студент вчиться процесу
Teacher: "Крок 1: ... Крок 2: ... Крок 3: ... Відповідь: 42"
Student вчиться: відтворювати ВЕСЬ ПРОЦЕС міркування
Перевага: student вчить думати, не просто відповідати
"""
def generate_reasoning_trace(self, teacher_model, question: str) -> dict:
"""Генеруємо повний trace з teacher"""
prompt = f"""Solve this problem step by step.
Show all your reasoning clearly.
Problem: {question}
Let's solve this step by step:
"""
response = teacher_model.generate(prompt, max_tokens=512)
return {
"question": question,
"reasoning": response,
"answer": self.extract_answer(response)
}
def train_student(self, student_model, reasoning_traces):
"""Fine-tune student на reasoning traces"""
# Формат training data:
# input: question
# output: full reasoning trace + answer
for trace in reasoning_traces:
training_input = f"Question: {trace['question']}\nLet's think step by step:"
training_output = trace['reasoning']
# Standard language modeling objective
loss = student_model.compute_lm_loss(training_input, training_output)
loss.backward()
Методи Reasoning Distillation
1. Chain-of-Thought Distillation:
class ChainOfThoughtDistillation:
"""
Базовий метод: teacher генерує CoT, student fine-tuned на CoT
Проста ідея, сильні результати
"""
def __init__(self, teacher_model, student_model):
self.teacher = teacher_model
self.student = student_model
def generate_cot_dataset(self, questions: list[str],
num_samples_per_question: int = 3) -> list[dict]:
"""
Генеруємо reasoning traces для training dataset
Multiple samples per question:
- Різні reasoning paths
- Self-consistency filtering
"""
dataset = []
for question in questions:
samples = []
for _ in range(num_samples_per_question):
trace = self.generate_single_trace(question)
samples.append(trace)
# Keep only consistent samples (same answer)
answers = [s["answer"] for s in samples]
most_common = max(set(answers), key=answers.count)
filtered = [s for s in samples if s["answer"] == most_common]
dataset.extend(filtered)
return dataset
def generate_single_trace(self, question: str) -> dict:
prompt = f"""Solve this problem step by step.
Be clear and precise in your reasoning.
Format each step on a new line.
Problem: {question}
Solution:"""
response = self.teacher.generate(
prompt,
temperature=0.7, # Some diversity
max_tokens=512
)
return {
"question": question,
"cot": response,
"answer": self.extract_final_answer(response)
}
def fine_tune_student(self, dataset: list[dict]) -> None:
"""Fine-tune student на CoT traces"""
training_examples = []
for item in dataset:
example = {
"input": f"Problem: {item['question']}\n\nLet's solve step by step:",
"output": item["cot"]
}
training_examples.append(example)
# Standard SFT
self.student.supervised_fine_tune(training_examples)
2. Step-by-Step Distillation (Hsieh et al., 2023):
class StepByStepDistillation:
"""
Покращений метод: окремо distill кожен reasoning step
Ідея: розбити reasoning на atomic steps,
навчити student кожен step окремо
"""
def __init__(self, teacher_model, student_model, label_model=None):
self.teacher = teacher_model
self.student = student_model
# Optional smaller model для initial labeling
self.label_model = label_model or teacher_model
def generate_stepwise_labels(self, question: str,
answer: str) -> list[dict]:
"""
Генеруємо step-by-step reasoning з final answer verification
"""
prompt = f"""Given the question and correct answer, generate step-by-step reasoning.
Question: {question}
Correct Answer: {answer}
Generate clear reasoning steps that lead to this answer:
Step 1:"""
response = self.teacher.generate(prompt, max_tokens=512)
# Parse into individual steps
steps = self.parse_steps(response)
return {
"question": question,
"steps": steps,
"final_answer": answer
}
def parse_steps(self, reasoning: str) -> list[dict]:
"""Parse reasoning into structured steps"""
steps = []
lines = reasoning.split('\n')
current_step = {"number": 0, "text": "", "operation": None, "result": None}
for line in lines:
if line.startswith("Step"):
if current_step["text"]:
steps.append(current_step)
step_num = int(line.split(":")[0].replace("Step", "").strip())
current_step = {
"number": step_num,
"text": line.split(":", 1)[1].strip() if ":" in line else "",
"operation": self.extract_operation(line),
"result": self.extract_result(line)
}
else:
current_step["text"] += " " + line.strip()
if current_step["text"]:
steps.append(current_step)
return steps
def train_with_rationale_loss(self, dataset: list[dict]):
"""
Multi-task training:
1. Predict next step given previous steps
2. Predict final answer
"""
for item in dataset:
# Step prediction loss
for i, step in enumerate(item["steps"][1:], 1):
prefix = self.format_prefix(item["question"], item["steps"][:i])
target = step["text"]
step_loss = self.student.compute_lm_loss(prefix, target)
# Answer prediction loss
full_reasoning = self.format_full_reasoning(item)
answer_loss = self.student.compute_lm_loss(
full_reasoning,
f"The answer is: {item['final_answer']}"
)
# Combined loss
total_loss = step_loss + answer_loss
total_loss.backward()
3. Self-Taught Reasoner (STaR):
class STaRDistillation:
"""
Self-Taught Reasoner (Zelikman et al., 2022)
Ітеративний процес:
1. Student намагається reasoning
2. Якщо правильно — зберігаємо reasoning
3. Якщо неправильно — rationalize (генеруємо з hint)
4. Fine-tune на успішних traces
5. Repeat
"""
def __init__(self, model, max_iterations: int = 5):
self.model = model
self.max_iterations = max_iterations
def star_iteration(self, training_data: list[dict]) -> list[dict]:
"""Одна ітерація STaR"""
successful_traces = []
for item in training_data:
question = item["question"]
gold_answer = item["answer"]
# 1. Try direct reasoning
trace = self.model.generate_reasoning(question)
predicted = self.extract_answer(trace)
if predicted == gold_answer:
# Success! Keep this trace
successful_traces.append({
"question": question,
"reasoning": trace,
"answer": gold_answer
})
else:
# 2. Rationalization: generate with hint
rationalized = self.rationalize(question, gold_answer)
successful_traces.append({
"question": question,
"reasoning": rationalized,
"answer": gold_answer
})
return successful_traces
def rationalize(self, question: str, correct_answer: str) -> str:
"""
Генеруємо reasoning маючи правильну відповідь
Це "cheating" але дозволяє моделі вчитись
правильним reasoning patterns
"""
prompt = f"""Given the question and correct answer, generate the reasoning.
Question: {question}
Correct Answer: {correct_answer}
Step-by-step reasoning that leads to this answer:"""
return self.model.generate(prompt)
def train(self, initial_data: list[dict]):
"""Повний STaR training loop"""
current_data = initial_data
for iteration in range(self.max_iterations):
print(f"STaR Iteration {iteration + 1}")
# Generate traces
traces = self.star_iteration(current_data)
# Fine-tune on successful traces
self.model.fine_tune(traces)
# Evaluate
accuracy = self.evaluate(current_data)
print(f"Accuracy: {accuracy:.2%}")
if accuracy > 0.95:
break
4. Orca-style Progressive Learning:
class OrcaDistillation:
"""
Orca: Progressive Learning from Complex Explanations (Microsoft, 2023)
Ключові ідеї:
1. System prompts для різних типів пояснень
2. Progressive: прості → складні tasks
3. Explanation tuning замість answer tuning
"""
EXPLANATION_PROMPTS = {
"step_by_step": """You are an AI assistant. Think step by step
and show all your reasoning before giving the final answer.""",
"explain_like_teacher": """You are a patient teacher.
Explain your reasoning in a way a student would understand.
Break down complex concepts.""",
"critical_thinking": """You are a critical thinker.
Consider multiple perspectives, identify assumptions,
and evaluate the strength of arguments.""",
"expert_analysis": """You are a domain expert.
Provide detailed technical analysis with precise terminology."""
}
def generate_diverse_explanations(self, question: str,
teacher_model) -> list[dict]:
"""Генеруємо explanations з різними стилями"""
explanations = []
for prompt_type, system_prompt in self.EXPLANATION_PROMPTS.items():
response = teacher_model.generate(
prompt=question,
system_prompt=system_prompt,
temperature=0.7
)
explanations.append({
"question": question,
"explanation_type": prompt_type,
"response": response
})
return explanations
def progressive_curriculum(self, data: list[dict]) -> list[list[dict]]:
"""
Розбиваємо data на рівні складності
Curriculum:
1. Простi single-step reasoning
2. Multi-step з 2-3 кроками
3. Complex multi-hop reasoning
4. Creative/abstract reasoning
"""
levels = {
"simple": [],
"medium": [],
"complex": [],
"advanced": []
}
for item in data:
complexity = self.estimate_complexity(item["question"])
levels[complexity].append(item)
return [levels["simple"], levels["medium"],
levels["complex"], levels["advanced"]]
def estimate_complexity(self, question: str) -> str:
"""Оцінка складності питання"""
# Simplified heuristic
word_count = len(question.split())
has_numbers = any(c.isdigit() for c in question)
keywords = ["compare", "analyze", "evaluate", "explain why"]
if word_count < 20 and not any(k in question.lower() for k in keywords):
return "simple"
elif word_count < 50:
return "medium"
elif word_count < 100:
return "complex"
else:
return "advanced"
Верифікація Reasoning
Перевірка якості distilled reasoning:
class ReasoningVerifier:
"""
Верифікація якості reasoning в distilled model
Три аспекти:
1. Correctness — чи правильна відповідь
2. Faithfulness — чи reasoning веде до відповіді
3. Coherence — чи reasoning логічно зв'язний
"""
def __init__(self, verifier_model=None):
self.verifier = verifier_model
def verify_correctness(self, prediction: str, gold: str) -> bool:
"""Просте порівняння відповіді"""
pred_answer = self.extract_answer(prediction)
gold_answer = self.extract_answer(gold)
return self.normalize(pred_answer) == self.normalize(gold_answer)
def verify_faithfulness(self, reasoning: str, answer: str) -> float:
"""
Чи reasoning дійсно веде до answer?
Faithfulness score: P(answer | reasoning) vs P(answer | question only)
"""
if self.verifier is None:
return self.heuristic_faithfulness(reasoning, answer)
prompt = f"""Given this reasoning, does it logically lead to the stated answer?
Reasoning: {reasoning}
Stated Answer: {answer}
Rate faithfulness from 0 to 1:"""
score = self.verifier.generate(prompt)
return float(score)
def verify_coherence(self, reasoning: str) -> float:
"""
Чи reasoning логічно зв'язний?
Перевіряємо:
- Кожен step має сенс
- Steps логічно пов'язані
- Немає суперечностей
"""
steps = self.parse_steps(reasoning)
if len(steps) < 2:
return 1.0 # Single step is trivially coherent
coherence_scores = []
for i in range(1, len(steps)):
prev_step = steps[i-1]
curr_step = steps[i]
# Check if current step follows from previous
prompt = f"""Does step {i+1} logically follow from step {i}?
Step {i}: {prev_step}
Step {i+1}: {curr_step}
Rate logical connection (0-1):"""
score = self.verifier.generate(prompt)
coherence_scores.append(float(score))
return sum(coherence_scores) / len(coherence_scores)
def comprehensive_evaluation(self, model, test_data: list[dict]) -> dict:
"""Повна оцінка reasoning quality"""
results = {
"correctness": [],
"faithfulness": [],
"coherence": [],
"avg_steps": []
}
for item in test_data:
prediction = model.generate(item["question"])
results["correctness"].append(
self.verify_correctness(prediction, item["answer"])
)
results["faithfulness"].append(
self.verify_faithfulness(prediction, item["answer"])
)
results["coherence"].append(
self.verify_coherence(prediction)
)
results["avg_steps"].append(
len(self.parse_steps(prediction))
)
return {
"accuracy": sum(results["correctness"]) / len(results["correctness"]),
"avg_faithfulness": sum(results["faithfulness"]) / len(results["faithfulness"]),
"avg_coherence": sum(results["coherence"]) / len(results["coherence"]),
"avg_reasoning_steps": sum(results["avg_steps"]) / len(results["avg_steps"])
}
Contrastive Reasoning Distillation
Вчимо модель розрізняти хороший і поганий reasoning:
class ContrastiveReasoningDistillation:
"""
Contrastive learning для reasoning
Ідея: показувати і правильний, і неправильний reasoning
Student вчиться розрізняти
"""
def generate_contrastive_pairs(self, question: str,
correct_answer: str,
teacher_model) -> dict:
"""Генеруємо позитивний і негативний приклади"""
# Positive: correct reasoning
positive = self.generate_correct_reasoning(
teacher_model, question, correct_answer
)
# Negative: plausible but incorrect reasoning
negative = self.generate_incorrect_reasoning(
teacher_model, question, correct_answer
)
return {
"question": question,
"positive": positive,
"negative": negative,
"correct_answer": correct_answer
}
def generate_incorrect_reasoning(self, model, question: str,
correct_answer: str) -> str:
"""
Генеруємо plausible але неправильний reasoning
Типи помилок:
1. Arithmetic error
2. Wrong operation
3. Missing step
4. Wrong interpretation
"""
prompt = f"""Generate a plausible but INCORRECT reasoning for this problem.
The reasoning should look convincing but have a subtle error.
Problem: {question}
(The correct answer is {correct_answer}, but generate reasoning that leads to a DIFFERENT answer)
Incorrect reasoning:"""
return model.generate(prompt)
def contrastive_loss(self, student, positive: str, negative: str,
margin: float = 1.0) -> torch.Tensor:
"""
Contrastive loss: push apart good/bad reasoning
score(positive) - score(negative) > margin
"""
pos_score = student.score_reasoning(positive)
neg_score = student.score_reasoning(negative)
loss = torch.relu(margin - (pos_score - neg_score))
return loss
def train_with_contrastive(self, student, contrastive_data: list[dict]):
"""Training loop з contrastive learning"""
for item in contrastive_data:
# Standard LM loss на positive
lm_loss = student.compute_lm_loss(
item["question"],
item["positive"]
)
# Contrastive loss
contrast_loss = self.contrastive_loss(
student,
item["positive"],
item["negative"]
)
# Combined
total_loss = lm_loss + 0.1 * contrast_loss
total_loss.backward()
Specialized Domain Distillation
Reasoning distillation для specific domains:
class DomainSpecificDistillation:
"""
Distillation для конкретних доменів
Кожен domain має свої reasoning patterns:
- Math: symbolic manipulation
- Code: execution traces
- Legal: precedent-based reasoning
- Medical: differential diagnosis
"""
DOMAIN_PROMPTS = {
"math": """You are a mathematician. Solve using formal notation.
Show each algebraic step clearly. Verify your answer.""",
"code": """You are a programmer. Think about:
1. Edge cases
2. Time/space complexity
3. Test with examples
Write clean, documented code.""",
"legal": """You are a legal expert. Consider:
1. Relevant statutes and precedents
2. Arguments for both sides
3. Standard of proof
Cite sources.""",
"medical": """You are a physician. For diagnosis:
1. List symptoms
2. Consider differential diagnoses
3. Recommend tests
4. Explain reasoning
Never give medical advice."""
}
def domain_specific_dataset(self, domain: str,
questions: list[str],
teacher_model) -> list[dict]:
"""Генеруємо domain-specific reasoning traces"""
system_prompt = self.DOMAIN_PROMPTS[domain]
dataset = []
for question in questions:
response = teacher_model.generate(
prompt=question,
system_prompt=system_prompt,
temperature=0.3 # Lower for consistency
)
dataset.append({
"domain": domain,
"question": question,
"reasoning": response,
"answer": self.extract_answer(response)
})
return dataset
class MathReasoningDistillation(DomainSpecificDistillation):
"""Спеціалізована distillation для математики"""
def generate_math_trace(self, problem: str, teacher) -> dict:
"""Генеруємо math reasoning trace з verification"""
# Generate solution
solution = teacher.generate(
f"Solve step by step: {problem}",
system_prompt=self.DOMAIN_PROMPTS["math"]
)
# Verify with symbolic computation (якщо можливо)
verification = self.verify_math(problem, solution)
return {
"problem": problem,
"solution": solution,
"verified": verification["correct"],
"verification_details": verification
}
def verify_math(self, problem: str, solution: str) -> dict:
"""Верифікація математичного рішення"""
# Could use SymPy, WolframAlpha API, etc.
return {"correct": True, "method": "symbolic"}
class CodeReasoningDistillation(DomainSpecificDistillation):
"""Спеціалізована distillation для коду"""
def generate_code_trace(self, problem: str, teacher) -> dict:
"""Генеруємо code reasoning з execution"""
# Generate solution
solution = teacher.generate(
f"Write code to solve: {problem}",
system_prompt=self.DOMAIN_PROMPTS["code"]
)
# Extract code and execute
code = self.extract_code(solution)
execution = self.execute_safely(code)
return {
"problem": problem,
"solution": solution,
"code": code,
"execution_result": execution,
"verified": execution["success"]
}
Benchmark Results
Порівняння методів на різних benchmarks:
| Method | Model Size | GSM8K | MATH | ARC | StrategyQA |
|--------|-----------|-------|------|-----|------------|
| Baseline 7B | 7B | 11% | 4% | 52% | 61% |
| CoT Distillation | 7B | 48% | 18% | 68% | 72% |
| Step-by-Step | 7B | 53% | 22% | 71% | 75% |
| STaR | 7B | 56% | 24% | 73% | 76% |
| Orca-style | 13B | 62% | 29% | 78% | 81% |
| GPT-4 (teacher) | ~1.7T | 92% | 42% | 95% | 89% |
Key insight: 7B модель може досягти 50-70% performance GPT-4 на reasoning tasks.
Практичний Pipeline
End-to-end reasoning distillation:
class ReasoningDistillationPipeline:
"""Повний pipeline для reasoning distillation"""
def __init__(self,
teacher_api: str, # "openai" or "anthropic"
student_model: str = "mistral-7b",
domain: str = "general"):
self.teacher = self.init_teacher(teacher_api)
self.student = self.init_student(student_model)
self.domain = domain
def run(self,
questions: list[str],
output_dir: str,
num_epochs: int = 3):
"""Повний pipeline"""
# 1. Generate reasoning traces
print("Generating reasoning traces...")
traces = self.generate_traces(questions)
# 2. Filter and clean
print("Filtering traces...")
cleaned = self.filter_traces(traces)
# 3. Progressive training
print("Training student...")
self.progressive_train(cleaned, num_epochs)
# 4. Evaluate
print("Evaluating...")
results = self.evaluate()
# 5. Save
self.save(output_dir, results)
return results
def generate_traces(self, questions: list[str]) -> list[dict]:
"""Step 1: Generate from teacher"""
traces = []
for q in tqdm(questions):
for _ in range(3): # Multiple samples
trace = self.teacher.generate_reasoning(q)
traces.append({
"question": q,
"trace": trace
})
return traces
def filter_traces(self, traces: list[dict]) -> list[dict]:
"""Step 2: Quality filtering"""
filtered = []
for trace in traces:
# Check answer consistency
if not self.check_answer_format(trace["trace"]):
continue
# Check reasoning quality
if self.count_steps(trace["trace"]) < 2:
continue
# Check for hallucinations (basic)
if self.detect_hallucination(trace["trace"]):
continue
filtered.append(trace)
return filtered
def progressive_train(self, data: list[dict], epochs: int):
"""Step 3: Progressive curriculum training"""
# Sort by complexity
sorted_data = sorted(data, key=lambda x: len(x["trace"]))
# Split into levels
n = len(sorted_data)
levels = [
sorted_data[:n//3], # Easy
sorted_data[n//3:2*n//3], # Medium
sorted_data[2*n//3:] # Hard
]
for level_idx, level_data in enumerate(levels):
print(f"Training on level {level_idx + 1}...")
for epoch in range(epochs):
self.train_epoch(level_data)
def evaluate(self) -> dict:
"""Step 4: Comprehensive evaluation"""
# Load test sets
gsm8k = self.load_benchmark("gsm8k")
math = self.load_benchmark("math")
return {
"gsm8k_accuracy": self.eval_on_benchmark(gsm8k),
"math_accuracy": self.eval_on_benchmark(math),
"avg_reasoning_steps": self.measure_reasoning_depth(),
"faithfulness": self.measure_faithfulness()
}
Що не distill-иться
Обмеження reasoning distillation:
class DistillationLimitations:
"""
Що НЕ можна ефективно передати через distillation
"""
LIMITATIONS = {
"world_knowledge": """
Фактичні знання потребують параметрів.
Можна distill reasoning patterns,
не можна distill "хто написав Гамлета".
""",
"context_length": """
Architectural constraint.
Якщо teacher має 128K context, а student 4K,
long-context reasoning не distill-иться.
""",
"very_complex_reasoning": """
10+ step reasoning все ще проблема.
Помилки накопичуються.
Smaller model має менше "headroom" для error correction.
""",
"creative_novel_reasoning": """
Нові підходи до проблем.
Student може імітувати patterns, не винаходити нові.
""",
"meta_reasoning": """
Reasoning про reasoning.
"Чому цей підхід кращий?" важко distill.
"""
}
@staticmethod
def what_distills_well():
return [
"Arithmetic reasoning patterns",
"Logical inference templates",
"Step decomposition strategies",
"Verification habits",
"Common problem-solving heuristics"
]
Ідеї для дослідження
Для бакалаврської роботи:
- Distill GPT-4 reasoning на GSM8K, порівняти з baseline fine-tuning
- Аналіз: на яких типах задач distillation працює найкраще
- Візуалізація reasoning quality до/після distillation
Для магістерської:
- Multi-step reasoning distillation з verification
- Domain-specific distillation (code, legal, medical)
- Contrastive reasoning distillation
- Порівняння різних teacher models (GPT-4 vs Claude vs Gemini)
Для PhD:
- Theoretical bounds на reasoning transfer
- Що робить reasoning "distillable"?
- Novel distillation objectives beyond imitation
- Compositional reasoning distillation
- Self-improving reasoning через iterative distillation
Чому це практично важливо
GPT-4 коштує $0.06/1K tokens (input). Для production з мільйонами запитів — це десятки тисяч доларів на день. Self-hosted fine-tuned 7B: ~$0.001/1K tokens.
Reasoning distillation дозволяє:
- Отримати 50-70% performance GPT-4 на reasoning
- За 1-2% вартості
- З можливістю self-hosting (privacy, control)
- Без залежності від external API
Це не компроміс якості. Це практична економіка AI deployment.
Для тих, хто готує наукову роботу з reasoning distillation — від курсової до дисертації — команда SKP-Degree на skp-degree.com.ua допоможе з дослідженням та реалізацією. Пишіть у Telegram: @kursovi_diplomy — маємо досвід роботи з Orca, WizardMath та власними distillation pipelines.
Ключові слова: reasoning distillation, knowledge distillation, chain-of-thought, CoT, STaR, Orca, small models, fine-tuning, LLM efficiency, edge deployment, наукова робота, дипломна, магістерська, курсова.