Deepseek prover task #733

plaguss · 2024-06-14T12:12:14Z

Description

⚠️ WIP

This PR implements tasks to replicate the paper: DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data.

Note:
The prompts differ with the original implementation as most of prompt formatting is done in via the system prompt. This yielded better results while trying with Llama3 70B.

Examples

Base Example:

from distilabel.steps.tasks import DeepSeekProverAutoFormalization
from distilabel.llms.huggingface import InferenceEndpointsLLM

prover_autoformal = DeepSeekProverAutoFormalization(
    llm=InferenceEndpointsLLM(
        model_id="deepseek-ai/deepseek-math-7b-instruct",
        tokenizer_id="deepseek-ai/deepseek-math-7b-instruct",
    ),
)

Few-shot setting:

prover_autoformal = DeepSeekProverAutoFormalization(
    llm=InferenceEndpointsLLM(
        model_id="deepseek-ai/deepseek-math-7b-instruct",
        tokenizer_id="deepseek-ai/deepseek-math-7b-instruct",
    ),
    examples=[
        "theorem amc12a_2019_p21 (z : ℂ) (h₀ : z = (1 + Complex.I) / Real.sqrt 2) :\n\n((∑ k : ℤ in Finset.Icc 1 12, z ^ k ^ 2) * (∑ k : ℤ in Finset.Icc 1 12, 1 / z ^ k ^ 2)) = 36 := by\n\nsorry",
        "theorem amc12a_2015_p10 (x y : ℤ) (h₀ : 0 < y) (h₁ : y < x) (h₂ : x + y + x * y = 80) : x = 26 := by\n\nsorry"
    ]
)

Scorer:

from distilabel.steps.tasks import DeepSeekProverScorer
from distilabel.llms.huggingface import InferenceEndpointsLLM

prover_scorer = DeepSeekProverAutoFormalization(
    llm=InferenceEndpointsLLM(
        model_id="deepseek-ai/deepseek-math-7b-instruct",
        tokenizer_id="deepseek-ai/deepseek-math-7b-instruct",
    ),
)

Pending tasks:

Add example in the paper section with a full pipeline (without training).

Closes #732

codspeed-hq · 2024-06-14T12:18:15Z

CodSpeed Performance Report

Merging #733 will not alter performance

_{Comparing deepseek-prover (1c2d7fc) with develop (9d6a152)}

Summary

✅ 1 untouched benchmarks

… examples

Add deepseek prover autoformalization task

6fc9acd

plaguss self-assigned this Jun 14, 2024

plaguss added the integrations label Jun 14, 2024

plaguss added this to the 1.3.0 milestone Jun 14, 2024

plaguss linked an issue Jun 14, 2024 that may be closed by this pull request

[IMPLEMENTATION] Implement DeepSeek-Prover #732

Open

plaguss marked this pull request as draft June 14, 2024 12:14

plaguss added 4 commits June 14, 2024 16:26

Add task for the scorer as a jinja template to make it easy to maintain

52ce32e

Add deepseek prover scorer task

febd720

Add tests for the scorer task

a3958c8

Redirect import

c758b25

plaguss requested review from gabrielmbmb and alvarobartt June 14, 2024 14:30

plaguss added 3 commits June 14, 2024 17:13

Create a folder for the deepseek-prover templates

d27bf1d

Make generator task more general including few shot examples

6481f41

Remove the few shot argument as we can determine by just checking for…

1c2d7fc

… examples

Base automatically changed from develop to main June 18, 2024 12:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepseek prover task #733

Deepseek prover task #733

plaguss commented Jun 14, 2024 •

edited

Loading

codspeed-hq bot commented Jun 14, 2024 •

edited

Loading

Deepseek prover task #733

Are you sure you want to change the base?

Deepseek prover task #733

Conversation

plaguss commented Jun 14, 2024 • edited Loading

Description

codspeed-hq bot commented Jun 14, 2024 • edited Loading

CodSpeed Performance Report

Merging #733 will not alter performance

Summary

plaguss commented Jun 14, 2024 •

edited

Loading

codspeed-hq bot commented Jun 14, 2024 •

edited

Loading