Fixing typos in core documentation (#61)

BryanTegomoh · web-flow · commit 3772e47d938f · 2025-10-31T10:40:58.000-07:00
diff --git a/tinker_cookbook/recipes/chat_sl/README.md b/tinker_cookbook/recipes/chat_sl/README.md
@@ -37,5 +37,5 @@ Performance can be further improved by training longer with a higher `lora_rank`
 
 The base classes in [tinker_cookbook/supervised/data.py](../../supervised/data.py) support loading new data in the following way:
 - `SupervisedDatasetFromHFDataset` loads dataset on huggingface hub with a postprocessing function
-- `StreamingSupervisedDatasetFromHFDataset` works simiarly, but supports streaming
+- `StreamingSupervisedDatasetFromHFDataset` works similarly, but supports streaming
 - `FromConversationFileBuilder` supports data loading from a JSONL file
diff --git a/tinker_cookbook/recipes/distillation/README.md b/tinker_cookbook/recipes/distillation/README.md
@@ -8,7 +8,7 @@ Specifically, we provide the scripts needed to reproduce our experiments from th
 
 ## Distillation for reasoning
 
-Our results can be reproducing by running:
+Our results can be reproduced by running:
 1. Supervised finetuning on [OpenThoughts3](https://huggingface.co/datasets/open-thoughts/OpenThoughts3-1.2M)
 2. On-policy distillation on [DeepMath](https://huggingface.co/datasets/zwhe99/DeepMath-103K)
 
diff --git a/tinker_cookbook/recipes/math_rl/README.md b/tinker_cookbook/recipes/math_rl/README.md
@@ -1,4 +1,4 @@
-# Using Reinforcement Learning to Solve Math Prolems
+# Using Reinforcement Learning to Solve Math Problems
 
 Math problems have been the most active testbed for RL with LLMs. This recipe collects environments and grading functions that allows you to test on several popular math datasets.
 
diff --git a/tinker_cookbook/recipes/preference/README.md b/tinker_cookbook/recipes/preference/README.md
@@ -1,6 +1,6 @@
 # Learning from Preferences
 
-Many applications involve learnin from preferences beyond scalar rewards. We provide a few examples here:
+Many applications involve learning from preferences beyond scalar rewards. We provide a few examples here:
 
 1. [Shorter](./shorter/): we introduce the `PairwisePreferenceRLDatasetBuilder` abstraction and walk through a simple example that trains a model to generate shorter responses.
 2. [RLHF](./rlhf/): we walk through the standard RLHF pipeline from [1, 2]. This pipeline involves three stages: supervised fine-tuning, reward model learning, and reinforcement learning.

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-# Using Reinforcement Learning to Solve Math Prolems`
	`1`	`+# Using Reinforcement Learning to Solve Math Problems`
`2`	`2`
`3`	`3`	`Math problems have been the most active testbed for RL with LLMs. This recipe collects environments and grading functions that allows you to test on several popular math datasets.`
`4`	`4`