Skip to content

Commit

Permalink
Add doc comment to type
Browse files Browse the repository at this point in the history
  • Loading branch information
hendrikvanantwerpen committed Oct 10, 2024
1 parent 851a559 commit ec07a42
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions crates/bpe-openai/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,12 @@ static BPE_O200K: LazyLock<Tokenizer> = LazyLock::new(|| {

pub use bpe::*;

/// A byte-pair encoding tokenizer that supports a pre-tokenization regex.
/// The direct methods on this type pre-tokenize the input text and should
/// produce the same output as the tiktoken tokenizers. The type gives access
/// to the regex and underlying bye-pair encoding if needed. Note that using
/// the byte-pair encoding directly does not take the regex into account and
/// may result in output that differs from tiktoken.
pub struct Tokenizer {
/// The byte-pair encoding for this tokenizer.
pub bpe: BytePairEncoding,
Expand Down

0 comments on commit ec07a42

Please sign in to comment.