Skip to content

Latest commit

 

History

History
205 lines (158 loc) · 5.44 KB

structures.md

File metadata and controls

205 lines (158 loc) · 5.44 KB

Structures

If you want to define the shape of data during runtime, you can use Structure class.

Structures allow you to define and modify arbitrary shape of data to be extracted by LLM. Classes may not be the best fit for this purpose, as declaring or changing them during execution is not possible.

With structures, you can define custom data shapes dynamically, for example based on the user input or context of the processing, to specify the information you need LLM to infer from the provided text or chat messages.

Defining a shape of data

Use Structure::define() to define the structure and pass it to Instructor as response model.

If Structure instance has been provided as a response model, Instructor returns an array in the shape you defined.

Structure::define() accepts array of Field objects.

Let's first define the structure, which is a shape of the data we want to extract from the message.

<?php
use Cognesy\Instructor\Extras\Structure\Field;
use Cognesy\Instructor\Extras\Structure\Structure;

enum Role : string {
    case Manager = 'manager';
    case Line = 'line';
}

$structure = Structure::define('person', [
    Field::string('name'),
    Field::int('age'),
    Field::enum('role', Role::class),
]);
?>

Following types of fields are currently supported:

  • Field::bool() - boolean value
  • Field::int() - int value
  • Field::string() - string value
  • Field::float() - float value
  • Field::enum() - enum value
  • Field::structure() - for nesting structures

Optional fields

Fields can be marked as optional with $field->optional(). By default, all defined fields are required.

<?php
$structure = Structure::define('person', [
    //...
    Field::int('age')->optional(),
    //...
]);
?>

Descriptions for guiding LLM inference

Instructor includes field descriptions in the content of instructions for LLM, so you can use them to provide explanations, detailed specifications or requirements for each field.

You can also provide extra inference instructions for LLM at the structure level with $structure->description(string $description)

<?php
$structure = Structure::define('person', [
    Field::string('name', 'Name of the person'),
    Field::int('age', 'Age of the person')->optional(),
    Field::enum('role', Role::class, 'Role of the person'),
], 'A person object');
?>

Nesting structures

You can use Field::structure() to nest structures in case you want to define more complex data shapes.

<?php
$structure = Structure::define('person', [
    Field::string('name','Name of the person'),
    Field::int('age', 'Age of the person')->validIf(
        fn($value) => $value > 0, "Age has to be positive number"
    ),
    Field::structure('address', [
        Field::string('street', 'Street name')->optional(),
        Field::string('city', 'City name'),
        Field::string('zip', 'Zip code')->optional(),
    ], 'Address of the person'),
    Field::enum('role', Role::class, 'Role of the person'),
], 'A person object');
?>

Validation of structure data

Instructor supports validation of structures.

You can define field validator with:

  • $field->validator(callable $validator) - $validator has to return an instance of ValidationResult
  • $field->validIf(callable $condition, string $message) - $condition has to return false if validation has not succeeded, $message with be provided to LLM as explanation for self-correction of the next extraction attempt

Let's add a simple field validation to the example above:

<?php
$structure = Structure::define('person', [
    // ...
    Field::int('age', 'Age of the person')->validIf(
        fn($value) => $value > 0, "Age has to be positive number"
    ),
    // ...
], 'A person object');
?>

Extracting data

Now, let's extract the data from the message.

<?php
use Cognesy\Instructor\Instructor;

$text = <<<TEXT
    Jane Doe lives in Springfield. She is 25 years old and works as a line worker. 
    McDonald's in Ney York is located at 456 Elm St, NYC, 12345.
    TEXT;

$person = (new Instructor)->respond(
    messages: $text,
    responseModel: $structure,
);

dump($person->toArray());
// array [
//   "name" => "Jane Doe"
//   "age" => 25
//   "address" => array [
//     "city" => "Springfield"
//   ]
//   "role" => "line"
// ]
?>

Working with Structure objects

Structure object properties can be accessed using get() and set() methods, but also directly as properties.

<?php
$person = Structure::define('person', [
    Field::string('name'),
    Field::int('age'),
    Field::structure('role', [
        Field::string('name'),
        Field::int('level'),
    ])
]);

// Setting properties via set()
$person->set('name', 'John Doe');
$person->set('age', 30);
$person->get('role')->set('name', 'Manager');
$person->get('role')->set('level', 1);

// Setting properties directly 
$person->name = 'John Doe';
$person->age = 30;
$person->role->name = 'Manager';
$person->role->level = 1;

// Getting properties via get()
$name = $person->get('name');
$age = $person->get('age');
$role = $person->get('role')->get('name');
$level = $person->get('role')->get('level');

// Getting properties directly
$name = $person->name;
$age = $person->age;
$role = $person->role->name;
$level = $person->role->level;
?>