AutoFeedback

The AutoFeedback feature in the Log10 python package (version 0.6.7) leverages AI to automatically generate feedback based on your existing feedback.

The example showed in this doc is a feedback task for assessing summarization works based on TLDR feedback rubrics. It evaluates a summary on four axes: overall, accuracy, coverage, and coherence. You can customize for your own feedback task. Here's a demo video (opens in a new tab).

For more information on advanced AutoFeedback models using synthetic data and fine-tuning, or for information on our hosted AutoFeedback options, please get in touch with us at [email protected]

Get a feedback task ID

To use the AutoFeedback feature, you need to identify the task for which you want to generate feedback. This task ID can be found via CLI or from dashboard. Please refer to the feedback docs and example code (opens in a new tab) to create tasks and feedback.

In the dashboard, go to Feedback, Task, then find the ID of a task from side panel: Feedback task screenshot

Using CLI:

log10 feedback-task list

$ export TASK_ID=<TASK ID FROM LIST OR DASHBOARD>

If you want to see the details of a task $TASK_ID from CLI:

log10 feedback-task get $TASK_ID

(optional) Show feedback with task ID

You can filter feedback by task ID using CLI.

log10 feedback list --task_id $TASK_ID

Example output, listing 5 feedback with a task id:

                                                                       Feedback
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ ID                                 ┃ Task Name                          ┃ Feedback                           ┃ Completion ID                       ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 5525bf49-a870-465a-b6b7-f92b1879b… │ Demo TLDR Dataset Summary grading  │ {"note": "The summary conveys the  │ 431ee175-3a37-436a-bd46-5f3c2b7976… │
│                                    │ w/ note                            │ main idea of the post.",           │                                     │
│                                    │                                    │ "overall": 7, "accuracy": 7,       │                                     │
│                                    │                                    │ "coverage": 7, "coherence": 7}     │                                     │
│ 9874ca7b-1c01-4785-8331-4d6ede97c… │ Demo TLDR Dataset Summary grading  │ {"note": "•explicit purpose        │ 447c3b69-6aea-4f4d-95e6-cbf37adca8… │
│                                    │ w/ note                            │ statement will make summary        │                                     │
│                                    │                                    │ perfect. ", "overall": 6,          │                                     │
│                                    │                                    │ "accuracy": 7, "coverage": 6,      │                                     │
│                                    │                                    │ "coherence": 7}                    │                                     │
│ ad7dd317-3a19-4633-bf80-31e26cd49… │ Demo TLDR Dataset Summary grading  │ {"note": "Missing details of       │ 1d2396bf-df44-4eaf-a98a-a7efc6035f… │
│                                    │ w/ note                            │ why.", "overall": 4, "accuracy":   │                                     │
│                                    │                                    │ 7, "coverage": 4, "coherence": 7}  │                                     │
│ 4b5edcb1-ca79-42a8-a0f5-bfad56646… │ Demo TLDR Dataset Summary grading  │ {"note": "Should mention they're   │ 264a3ca1-7bcc-4679-b8ad-8ad9d06970… │
│                                    │ w/ note                            │ male.", "overall": 6, "accuracy":  │                                     │
│                                    │                                    │ 7, "coverage": 6, "coherence": 7}  │                                     │
│ 62b4a794-22bd-454d-9563-b34f8eb11… │ Demo TLDR Dataset Summary grading  │ {"note": "•summary is sufficient   │ 32a5de77-2cff-409f-b20c-7a44b2bfde… │
│                                    │ w/ note                            │ to provide a gist of OP's          │                                     │
│                                    │                                    │ dilemma.\n\n•an explicit purpose   │                                     │
│                                    │                                    │ statement will make the summary    │                                     │
│                                    │                                    │ better. ", "overall": 6,           │                                     │
│                                    │                                    │ "accuracy": 7, "coverage": 6,      │                                     │
│                                    │                                    │ "coherence": 7}                    │                                     │
└────────────────────────────────────┴────────────────────────────────────┴────────────────────────────────────┴─────────────────────────────────────┘

Predict feedback

To automatically generate feedback, use the log10 feedback predict command and pass in the task ID. It samples the existing feedback (default is 5 random samples, but this can be adjusted with the num_samples option) as few shot examples in the prompt to make more prediction.

Pass --completion_id If you have been using log10 to log your completions, you can pass your completion_id to the predict function.

log10 feedback predict --task_id $TASK_ID --completion_id $COMPLETION_ID

Example output, where you can find the summary feedback in JSON format at the bottom:

[2024-03-12 11:26:48 - LOG10 - INFO] Getting 5 feedback for task 04405cbc-3420-4901-97b6-f8b7ed248205
[2024-03-12 11:26:52 - LOG10 - INFO] Sampled completion ids: ['addb2ce3-49b6-41ba-8252-9d4103005608', '447c3b69-6aea-4f4d-95e6-cbf37adca846', '2966d9b0-cb43-4a44-9f02-bd431efe5824', '98d12905-e479-4c18-96e3-c88d0dee38ba', 'd6a614cc-bf67-4211-a4d5-5c9bf5f6b479']
[2024-03-12 11:26:53 - LOG10 - INFO] text='{"prompt": "While we\'ve only been in a proper relationship for 2 months we\'ve been romantically involved one way or another for the past 6 months. Yesterday she said there was something she wanted to ask me but felt horrible asking it so never ended up asking me. Today I convinced her that it was best for our interests that she just comes out and says it. She was unable to say it over Skype so she just sent me a text, the text said \\"Sometimes I get slightly paranoid that like sometimes I think that you might like the idea of having a girlfriend or just having a girlfriend more than you actually like me as a person\\".\\n\\nFirst and foremost I love this girl so much, I\'d do anything for her and it hurt me a little that she thinks this. I tried to explain to her how much I like her and there\'s no one else in the world I\'d want to be with but no matter what I say she says the doubt is still in her mind.\\n\\nI have a feeling that she might be insecure about our relationship because she told me that someone had told her that I wanted to have sex with her when I first met her (this is from one of my friends crazy ex\'s and is not true). I really like her and I want to make this work but what ever I say she doesn\'t seem to listen to me, how can I make her believe that I think she is the most amazing person in the world and I want to be with her, not the idea of a relationship?", "response": "My girlfriend thinks I like the idea of a girlfriend more than her."}'
[2024-03-12 11:26:53 - LOG10 - INFO] Using predict llm_call: summary_feedback_llm_call
{
  "note": "The summary captures the essence of the situation but omits key details about the cause of the girlfriend's insecurity and the narrator's efforts to reassure her.",
  "axes": {
    "overall": "5",
    "accuracy": "6",
    "coverage": "4",
    "coherence": "7"
  }
}

Pass content directly by using --content or with a file using --file:

log10 feedback predict --task_id $TASK_ID --content "{\"prompt\": \"<your-prompt-here>\", \"response\": \"<model-response-here>\"}"
 
log10 feedback predict --task_id $TASK_ID --file content.json

Note that the content format is:

{
    "prompt": "<your-prompt-here>",
    "response": "<model-response-here>"
}

Predict CLI

$ log10 feedback predict --help
Usage: log10 feedback predict [OPTIONS]
 
Options:
  --task_id TEXT         Feedback task ID
  --content TEXT         Completion content
  -f, --file TEXT        File containing completion content
  --completion_id TEXT   Completion ID
  --num_samples INTEGER  Number of samples to use for few-shot learning

Custom AutoFeedback

In addition to the summarization feedback task, you can generate other types of AutoFeedback. All you need to do is simply define your prompt template to incorporate the feedback. If you want to access your logged completions on log10.io, you also need to define a function to flatten the messages.

Please refer to the functions defined for summary feedback:

define prompt template summary_feedback_llm_call (opens in a new tab) and init AutoFeedbackICL with it (opens in a new tab).

auto_feedback_icl = AutoFeedbackICL(task_id, num_samples=num_samples, predict_func=summary_feedback_llm_call)

flatten log10 completion messages flatten_messages (opens in a new tab)

from log10.completions.completions import _get_completion
 
completion = _get_completion(completion_id)
messages = flatten_messages(completion.json()["data"])

(optional) use predict SDK

fd = auto_feedback_icl.predict(text="<your-prompt-response-content>")
# or pass completion_id in log10.io
fd = auto_feedback_icl.predict(completion_id="<your-completion-id>")

Completion-derived Autofeedback (alpha)

For a limited time, AutoFeedback (normally Enterprise tier) is enabled for all accounts.

Log10 automatically generates feedback based on logged completions.

Creating a sample task for Autofeedback

A sample task with associated feedback can be generated from the onboarding screen. This will add the task and associated feedback to your organization.

The onboarding screen can be accessed from the Add Feedback menu on the Feedback page.

Clicking on Summary grading creates a summarization task with feedback already associated. To generate feedback, log completions with the tag log10/summary-grading to your organization. Summary onboarding task

Every completion that matches a task's tag selector will trigger feedback generation.

Task with tag selector

For this task, completions with the tag log10/summary-grading will trigger autofeedback generation.

Generated feedback can be accepted, rejected, or edited.

Accept or reject autogenerated feedback

The next piece of generated feedback will be displayed automatically after the current feedback is accepted or rejected.

This workflow is optimal for providing feedback for a large volume of completions.

Alternatively, generated feedback can be accepted or rejected directly from the inbox. Accept or reject feedback from the inbox

Generated feedback that is accepted or edited will be available in the feedback list and will be incorporated in future generations of feedback. Rejected feedback will be used as negative examples for future generations of feedback.

Accessing AutoFeedback programmatically

Generated AutoFeedback entries can be access programmatically via Log10's graphql API using the completion id. This can be useful when surfacing quality scoring in your own applications before a reviewer has approved them within the log10 platform. Here is an example function to fetch AutoFeedback:

from log10.feedback.autofeedback import get_autofeedback
import json
 
af_pred = get_autofeedback(completion_id)
print(af_pred)

Task definition Log matching