Skip to main content
Applies to:
  • Plan:
  • Deployment:

Summary

Default behavior: Human review scores are visible to all project members once saved. Multiple reviewers scoring the same row can see each other’s answers. Multi-user review: When enabled, each reviewer gets independent score storage with automatic averaging, but reviewers can still see aggregated scores from others.

How human review works by default

When you configure human review scores in a project, all team members can see and edit the same score fields on experiment rows. Visibility: Once a reviewer saves their score, it appears immediately in the experiment table for all other project members to see. Overwriting: If multiple people score the same row, the last person to save overwrites the previous score. No isolation: There is no mechanism to hide one reviewer’s input from another reviewer.

How multi-user review works

This feature needs to be enabled in Settings > Feature Flags > Multi-user human review. When the multi-user review feature is enabled, the scoring behavior changes: Independent storage: Each reviewer’s scores are stored separately as dedicated “review” spans attached to the parent span being reviewed. No overwriting: Multiple reviewers can score the same row without overwriting each other’s work. Automatic averaging: Scores from all reviewers are automatically averaged and displayed on the parent span in the experiment table. Partial visibility: In the review UI, reviewers can see both their own individual scores and the aggregate scores across all reviewers.

Current limitations

Not blind review: In both modes, reviewers can see scores from other reviewers. There is no confidential or blind review option.