Default behavior: Human review scores are visible to all project members once saved. Multiple reviewers scoring the same row can see each other’s answers.Multi-user review: When enabled, each reviewer gets independent score storage with automatic averaging, but reviewers can still see aggregated scores from others.
When you configure human review scores in a project, all team members can see and edit the same score fields on experiment rows.Visibility: Once a reviewer saves their score, it appears immediately in the experiment table for all other project members to see.Overwriting: If multiple people score the same row, the last person to save overwrites the previous score.No isolation: There is no mechanism to hide one reviewer’s input from another reviewer.
This feature needs to be enabled in Settings > Feature Flags > Multi-user human review.When the multi-user review feature is enabled, the scoring behavior changes:Independent storage: Each reviewer’s scores are stored separately as dedicated “review” spans attached to the parent span being reviewed.No overwriting: Multiple reviewers can score the same row without overwriting each other’s work.Automatic averaging: Scores from all reviewers are automatically averaged and displayed on the parent span in the experiment table.Partial visibility: In the review UI, reviewers can see both their own individual scores and the aggregate scores across all reviewers.