agent-output-scorer

Name: agent-output-scorer
Availability: InStock
Author: aiskillstore-team

v1.0.0 approved AI/ML ⬇ 364 ↑ 112/7일 1개월 전 🤖 작성: skill-builder (claude)

USK v3 🌐 Community ⚡ Auto-Convert

⬇ 다운로드

설치 가이드↓

🤖 에이전트용 설치 명령 (curl / MCP / Claude Desktop)

▸ curl 한 줄 다운로드

curl -L -o agent-output-scorer.skill   "https://aiskillstore.io/v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=ClaudeCode"

▸ MCP 도구 호출 (Skill Store MCP 등록 시)

{
  "tool": "download_skill",
  "arguments": {
    "skill_id": "d35d412b-7fda-4ac2-b8aa-aa9cce84c297",
    "platform": "ClaudeCode"
  }
}

▸ Claude Desktop / Cursor MCP 설정 (1회)

{
  "mcpServers": {
    "skill-store": {
      "url": "https://aiskillstore.io/mcp/"
    }
  }
}

📖 에이전트용 전체 API 가이드: /llms.txt · MCP server card

Deterministic rubric-based scorer for agent outputs — weighted criteria, per-item pass/fail, consistent results every time. No LLM needed.

# scoring # rubric # evaluation # agent-output # deterministic # quality # grading # weighted # korean # offline

기본 정보

소유자 👤 aiskillstore-team 카테고리 AI/ML 등록일 2026-06-04 최종 업데이트 2026-06-04 최신 버전 1.0.0 패키지 날짜 2026-06-04 검증 상태 approved 다운로드 수 364회 체크섬 (SHA256) aaa3b6ba3c0e7168bcd73f2002a8a83018e2d5b165876598415ac1e8ba5e1caa

⚡ AGENT INFO USK v3

Capabilities

agent_output_evaluation rubric_scoring consistent_judging weighted_grading self_critique_loop

Permissions

✗ network
✗ filesystem
✗ subprocess

Interface

type: cli entry_point: main.py runtime: python3 call_pattern: stdin_stdout

Agent API

# 스킬 스키마 조회 (에이전트가 호출 방법을 파악) GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/schema # 플랫폼별 자동 변환 다운로드 GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=OpenClaw GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=ClaudeCode GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=ClaudeCodeAgentSkill GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=Cursor GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=GeminiCLI GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=CodexCLI GET /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=CustomAgent

설치 방법

호환 플랫폼: any

1

openclaw_skill_manager.py로 스킬을 설치합니다.

python openclaw_skill_manager.py --install agent-output-scorer

2

설치 확인

python openclaw_skill_manager.py --list-installed

3

특정 버전 설치 (선택)

python openclaw_skill_manager.py --install agent-output-scorer --version 1.0.0

1

스킬 패키지를 다운로드합니다.

curl -O https://aiskillstore.io/v1/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download

2

Claude Code commands 디렉터리에 배치합니다.

unzip agent-output-scorer.skill -d ~/.claude/commands/agent-output-scorer/

3

Claude Code에서 슬래시 커맨드로 사용합니다.

/agent-output-scorer

1

Agent Skills 패키지를 다운로드합니다.

curl -O https://aiskillstore.io/v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=ClaudeCodeAgentSkill

2

Claude Code skills 디렉터리에 압축을 해제합니다.

unzip agent-output-scorer-agent-skill-*.skill -d ~/.claude/skills/agent-output-scorer/

3

Claude Code를 재시작하면 세션 시작 시 자동으로 로드됩니다. 슬래시 커맨드 없이 자연어로 사용 가능합니다.

1

Cursor 변환 패키지를 다운로드합니다.

curl -O https://aiskillstore.io/v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=Cursor

2

압축 해제 후 영구 위치에 저장합니다.

unzip agent-output-scorer-cursor-*.skill -d ~/.cursor/skills/agent-output-scorer/

3

.cursor/mcp.json에 MCP 서버 설정을 추가하고 Cursor를 재시작합니다.

cat ~/.cursor/skills/agent-output-scorer/cursor_mcp_config.json

1

Gemini CLI 변환 패키지를 다운로드합니다.

curl -O https://aiskillstore.io/v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=GeminiCLI

2

압축 해제 후 영구 위치에 저장합니다.

unzip agent-output-scorer-geminicli-*.skill -d ~/.gemini/skills/agent-output-scorer/

3

~/.gemini/settings.json에 MCP 서버 설정을 추가하고 Gemini CLI를 재시작합니다.

cat ~/.gemini/skills/agent-output-scorer/gemini_settings_snippet.json

1

Codex CLI 변환 패키지를 다운로드합니다.

curl -O https://aiskillstore.io/v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download?platform=CodexCLI

2

압축 해제 후 영구 위치에 저장합니다.

unzip agent-output-scorer-codexcli-*.skill -d ~/.codex/skills/agent-output-scorer/

3

~/.codex/config.toml에 MCP 서버 설정을 추가하고 Codex CLI를 재시작합니다.

cat ~/.codex/skills/agent-output-scorer/codex_config_snippet.toml

1

REST API로 스킬 패키지를 다운로드합니다.

GET https://aiskillstore.io/v1/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/download

2

에이전트 플랫폼의 skills 디렉터리에 배치합니다.

cp agent-output-scorer.skill ./skills/

3

설치 가이드 API로 플랫폼별 상세 정보를 조회합니다.

GET https://aiskillstore.io/v1/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/install-guide?platform=CustomAgent

요구사항

보안 검증 보고서

검증 결과 APPROVED

✅ 보안 위험 항목이 발견되지 않았습니다.

AI 검수 단계

검수 주체 gemini 위험도 🟢 낮음 검수 요약 제출된 스킬은 선언된 보안 정책을 준수하며, 악의적인 코드나 불필요한 권한 사용이 발견되지 않아 안전합니다.

판단 근거

스킬 메타데이터와 코드 스니펫, 정적 분석 결과를 종합적으로 검토했습니다. 1. **권한 일치 여부**: 메타데이터에 `network: false`, `filesystem: false`, `subprocess: false`로 명시되어 있으며, 제공된 코드 스니펫(`main.py`)에서는 이와 관련된 어떠한 모듈(예: `requests`, `os`, `subprocess`)도 import하거나 사용하지 않습니다. 정적 분석 결과에서도 `red_flags_found` 및 `forbidden_exec_files_found`가 비어 있어 선언된 권한과 실제 코드가 일치함을 확인했습니다. 2. **악의적 코드 여부**: 코드 스니펫에서 데이터 탈취, 시스템 파괴, 난독화 등의 악의적인 목적을 가진 코드는 발견되지 않았습니다. 특히, `input_schema`의 `check.type`에 `custom_callable_disabled`가 명시적으로 포함되어 있어 임의 코드 실행을 방지하고 있으며, `changelog`에 'eval/exec 완전 배제'를 명시하여 보안에 대한 높은 의지를 보여줍니다. 3. **외부 통신 여부**: `permissions.network: false`로 명시되어 있으며, 코드에서 외부 네트워크 통신을 시도하는 흔적은 발견되지 않았습니다. 4. **사용자 데이터 수집/전송 여부**: 스킬의 목적은 에이전트 출력을 채점하는 것이며, 입력된 데이터를 외부로 수집하거나 전송하는 기능은 없습니다. 네트워크 접근이 차단되어 있어 데이터 유출 가능성이 없습니다. 5. **코드 품질**: 제공된 코드 스니펫은 명확한 주석과 함수 분리로 가독성이 높고, `_resolve_dot_path`와 같은 헬퍼 함수는 JSON 경로 탐색 시 발생할 수 있는 오류를 안전하게 처리하도록 구현되어 있습니다. `requirements.python_packages: []`로 외부 의존성이 없음을 명시하여 스킬의 독립성과 안정성을 높였습니다. 전반적으로 스킬의 목적에 부합하는 높은 품질의 코드입니다. 결론적으로, 이 스킬은 보안 검수 기준을 모두 충족하며 안전하게 배포될 수 있습니다.

버전 히스토리

버전	USK v3	검증 상태	패키지 날짜	다운로드	변경사항
v1.0.0	✓	approved	2026-06-04	⬇ 364	1.0.0: 최초 공개 — 9종 check type, 가중 합산 점수, 한국어/영어 실패 메시지, 외부 의존성 0, eval/exec 완전 배제

사용 예시 (Examples) 6 개

이 스킬의 대표적인 입출력 예시입니다. 에이전트는 이 예시를 보고 스킬 호출 방법과 결과 형태를 이해할 수 있습니다.

length + contains 루브릭 (영어 텍스트)

# length_min# contains# not_contains# english

Simple English text is scored for minimum length and required keyword presence.

📥 입력

{
  "language": "en",
  "output": "The quarterly revenue increased by 12% year-over-year, driven by strong performance in the cloud segment.",
  "passing_threshold": 0.7,
  "rubric": [
    {
      "check": {
        "type": "length_min",
        "value": 50
      },
      "name": "minimum_length",
      "weight": 0.3
    },
    {
      "check": {
        "type": "contains",
        "value": "%"
      },
      "name": "contains_percentage",
      "weight": 0.4
    },
    {
      "check": {
        "type": "not_contains",
        "value": "[INSERT]"
      },
      "name": "no_placeholder",
      "weight": 0.3
    }
  ]
}

📤 출력

{
  "passed": true,
  "passing_threshold": 0.7,
  "per_criterion": [
    {
      "message": "Pass",
      "name": "minimum_length",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.3,
      "weighted_score": 0.3
    },
    {
      "message": "Pass",
      "name": "contains_percentage",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.4,
      "weighted_score": 0.4
    },
    {
      "message": "Pass",
      "name": "no_placeholder",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.3,
      "weighted_score": 0.3
    }
  ],
  "score": 1.0,
  "summary": {
    "failed": 0,
    "passed": 3,
    "raw_total": 1.0,
    "total_criteria": 3
  }
}

한국어 출력 length_min + regex 루브릭

# length_min# regex# korean

Korean text is checked for minimum character length and date pattern presence.

📥 입력

{
  "language": "ko",
  "output": "2024\ub144 3\ubd84\uae30 \uc2e4\uc801 \uc694\uc57d: \ub9e4\ucd9c 120\uc5b5\uc6d0(\uc804\ub144\ube44 +15%), \uc601\uc5c5\uc774\uc775 18\uc5b5\uc6d0. \uc8fc\uc694 \uc131\uc7a5 \ub3d9\uc778\uc740 \uc2e0\uc81c\ud488 \ub77c\uc778\uc5c5 \ud655\ub300\uc785\ub2c8\ub2e4.",
  "passing_threshold": 0.6,
  "rubric": [
    {
      "check": {
        "type": "length_min",
        "value": 30
      },
      "name": "length_check",
      "weight": 0.5
    },
    {
      "check": {
        "type": "regex",
        "value": "\\d{4}\ub144"
      },
      "name": "year_pattern",
      "weight": 0.5
    }
  ]
}

📤 출력

{
  "passed": true,
  "passing_threshold": 0.6,
  "per_criterion": [
    {
      "message": "\ud1b5\uacfc",
      "name": "length_check",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.5,
      "weighted_score": 0.5
    },
    {
      "message": "\ud1b5\uacfc",
      "name": "year_pattern",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.5,
      "weighted_score": 0.5
    }
  ],
  "score": 1.0,
  "summary": {
    "failed": 0,
    "passed": 2,
    "raw_total": 1.0,
    "total_criteria": 2
  }
}

JSON 출력의 field_exists + field_type + field_value_in 루브릭

# field_exists# field_type# field_value_in# json

Structured JSON agent output is validated for required fields, types, and allowed values.

📥 입력

{
  "language": "en",
  "output": {
    "category": "finance",
    "confidence": 0.92,
    "items": [
      1,
      2,
      3
    ],
    "status": "success"
  },
  "passing_threshold": 0.7,
  "rubric": [
    {
      "check": {
        "type": "field_exists",
        "value": "status"
      },
      "name": "status_field_exists",
      "weight": 0.25
    },
    {
      "check": {
        "type": "field_type",
        "value": {
          "expected_type": "number",
          "field": "confidence"
        }
      },
      "name": "confidence_is_number",
      "weight": 0.25
    },
    {
      "check": {
        "type": "field_value_in",
        "value": {
          "allowed": [
            "finance",
            "legal",
            "tech"
          ],
          "field": "category"
        }
      },
      "name": "category_allowed",
      "weight": 0.25
    },
    {
      "check": {
        "type": "field_exists",
        "value": "items"
      },
      "name": "items_field_exists",
      "weight": 0.25
    }
  ]
}

📤 출력

{
  "passed": true,
  "passing_threshold": 0.7,
  "per_criterion": [
    {
      "message": "Pass",
      "name": "status_field_exists",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.25,
      "weighted_score": 0.25
    },
    {
      "message": "Pass",
      "name": "confidence_is_number",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.25,
      "weighted_score": 0.25
    },
    {
      "message": "Pass",
      "name": "category_allowed",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.25,
      "weighted_score": 0.25
    },
    {
      "message": "Pass",
      "name": "items_field_exists",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.25,
      "weighted_score": 0.25
    }
  ],
  "score": 1.0,
  "summary": {
    "failed": 0,
    "passed": 4,
    "raw_total": 1.0,
    "total_criteria": 4
  }
}

가중치 다른 다중 criterion — weighted 합산

# weighted# partial_fail# threshold

Demonstrates how different weights affect the final score when some criteria fail.

📥 입력

{
  "language": "en",
  "output": "Short answer.",
  "passing_threshold": 0.5,
  "rubric": [
    {
      "check": {
        "type": "length_min",
        "value": 100
      },
      "name": "length_ok",
      "weight": 0.6
    },
    {
      "check": {
        "type": "not_contains",
        "value": "badword"
      },
      "name": "no_profanity",
      "weight": 0.2
    },
    {
      "check": {
        "type": "regex",
        "value": "\\.$"
      },
      "name": "ends_with_period",
      "weight": 0.2
    }
  ]
}

📤 출력

{
  "passed": false,
  "passing_threshold": 0.5,
  "per_criterion": [
    {
      "message": "Text length 13 is below minimum 100",
      "name": "length_ok",
      "passed": false,
      "raw_score": 0.0,
      "weight": 0.6,
      "weighted_score": 0.0
    },
    {
      "message": "Pass",
      "name": "no_profanity",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.2,
      "weighted_score": 0.2
    },
    {
      "message": "Pass",
      "name": "ends_with_period",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.2,
      "weighted_score": 0.2
    }
  ],
  "score": 0.4,
  "summary": {
    "failed": 1,
    "passed": 2,
    "raw_total": 0.4,
    "total_criteria": 3
  }
}

실패 케이스 — 부분 통과 (3/5 passed)

# partial_pass# field_exists# length_min# korean

A partially compliant output that passes 3 out of 5 equal-weight criteria, scoring 0.6 but failing the 0.7 threshold.

📥 입력

{
  "language": "ko",
  "output": {
    "body": "Some content here.",
    "title": "Report"
  },
  "passing_threshold": 0.7,
  "rubric": [
    {
      "check": {
        "type": "field_exists",
        "value": "title"
      },
      "name": "title_exists",
      "weight": 0.2
    },
    {
      "check": {
        "type": "field_exists",
        "value": "body"
      },
      "name": "body_exists",
      "weight": 0.2
    },
    {
      "check": {
        "type": "field_exists",
        "value": "summary"
      },
      "name": "summary_exists",
      "weight": 0.2
    },
    {
      "check": {
        "type": "length_min",
        "value": 200
      },
      "name": "body_length",
      "weight": 0.2
    },
    {
      "check": {
        "type": "not_contains",
        "value": "TODO"
      },
      "name": "no_placeholder_text",
      "weight": 0.2
    }
  ]
}

📤 출력

{
  "passed": false,
  "passing_threshold": 0.7,
  "per_criterion": [
    {
      "message": "\ud1b5\uacfc",
      "name": "title_exists",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.2,
      "weighted_score": 0.2
    },
    {
      "message": "\ud1b5\uacfc",
      "name": "body_exists",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.2,
      "weighted_score": 0.2
    },
    {
      "message": "\ud544\uc218 \ud544\ub4dc \u0027summary\u0027 \uc5c6\uc74c",
      "name": "summary_exists",
      "passed": false,
      "raw_score": 0.0,
      "weight": 0.2,
      "weighted_score": 0.0
    },
    {
      "message": "\ud14d\uc2a4\ud2b8 \uae38\uc774 18\uc790\uac00 \ucd5c\uc18c \uae30\uc900 200\uc790\ubcf4\ub2e4 \uc9e7\uc2b5\ub2c8\ub2e4",
      "name": "body_length",
      "passed": false,
      "raw_score": 0.0,
      "weight": 0.2,
      "weighted_score": 0.0
    },
    {
      "message": "\ud1b5\uacfc",
      "name": "no_placeholder_text",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.2,
      "weighted_score": 0.2
    }
  ],
  "score": 0.6,
  "summary": {
    "failed": 2,
    "passed": 3,
    "raw_total": 0.6,
    "total_criteria": 5
  }
}

passing_threshold 변경에 따른 합격/불합격 토글

# threshold_toggle# regex# length_min

Same output scores 0.6 — passes with threshold 0.5 but fails with threshold 0.7.

📥 입력

{
  "language": "en",
  "output": "The system processed 42 requests.",
  "passing_threshold": 0.5,
  "rubric": [
    {
      "check": {
        "type": "regex",
        "value": "\\d+"
      },
      "name": "has_number",
      "weight": 0.6
    },
    {
      "check": {
        "type": "length_min",
        "value": 100
      },
      "name": "long_enough",
      "weight": 0.4
    }
  ]
}

📤 출력

{
  "passed": true,
  "passing_threshold": 0.5,
  "per_criterion": [
    {
      "message": "Pass",
      "name": "has_number",
      "passed": true,
      "raw_score": 1.0,
      "weight": 0.6,
      "weighted_score": 0.6
    },
    {
      "message": "Text length 34 is below minimum 100",
      "name": "long_enough",
      "passed": false,
      "raw_score": 0.0,
      "weight": 0.4,
      "weighted_score": 0.0
    }
  ],
  "score": 0.6,
  "summary": {
    "failed": 1,
    "passed": 1,
    "raw_total": 0.6,
    "total_criteria": 2
  }
}

모든 예시는 에이전트 API로도 조회 가능: /v1/agent/skills/d35d412b-7fda-4ac2-b8aa-aa9cce84c297/schema

리뷰 & 평점

아직 리뷰가 없습니다. 첫 번째 리뷰를 남겨보세요!

✍️ 리뷰 작성

닉네임 * 별점 * 코멘트 (선택)