> ## Documentation Index
> Fetch the complete documentation index at: https://docs.courtrules.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Classify Document

> Upload a PDF and receive document classification, extracted metadata, and a partial /check request body when supported

```
POST /api/v1/classify
```

Upload a PDF and receive document classification (scope, motion type) plus a partial request body for the [`/check`](/api-reference/check) endpoint when the PDF is a supported federal filing. The API extracts page count, word count, structural metadata, and document type from the PDF automatically. You must add filing context fields (`is_pro_se`, `pmc_completed`, `opposing_party_pro_se`, `filing_role`) before sending to `/check`.

The response includes a `supported` boolean indicating whether the document is a classifiable federal court filing. Non-federal documents, blank templates, court rules or practices, and non-legal documents return `supported: false` with `confidence: "low"` and no `check_request`.

Uploaded PDFs are stored server-side for quality improvement.

**Note:** May take 10-30 seconds to respond. Format extraction (font, margins, line spacing) has a 15-second timeout; fillable forms and interactive PDFs may hit this limit, in which case format checks are skipped and a warning is returned.

## Try it

A sample filing PDF is available in the [console playground](https://console.courtrules.app/playground?tab=classify). Upload it to see classification, extracted metadata, and a ready-to-run `/check` request body.

## Request

<ParamField body="judge_slug" type="string" required>
  Judge slug identifier (e.g. `gary-r-brown`)
</ParamField>

<ParamField body="district_id" type="string" required>
  District identifier (e.g. `edny`)
</ParamField>

<ParamField body="pdf_base64" type="string" required>
  Base64-encoded PDF file.
</ParamField>

<ParamField body="filename" type="string">
  Original filename of the PDF (e.g. `motion_for_summary_judgment.pdf`). Recommended: always include
  if available. Filenames like `memo_in_support_sj.pdf` provide a strong classification signal;
  generic names like `document.pdf` are still accepted but add no value.
</ParamField>

<RequestExample>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  # Base64-encode the PDF, then send it
  PDF_B64=$(base64 -i motion_for_sj.pdf)

  curl -X POST https://api.courtrules.app/api/v1/classify \
   -H "Authorization: Bearer YOUR_API_KEY" \
   -H "Content-Type: application/json" \
   -d "{
  \"judge_slug\": \"carol-bagley-amon\",
  \"district_id\": \"edny\",
  \"pdf_base64\": \"$PDF_B64\",
  \"filename\": \"memo_in_support_sj.pdf\"
  }"

  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import base64
  from pathlib import Path

  import requests

  headers = {"Authorization": "Bearer YOUR_API_KEY"}

  # Read and encode the PDF
  pdf_b64 = base64.b64encode(Path("motion_for_sj.pdf").read_bytes()).decode()

  # Classify the PDF
  resp = requests.post(
      "https://api.courtrules.app/api/v1/classify",
      headers=headers,
      json={
          "judge_slug": "carol-bagley-amon",
          "district_id": "edny",
          "pdf_base64": pdf_b64,
          "filename": "memo_in_support_sj.pdf",
      },
  )
  result = resp.json()

  # Add filing context, then send to /check
  check_body = {
      **result["check_request"],
      "is_pro_se": False,
      "pmc_completed": True,
      "opposing_party_pro_se": False,
      "filing_role": "movant",
  }
  check_resp = requests.post(
      "https://api.courtrules.app/api/v1/check",
      headers=headers,
      json=check_body,
  )
  ```

  ```typescript Node.js theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import { readFileSync } from "fs";

  const pdfBase64 = readFileSync("motion_for_sj.pdf").toString("base64");

  const classifyResp = await fetch("https://api.courtrules.app/api/v1/classify", {
    method: "POST",
    headers: {
      Authorization: "Bearer YOUR_API_KEY",
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      judge_slug: "carol-bagley-amon",
      district_id: "edny",
      pdf_base64: pdfBase64,
      filename: "memo_in_support_sj.pdf",
    }),
  });
  const { check_request } = await classifyResp.json();

  // Add filing context, then send to /check
  const checkResp = await fetch("https://api.courtrules.app/api/v1/check", {
    method: "POST",
    headers: {
      Authorization: "Bearer YOUR_API_KEY",
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      ...check_request,
      is_pro_se: false,
      pmc_completed: true,
      opposing_party_pro_se: false,
      filing_role: "movant",
    }),
  });
  ```
</RequestExample>

<ResponseExample>
  ```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
  {
    "classification": {
      "document_scope": "brief_support",
      "motion_type": "Rule_56",
      "confidence": "high",
      "supported": true,
      "reasoning": "Document is titled 'Memorandum of Law in Support of Motion for Summary Judgment', contains Rule 56 references, and includes a statement of undisputed material facts."
    },
    "document": {
      "page_count": 31,
      "word_count": 10000,
      "caption": {
        "present": true,
        "has_court_name": true,
        "has_case_title": true,
        "has_docket_number": true,
        "has_document_designation": true
      },
      "sections": {
        "has_toc": true,
        "has_toa": true,
        "has_certificate_of_compliance": false,
        "has_certificate_of_service": true,
        "has_numbered_paragraphs": false,
        "has_56_1_statement": true,
        "has_56_1_counterstatement": false,
        "has_proposed_amended_pleading": false,
        "has_verbatim_discovery_text": false,
        "has_conferral_certification": false,
        "has_notice_of_motion": true,
        "has_memorandum_of_law": true,
        "has_supporting_affidavits": true,
        "has_pro_se_sj_notice": false
      }
    },
    "check_request": {
      "judge_slug": "carol-bagley-amon",
      "district_id": "edny",
      "document_scope": "brief_support",
      "motion_type": "Rule_56",
      "document": {
        "page_count": 31,
        "word_count": 10000,
        "caption": {
          "present": true,
          "has_court_name": true,
          "has_case_title": true,
          "has_docket_number": true,
          "has_document_designation": true
        },
        "sections": {
          "has_toc": true,
          "has_toa": true,
          "has_certificate_of_compliance": false,
          "has_certificate_of_service": true,
          "has_numbered_paragraphs": false,
          "has_56_1_statement": true,
          "has_56_1_counterstatement": false,
          "has_proposed_amended_pleading": false,
          "has_verbatim_discovery_text": false,
          "has_conferral_certification": false,
          "has_notice_of_motion": true,
          "has_memorandum_of_law": true,
          "has_supporting_affidavits": true,
          "has_pro_se_sj_notice": false
        }
      }
    },
    "page_count_analysis": {
      "total_pdf_pages": 35,
      "front_matter_pages": 2,
      "back_matter_pages": 2,
      "body_page_count": 31,
      "method": "llm_structural_analysis"
    },
    "format_detection": {
      "confidence": "high",
      "confidence_reason": "Font and margin data extracted successfully from PDF structure",
      "warning": null,
      "font_family": "TimesNewRoman",
      "font_name_raw": "ABCDEF+TimesNewRomanPSMT",
      "margins_per_side": { "top": 1.0, "bottom": 1.0, "left": 1.0, "right": 1.0 },
      "line_spacing_pt": 24.0
    }
  }
  ```
</ResponseExample>

## Response fields

<ResponseField name="classification" type="object">
  Document type classification

  <Expandable title="Classification">
    <ResponseField name="document_scope" type="string">
      Detected document type (e.g. `brief_support`, `letter`, `affidavit`). Omitted when `supported` is `false`.
    </ResponseField>

    <ResponseField name="motion_type" type="string">
      Detected motion type, if applicable (e.g. `Rule_56`, `discovery`). Omitted for non-motion documents.
    </ResponseField>

    <ResponseField name="confidence" type="string">
      Classification confidence: `high`, `medium`, or `low`. A value of `low` means the document is not a supported filing type.
    </ResponseField>

    <ResponseField name="supported" type="boolean">
      Whether the document is a classifiable federal court filing. Derived from confidence: `true` when confidence is `high` or `medium`, `false` when `low`.
    </ResponseField>

    <ResponseField name="reasoning" type="string">
      Brief explanation of how the document was classified
    </ResponseField>
  </Expandable>
</ResponseField>

<ResponseField name="document" type="object">
  Extracted document metadata including page count, word count, and structural elements detected in
  the PDF (caption, sections, format, privacy).
</ResponseField>

<ResponseField name="check_request" type="object">
  Present only when `supported` is `true`. A partial request body for [`POST
      /api/v1/check`](/api-reference/check). Includes `judge_slug`, `district_id`, `document_scope`,
  `motion_type` (if detected), and the full `document` object. You must add `is_pro_se`,
  `pmc_completed`, `opposing_party_pro_se`, and `filing_role` before sending.
</ResponseField>

<ResponseField name="page_count_analysis" type="object">
  Breakdown of how body page count was computed from total PDF pages.

  <Expandable title="Page count analysis">
    <ResponseField name="total_pdf_pages" type="integer">
      Total pages in the uploaded PDF.
    </ResponseField>

    <ResponseField name="front_matter_pages" type="integer">
      Pages of front matter (cover, TOC, TOA) excluded from body count.
    </ResponseField>

    <ResponseField name="back_matter_pages" type="integer">
      Pages of back matter (cert of service, cert of compliance, exhibits list) excluded from body count.
    </ResponseField>

    <ResponseField name="body_page_count" type="integer">
      Effective body page count used for limit checks (total minus front minus back).
    </ResponseField>

    <ResponseField name="method" type="string">
      Analysis method used for page counting.
    </ResponseField>
  </Expandable>
</ResponseField>

<ResponseField name="format_detection" type="object">
  Detected font, margin, and line spacing from the PDF. Used for formatting compliance checks.

  <Expandable title="Format detection">
    <ResponseField name="confidence" type="string">
      `high`, `medium`, or `low`. When `low`, format checks are skipped.
    </ResponseField>

    <ResponseField name="confidence_reason" type="string">
      Why the confidence level was assigned.
    </ResponseField>

    <ResponseField name="warning" type="string | null">
      Human-readable warning when format checks will be skipped or uncertain.
    </ResponseField>

    <ResponseField name="font_family" type="string | null">
      Font family name (e.g. `TimesNewRoman`). Null when detection failed.
    </ResponseField>

    <ResponseField name="font_name_raw" type="string | null">
      Original font name as embedded in the PDF.
    </ResponseField>

    <ResponseField name="margins_per_side" type="object | null">
      Margin measurements per side in inches (`top`, `bottom`, `left`, `right`). Null when detection failed.
    </ResponseField>

    <ResponseField name="line_spacing_pt" type="number | null">
      Median line spacing in points. Null when detection failed.
    </ResponseField>
  </Expandable>
</ResponseField>

<ResponseField name="warnings" type="string[]">
  Conditions that may affect result quality. Present only when something noteworthy occurred during
  analysis (e.g., format extraction failed, document was truncated).
</ResponseField>

## Error responses

| Code  | Meaning                                                            |
| ----- | ------------------------------------------------------------------ |
| `400` | Invalid request (missing `pdf_base64`, invalid `judge_slug`, etc.) |
| `403` | District not accessible                                            |
| `404` | Judge not found in the specified district                          |
| `429` | Rate limit exceeded (5 requests per minute)                        |
| `502` | Classification failed (timeout, upstream error)                    |

## Usage pattern

The typical supported-document flow is classify, then check:

```
POST /classify  →  get check_request  →  POST /check
```

When present, the `check_request` field contains everything the classifier can extract from the PDF. Before sending it to [`/check`](/api-reference/check), add the filing context fields that only the filer knows: `is_pro_se`, `pmc_completed`, `opposing_party_pro_se`, and `filing_role`. When `supported` is `false`, show the classification reasoning to the user and do not call `/check`.
