Sensible’s new classification API automates document identification, labeling, and routing without needing any additional configuration.
Previously, you had to specify a document type in order to send an extraction API call, which could cause bottlenecks in the extraction workflow. The classification API intelligently matches your uploaded document to the document types and reference documents already stored in your Sensible account, removing the need to determine the document type ahead of time.
Using the classification API
First, POST your document to the /classify
endpoint.
curl --request POST \
--url https://api.sensible.so/v0/classify \
--header 'accept: application/json' \
--header 'authorization: Bearer REPLACE_WITH_SENSIBLE_TOKEN' \
--header 'content-type: application/pdf' \
--data 'REPLACE_WITH_LOCAL_PATH_TO_DOCUMENT'
Sensible classifies the document as a corresponding document type you’ve already defined in your Sensible account and matches it to the reference documents for that document type.
Then, Sensible returns the classification results, including similarity scores for each document type in your account, and for each reference document within the highest-scoring document type.
{
"document_type": {
"id": "65170934-59e9-4fb8-9b42-2b4h84d27490",
"name": "home_policy_declaration_pages",
"score": 0.7496934783865323
},
"reference_documents": [{
"id": "73d2f0b3-7dd3-4bc6-9920-c4674fkl35642",
"name": "prudential_home_declaration_page_sample",
"score": 0.7496934783865323
}],
"classification_summary": [{
"id": "c33646c1-7638-4086-a4e8-grfdb32164d5",
"name": "mortgage_applications",
"score": 0.6200881986672334
}, {
"id": "4485c242-ec7e-4b73-8d14-63589dwe9854",
"name": "tax_forms",
"score": 0.5174036153924795
}, {
"id": "f214dfeb-0005-4aa2-909f-9l9k34a4b698",
"name": "bank_statements",
"score": 0.4883282718837503
}]
}
With the document type now identified, you can use this information in several different ways, including:
- In an extraction workflow
- Not all uploaded documents contain information relevant to your workflow, and processing them wastes time and API calls. With the classification API, determine the correct documents to extract data from, then make a request to one of Sensible’s extraction APIs.
- In practice: A shipping logistics company may require their clients to self-upload customs forms for their shipments. Using Sensible’s classification API, the company can screen uploaded documents to ensure they are the correct forms, and flag incorrect uploads on the client’s end.
- Outside an extraction workflow
- Whether document extraction needs to happen today, tomorrow, or never, the classification API’s response can be saved to a system of record or used to automate document delivery to the correct workflow or endpoint.
- In practice: To process loan applications, banks need to review their customer’s supporting documents. Some of these documents may require verification or review by multiple teams. Using Sensible’s classification API, the bank ensures that documents are routed correctly to process loan applications quickly and efficiently.
How it works
Similar to our question and list extraction methods, the classification API calculates OpenAI embeddings over your query document and compares them against the document types in your Sensible account. We average the similarity scores across all reference documents in each document type, and return the document type with the highest similarity score. We also return the individual similarity scores for all reference documents in the selected document type.
The classification API simplifies document preprocessing by removing the guesswork and manual selection from the extraction process. Simply upload your document and start extracting sooner.
Ready to start classifying your documents? Try the classification API today, or request a demo from a Sensible expert.