Changelog
Stay up to date with the latest changes and improvements to Tensorlake
Table Recognition now parses ~1,500-cell tables (with structure preserved)
New model is live—reliably extracting very large, dense tables from PDFs (incl. scans) while preserving header hierarchy, row/col spans, and cell boundaries, with fast HTML/CSV export and bbox for citations.
- •Robust on ~1,500-cell tables; resilient to complex layouts and scanned documents.
- •Preserves header hierarchy and row/column spans; faithful HTML outputs.
- •Improved cell boundary detection and multi-row/multi-col header parsing.
- +3 more...
DocumentAI API v2
V2 of the DocumentAI API is fully in production in the Python SDK and on the Playground, offering unified document processing with advanced structured extraction, page classification, and enrichment capabilities.
- •Unified Parse and Jobs API
- •Advanced Structured Extraction with JSON Schema
- •Page Classification and Signature Detection
- +2 more...
Advanced Schema Extraction
Extract structured data from any document using Pydantic schemas with improved accuracy and multi-format support
- •Research paper metadata extraction
- •Pydantic schema support
- •Multi-format document support
- +1 more...