PDF Processing Configuration Strategies
Overview
Wabee AI offers various strategies for splitting PDF files that are uploaded to the agent in the chat window, allowing users to optimize how their agents process and analyze document content. This document outlines the available strategies and their use cases.
Where to Configure
The file split strategy can be configured in the agent settings when updating an agent configuration. The strategy can be set to one of the following options: Semantic, Title or Form-based split.

Available Strategies
1. Semantic Chunking
Strategy Name: semantic
Description
Semantic chunking divides the document into coherent sections based on the semantic meaning of the content. This strategy uses advanced natural language processing techniques to identify logical breaks in the text.
Use Cases
- Long-form articles or reports
- Academic papers
- Legal documents
Benefits
- Preserves context within chunks
- Improves relevance of information retrieval
- Enhances the quality of AI-generated responses
2. Form-based Chunking
Strategy Name: form
Description
Form-based chunking is designed specifically for documents with structured layouts, such as forms, invoices, or templated reports. This strategy identifies and extracts information based on the visual structure of the document.
Use Cases
- Invoices and receipts
- Application forms
- Structured reports with consistent layouts
Benefits
- Accurately captures field-value pairs
- Preserves tabular data structure
- Ideal for documents with repetitive layouts
3. Title-based Chunking (Default)
Strategy Name: title
(default)
Description
Title-based chunking splits the document into sections based on identified titles or headings. This strategy is effective for well-structured documents with clear section demarcations.
Use Cases
- Technical documentation
- Business reports
- User manuals
Benefits
- Maintains document structure
- Facilitates easy navigation of content
- Suitable for a wide range of document types
Choosing the Right Strategy
When creating an agent, consider the following factors to select the most appropriate file split strategy:
- Document Type: Match the strategy to the typical structure of your documents.
- Content Complexity: For documents with varied content, semantic chunking might be more effective.
- Information Extraction Needs: If you need to extract specific fields, form-based chunking could be ideal.
- Processing Speed: Title-based chunking is generally faster and suitable for most use cases.