πŸ“š DocChat User Guide

Your Intelligent Document Chat Assistant

πŸ“‘ Table of Contents

1️⃣ Overview & How It Works

What is DocChat?

DocChat is an intelligent document chat assistant powered by advanced AI technology. It allows you to have natural conversations about your documents, asking questions and receiving accurate, cited answers based on your document collection.

How Does It Work?

DocChat uses a technology called Retrieval-Augmented Generation (RAG):

  1. Document Indexing: Your documents are processed and stored in a searchable format with intelligent chunking and embedding
  2. Multi-Language Query Expansion: Your question is automatically expanded into multiple query variants across different languages (English, Dutch, French)
  3. Intelligent Search: The system searches your documents using advanced similarity matching to find the most relevant information
  4. AI Response Generation: A large language model (LLM) reads the relevant document sections and generates a comprehensive answer in your selected language
  5. Citation & References: All answers include references to specific document sections [Block X] so you can verify the information
πŸ’‘ Key Benefit: Unlike generic AI chatbots, DocChat answers are always grounded in your actual documents, making responses accurate, verifiable, and trustworthy.

Supported Document Types

2️⃣ Getting Started

Step 1: Index Your Documents

Before asking questions, your documents need to be indexed:

  1. Click the "πŸ”„ Index Documents" button in the sidebar
  2. Wait for the indexing process to complete (progress bar will show status)
  3. Once complete, the document count will update at the top
βœ… Tip: Documents are automatically indexed on application startup if configured. You only need to re-index when new documents are added.

Step 2: Upload Additional Documents (Optional)

To add individual documents:

  1. Click "Choose File" under "Upload Document"
  2. Select your file
  3. Click "Upload"
  4. The document will be automatically indexed

Step 3: Select Your Settings

Step 4: Ask Your Question

Type your question in the chat input and press Enter or click Send!

3️⃣ Operating Modes

DocChat offers three operating modes, each optimized for different use cases:

🎯 Basic RAG

Best for: Quick questions

Speed: Fast (10-30 seconds)

How it works: Searches for relevant chunks and generates an answer based on the top matches.

Use when: You need quick answers to specific questions

πŸ“š Extensive

Best for: Detailed analysis

Speed: Medium (1-3 minutes)

How it works: Retrieves full documents, preprocesses them with a small LLM, then generates comprehensive answers.

Use when: You need thorough, detailed information from multiple documents

πŸ“– Full Reading

Best for: Complete overview

Speed: Slow (5-10 minutes)

How it works: Reads ALL documents in your collection (or selected sources) and provides a comprehensive synthesis.

Use when: You need a complete understanding across your entire document collection

⚠️ Mode Selection Tip: Start with Basic RAG for most questions. Use Extensive when Basic doesn't provide enough detail. Reserve Full Reading for rare cases when you need a complete overview.

Comparing Modes

Feature Basic RAG Extensive Full Reading
Speed ⚑ Fast ⚑⚑ Medium ⚑⚑⚑ Slow
Documents Searched Top 5-15 chunks Top 10-20 docs ALL documents
Detail Level Focused Detailed Comprehensive
Best Use Case Quick facts Analysis Overview

4️⃣ UI Features & Options

Settings Section

πŸ€– LLM Model

Choose the AI model for generating responses:

⚠️ Note: GPT-5 models (GPT-5, GPT-5 Mini, GPT-5 Pro) are experimental and may occasionally return empty responses. Use GPT-4o for production work.

🌍 Output Language

Select the language for AI responses:

βœ… Use Reranking

When enabled, search results are reranked using a cross-encoder model for improved relevance. Recommended: Keep enabled for better accuracy.

πŸ“ Include Chat History

When enabled, the AI remembers previous messages in the conversation and can answer follow-up questions with context. Recommended: Keep enabled for natural conversations.

πŸ” Enable Web Search Augmentation

When enabled, if no relevant documents are found, the system will search the web for additional information. Use with caution as web results are not from your document collection.

πŸ“„ Full Document Mode

Skip preprocessing step and send full documents to the main LLM. Useful when you want complete document content without summarization.

🎯 Reference Relevance Filter

When enabled, filters out low-relevance document references based on the threshold score. Helps focus on the most relevant sources.

πŸ“Š Top K

Number of document chunks/documents to retrieve. Higher values retrieve more information but may increase processing time. Default: 5 for Basic, 10 for Extensive.

Source Selection

πŸ—‚οΈ Select Sources

Restrict your search to specific documents or folders:

  1. Click "πŸ—‚οΈ Select Sources"
  2. Browse the document tree
  3. Check/uncheck documents or folders
  4. Use "Select All" or "Clear All" for quick selection
  5. Click "Apply" to confirm
βœ… Pro Tip: Selecting specific sources significantly improves speed and relevance when you know which documents contain your answer.

Manual Keywords

πŸ”Ž Manual Keywords

Add specific terms to enhance your search:

Example Usage:
Query: "What are the requirements?"
Keywords: calibration, equipment, validation
Result: System focuses on calibration-related requirements

Custom Instructions

πŸ“ Custom Instructions

Provide specific guidance for the AI:

5️⃣ Best Practices for Formulating Queries

βœ… Good Query Practices

1. Be Specific and Clear

❌ Vague:
"Tell me about safety"
βœ… Better:
"What are the safety procedures for handling chemical waste in the laboratory?"

2. Use Natural Language

Write questions as you would ask a colleague:

3. Provide Context When Needed

Example:
"In the context of biosafety level 2 procedures, what PPE is required for handling biological samples?"

4. Use Follow-Up Questions

With chat history enabled, you can ask follow-up questions:

First: "What are the waste disposal procedures?"
Then: "How often should waste containers be emptied?"
Then: "Who is responsible for this?"

5. Leverage Language Flexibility

Ask questions in any language - the system will search all documents:

🎯 Query Optimization Tips

Combine with Source Selection

If you know which documents contain the answer, select them first:

  1. Select relevant source documents/folders
  2. Ask your question
  3. Get faster, more focused results

Use Keywords for Technical Topics

For specialized domains, add technical keywords:

Query: "What are the requirements?"
Keywords: GLP, validation, ISO
Result: Focuses on GLP validation requirements

Choose the Right Mode

Question Type Recommended Mode
Simple fact: "What is the expiry date?" Basic RAG
Procedure: "How do I perform calibration?" Extensive
Overview: "Summarize all safety policies" Full Reading
Comparison: "Compare methods A and B" Extensive

⚠️ What to Avoid

6️⃣ Important Caveats & User Responsibility

⚠️ CRITICAL DISCLAIMER: DocChat is an AI-powered assistant tool. While it provides intelligent and helpful responses, users bear full responsibility for verifying and validating all information before making decisions or taking actions based on AI responses.

πŸ€– AI Limitations

1. AI Can Make Mistakes

Large Language Models (LLMs) can occasionally:

Your Responsibility: Always verify AI responses by checking the cited document blocks [Block X] and reviewing the original source documents.

2. Document Interpretation Limits

3. Language Translation Considerations

πŸ“‹ User Responsibilities

βœ… Required User Actions

  1. Verify Citations: Check the [Block X] references in responses
  2. Review Source Documents: Read the original documents for critical decisions
  3. Cross-Check Information: Validate important information from multiple sources
  4. Apply Domain Expertise: Use your professional judgment to evaluate AI responses
  5. Report Issues: If you notice inaccuracies, report them to improve the system

⚠️ Critical Use Cases

For high-stakes decisions involving:

ALWAYS review original source documents and consult with qualified professionals. Do not rely solely on AI-generated responses for critical decisions.

πŸ”’ Data Privacy & Security

Document Handling

Session Data

βš™οΈ System Limitations

Performance Considerations

Document Processing Limits

βœ… Best Practices for Responsible Use

The "Trust but Verify" Approach:

  1. Use DocChat to quickly find relevant information
  2. Check the cited block references [Block X]
  3. Review the original source documents
  4. Apply your professional expertise and judgment
  5. Validate critical information through proper channels

When to Use DocChat

When NOT to Rely Solely on DocChat

Remember: DocChat is a powerful assistant tool, not a replacement for human expertise, judgment, and verification. Use it to enhance your work, not to bypass necessary due diligence.

πŸ“ž Need Help?

For additional support or to report issues:

Version Information: DocChat uses state-of-the-art RAG technology with multi-language support, powered by OpenAI models (GPT-4o, GPT-4o Mini, GPT-5 experimental), Anthropic Claude Sonnet 4.5, and Google Gemini 2.0 Flash.

Β© 2025 DocChat - Intelligent Document Chat Assistant