- Automatic browser geolocation capture on event creation - Reverse geocoding via Nominatim API for place names - Full-text search with SQLite FTS5 - Calendar view for browsing past entries - DateNavigator component for day navigation - SearchModal with Ctrl+K shortcut - QuickAddWidget with Ctrl+J shortcut - Starlight documentation site with GitHub Pages deployment - Multiple AI provider support (Groq, OpenAI, Anthropic, Ollama, LM Studio) - Multi-user registration support BREAKING: Events now include latitude/longitude/placeName fields
8.9 KiB
8.9 KiB
Full-Text Search Feature Research
Overview
This document outlines research and implementation ideas for adding full-text search to DearDiary, an AI-powered daily journaling app.
1. Feature Description
Core Functionality
- Search across diary content (journal titles and bodies)
- Search across raw event content
- Filter by date range
- Sort by relevance (BM25) or date
- Real-time instant search as user types
User Stories
- Quick Recall: User types "meeting with Sarah" → sees matching diary entries and events
- Date-based Search: User searches "vacation" → filters to summer 2024 entries
- Deep Search: User searches for specific phrase → finds exact match in event content
2. Technical Approach
Option A: SQLite FTS5 (Recommended for v1)
Pros:
- Zero external dependencies
- Built into SQLite (already in use)
- BM25 ranking built-in
- Real-time indexing (update on insert)
- Lowest implementation complexity
- No additional infrastructure
Cons:
- No typo tolerance (unless using trigram/token helpers)
- Limited to SQLite (migration cost if switching DB)
- Single-node only (fine for self-hosted)
Implementation:
-- FTS5 virtual table for journals
CREATE VIRTUAL TABLE journal_fts USING fts5(
title,
content,
content_rowid='rowid',
tokenize='porter unicode61'
);
-- FTS5 virtual table for events
CREATE VIRTUAL TABLE event_fts USING fts5(
content,
type,
content_rowid='rowid',
tokenize='porter unicode61'
);
-- Triggers to keep FTS in sync
CREATE TRIGGER journal_ai AFTER INSERT ON Journal BEGIN
INSERT INTO journal_fts(rowid, title, content)
VALUES (NEW.rowid, NEW.title, NEW.content);
END;
Performance: FTS5 handles 100k+ rows easily on SQLite. For typical personal journaling (10 years = ~3650 entries, ~10k events), performance will be sub-100ms.
Option B: External Search (Typesense/Meilisearch)
Pros:
- Typo tolerance (fuzzy search)
- Better ranking algorithms
- Scalable to millions of records
- REST API, language-agnostic
Cons:
- Additional infrastructure (Docker service)
- Sync complexity (real-time indexing)
- More complex setup for self-hosted users
- Resource overhead (CPU/RAM for search service)
Recommendation: Defer to v2. External search only becomes necessary when:
- User wants fuzzy/typo-tolerant search
- Dataset exceeds 500k+ records
- Multi-language support needed
3. Indexing Strategy
Fields to Index
| Table | Field | Indexed | Reason |
|---|---|---|---|
| Journal | title | Yes | Primary search target |
| Journal | content | Yes | Full diary text |
| Journal | date | Yes | Filtering |
| Event | content | Yes | Raw event text |
| Event | type | Yes | Filter by event type |
| Event | date | Yes | Date filtering |
What NOT to Index
Event.metadata- JSON blob, search within JSON handled separately if neededEvent.mediaPath- File paths, not searchable contentUserfields - Not needed for user-facing search
Sync Strategy
- On Insert/Update: Write to main table, then update FTS via trigger
- On Delete: FTS trigger removes from index
- Reindex: Manual endpoint for recovery/debugging
4. Database Schema Changes
Prisma Schema Addition
// Optional: Search history for "recent searches" feature
model SearchHistory {
id String @id @default(uuid())
userId String
query String
createdAt DateTime @default(now())
@@index([userId, createdAt])
}
Note: FTS5 tables are virtual and managed via raw SQL, not Prisma models. We'll use prisma.$executeRaw for FTS operations.
Migration Steps
- Create FTS5 virtual tables (raw SQL)
- Create triggers for auto-sync
- Backfill existing data
- Add SearchHistory model (optional)
5. API Changes
New Endpoints
GET /api/v1/search?q=<query>&type=diary|event|all&from=2024-01-01&to=2024-12-31&sort=relevance|date&page=1&limit=20
Response:
interface SearchResult {
type: 'diary' | 'event';
id: string;
date: string;
title?: string; // For diaries
content: string; // Truncated/preview
highlight?: string; // Matched text with <mark> tags
score: number; // BM25 relevance
}
interface SearchResponse {
data: {
results: SearchResult[];
total: number;
page: number;
limit: number;
} | null;
error: null;
}
Optional Endpoints
GET /api/v1/search/history // Recent searches
DELETE /api/v1/search/history // Clear history
POST /api/v1/search/reindex // Force reindex (admin)
6. UI/UX Considerations
Search Modal
- Trigger: Cmd/Ctrl+K keyboard shortcut (standard pattern)
- Position: Centered modal with overlay
- Features:
- Instant search as you type (debounced 150ms)
- Filter tabs: All | Diaries | Events
- Date range picker (quick presets: Today, This Week, This Month, This Year)
- Results show date, type, preview with highlighted matches
Sidebar (Alternative)
- Persistent search box in navigation
- Results in scrollable list below
- Less intrusive, always visible
Result Cards
┌─────────────────────────────────────────┐
│ 📅 2024-03-15 [Diary] │
│ Meeting with Sarah about project... │
│ ─────────────────────────────────────── │
│ ...discussed timeline and <mark>budget</mark>... │
└─────────────────────────────────────────┘
UX Details
- Empty state: Show recent diaries/events when no query
- No results: Friendly message + suggestions
- Loading: Subtle spinner (search should be <100ms)
- Keyboard: Arrow keys to navigate results, Enter to open
Mobile Considerations
- Tap search icon in header → full-screen search
- Larger touch targets for filters
7. Performance Considerations
Query Performance
- FTS5 BM25 queries: ~50-100ms for 10k records
- Add LIMIT to prevent unbounded results
- Use connection pooling if many concurrent searches
Write Performance
- Triggers add ~5-10ms per insert/update
- Batch backfill for existing data (1000 rows/batch)
Caching Strategy
- Cache recent searches (Redis optional, or in-memory)
- Cache FTS index in memory (SQLite mmap)
Scaling Thresholds
- < 10k entries: No optimization needed
- 10k-100k: Consider FTS5 optimization (tokenizer, prefix search)
-
100k: Consider external search
8. Implementation Complexity
Complexity Assessment: MEDIUM
| Component | Complexity | Notes |
|---|---|---|
| FTS5 setup | Low | Raw SQL, one-time |
| Triggers | Low | Auto-sync, minimal code |
| API endpoint | Low | Standard CRUD pattern |
| Frontend modal | Medium | Keyboard shortcuts, state |
| Filters/Date | Medium | Multiple filter combinations |
| Backfill | Low | One-time script |
Phased Implementation
Phase 1 (MVP - 2-3 days)
- FTS5 tables + triggers
- Basic search API
- Simple modal UI with text input
Phase 2 (Enhancements - 1-2 days)
- Date filtering
- Type filtering (diary/event)
- Result highlighting
Phase 3 (Polish - 1 day)
- Search history
- Keyboard navigation
- Mobile responsive
9. Priority Recommendation
Recommended Priority: MEDIUM-HIGH
Rationale:
- Search is a core journaling feature (user wants to find past entries)
- Competitor apps (Day One, Journey) have robust search
- Implementation complexity is manageable (medium)
- Zero external dependencies (SQLite FTS5)
Factors Supporting High Priority
- User Value: High - helps users find meaningful memories
- Implementation Cost: Medium - achievable in 1 week
- Dependency Risk: Low - no external services needed
- Future-proofing: FTS5 is mature, well-supported
Factors Against Very High Priority
- Current core features (capture, generate) are stable
- Small dataset users may not notice missing search
- Can be added post-MVP without breaking changes
10. Open Questions / Further Research
- Typo tolerance: Is exact match sufficient, or do users expect fuzzy search?
- Search ranking: Should recent results be boosted higher?
- Multi-language: Support languages other than English (tokenizer considerations)
- Export/Import: Should search index be rebuilt on data import?
- Shared access: Multi-user search (future consideration)
11. Summary
| Aspect | Recommendation |
|---|---|
| Search Engine | SQLite FTS5 (built-in) |
| UI Pattern | Cmd/Ctrl+K modal |
| Features | Instant search, date filter, type filter, relevance sort |
| Complexity | Medium (3-5 days) |
| Priority | Medium-High |
| Schema Changes | FTS5 via raw SQL + optional SearchHistory model |
| API Changes | New /search endpoint with query params |