feat: v0.1.0 - geolocation capture, calendar, search, Starlight docs site
- Automatic browser geolocation capture on event creation - Reverse geocoding via Nominatim API for place names - Full-text search with SQLite FTS5 - Calendar view for browsing past entries - DateNavigator component for day navigation - SearchModal with Ctrl+K shortcut - QuickAddWidget with Ctrl+J shortcut - Starlight documentation site with GitHub Pages deployment - Multiple AI provider support (Groq, OpenAI, Anthropic, Ollama, LM Studio) - Multi-user registration support BREAKING: Events now include latitude/longitude/placeName fields
This commit is contained in:
316
todo/search.md
Normal file
316
todo/search.md
Normal file
@@ -0,0 +1,316 @@
|
||||
# Full-Text Search Feature Research
|
||||
|
||||
## Overview
|
||||
|
||||
This document outlines research and implementation ideas for adding full-text search to DearDiary, an AI-powered daily journaling app.
|
||||
|
||||
---
|
||||
|
||||
## 1. Feature Description
|
||||
|
||||
### Core Functionality
|
||||
- Search across diary content (journal titles and bodies)
|
||||
- Search across raw event content
|
||||
- Filter by date range
|
||||
- Sort by relevance (BM25) or date
|
||||
- Real-time instant search as user types
|
||||
|
||||
### User Stories
|
||||
1. **Quick Recall**: User types "meeting with Sarah" → sees matching diary entries and events
|
||||
2. **Date-based Search**: User searches "vacation" → filters to summer 2024 entries
|
||||
3. **Deep Search**: User searches for specific phrase → finds exact match in event content
|
||||
|
||||
---
|
||||
|
||||
## 2. Technical Approach
|
||||
|
||||
### Option A: SQLite FTS5 (Recommended for v1)
|
||||
|
||||
**Pros:**
|
||||
- Zero external dependencies
|
||||
- Built into SQLite (already in use)
|
||||
- BM25 ranking built-in
|
||||
- Real-time indexing (update on insert)
|
||||
- Lowest implementation complexity
|
||||
- No additional infrastructure
|
||||
|
||||
**Cons:**
|
||||
- No typo tolerance (unless using trigram/token helpers)
|
||||
- Limited to SQLite (migration cost if switching DB)
|
||||
- Single-node only (fine for self-hosted)
|
||||
|
||||
**Implementation:**
|
||||
```sql
|
||||
-- FTS5 virtual table for journals
|
||||
CREATE VIRTUAL TABLE journal_fts USING fts5(
|
||||
title,
|
||||
content,
|
||||
content_rowid='rowid',
|
||||
tokenize='porter unicode61'
|
||||
);
|
||||
|
||||
-- FTS5 virtual table for events
|
||||
CREATE VIRTUAL TABLE event_fts USING fts5(
|
||||
content,
|
||||
type,
|
||||
content_rowid='rowid',
|
||||
tokenize='porter unicode61'
|
||||
);
|
||||
|
||||
-- Triggers to keep FTS in sync
|
||||
CREATE TRIGGER journal_ai AFTER INSERT ON Journal BEGIN
|
||||
INSERT INTO journal_fts(rowid, title, content)
|
||||
VALUES (NEW.rowid, NEW.title, NEW.content);
|
||||
END;
|
||||
```
|
||||
|
||||
**Performance:** FTS5 handles 100k+ rows easily on SQLite. For typical personal journaling (10 years = ~3650 entries, ~10k events), performance will be sub-100ms.
|
||||
|
||||
### Option B: External Search (Typesense/Meilisearch)
|
||||
|
||||
**Pros:**
|
||||
- Typo tolerance (fuzzy search)
|
||||
- Better ranking algorithms
|
||||
- Scalable to millions of records
|
||||
- REST API, language-agnostic
|
||||
|
||||
**Cons:**
|
||||
- Additional infrastructure (Docker service)
|
||||
- Sync complexity (real-time indexing)
|
||||
- More complex setup for self-hosted users
|
||||
- Resource overhead (CPU/RAM for search service)
|
||||
|
||||
**Recommendation:** Defer to v2. External search only becomes necessary when:
|
||||
- User wants fuzzy/typo-tolerant search
|
||||
- Dataset exceeds 500k+ records
|
||||
- Multi-language support needed
|
||||
|
||||
---
|
||||
|
||||
## 3. Indexing Strategy
|
||||
|
||||
### Fields to Index
|
||||
|
||||
| Table | Field | Indexed | Reason |
|
||||
|-------|-------|---------|--------|
|
||||
| Journal | title | Yes | Primary search target |
|
||||
| Journal | content | Yes | Full diary text |
|
||||
| Journal | date | Yes | Filtering |
|
||||
| Event | content | Yes | Raw event text |
|
||||
| Event | type | Yes | Filter by event type |
|
||||
| Event | date | Yes | Date filtering |
|
||||
|
||||
### What NOT to Index
|
||||
- `Event.metadata` - JSON blob, search within JSON handled separately if needed
|
||||
- `Event.mediaPath` - File paths, not searchable content
|
||||
- `User` fields - Not needed for user-facing search
|
||||
|
||||
### Sync Strategy
|
||||
1. **On Insert/Update**: Write to main table, then update FTS via trigger
|
||||
2. **On Delete**: FTS trigger removes from index
|
||||
3. **Reindex**: Manual endpoint for recovery/debugging
|
||||
|
||||
---
|
||||
|
||||
## 4. Database Schema Changes
|
||||
|
||||
### Prisma Schema Addition
|
||||
|
||||
```prisma
|
||||
// Optional: Search history for "recent searches" feature
|
||||
model SearchHistory {
|
||||
id String @id @default(uuid())
|
||||
userId String
|
||||
query String
|
||||
createdAt DateTime @default(now())
|
||||
|
||||
@@index([userId, createdAt])
|
||||
}
|
||||
```
|
||||
|
||||
Note: FTS5 tables are virtual and managed via raw SQL, not Prisma models. We'll use `prisma.$executeRaw` for FTS operations.
|
||||
|
||||
### Migration Steps
|
||||
1. Create FTS5 virtual tables (raw SQL)
|
||||
2. Create triggers for auto-sync
|
||||
3. Backfill existing data
|
||||
4. Add SearchHistory model (optional)
|
||||
|
||||
---
|
||||
|
||||
## 5. API Changes
|
||||
|
||||
### New Endpoints
|
||||
|
||||
```
|
||||
GET /api/v1/search?q=<query>&type=diary|event|all&from=2024-01-01&to=2024-12-31&sort=relevance|date&page=1&limit=20
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```typescript
|
||||
interface SearchResult {
|
||||
type: 'diary' | 'event';
|
||||
id: string;
|
||||
date: string;
|
||||
title?: string; // For diaries
|
||||
content: string; // Truncated/preview
|
||||
highlight?: string; // Matched text with <mark> tags
|
||||
score: number; // BM25 relevance
|
||||
}
|
||||
|
||||
interface SearchResponse {
|
||||
data: {
|
||||
results: SearchResult[];
|
||||
total: number;
|
||||
page: number;
|
||||
limit: number;
|
||||
} | null;
|
||||
error: null;
|
||||
}
|
||||
```
|
||||
|
||||
### Optional Endpoints
|
||||
```
|
||||
GET /api/v1/search/history // Recent searches
|
||||
DELETE /api/v1/search/history // Clear history
|
||||
POST /api/v1/search/reindex // Force reindex (admin)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. UI/UX Considerations
|
||||
|
||||
### Search Modal
|
||||
- **Trigger**: Cmd/Ctrl+K keyboard shortcut (standard pattern)
|
||||
- **Position**: Centered modal with overlay
|
||||
- **Features**:
|
||||
- Instant search as you type (debounced 150ms)
|
||||
- Filter tabs: All | Diaries | Events
|
||||
- Date range picker (quick presets: Today, This Week, This Month, This Year)
|
||||
- Results show date, type, preview with highlighted matches
|
||||
|
||||
### Sidebar (Alternative)
|
||||
- Persistent search box in navigation
|
||||
- Results in scrollable list below
|
||||
- Less intrusive, always visible
|
||||
|
||||
### Result Cards
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ 📅 2024-03-15 [Diary] │
|
||||
│ Meeting with Sarah about project... │
|
||||
│ ─────────────────────────────────────── │
|
||||
│ ...discussed timeline and <mark>budget</mark>... │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### UX Details
|
||||
- **Empty state**: Show recent diaries/events when no query
|
||||
- **No results**: Friendly message + suggestions
|
||||
- **Loading**: Subtle spinner (search should be <100ms)
|
||||
- **Keyboard**: Arrow keys to navigate results, Enter to open
|
||||
|
||||
### Mobile Considerations
|
||||
- Tap search icon in header → full-screen search
|
||||
- Larger touch targets for filters
|
||||
|
||||
---
|
||||
|
||||
## 7. Performance Considerations
|
||||
|
||||
### Query Performance
|
||||
- FTS5 BM25 queries: ~50-100ms for 10k records
|
||||
- Add LIMIT to prevent unbounded results
|
||||
- Use connection pooling if many concurrent searches
|
||||
|
||||
### Write Performance
|
||||
- Triggers add ~5-10ms per insert/update
|
||||
- Batch backfill for existing data (1000 rows/batch)
|
||||
|
||||
### Caching Strategy
|
||||
- Cache recent searches (Redis optional, or in-memory)
|
||||
- Cache FTS index in memory (SQLite mmap)
|
||||
|
||||
### Scaling Thresholds
|
||||
- < 10k entries: No optimization needed
|
||||
- 10k-100k: Consider FTS5 optimization (tokenizer, prefix search)
|
||||
- > 100k: Consider external search
|
||||
|
||||
---
|
||||
|
||||
## 8. Implementation Complexity
|
||||
|
||||
### Complexity Assessment: **MEDIUM**
|
||||
|
||||
| Component | Complexity | Notes |
|
||||
|-----------|------------|-------|
|
||||
| FTS5 setup | Low | Raw SQL, one-time |
|
||||
| Triggers | Low | Auto-sync, minimal code |
|
||||
| API endpoint | Low | Standard CRUD pattern |
|
||||
| Frontend modal | Medium | Keyboard shortcuts, state |
|
||||
| Filters/Date | Medium | Multiple filter combinations |
|
||||
| Backfill | Low | One-time script |
|
||||
|
||||
### Phased Implementation
|
||||
|
||||
**Phase 1 (MVP - 2-3 days)**
|
||||
- FTS5 tables + triggers
|
||||
- Basic search API
|
||||
- Simple modal UI with text input
|
||||
|
||||
**Phase 2 (Enhancements - 1-2 days)**
|
||||
- Date filtering
|
||||
- Type filtering (diary/event)
|
||||
- Result highlighting
|
||||
|
||||
**Phase 3 (Polish - 1 day)**
|
||||
- Search history
|
||||
- Keyboard navigation
|
||||
- Mobile responsive
|
||||
|
||||
---
|
||||
|
||||
## 9. Priority Recommendation
|
||||
|
||||
### Recommended Priority: **MEDIUM-HIGH**
|
||||
|
||||
**Rationale:**
|
||||
- Search is a core journaling feature (user wants to find past entries)
|
||||
- Competitor apps (Day One, Journey) have robust search
|
||||
- Implementation complexity is manageable (medium)
|
||||
- Zero external dependencies (SQLite FTS5)
|
||||
|
||||
### Factors Supporting High Priority
|
||||
1. **User Value**: High - helps users find meaningful memories
|
||||
2. **Implementation Cost**: Medium - achievable in 1 week
|
||||
3. **Dependency Risk**: Low - no external services needed
|
||||
4. **Future-proofing**: FTS5 is mature, well-supported
|
||||
|
||||
### Factors Against Very High Priority
|
||||
- Current core features (capture, generate) are stable
|
||||
- Small dataset users may not notice missing search
|
||||
- Can be added post-MVP without breaking changes
|
||||
|
||||
---
|
||||
|
||||
## 10. Open Questions / Further Research
|
||||
|
||||
1. **Typo tolerance**: Is exact match sufficient, or do users expect fuzzy search?
|
||||
2. **Search ranking**: Should recent results be boosted higher?
|
||||
3. **Multi-language**: Support languages other than English (tokenizer considerations)
|
||||
4. **Export/Import**: Should search index be rebuilt on data import?
|
||||
5. **Shared access**: Multi-user search (future consideration)
|
||||
|
||||
---
|
||||
|
||||
## 11. Summary
|
||||
|
||||
| Aspect | Recommendation |
|
||||
|--------|----------------|
|
||||
| **Search Engine** | SQLite FTS5 (built-in) |
|
||||
| **UI Pattern** | Cmd/Ctrl+K modal |
|
||||
| **Features** | Instant search, date filter, type filter, relevance sort |
|
||||
| **Complexity** | Medium (3-5 days) |
|
||||
| **Priority** | Medium-High |
|
||||
| **Schema Changes** | FTS5 via raw SQL + optional SearchHistory model |
|
||||
| **API Changes** | New `/search` endpoint with query params |
|
||||
Reference in New Issue
Block a user