feat: v0.1.0 - geolocation capture, calendar, search, Starlight docs site

- Automatic browser geolocation capture on event creation
- Reverse geocoding via Nominatim API for place names
- Full-text search with SQLite FTS5
- Calendar view for browsing past entries
- DateNavigator component for day navigation
- SearchModal with Ctrl+K shortcut
- QuickAddWidget with Ctrl+J shortcut
- Starlight documentation site with GitHub Pages deployment
- Multiple AI provider support (Groq, OpenAI, Anthropic, Ollama, LM Studio)
- Multi-user registration support

BREAKING: Events now include latitude/longitude/placeName fields
This commit is contained in:
lotherk
2026-03-27 02:27:55 +00:00
parent deaf496a7d
commit 0bdd71a4ed
67 changed files with 15201 additions and 355 deletions

598
todo/export.md Normal file
View File

@@ -0,0 +1,598 @@
# Data Export Feature - DearDiary
Comprehensive research document for implementing a data export feature.
---
## 1. Feature Overview
Allow users to export their diary data in multiple formats with flexible scope and options. This feature enables users to:
- Backup their data locally
- Migrate to other journaling platforms
- Create offline archives
- Share selected entries
---
## 2. Export Formats
### 2.1 Markdown (.md)
**Description**: Human-readable plain text format with frontmatter metadata.
**Technical Approach**:
- Single file: One `.md` file per day or combined
- Use YAML frontmatter for metadata (date, title, word count)
- Structure:
```markdown
---
date: 2024-01-15
title: A Quiet Morning
event_count: 5
generated_at: 2024-01-15T20:30:00Z
---
# January 15, 2024
## Events
[08:30] Had coffee and read news
[12:00] Team meeting about Q1 goals
## Diary Page
The morning started quietly...
```
**Complexity**: Low - straightforward string generation
**Priority**: High - most versatile, easy to implement
---
### 2.2 JSON (.json)
**Description**: Machine-readable structured format for programmatic use.
**Technical Approach**:
```json
{
"exported_at": "2024-01-15T20:30:00Z",
"user_id": "user-uuid",
"format_version": "1.0",
"entries": [
{
"date": "2024-01-15",
"journal": {
"title": "A Quiet Morning",
"content": "The morning started quietly...",
"generated_at": "2024-01-15T20:30:00Z"
},
"events": [
{
"id": "event-uuid",
"type": "text",
"content": "Had coffee and read news",
"created_at": "2024-01-15T08:30:00Z",
"metadata": {}
}
]
}
]
}
```
**Complexity**: Low - native Prisma JSON serialization
**Priority**: High - essential for backups/migrations
---
### 2.3 PDF (.pdf)
**Description**: Print-ready formatted document.
**Technical Approach**:
- Use `pdfkit` or `puppeteer` (headless Chrome) for generation
- Puppeteer recommended for complex layouts/CSS support
- Template options:
- Simple: Title + content (minimal styling)
- Full: Events listed with diary page formatted
- Page breaks handled for multi-day exports
**Complexity**: Medium - requires additional dependency
**Priority**: Medium - high user demand for print/export
---
### 2.4 HTML (.html)
**Description**: Web-viewable static pages.
**Technical Approach**:
- Single HTML file with embedded CSS
- Include basic navigation for multi-day exports
- Responsive design with print media queries
- Structure:
```html
<!DOCTYPE html>
<html>
<head>
<title>DearDiary Export</title>
<style>
body { font-family: system-ui; max-width: 800px; margin: 0 auto; padding: 2rem; }
.entry { margin-bottom: 2rem; }
.meta { color: #666; font-size: 0.9rem; }
</style>
</head>
<body>
<h1>January 2024</h1>
<div class="entry">
<h2>January 15, 2024</h2>
<div class="meta">5 events</div>
<p>Diary content...</p>
</div>
</body>
</html>
```
**Complexity**: Low-Medium - string generation with CSS
**Priority**: Medium - good for web publishing
---
### 2.5 ePub (.epub)
**Description**: Ebook format for e-readers.
**Technical Approach**:
- Use `epub-gen` or similar library
- Structure: One chapter per day or per month
- Include cover image with app branding
- Metadata: Title, author, generated date
**Complexity**: High - requires ebook-specific libraries
**Priority**: Low - niche use case, can be deprioritized
---
## 3. Export Scope
### 3.1 Single Diary
- Export one day's journal + events
- API: `GET /api/v1/export?date=2024-01-15`
- Returns single entry with all related data
### 3.2 Date Range
- Export events between start and end dates
- API: `GET /api/v1/export?start=2024-01-01&end=2024-01-31`
- Batch query: Prisma `where: { date: { gte: start, lte: end } }`
### 3.3 All Data
- Export entire user dataset
- Include settings, metadata
- Requires pagination for large datasets
---
## 4. Include/Exclude Options
### 4.1 Content Filters
| Option | Description | Implementation |
|--------|-------------|----------------|
| `events_only` | Raw events without AI-generated diaries | Filter journals from response |
| `diaries_only` | Only generated diary pages | Filter events from response |
| `with_media` | Include media file references | Include `mediaPath` field |
| `without_media` | Exclude media references | Omit `mediaPath` field |
### 4.2 Data Structure Options
```typescript
interface ExportOptions {
format: 'md' | 'json' | 'pdf' | 'html' | 'epub';
scope: 'single' | 'range' | 'all';
date?: string;
startDate?: string;
endDate?: string;
include: {
events: boolean;
journals: boolean;
media: boolean;
settings: boolean;
};
organization: 'single_file' | 'folder';
compress: boolean;
}
```
---
## 5. File Organization
### 5.1 Single File
- All content in one file (`.md`, `.json`, `.html`)
- Best for: small exports, JSON backups
- Simple to implement
### 5.2 Folder Structure
```
export-2024-01-15/
├── index.html # Main navigation
├── 2024-01-15/
│ ├── journal.md # Diary page
│ ├── events.md # Raw events
│ └── media/ # Photos, voice memos
├── 2024-01-14/
│ └── ...
└── manifest.json # Export metadata
```
- Best for: large exports with media
- Use ZIP compression for download
---
## 6. Compression Options
### 6.1 ZIP Archive
- Default for folder exports > 10MB
- Use `Bun.zip()` or `archiver` package
- Include manifest with export details
**Implementation**:
```typescript
// Example: ZIP export flow
async function exportZip(options: ExportOptions) {
const tempDir = await createTempDir();
await generateFiles(tempDir, options);
const zipPath = `${tempDir}.zip`;
await zip(tempDir, zipPath);
return serveFile(zipPath);
}
```
---
## 7. Streaming Large Exports
### 7.1 Problem
- Large exports (years of data) can exceed memory
- Need progressive loading and streaming response
### 7.2 Solution: Server-Sent Events (SSE)
**API Design**:
```
POST /api/v1/export
Content-Type: application/json
{
"format": "json",
"startDate": "2020-01-01",
"endDate": "2024-01-15"
}
```
**Response** (chunked):
```
event: progress
data: {"percent": 10, "stage": "loading_events"}
event: data
data: {"date": "2020-01-01", ...}
event: progress
data: {"percent": 20, "stage": "loading_journals"}
event: data
data: {"date": "2020-01-02", ...}
event: complete
data: {"total_entries": 1000, "export_size": "5MB"}
```
### 7.3 Implementation Notes
- Use Prisma cursor-based pagination for memory efficiency
- Stream directly to response without buffering
- Provide progress updates every N records
---
## 8. Privacy & Security
### 8.1 Authentication
- Require valid API key for all export endpoints
- User can only export their own data
### 8.2 Sensitive Data Handling
- **Option**: Password-protect exports
- Use AES-256 encryption for ZIP
- Prompt for password in UI
- **Option**: redact sensitive entries
- Tag certain events as "private"
- Exclude from export by default
### 8.3 Media Files
- Generate signed URLs for media export
- Set expiration (24h default)
- Don't include raw API keys in export
### 8.4 Audit Logging
- Log export requests (who, when, scope)
- Store in new `ExportLog` model
---
## 9. Database Schema Changes
### 9.1 New Models
```prisma
model ExportLog {
id String @id @default(uuid())
userId String
format String
scope String
startDate String?
endDate String?
recordCount Int
sizeBytes Int?
status String @default("pending")
createdAt DateTime @default(now())
completedAt DateTime?
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
}
model ScheduledExport {
id String @id @default(uuid())
userId String
name String
format String
scope String @default("all")
frequency String @default("weekly")
includeJson Json?
enabled Boolean @default(true)
lastRunAt DateTime?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
}
```
---
## 10. API Changes
### 10.1 New Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/v1/export` | Create export job |
| GET | `/api/v1/export/:id` | Get export status |
| GET | `/api/v1/export/:id/download` | Download export file |
| GET | `/api/v1/exports` | List export history |
| DELETE | `/api/v1/export/:id` | Delete export |
| GET | `/api/v1/scheduled-exports` | List scheduled exports |
| POST | `/api/v1/scheduled-exports` | Create schedule |
| PUT | `/api/v1/scheduled-exports/:id` | Update schedule |
| DELETE | `/api/v1/scheduled-exports/:id` | Delete schedule |
### 10.2 Request/Response Examples
**Create Export**:
```typescript
// POST /api/v1/export
interface CreateExportRequest {
format: 'md' | 'json' | 'pdf' | 'html' | 'epub';
date?: string; // single day
startDate?: string; // range start
endDate?: string; // range end
include: {
events: boolean;
journals: boolean;
media: boolean;
settings: boolean;
};
organization: 'single_file' | 'folder';
compress: boolean;
password?: string; // optional ZIP password
}
interface ExportResponse {
id: string;
status: 'pending' | 'processing' | 'completed' | 'failed';
progress: number;
downloadUrl?: string;
expiresAt?: string;
}
```
---
## 11. UI/UX Considerations
### 11.1 Export Page Location
- Add to Settings page as "Export Data" section
- Or create dedicated `/export` route
### 11.2 Export Modal
```
┌─────────────────────────────────────────┐
│ Export Your Data │
├─────────────────────────────────────────┤
│ │
│ Format: [Markdown ▼] │
│ ○ Markdown │
│ ○ JSON │
│ ○ PDF │
│ ○ HTML │
│ ○ ePub │
│ │
│ Scope: ○ This month │
│ ○ This year │
│ ○ All time │
│ ○ Custom range [____] │
│ │
│ Include: ☑ Generated diaries │
│ ☑ Raw events │
│ ☐ Media files │
│ ☐ Settings │
│ │
│ Options: ○ Single file │
│ ○ Folder (with ZIP) │
│ │
│ ☐ Password protect │
│ [________] │
│ │
│ [Cancel] [Export] │
└─────────────────────────────────────────┘
```
### 11.3 Progress View
- Show progress bar during export
- Estimated time remaining
- Cancel button for large exports
- Email notification option (future)
### 11.4 Export History
- List of past exports with:
- Date, format, scope
- Size, record count
- Download link (with expiration)
- Delete button
---
## 12. Scheduled Exports
### 12.1 Configuration Options
| Frequency | Description |
|-----------|-------------|
| `daily` | Every day at configured time |
| `weekly` | Every Sunday |
| `monthly` | First day of month |
| `quarterly` | Every 3 months |
### 12.2 Implementation
- Use cron-style scheduling
- Run as background job (Bun.setInterval or dedicated worker)
- Store exports in cloud storage (S3-compatible) or local
- Send notification when ready
### 12.3 Use Cases
- Automated weekly backups
- Monthly archive generation
- Quarterly review compilation
---
## 13. Implementation Roadmap
### Phase 1: Core Export (Week 1-2)
- [ ] Add `ExportLog` model to schema
- [ ] Implement JSON export endpoint
- [ ] Implement Markdown export endpoint
- [ ] Add single date/range query support
- [ ] Basic export UI in Settings
**Complexity**: 3/5
**Priority**: High
### Phase 2: Advanced Formats (Week 3)
- [ ] HTML export
- [ ] PDF export (using puppeteer)
- [ ] ePub export (optional)
**Complexity**: 4/5
**Priority**: Medium
### Phase 3: Large Exports (Week 4)
- [ ] Streaming with SSE
- [ ] ZIP compression
- [ ] Progress reporting
**Complexity**: 5/5
**Priority**: Medium
### Phase 4: Automation (Week 5)
- [ ] Scheduled exports model
- [ ] Background job scheduler
- [ ] Scheduled exports UI
**Complexity**: 4/5
**Priority**: Low
### Phase 5: Security & Polish (Week 6)
- [ ] Password-protected ZIPs
- [ ] Export audit logging
- [ ] Media file handling
- [ ] Edge cases and testing
**Complexity**: 3/5
**Priority**: Medium
---
## 14. Dependencies Required
| Package | Purpose | Version |
|---------|---------|---------|
| `pdfkit` | PDF generation | ^0.14.0 |
| `puppeteer` | HTML to PDF | ^21.0.0 |
| `archiver` | ZIP creation | ^6.0.0 |
| `epub-gen` | ePub creation | ^0.1.0 |
| `jszip` | Client-side ZIP | ^3.10.0 |
---
## 15. Testing Considerations
### 15.1 Unit Tests
- Export formatters (MD, JSON, HTML)
- Date range filtering
- Include/exclude logic
### 15.2 Integration Tests
- Full export workflow
- Large dataset performance
- Streaming response handling
### 15.3 Edge Cases
- Empty date range
- Missing media files
- Export during active generation
- Concurrent export requests
---
## 16. Priority Recommendation
| Feature | Priority | Rationale |
|---------|----------|-----------|
| JSON/Markdown export | P0 | Core requirement for backups |
| Single/range export | P0 | Essential scope control |
| Export UI | P0 | User-facing feature |
| PDF export | P1 | High user demand |
| HTML export | P1 | Good alternative to PDF |
| Streaming exports | P2 | Performance for large data |
| ZIP compression | P2 | Usability for folder exports |
| ePub export | P3 | Niche, can skip |
| Scheduled exports | P3 | Automation, lower urgency |
| Password protection | P4 | Advanced, security theater |
---
## 17. Open Questions
1. **Storage**: Should exports be stored temporarily or generated on-demand?
2. **Retention**: How long to keep export downloads available?
3. **Media handling**: Include actual files or just references?
4. **Third-party sync**: Export to Google Drive, Dropbox?
5. **Incremental exports**: Only export new data since last export?
---
## 18. Summary
This feature set provides comprehensive data export capabilities while maintaining security and user privacy. Starting with JSON/Markdown exports covers 80% of use cases (backups, migration). PDF and HTML add print/web options. Streaming and compression enable handling of large datasets. Scheduled exports provide automation for power users.
Recommend implementing Phase 1 first to establish core functionality, then iterate based on user feedback.