- Automatic browser geolocation capture on event creation - Reverse geocoding via Nominatim API for place names - Full-text search with SQLite FTS5 - Calendar view for browsing past entries - DateNavigator component for day navigation - SearchModal with Ctrl+K shortcut - QuickAddWidget with Ctrl+J shortcut - Starlight documentation site with GitHub Pages deployment - Multiple AI provider support (Groq, OpenAI, Anthropic, Ollama, LM Studio) - Multi-user registration support BREAKING: Events now include latitude/longitude/placeName fields
599 lines
15 KiB
Markdown
599 lines
15 KiB
Markdown
# Data Export Feature - DearDiary
|
|
|
|
Comprehensive research document for implementing a data export feature.
|
|
|
|
---
|
|
|
|
## 1. Feature Overview
|
|
|
|
Allow users to export their diary data in multiple formats with flexible scope and options. This feature enables users to:
|
|
- Backup their data locally
|
|
- Migrate to other journaling platforms
|
|
- Create offline archives
|
|
- Share selected entries
|
|
|
|
---
|
|
|
|
## 2. Export Formats
|
|
|
|
### 2.1 Markdown (.md)
|
|
|
|
**Description**: Human-readable plain text format with frontmatter metadata.
|
|
|
|
**Technical Approach**:
|
|
- Single file: One `.md` file per day or combined
|
|
- Use YAML frontmatter for metadata (date, title, word count)
|
|
- Structure:
|
|
```markdown
|
|
---
|
|
date: 2024-01-15
|
|
title: A Quiet Morning
|
|
event_count: 5
|
|
generated_at: 2024-01-15T20:30:00Z
|
|
---
|
|
|
|
# January 15, 2024
|
|
|
|
## Events
|
|
[08:30] Had coffee and read news
|
|
[12:00] Team meeting about Q1 goals
|
|
|
|
## Diary Page
|
|
|
|
The morning started quietly...
|
|
```
|
|
|
|
**Complexity**: Low - straightforward string generation
|
|
**Priority**: High - most versatile, easy to implement
|
|
|
|
---
|
|
|
|
### 2.2 JSON (.json)
|
|
|
|
**Description**: Machine-readable structured format for programmatic use.
|
|
|
|
**Technical Approach**:
|
|
```json
|
|
{
|
|
"exported_at": "2024-01-15T20:30:00Z",
|
|
"user_id": "user-uuid",
|
|
"format_version": "1.0",
|
|
"entries": [
|
|
{
|
|
"date": "2024-01-15",
|
|
"journal": {
|
|
"title": "A Quiet Morning",
|
|
"content": "The morning started quietly...",
|
|
"generated_at": "2024-01-15T20:30:00Z"
|
|
},
|
|
"events": [
|
|
{
|
|
"id": "event-uuid",
|
|
"type": "text",
|
|
"content": "Had coffee and read news",
|
|
"created_at": "2024-01-15T08:30:00Z",
|
|
"metadata": {}
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Complexity**: Low - native Prisma JSON serialization
|
|
**Priority**: High - essential for backups/migrations
|
|
|
|
---
|
|
|
|
### 2.3 PDF (.pdf)
|
|
|
|
**Description**: Print-ready formatted document.
|
|
|
|
**Technical Approach**:
|
|
- Use `pdfkit` or `puppeteer` (headless Chrome) for generation
|
|
- Puppeteer recommended for complex layouts/CSS support
|
|
- Template options:
|
|
- Simple: Title + content (minimal styling)
|
|
- Full: Events listed with diary page formatted
|
|
- Page breaks handled for multi-day exports
|
|
|
|
**Complexity**: Medium - requires additional dependency
|
|
**Priority**: Medium - high user demand for print/export
|
|
|
|
---
|
|
|
|
### 2.4 HTML (.html)
|
|
|
|
**Description**: Web-viewable static pages.
|
|
|
|
**Technical Approach**:
|
|
- Single HTML file with embedded CSS
|
|
- Include basic navigation for multi-day exports
|
|
- Responsive design with print media queries
|
|
- Structure:
|
|
```html
|
|
<!DOCTYPE html>
|
|
<html>
|
|
<head>
|
|
<title>DearDiary Export</title>
|
|
<style>
|
|
body { font-family: system-ui; max-width: 800px; margin: 0 auto; padding: 2rem; }
|
|
.entry { margin-bottom: 2rem; }
|
|
.meta { color: #666; font-size: 0.9rem; }
|
|
</style>
|
|
</head>
|
|
<body>
|
|
<h1>January 2024</h1>
|
|
<div class="entry">
|
|
<h2>January 15, 2024</h2>
|
|
<div class="meta">5 events</div>
|
|
<p>Diary content...</p>
|
|
</div>
|
|
</body>
|
|
</html>
|
|
```
|
|
|
|
**Complexity**: Low-Medium - string generation with CSS
|
|
**Priority**: Medium - good for web publishing
|
|
|
|
---
|
|
|
|
### 2.5 ePub (.epub)
|
|
|
|
**Description**: Ebook format for e-readers.
|
|
|
|
**Technical Approach**:
|
|
- Use `epub-gen` or similar library
|
|
- Structure: One chapter per day or per month
|
|
- Include cover image with app branding
|
|
- Metadata: Title, author, generated date
|
|
|
|
**Complexity**: High - requires ebook-specific libraries
|
|
**Priority**: Low - niche use case, can be deprioritized
|
|
|
|
---
|
|
|
|
## 3. Export Scope
|
|
|
|
### 3.1 Single Diary
|
|
- Export one day's journal + events
|
|
- API: `GET /api/v1/export?date=2024-01-15`
|
|
- Returns single entry with all related data
|
|
|
|
### 3.2 Date Range
|
|
- Export events between start and end dates
|
|
- API: `GET /api/v1/export?start=2024-01-01&end=2024-01-31`
|
|
- Batch query: Prisma `where: { date: { gte: start, lte: end } }`
|
|
|
|
### 3.3 All Data
|
|
- Export entire user dataset
|
|
- Include settings, metadata
|
|
- Requires pagination for large datasets
|
|
|
|
---
|
|
|
|
## 4. Include/Exclude Options
|
|
|
|
### 4.1 Content Filters
|
|
| Option | Description | Implementation |
|
|
|--------|-------------|----------------|
|
|
| `events_only` | Raw events without AI-generated diaries | Filter journals from response |
|
|
| `diaries_only` | Only generated diary pages | Filter events from response |
|
|
| `with_media` | Include media file references | Include `mediaPath` field |
|
|
| `without_media` | Exclude media references | Omit `mediaPath` field |
|
|
|
|
### 4.2 Data Structure Options
|
|
```typescript
|
|
interface ExportOptions {
|
|
format: 'md' | 'json' | 'pdf' | 'html' | 'epub';
|
|
scope: 'single' | 'range' | 'all';
|
|
date?: string;
|
|
startDate?: string;
|
|
endDate?: string;
|
|
include: {
|
|
events: boolean;
|
|
journals: boolean;
|
|
media: boolean;
|
|
settings: boolean;
|
|
};
|
|
organization: 'single_file' | 'folder';
|
|
compress: boolean;
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 5. File Organization
|
|
|
|
### 5.1 Single File
|
|
- All content in one file (`.md`, `.json`, `.html`)
|
|
- Best for: small exports, JSON backups
|
|
- Simple to implement
|
|
|
|
### 5.2 Folder Structure
|
|
```
|
|
export-2024-01-15/
|
|
├── index.html # Main navigation
|
|
├── 2024-01-15/
|
|
│ ├── journal.md # Diary page
|
|
│ ├── events.md # Raw events
|
|
│ └── media/ # Photos, voice memos
|
|
├── 2024-01-14/
|
|
│ └── ...
|
|
└── manifest.json # Export metadata
|
|
```
|
|
|
|
- Best for: large exports with media
|
|
- Use ZIP compression for download
|
|
|
|
---
|
|
|
|
## 6. Compression Options
|
|
|
|
### 6.1 ZIP Archive
|
|
- Default for folder exports > 10MB
|
|
- Use `Bun.zip()` or `archiver` package
|
|
- Include manifest with export details
|
|
|
|
**Implementation**:
|
|
```typescript
|
|
// Example: ZIP export flow
|
|
async function exportZip(options: ExportOptions) {
|
|
const tempDir = await createTempDir();
|
|
await generateFiles(tempDir, options);
|
|
const zipPath = `${tempDir}.zip`;
|
|
await zip(tempDir, zipPath);
|
|
return serveFile(zipPath);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Streaming Large Exports
|
|
|
|
### 7.1 Problem
|
|
- Large exports (years of data) can exceed memory
|
|
- Need progressive loading and streaming response
|
|
|
|
### 7.2 Solution: Server-Sent Events (SSE)
|
|
|
|
**API Design**:
|
|
```
|
|
POST /api/v1/export
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"format": "json",
|
|
"startDate": "2020-01-01",
|
|
"endDate": "2024-01-15"
|
|
}
|
|
```
|
|
|
|
**Response** (chunked):
|
|
```
|
|
event: progress
|
|
data: {"percent": 10, "stage": "loading_events"}
|
|
|
|
event: data
|
|
data: {"date": "2020-01-01", ...}
|
|
|
|
event: progress
|
|
data: {"percent": 20, "stage": "loading_journals"}
|
|
|
|
event: data
|
|
data: {"date": "2020-01-02", ...}
|
|
|
|
event: complete
|
|
data: {"total_entries": 1000, "export_size": "5MB"}
|
|
```
|
|
|
|
### 7.3 Implementation Notes
|
|
- Use Prisma cursor-based pagination for memory efficiency
|
|
- Stream directly to response without buffering
|
|
- Provide progress updates every N records
|
|
|
|
---
|
|
|
|
## 8. Privacy & Security
|
|
|
|
### 8.1 Authentication
|
|
- Require valid API key for all export endpoints
|
|
- User can only export their own data
|
|
|
|
### 8.2 Sensitive Data Handling
|
|
- **Option**: Password-protect exports
|
|
- Use AES-256 encryption for ZIP
|
|
- Prompt for password in UI
|
|
- **Option**: redact sensitive entries
|
|
- Tag certain events as "private"
|
|
- Exclude from export by default
|
|
|
|
### 8.3 Media Files
|
|
- Generate signed URLs for media export
|
|
- Set expiration (24h default)
|
|
- Don't include raw API keys in export
|
|
|
|
### 8.4 Audit Logging
|
|
- Log export requests (who, when, scope)
|
|
- Store in new `ExportLog` model
|
|
|
|
---
|
|
|
|
## 9. Database Schema Changes
|
|
|
|
### 9.1 New Models
|
|
|
|
```prisma
|
|
model ExportLog {
|
|
id String @id @default(uuid())
|
|
userId String
|
|
format String
|
|
scope String
|
|
startDate String?
|
|
endDate String?
|
|
recordCount Int
|
|
sizeBytes Int?
|
|
status String @default("pending")
|
|
createdAt DateTime @default(now())
|
|
completedAt DateTime?
|
|
|
|
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
|
|
}
|
|
|
|
model ScheduledExport {
|
|
id String @id @default(uuid())
|
|
userId String
|
|
name String
|
|
format String
|
|
scope String @default("all")
|
|
frequency String @default("weekly")
|
|
includeJson Json?
|
|
enabled Boolean @default(true)
|
|
lastRunAt DateTime?
|
|
createdAt DateTime @default(now())
|
|
updatedAt DateTime @updatedAt
|
|
|
|
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 10. API Changes
|
|
|
|
### 10.1 New Endpoints
|
|
|
|
| Method | Endpoint | Description |
|
|
|--------|----------|-------------|
|
|
| POST | `/api/v1/export` | Create export job |
|
|
| GET | `/api/v1/export/:id` | Get export status |
|
|
| GET | `/api/v1/export/:id/download` | Download export file |
|
|
| GET | `/api/v1/exports` | List export history |
|
|
| DELETE | `/api/v1/export/:id` | Delete export |
|
|
| GET | `/api/v1/scheduled-exports` | List scheduled exports |
|
|
| POST | `/api/v1/scheduled-exports` | Create schedule |
|
|
| PUT | `/api/v1/scheduled-exports/:id` | Update schedule |
|
|
| DELETE | `/api/v1/scheduled-exports/:id` | Delete schedule |
|
|
|
|
### 10.2 Request/Response Examples
|
|
|
|
**Create Export**:
|
|
```typescript
|
|
// POST /api/v1/export
|
|
interface CreateExportRequest {
|
|
format: 'md' | 'json' | 'pdf' | 'html' | 'epub';
|
|
date?: string; // single day
|
|
startDate?: string; // range start
|
|
endDate?: string; // range end
|
|
include: {
|
|
events: boolean;
|
|
journals: boolean;
|
|
media: boolean;
|
|
settings: boolean;
|
|
};
|
|
organization: 'single_file' | 'folder';
|
|
compress: boolean;
|
|
password?: string; // optional ZIP password
|
|
}
|
|
|
|
interface ExportResponse {
|
|
id: string;
|
|
status: 'pending' | 'processing' | 'completed' | 'failed';
|
|
progress: number;
|
|
downloadUrl?: string;
|
|
expiresAt?: string;
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 11. UI/UX Considerations
|
|
|
|
### 11.1 Export Page Location
|
|
- Add to Settings page as "Export Data" section
|
|
- Or create dedicated `/export` route
|
|
|
|
### 11.2 Export Modal
|
|
|
|
```
|
|
┌─────────────────────────────────────────┐
|
|
│ Export Your Data │
|
|
├─────────────────────────────────────────┤
|
|
│ │
|
|
│ Format: [Markdown ▼] │
|
|
│ ○ Markdown │
|
|
│ ○ JSON │
|
|
│ ○ PDF │
|
|
│ ○ HTML │
|
|
│ ○ ePub │
|
|
│ │
|
|
│ Scope: ○ This month │
|
|
│ ○ This year │
|
|
│ ○ All time │
|
|
│ ○ Custom range [____] │
|
|
│ │
|
|
│ Include: ☑ Generated diaries │
|
|
│ ☑ Raw events │
|
|
│ ☐ Media files │
|
|
│ ☐ Settings │
|
|
│ │
|
|
│ Options: ○ Single file │
|
|
│ ○ Folder (with ZIP) │
|
|
│ │
|
|
│ ☐ Password protect │
|
|
│ [________] │
|
|
│ │
|
|
│ [Cancel] [Export] │
|
|
└─────────────────────────────────────────┘
|
|
```
|
|
|
|
### 11.3 Progress View
|
|
- Show progress bar during export
|
|
- Estimated time remaining
|
|
- Cancel button for large exports
|
|
- Email notification option (future)
|
|
|
|
### 11.4 Export History
|
|
- List of past exports with:
|
|
- Date, format, scope
|
|
- Size, record count
|
|
- Download link (with expiration)
|
|
- Delete button
|
|
|
|
---
|
|
|
|
## 12. Scheduled Exports
|
|
|
|
### 12.1 Configuration Options
|
|
| Frequency | Description |
|
|
|-----------|-------------|
|
|
| `daily` | Every day at configured time |
|
|
| `weekly` | Every Sunday |
|
|
| `monthly` | First day of month |
|
|
| `quarterly` | Every 3 months |
|
|
|
|
### 12.2 Implementation
|
|
- Use cron-style scheduling
|
|
- Run as background job (Bun.setInterval or dedicated worker)
|
|
- Store exports in cloud storage (S3-compatible) or local
|
|
- Send notification when ready
|
|
|
|
### 12.3 Use Cases
|
|
- Automated weekly backups
|
|
- Monthly archive generation
|
|
- Quarterly review compilation
|
|
|
|
---
|
|
|
|
## 13. Implementation Roadmap
|
|
|
|
### Phase 1: Core Export (Week 1-2)
|
|
- [ ] Add `ExportLog` model to schema
|
|
- [ ] Implement JSON export endpoint
|
|
- [ ] Implement Markdown export endpoint
|
|
- [ ] Add single date/range query support
|
|
- [ ] Basic export UI in Settings
|
|
|
|
**Complexity**: 3/5
|
|
**Priority**: High
|
|
|
|
### Phase 2: Advanced Formats (Week 3)
|
|
- [ ] HTML export
|
|
- [ ] PDF export (using puppeteer)
|
|
- [ ] ePub export (optional)
|
|
|
|
**Complexity**: 4/5
|
|
**Priority**: Medium
|
|
|
|
### Phase 3: Large Exports (Week 4)
|
|
- [ ] Streaming with SSE
|
|
- [ ] ZIP compression
|
|
- [ ] Progress reporting
|
|
|
|
**Complexity**: 5/5
|
|
**Priority**: Medium
|
|
|
|
### Phase 4: Automation (Week 5)
|
|
- [ ] Scheduled exports model
|
|
- [ ] Background job scheduler
|
|
- [ ] Scheduled exports UI
|
|
|
|
**Complexity**: 4/5
|
|
**Priority**: Low
|
|
|
|
### Phase 5: Security & Polish (Week 6)
|
|
- [ ] Password-protected ZIPs
|
|
- [ ] Export audit logging
|
|
- [ ] Media file handling
|
|
- [ ] Edge cases and testing
|
|
|
|
**Complexity**: 3/5
|
|
**Priority**: Medium
|
|
|
|
---
|
|
|
|
## 14. Dependencies Required
|
|
|
|
| Package | Purpose | Version |
|
|
|---------|---------|---------|
|
|
| `pdfkit` | PDF generation | ^0.14.0 |
|
|
| `puppeteer` | HTML to PDF | ^21.0.0 |
|
|
| `archiver` | ZIP creation | ^6.0.0 |
|
|
| `epub-gen` | ePub creation | ^0.1.0 |
|
|
| `jszip` | Client-side ZIP | ^3.10.0 |
|
|
|
|
---
|
|
|
|
## 15. Testing Considerations
|
|
|
|
### 15.1 Unit Tests
|
|
- Export formatters (MD, JSON, HTML)
|
|
- Date range filtering
|
|
- Include/exclude logic
|
|
|
|
### 15.2 Integration Tests
|
|
- Full export workflow
|
|
- Large dataset performance
|
|
- Streaming response handling
|
|
|
|
### 15.3 Edge Cases
|
|
- Empty date range
|
|
- Missing media files
|
|
- Export during active generation
|
|
- Concurrent export requests
|
|
|
|
---
|
|
|
|
## 16. Priority Recommendation
|
|
|
|
| Feature | Priority | Rationale |
|
|
|---------|----------|-----------|
|
|
| JSON/Markdown export | P0 | Core requirement for backups |
|
|
| Single/range export | P0 | Essential scope control |
|
|
| Export UI | P0 | User-facing feature |
|
|
| PDF export | P1 | High user demand |
|
|
| HTML export | P1 | Good alternative to PDF |
|
|
| Streaming exports | P2 | Performance for large data |
|
|
| ZIP compression | P2 | Usability for folder exports |
|
|
| ePub export | P3 | Niche, can skip |
|
|
| Scheduled exports | P3 | Automation, lower urgency |
|
|
| Password protection | P4 | Advanced, security theater |
|
|
|
|
---
|
|
|
|
## 17. Open Questions
|
|
|
|
1. **Storage**: Should exports be stored temporarily or generated on-demand?
|
|
2. **Retention**: How long to keep export downloads available?
|
|
3. **Media handling**: Include actual files or just references?
|
|
4. **Third-party sync**: Export to Google Drive, Dropbox?
|
|
5. **Incremental exports**: Only export new data since last export?
|
|
|
|
---
|
|
|
|
## 18. Summary
|
|
|
|
This feature set provides comprehensive data export capabilities while maintaining security and user privacy. Starting with JSON/Markdown exports covers 80% of use cases (backups, migration). PDF and HTML add print/web options. Streaming and compression enable handling of large datasets. Scheduled exports provide automation for power users.
|
|
|
|
Recommend implementing Phase 1 first to establish core functionality, then iterate based on user feedback.
|