Files

lotherk 0bdd71a4ed feat: v0.1.0 - geolocation capture, calendar, search, Starlight docs site

- Automatic browser geolocation capture on event creation
- Reverse geocoding via Nominatim API for place names
- Full-text search with SQLite FTS5
- Calendar view for browsing past entries
- DateNavigator component for day navigation
- SearchModal with Ctrl+K shortcut
- QuickAddWidget with Ctrl+J shortcut
- Starlight documentation site with GitHub Pages deployment
- Multiple AI provider support (Groq, OpenAI, Anthropic, Ollama, LM Studio)
- Multi-user registration support

BREAKING: Events now include latitude/longitude/placeName fields

2026-03-27 02:27:55 +00:00

15 KiB

Raw Blame History

Data Export Feature - DearDiary

Comprehensive research document for implementing a data export feature.

1. Feature Overview

Allow users to export their diary data in multiple formats with flexible scope and options. This feature enables users to:

Backup their data locally
Migrate to other journaling platforms
Create offline archives
Share selected entries

2. Export Formats

2.1 Markdown (.md)

Description: Human-readable plain text format with frontmatter metadata.

Technical Approach:

Single file: One .md file per day or combined
Use YAML frontmatter for metadata (date, title, word count)

Structure:

---
date: 2024-01-15
title: A Quiet Morning
event_count: 5
generated_at: 2024-01-15T20:30:00Z
---

# January 15, 2024

## Events
[08:30] Had coffee and read news
[12:00] Team meeting about Q1 goals

## Diary Page

The morning started quietly...

Complexity: Low - straightforward string generation Priority: High - most versatile, easy to implement

2.2 JSON (.json)

Description: Machine-readable structured format for programmatic use.

Technical Approach:

{
  "exported_at": "2024-01-15T20:30:00Z",
  "user_id": "user-uuid",
  "format_version": "1.0",
  "entries": [
    {
      "date": "2024-01-15",
      "journal": {
        "title": "A Quiet Morning",
        "content": "The morning started quietly...",
        "generated_at": "2024-01-15T20:30:00Z"
      },
      "events": [
        {
          "id": "event-uuid",
          "type": "text",
          "content": "Had coffee and read news",
          "created_at": "2024-01-15T08:30:00Z",
          "metadata": {}
        }
      ]
    }
  ]
}

Complexity: Low - native Prisma JSON serialization Priority: High - essential for backups/migrations

2.3 PDF (.pdf)

Description: Print-ready formatted document.

Technical Approach:

Use pdfkit or puppeteer (headless Chrome) for generation
Puppeteer recommended for complex layouts/CSS support
Template options:
- Simple: Title + content (minimal styling)
- Full: Events listed with diary page formatted
Page breaks handled for multi-day exports

Complexity: Medium - requires additional dependency Priority: Medium - high user demand for print/export

2.4 HTML (.html)

Description: Web-viewable static pages.

Technical Approach:

Single HTML file with embedded CSS
Include basic navigation for multi-day exports
Responsive design with print media queries

Structure:

<!DOCTYPE html>
<html>
<head>
  <title>DearDiary Export</title>
  <style>
    body { font-family: system-ui; max-width: 800px; margin: 0 auto; padding: 2rem; }
    .entry { margin-bottom: 2rem; }
    .meta { color: #666; font-size: 0.9rem; }
  </style>
</head>
<body>
  <h1>January 2024</h1>
  <div class="entry">
    <h2>January 15, 2024</h2>
    <div class="meta">5 events</div>
    <p>Diary content...</p>
  </div>
</body>
</html>

Complexity: Low-Medium - string generation with CSS Priority: Medium - good for web publishing

2.5 ePub (.epub)

Description: Ebook format for e-readers.

Technical Approach:

Use epub-gen or similar library
Structure: One chapter per day or per month
Include cover image with app branding
Metadata: Title, author, generated date

Complexity: High - requires ebook-specific libraries Priority: Low - niche use case, can be deprioritized

3. Export Scope

3.1 Single Diary

Export one day's journal + events
API: GET /api/v1/export?date=2024-01-15
Returns single entry with all related data

3.2 Date Range

Export events between start and end dates
API: GET /api/v1/export?start=2024-01-01&end=2024-01-31
Batch query: Prisma where: { date: { gte: start, lte: end } }

3.3 All Data

Export entire user dataset
Include settings, metadata
Requires pagination for large datasets

4. Include/Exclude Options

4.1 Content Filters

Option	Description	Implementation
`events_only`	Raw events without AI-generated diaries	Filter journals from response
`diaries_only`	Only generated diary pages	Filter events from response
`with_media`	Include media file references	Include `mediaPath` field
`without_media`	Exclude media references	Omit `mediaPath` field

4.2 Data Structure Options

interface ExportOptions {
  format: 'md' | 'json' | 'pdf' | 'html' | 'epub';
  scope: 'single' | 'range' | 'all';
  date?: string;
  startDate?: string;
  endDate?: string;
  include: {
    events: boolean;
    journals: boolean;
    media: boolean;
    settings: boolean;
  };
  organization: 'single_file' | 'folder';
  compress: boolean;
}

5. File Organization

5.1 Single File

All content in one file (.md, .json, .html)
Best for: small exports, JSON backups
Simple to implement

5.2 Folder Structure

export-2024-01-15/
├── index.html          # Main navigation
├── 2024-01-15/
│   ├── journal.md      # Diary page
│   ├── events.md       # Raw events
│   └── media/          # Photos, voice memos
├── 2024-01-14/
│   └── ...
└── manifest.json       # Export metadata

Best for: large exports with media
Use ZIP compression for download

6. Compression Options

6.1 ZIP Archive

Default for folder exports > 10MB
Use Bun.zip() or archiver package
Include manifest with export details

Implementation:

// Example: ZIP export flow
async function exportZip(options: ExportOptions) {
  const tempDir = await createTempDir();
  await generateFiles(tempDir, options);
  const zipPath = `${tempDir}.zip`;
  await zip(tempDir, zipPath);
  return serveFile(zipPath);
}

7. Streaming Large Exports

7.1 Problem

Large exports (years of data) can exceed memory
Need progressive loading and streaming response

7.2 Solution: Server-Sent Events (SSE)

API Design:

POST /api/v1/export
Content-Type: application/json

{
  "format": "json",
  "startDate": "2020-01-01",
  "endDate": "2024-01-15"
}

Response (chunked):

event: progress
data: {"percent": 10, "stage": "loading_events"}

event: data
data: {"date": "2020-01-01", ...}

event: progress
data: {"percent": 20, "stage": "loading_journals"}

event: data
data: {"date": "2020-01-02", ...}

event: complete
data: {"total_entries": 1000, "export_size": "5MB"}

7.3 Implementation Notes

Use Prisma cursor-based pagination for memory efficiency
Stream directly to response without buffering
Provide progress updates every N records

8. Privacy & Security

8.1 Authentication

Require valid API key for all export endpoints
User can only export their own data

8.2 Sensitive Data Handling

Option: Password-protect exports
- Use AES-256 encryption for ZIP
- Prompt for password in UI
Option: redact sensitive entries
- Tag certain events as "private"
- Exclude from export by default

8.3 Media Files

Generate signed URLs for media export
Set expiration (24h default)
Don't include raw API keys in export

8.4 Audit Logging

Log export requests (who, when, scope)
Store in new ExportLog model

9. Database Schema Changes

9.1 New Models

model ExportLog {
  id          String   @id @default(uuid())
  userId      String
  format      String
  scope       String
  startDate   String?
  endDate     String?
  recordCount Int
  sizeBytes   Int?
  status      String   @default("pending")
  createdAt   DateTime @default(now())
  completedAt DateTime?

  user User @relation(fields: [userId], references: [id], onDelete: Cascade)
}

model ScheduledExport {
  id          String   @id @default(uuid())
  userId      String
  name        String
  format      String
  scope       String   @default("all")
  frequency   String   @default("weekly")
  includeJson Json?
  enabled     Boolean  @default(true)
  lastRunAt   DateTime?
  createdAt   DateTime @default(now())
  updatedAt   DateTime @updatedAt

  user User @relation(fields: [userId], references: [id], onDelete: Cascade)
}

10. API Changes

10.1 New Endpoints

Method	Endpoint	Description
POST	`/api/v1/export`	Create export job
GET	`/api/v1/export/:id`	Get export status
GET	`/api/v1/export/:id/download`	Download export file
GET	`/api/v1/exports`	List export history
DELETE	`/api/v1/export/:id`	Delete export
GET	`/api/v1/scheduled-exports`	List scheduled exports
POST	`/api/v1/scheduled-exports`	Create schedule
PUT	`/api/v1/scheduled-exports/:id`	Update schedule
DELETE	`/api/v1/scheduled-exports/:id`	Delete schedule

10.2 Request/Response Examples

Create Export:

// POST /api/v1/export
interface CreateExportRequest {
  format: 'md' | 'json' | 'pdf' | 'html' | 'epub';
  date?: string;           // single day
  startDate?: string;      // range start
  endDate?: string;       // range end
  include: {
    events: boolean;
    journals: boolean;
    media: boolean;
    settings: boolean;
  };
  organization: 'single_file' | 'folder';
  compress: boolean;
  password?: string;      // optional ZIP password
}

interface ExportResponse {
  id: string;
  status: 'pending' | 'processing' | 'completed' | 'failed';
  progress: number;
  downloadUrl?: string;
  expiresAt?: string;
}

11. UI/UX Considerations

11.1 Export Page Location

Add to Settings page as "Export Data" section
Or create dedicated /export route

┌─────────────────────────────────────────┐
│  Export Your Data                       │
├─────────────────────────────────────────┤
│                                         │
│  Format:  [Markdown ▼]                 │
│            ○ Markdown                  │
│            ○ JSON                       │
│            ○ PDF                        │
│            ○ HTML                       │
│            ○ ePub                       │
│                                         │
│  Scope:   ○ This month                  │
│            ○ This year                  │
│            ○ All time                   │
│            ○ Custom range    [____]     │
│                                         │
│  Include: ☑ Generated diaries           │
│           ☑ Raw events                  │
│           ☐ Media files                 │
│           ☐ Settings                    │
│                                         │
│  Options: ○ Single file                 │
│            ○ Folder (with ZIP)          │
│                                         │
│           ☐ Password protect           │
│           [________]                    │
│                                         │
│  [Cancel]              [Export]         │
└─────────────────────────────────────────┘

11.3 Progress View

Show progress bar during export
Estimated time remaining
Cancel button for large exports
Email notification option (future)

11.4 Export History

List of past exports with:
- Date, format, scope
- Size, record count
- Download link (with expiration)
- Delete button

12. Scheduled Exports

12.1 Configuration Options

Frequency	Description
`daily`	Every day at configured time
`weekly`	Every Sunday
`monthly`	First day of month
`quarterly`	Every 3 months

12.2 Implementation

Use cron-style scheduling
Run as background job (Bun.setInterval or dedicated worker)
Store exports in cloud storage (S3-compatible) or local
Send notification when ready

12.3 Use Cases

Automated weekly backups
Monthly archive generation
Quarterly review compilation

13. Implementation Roadmap

Phase 1: Core Export (Week 1-2)

Add ExportLog model to schema
Implement JSON export endpoint
Implement Markdown export endpoint
Add single date/range query support
Basic export UI in Settings

Complexity: 3/5 Priority: High

Phase 2: Advanced Formats (Week 3)

HTML export
PDF export (using puppeteer)
ePub export (optional)

Complexity: 4/5 Priority: Medium

Phase 3: Large Exports (Week 4)

Streaming with SSE
ZIP compression
Progress reporting

Complexity: 5/5 Priority: Medium

Phase 4: Automation (Week 5)

Scheduled exports model
Background job scheduler
Scheduled exports UI

Complexity: 4/5 Priority: Low

Phase 5: Security & Polish (Week 6)

Password-protected ZIPs
Export audit logging
Media file handling
Edge cases and testing

Complexity: 3/5 Priority: Medium

14. Dependencies Required

Package	Purpose	Version
`pdfkit`	PDF generation	^0.14.0
`puppeteer`	HTML to PDF	^21.0.0
`archiver`	ZIP creation	^6.0.0
`epub-gen`	ePub creation	^0.1.0
`jszip`	Client-side ZIP	^3.10.0

15. Testing Considerations

15.1 Unit Tests

Export formatters (MD, JSON, HTML)
Date range filtering
Include/exclude logic

15.2 Integration Tests

Full export workflow
Large dataset performance
Streaming response handling

15.3 Edge Cases

Empty date range
Missing media files
Export during active generation
Concurrent export requests

16. Priority Recommendation

Feature	Priority	Rationale
JSON/Markdown export	P0	Core requirement for backups
Single/range export	P0	Essential scope control
Export UI	P0	User-facing feature
PDF export	P1	High user demand
HTML export	P1	Good alternative to PDF
Streaming exports	P2	Performance for large data
ZIP compression	P2	Usability for folder exports
ePub export	P3	Niche, can skip
Scheduled exports	P3	Automation, lower urgency
Password protection	P4	Advanced, security theater

17. Open Questions

Storage: Should exports be stored temporarily or generated on-demand?
Retention: How long to keep export downloads available?
Media handling: Include actual files or just references?
Third-party sync: Export to Google Drive, Dropbox?
Incremental exports: Only export new data since last export?

18. Summary

This feature set provides comprehensive data export capabilities while maintaining security and user privacy. Starting with JSON/Markdown exports covers 80% of use cases (backups, migration). PDF and HTML add print/web options. Streaming and compression enable handling of large datasets. Scheduled exports provide automation for power users.

Recommend implementing Phase 1 first to establish core functionality, then iterate based on user feedback.

15 KiB Raw Blame History

Data Export Feature - DearDiary

1. Feature Overview

2. Export Formats

2.1 Markdown (.md)

2.2 JSON (.json)

2.3 PDF (.pdf)

2.4 HTML (.html)

2.5 ePub (.epub)

3. Export Scope

3.1 Single Diary

3.2 Date Range

3.3 All Data

4. Include/Exclude Options

4.1 Content Filters

4.2 Data Structure Options

5. File Organization

5.1 Single File

5.2 Folder Structure

6. Compression Options

6.1 ZIP Archive

7. Streaming Large Exports

7.1 Problem

7.2 Solution: Server-Sent Events (SSE)

7.3 Implementation Notes

8. Privacy & Security

8.1 Authentication

8.2 Sensitive Data Handling

8.3 Media Files

8.4 Audit Logging

9. Database Schema Changes

9.1 New Models

10. API Changes

10.1 New Endpoints

10.2 Request/Response Examples

11. UI/UX Considerations

11.1 Export Page Location

11.2 Export Modal

11.3 Progress View

11.4 Export History

12. Scheduled Exports

12.1 Configuration Options

12.2 Implementation

12.3 Use Cases

13. Implementation Roadmap

Phase 1: Core Export (Week 1-2)

Phase 2: Advanced Formats (Week 3)

Phase 3: Large Exports (Week 4)

Phase 4: Automation (Week 5)

Phase 5: Security & Polish (Week 6)

14. Dependencies Required

15. Testing Considerations

15.1 Unit Tests

15.2 Integration Tests

15.3 Edge Cases

16. Priority Recommendation

17. Open Questions

18. Summary

15 KiB

Raw Blame History