Skip to content

indiser/MarkItUp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MarkItUp

Convert PDFs, Office documents, images, audio, and 20+ file formats to pristine Markdown with AI-powered OCR.

Python 3.10+ Flask 3.0+ License: MIT

Features · Quick Start · API · Roadmap


🚀 Features

Unified Conversion Engine

  • Single API for 25+ file formats (PDF → MD, DOCX → MD, JPG → MD, etc.)
  • Smart format detection with fallback handling

🤖 AI-Powered Image OCR

  • Google Gemini 2.0 Flash extracts text with context preservation
  • Graceful degradation on quota exceeded (shows helpful message)
  • Supports: JPG, PNG, GIF, BMP, WEBP

Modern Web Interface

  • Real-time markdown preview (2000-char preview window)
  • Drag-and-drop file upload with visual feedback
  • Dark/light theme with localStorage persistence
  • Responsive design (mobile, tablet, desktop)
  • Scroll-reveal animations with IntersectionObserver

🔒 Privacy & Security

  • Files processed locally (not sent to third parties except for Gemini OCR if enabled)
  • Automatic cleanup of temp files after conversion
  • Secure filename validation (Werkzeug)
  • 100MB upload limit (configurable)

📋 Supported Formats

Category Formats
Documents PDF, DOCX, DOC, EPUB, TXT
Presentations PPTX
Spreadsheets XLSX, XLS, CSV
Data & Markup JSON, XML, HTML, HTM
Images JPG, JPEG, PNG, GIF, BMP, WEBP (+ OCR)
Audio MP3, WAV, M4A, FLAC
Archives ZIP, MSG (Outlook)

🛠️ Tech Stack

Backend:

  • Framework: Flask 3.0+ (lightweight, production-ready)
  • Document Processing: Microsoft MarkItDown (AI-aware format parsing)
  • Image OCR: Google Generative AI SDK with Gemini 2.0 Flash
  • File Handling: Werkzeug (secure uploads + MIME detection)
  • Config Management: python-dotenv (12-factor app pattern)

Frontend:

  • Markup: HTML5 semantic structure with ARIA labels
  • Styling: Modern CSS3 (custom properties, flexbox/grid, dark theme)
  • Interactivity: Vanilla JavaScript (no framework overhead)
  • APIs: Fetch, FormData, Clipboard, IntersectionObserver, LocalStorage

Infrastructure:

  • Python: 3.10+ (type hints, pattern matching)
  • Upload Limit: 100MB (configurable in app.py)
  • Temp Storage: ./tmp/ with auto-cleanup on completion
  • Error Handling: Graceful fallbacks for quota/rate limits

📦 Quick Start

Prerequisites

Python 3.10+
pip or uv

Installation

  1. Setup environment:

    git clone https://github.com/indiser/MarkItUp.git
    cd MarkItUp
    python -m venv venv
    source venv/bin/activate  # Windows: venv\Scripts\activate
  2. Install dependencies:

    pip install -r requirements.txt
    # Optional: full MarkItDown support
    pip install 'markitdown[all]'
  3. Configure Gemini 2.0 Flash (optional, for image OCR):

    cat > .env << EOF
    GOOGLE_API_KEY=your_api_key_here
    EOF

    Get free key at: https://ai.google.dev/

  4. Run:

    python app.py
    # → http://127.0.0.1:5000

📡 API Reference

GET /

Render web interface with embedded format list.


GET /api/formats

List all supported file formats.

Response:

{
  "formats": ["pdf", "docx", "xlsx", "jpg", "png", ...]
}

POST /api/preview

Generate markdown preview (first 2000 characters).

Request:

multipart/form-data
─ file: <File>

Response (Success):

{
  "preview": "# Document Title\n\nFirst 2000 chars of markdown..."
}

Response (Error):

{
  "error": "File type not supported"
}

HTTP Codes:

  • 200: Success
  • 400: Missing/invalid file
  • 500: Conversion error

POST /api/convert

Full conversion with downloadable markdown file.

Request:

multipart/form-data
─ file: <File>

Response (Success):

  • Status: 200
  • Content-Type: text/markdown
  • Body: Binary markdown file (.md extension)
  • Headers: Content-Disposition: attachment

Response (Error):

{
  "error": "Conversion error: [reason]"
}

Processing Pipeline:

Upload
  ↓
Validate (extension, size)
  ↓
Save to ./tmp/
  ↓
Is Image?
  ├─ YES: Gemini 2.0 Flash OCR
  │       (+ quota fallback)
  └─ NO: MarkItDown conversion
  ↓
Stream .md download
  ↓
Cleanup ./tmp/

🏗️ Project Structure

markitdown-web/
├── app.py                    # Flask app (300 lines)
│                             # ├─ Config & initialization
│                             # ├─ Image OCR via Gemini 2.0
│                             # └─ 3 API endpoints + helpers
│
├── requirements.txt          # Dependencies
├── .env                      # Configuration (GOOGLE_API_KEY)
├── .env.example             # Template
│
├── templates/
│   └── index.html           # SPA (500+ lines)
│                             # ├─ Semantic HTML5 structure
│                             # ├─ Embedded CSS for theming
│                             # ├─ Navbar with logo + theme toggle
│                             # ├─ Dropzone + file upload
│                             # ├─ File metadata card
│                             # ├─ Live markdown preview panel
│                             # └─ Fullscreen expand modal
│
├── static/
│   ├── style.css            # Responsive design (400 lines)
│   │                         # ├─ Dark/light theme via :root
│   │                         # ├─ Mobile-first breakpoints
│   │                         # ├─ Flexbox/Grid layouts
│   │                         # ├─ Animation keyframes
│   │                         # └─ Accessibility (focus states, contrast)
│   │
│   └── script.js            # Vanilla JS logic (400+ lines)
│                             # ├─ File upload handlers
│                             # ├─ Fetch API calls (/api/*)
│                             # ├─ Drag-and-drop events
│                             # ├─ Progress visualization
│                             # ├─ Theme persistence (localStorage)
│                             # ├─ Clipboard API
│                             # └─ Scroll reveal animations
│
└── tmp/                      # Temp uploads (auto-created, auto-cleaned)

🔐 Configuration

Environment Variables

# Required for image OCR
GOOGLE_API_KEY=sk-xxx...

# Optional (override defaults)
MAX_UPLOAD_SIZE_MB=100        # Default: 100MB
TEMP_FOLDER=./tmp             # Default: ./tmp

Runtime Tuning (app.py)

# Adjust upload limit
app.config['MAX_CONTENT_LENGTH'] = 50 * 1024 * 1024  # 50MB

# Change temp folder
UPLOAD_FOLDER = Path(__file__).parent / 'tmp'

🚨 Troubleshooting

Problem Solution
Port 5000 in use Change: app.run(port=5001) in app.py
Gemini quota exceeded Enable billing: https://console.cloud.google.com/billing
MarkItDown converters missing pip install 'markitdown[all]'
File upload fails Check: size < 100MB, format supported, write permissions on tmp/
Preview shows "quota exceeded" Switch GOOGLE_API_KEY or wait 24h for free tier reset

🧪 Development

Debug Mode

FLASK_ENV=development FLASK_DEBUG=1 python app.py
# Auto-reloads on code changes

Test Endpoints

# Preview
curl -X POST -F "file=@doc.pdf" http://localhost:5000/api/preview | jq

# Convert (save output)
curl -X POST -F "file=@doc.pdf" \
  -o converted.md \
  http://localhost:5000/api/convert

# Formats list
curl http://localhost:5000/api/formats | jq .formats

Code Quality

pip install black flake8 pytest
black app.py static/ templates/
flake8 app.py --max-line-length=100

🎯 Future Roadmap

Backend Enhancements

Phase 1: Async Processing (Q1 2026)

  • Celery + Redis for background jobs
  • WebSocket endpoint for real-time progress
  • Batch upload with job queue
  • Scheduled conversions (convert at specific time)

Phase 2: Multi-LLM & Resilience (Q2 2026)

  • Claude 3.5 Sonnet fallback (higher OCR accuracy)
  • Provider abstraction layer (easy vendor switching)
  • OpenAI GPT-4o as secondary OCR option
  • Automatic provider failover on quota/error

Phase 3: Advanced Document Processing (Q3 2026)

  • Microsoft Document Intelligence API for:
    • Table layout preservation
    • Form field extraction
    • Handwriting recognition
  • PDF layout detection (preserve columns, headers)
  • Post-processing: clean markdown linting

Phase 4: Database & Analytics (Q3-Q4 2026)

  • PostgreSQL for conversion history
  • User authentication (JWT + OAuth2)
  • Usage analytics dashboard
  • API rate limiting with quotas
  • Conversion metrics (speed, success rate)

Phase 5: Optimization (Q4 2026)

  • Redis caching for frequently converted docs
  • ETags/conditional requests (reduce bandwidth)
  • Incremental processing (stream large PDFs)
  • CDN integration for static assets

Frontend Enhancements

Phase 1: Rich Editor (Q1 2026)

  • CodeMirror integration for live markdown editing
  • Side-by-side source/preview toggle
  • Markdown formatting toolbar (bold, italic, lists, code)
  • Syntax highlighting for code blocks (highlight.js)

Phase 2: Advanced Preview (Q2 2026)

  • Mermaid diagram rendering
  • LaTeX math equation display (KaTeX)
  • Auto-generated table of contents with anchors
  • Custom CSS injection for preview styling
  • Diff viewer for version history

Phase 3: History & Collaboration (Q2-Q3 2026)

  • Persistent conversion history (IndexedDB)
  • Favorite conversions bookmarking
  • Shareable URLs for converted markdown (short-lived)
  • Collaborative editing with WebSocket sync
  • Comments/annotations on content

Phase 4: PWA & Offline (Q3 2026)

  • Service Worker for offline caching
  • Add to home screen (manifest.json)
  • Background sync for queued uploads
  • Installable desktop app (Electron wrapper)

Phase 5: Accessibility & I18n (Q4 2026)

  • WCAG 2.1 AA compliance audit
  • Full screen reader support (ARIA)
  • Keyboard navigation (Tab, Enter, Escape)
  • 10+ language support (i18n)
  • RTL language support (Arabic, Hebrew)

Phase 6: Performance (Q4 2026)

  • Virtual scrolling for large previews
  • Code splitting / lazy routes
  • Image lazy loading in preview
  • Web Worker for markdown parsing (off main thread)
  • Compression (gzip, brotli)

🤝 Contributing

We welcome contributions! Please:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Follow code style: black app.py + flake8
  4. Commit with clear messages: git commit -m 'Add amazing feature'
  5. Push: git push origin feature/amazing-feature
  6. Open a Pull Request

📝 License


📧 Support


Made with ❤️ by MarkItUp Contributors

Powered by Microsoft MarkItDown + Google Gemini 2.0 Flash

⬆ Back to Top

About

MarkItUp — A modern web application for converting PDFs, Office documents, images, and audio files to clean, readable Markdown. Powered by Microsoft MarkItDown and Google Gemini 2.0 Flash for intelligent image OCR.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors