MarkItUp

Convert PDFs, Office documents, images, audio, and 20+ file formats to pristine Markdown with AI-powered OCR.

Features · Quick Start · API · Roadmap

🚀 Features

✨ Unified Conversion Engine

Single API for 25+ file formats (PDF → MD, DOCX → MD, JPG → MD, etc.)
Smart format detection with fallback handling

🤖 AI-Powered Image OCR

Google Gemini 2.0 Flash extracts text with context preservation
Graceful degradation on quota exceeded (shows helpful message)
Supports: JPG, PNG, GIF, BMP, WEBP

⚡ Modern Web Interface

Real-time markdown preview (2000-char preview window)
Drag-and-drop file upload with visual feedback
Dark/light theme with localStorage persistence
Responsive design (mobile, tablet, desktop)
Scroll-reveal animations with IntersectionObserver

🔒 Privacy & Security

Files processed locally (not sent to third parties except for Gemini OCR if enabled)
Automatic cleanup of temp files after conversion
Secure filename validation (Werkzeug)
100MB upload limit (configurable)

📋 Supported Formats

Category	Formats
Documents	PDF, DOCX, DOC, EPUB, TXT
Presentations	PPTX
Spreadsheets	XLSX, XLS, CSV
Data & Markup	JSON, XML, HTML, HTM
Images	JPG, JPEG, PNG, GIF, BMP, WEBP (+ OCR)
Audio	MP3, WAV, M4A, FLAC
Archives	ZIP, MSG (Outlook)

🛠️ Tech Stack

Backend:

Framework: Flask 3.0+ (lightweight, production-ready)
Document Processing: Microsoft MarkItDown (AI-aware format parsing)
Image OCR: Google Generative AI SDK with Gemini 2.0 Flash
File Handling: Werkzeug (secure uploads + MIME detection)
Config Management: python-dotenv (12-factor app pattern)

Frontend:

Markup: HTML5 semantic structure with ARIA labels
Styling: Modern CSS3 (custom properties, flexbox/grid, dark theme)
Interactivity: Vanilla JavaScript (no framework overhead)
APIs: Fetch, FormData, Clipboard, IntersectionObserver, LocalStorage

Infrastructure:

Python: 3.10+ (type hints, pattern matching)
Upload Limit: 100MB (configurable in app.py)
Temp Storage: ./tmp/ with auto-cleanup on completion
Error Handling: Graceful fallbacks for quota/rate limits

📦 Quick Start

Prerequisites

Python 3.10+
pip or uv

Installation

Setup environment:

git clone https://github.com/indiser/MarkItUp.git
cd MarkItUp
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt
# Optional: full MarkItDown support
pip install 'markitdown[all]'

Configure Gemini 2.0 Flash (optional, for image OCR):
```
cat > .env << EOF
GOOGLE_API_KEY=your_api_key_here
EOF
```
Get free key at: https://ai.google.dev/

Run:

python app.py
# → http://127.0.0.1:5000

📡 API Reference

`GET /`

Render web interface with embedded format list.

`GET /api/formats`

List all supported file formats.

Response:

{
  "formats": ["pdf", "docx", "xlsx", "jpg", "png", ...]
}

`POST /api/preview`

Generate markdown preview (first 2000 characters).

Request:

multipart/form-data
─ file: <File>

Response (Success):

{
  "preview": "# Document Title\n\nFirst 2000 chars of markdown..."
}

Response (Error):

{
  "error": "File type not supported"
}

HTTP Codes:

200: Success
400: Missing/invalid file
500: Conversion error

`POST /api/convert`

Full conversion with downloadable markdown file.

Request:

multipart/form-data
─ file: <File>

Response (Success):

Status: 200
Content-Type: text/markdown
Body: Binary markdown file (.md extension)
Headers: Content-Disposition: attachment

Response (Error):

{
  "error": "Conversion error: [reason]"
}

Processing Pipeline:

Upload
  ↓
Validate (extension, size)
  ↓
Save to ./tmp/
  ↓
Is Image?
  ├─ YES: Gemini 2.0 Flash OCR
  │       (+ quota fallback)
  └─ NO: MarkItDown conversion
  ↓
Stream .md download
  ↓
Cleanup ./tmp/

🏗️ Project Structure

markitdown-web/
├── app.py                    # Flask app (300 lines)
│                             # ├─ Config & initialization
│                             # ├─ Image OCR via Gemini 2.0
│                             # └─ 3 API endpoints + helpers
│
├── requirements.txt          # Dependencies
├── .env                      # Configuration (GOOGLE_API_KEY)
├── .env.example             # Template
│
├── templates/
│   └── index.html           # SPA (500+ lines)
│                             # ├─ Semantic HTML5 structure
│                             # ├─ Embedded CSS for theming
│                             # ├─ Navbar with logo + theme toggle
│                             # ├─ Dropzone + file upload
│                             # ├─ File metadata card
│                             # ├─ Live markdown preview panel
│                             # └─ Fullscreen expand modal
│
├── static/
│   ├── style.css            # Responsive design (400 lines)
│   │                         # ├─ Dark/light theme via :root
│   │                         # ├─ Mobile-first breakpoints
│   │                         # ├─ Flexbox/Grid layouts
│   │                         # ├─ Animation keyframes
│   │                         # └─ Accessibility (focus states, contrast)
│   │
│   └── script.js            # Vanilla JS logic (400+ lines)
│                             # ├─ File upload handlers
│                             # ├─ Fetch API calls (/api/*)
│                             # ├─ Drag-and-drop events
│                             # ├─ Progress visualization
│                             # ├─ Theme persistence (localStorage)
│                             # ├─ Clipboard API
│                             # └─ Scroll reveal animations
│
└── tmp/                      # Temp uploads (auto-created, auto-cleaned)

🔐 Configuration

Environment Variables

# Required for image OCR
GOOGLE_API_KEY=sk-xxx...

# Optional (override defaults)
MAX_UPLOAD_SIZE_MB=100        # Default: 100MB
TEMP_FOLDER=./tmp             # Default: ./tmp

Runtime Tuning (app.py)

# Adjust upload limit
app.config['MAX_CONTENT_LENGTH'] = 50 * 1024 * 1024  # 50MB

# Change temp folder
UPLOAD_FOLDER = Path(__file__).parent / 'tmp'

🚨 Troubleshooting

Problem	Solution
Port 5000 in use	Change: `app.run(port=5001)` in app.py
Gemini quota exceeded	Enable billing: https://console.cloud.google.com/billing
MarkItDown converters missing	`pip install 'markitdown[all]'`
File upload fails	Check: size < 100MB, format supported, write permissions on `tmp/`
Preview shows "quota exceeded"	Switch `GOOGLE_API_KEY` or wait 24h for free tier reset

🧪 Development

Debug Mode

FLASK_ENV=development FLASK_DEBUG=1 python app.py
# Auto-reloads on code changes

Test Endpoints

# Preview
curl -X POST -F "file=@doc.pdf" http://localhost:5000/api/preview | jq

# Convert (save output)
curl -X POST -F "file=@doc.pdf" \
  -o converted.md \
  http://localhost:5000/api/convert

# Formats list
curl http://localhost:5000/api/formats | jq .formats

Code Quality

pip install black flake8 pytest
black app.py static/ templates/
flake8 app.py --max-line-length=100

🎯 Future Roadmap

Backend Enhancements

Phase 1: Async Processing (Q1 2026)

Celery + Redis for background jobs
WebSocket endpoint for real-time progress
Batch upload with job queue
Scheduled conversions (convert at specific time)

Phase 2: Multi-LLM & Resilience (Q2 2026)

Claude 3.5 Sonnet fallback (higher OCR accuracy)
Provider abstraction layer (easy vendor switching)
OpenAI GPT-4o as secondary OCR option
Automatic provider failover on quota/error

Phase 3: Advanced Document Processing (Q3 2026)

Microsoft Document Intelligence API for:
- Table layout preservation
- Form field extraction
- Handwriting recognition
PDF layout detection (preserve columns, headers)
Post-processing: clean markdown linting

Phase 4: Database & Analytics (Q3-Q4 2026)

PostgreSQL for conversion history
User authentication (JWT + OAuth2)
Usage analytics dashboard
API rate limiting with quotas
Conversion metrics (speed, success rate)

Phase 5: Optimization (Q4 2026)

Redis caching for frequently converted docs
ETags/conditional requests (reduce bandwidth)
Incremental processing (stream large PDFs)
CDN integration for static assets

Frontend Enhancements

Phase 1: Rich Editor (Q1 2026)

CodeMirror integration for live markdown editing
Side-by-side source/preview toggle
Markdown formatting toolbar (bold, italic, lists, code)
Syntax highlighting for code blocks (highlight.js)

Phase 2: Advanced Preview (Q2 2026)

Mermaid diagram rendering
LaTeX math equation display (KaTeX)
Auto-generated table of contents with anchors
Custom CSS injection for preview styling
Diff viewer for version history

Phase 3: History & Collaboration (Q2-Q3 2026)

Persistent conversion history (IndexedDB)
Favorite conversions bookmarking
Shareable URLs for converted markdown (short-lived)
Collaborative editing with WebSocket sync
Comments/annotations on content

Phase 4: PWA & Offline (Q3 2026)

Service Worker for offline caching
Add to home screen (manifest.json)
Background sync for queued uploads
Installable desktop app (Electron wrapper)

Phase 5: Accessibility & I18n (Q4 2026)

WCAG 2.1 AA compliance audit
Full screen reader support (ARIA)
Keyboard navigation (Tab, Enter, Escape)
10+ language support (i18n)
RTL language support (Arabic, Hebrew)

Phase 6: Performance (Q4 2026)

Virtual scrolling for large previews
Code splitting / lazy routes
Image lazy loading in preview
Web Worker for markdown parsing (off main thread)
Compression (gzip, brotli)

🤝 Contributing

We welcome contributions! Please:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Follow code style: black app.py + flake8
Commit with clear messages: git commit -m 'Add amazing feature'
Push: git push origin feature/amazing-feature
Open a Pull Request

📝 License

MarkItUp: MIT License
Dependencies:
- Microsoft MarkItDown — MIT
- Google Generative AI SDK — Apache 2.0
- Flask — BSD-3-Clause

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.agent/skills/best_ui		.agent/skills/best_ui
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MarkItUp

🚀 Features

📋 Supported Formats

🛠️ Tech Stack

📦 Quick Start

Prerequisites

Installation

📡 API Reference

GET /

GET /api/formats

POST /api/preview

POST /api/convert

🏗️ Project Structure

🔐 Configuration

Environment Variables

Runtime Tuning (app.py)

🚨 Troubleshooting

🧪 Development

Debug Mode

Test Endpoints

Code Quality

🎯 Future Roadmap

Backend Enhancements

Phase 1: Async Processing (Q1 2026)

Phase 2: Multi-LLM & Resilience (Q2 2026)

Phase 3: Advanced Document Processing (Q3 2026)

Phase 4: Database & Analytics (Q3-Q4 2026)

Phase 5: Optimization (Q4 2026)

Frontend Enhancements

Phase 1: Rich Editor (Q1 2026)

Phase 2: Advanced Preview (Q2 2026)

Phase 3: History & Collaboration (Q2-Q3 2026)

Phase 4: PWA & Offline (Q3 2026)

Phase 5: Accessibility & I18n (Q4 2026)

Phase 6: Performance (Q4 2026)

🤝 Contributing

📝 License

📧 Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /`

`GET /api/formats`

`POST /api/preview`

`POST /api/convert`

Packages