---
name: textsight
description: Convert images to ASCII text using a local WASM binary. Gives agents visual understanding without cloud vision APIs. Supports JPEG, PNG, GIF. Three render modes — plain, ANSI color, half-block RGB.
metadata:
  { "openclaw": { "emoji": "👁️", "os": ["darwin", "linux"], "requires": { "bins": ["wasmtime"] } } }
---

# TextSight — Image to Text (WASM)

Convert images to ASCII art locally. Your images never leave your machine.
No cloud API calls. No privacy concerns. No API credits spent.

**Source:** https://github.com/minkymorgan/textsight

---

## Setup (one time)

```bash
# 1. Install wasmtime (if not already installed)
curl https://wasmtime.dev/install.sh -sSf | bash
# Windows: winget install bytecodealliance.wasmtime

# 2. Download the WASM binary
curl -sO https://cogbot.io/tools/textsight/textsight.wasm

# Verify it downloaded correctly (should be ~3.1MB)
ls -lh textsight.wasm
```

> **Note:** Keep `textsight.wasm` in the directory you'll run it from,
> or always use the full path in commands.

---

## Basic Usage

```bash
# Set path to the WASM binary (adjust as needed)
WASM="/path/to/textsight.wasm"

# Plain ASCII — default 80 chars wide
wasmtime run --dir=. -- "$WASM" photo.jpg

# Specify width (columns)
wasmtime run --dir=. -- "$WASM" photo.jpg 120

# ANSI color mode (-c flag)
wasmtime run --dir=. -- "$WASM" -c photo.jpg 80

# Half-block RGB mode (-b flag)
wasmtime run --dir=. -- "$WASM" -b photo.jpg 60
```

> ⚠️ **Critical:** The `--dir` flag must grant wasmtime access to the
> directory containing **your image file** — not just the WASM binary.
> If your image is in `/tmp/`, use `--dir=/tmp`. If it's in the current
> working directory, `--dir=.` works. Mismatching this causes a cryptic
> "file not found" error inside the WASM sandbox.

---

## Argument Order

The argument order matters:

```
wasmtime run --dir=<image_dir> -- <wasm_path> [FLAGS] <image_file> [width]
```

| Argument | Required | Default | Notes |
|----------|----------|---------|-------|
| image_file | yes | — | Path relative to `--dir` |
| width | no | 80 | Output columns |
| -c | no | — | ANSI 24-bit color mode |
| -b | no | — | Half-block RGB mode |

The flag (`-c` or `-b`) must come **before** the image filename.

---

## Three Render Modes

### Plain ASCII (default)
```bash
wasmtime run --dir=. -- "$WASM" photo.jpg 80 > output.txt
# Read output.txt — the image structure is readable as text characters
```
Best for: agent analysis, retro display, minimal bandwidth. Output is
pure text — agents can read it directly without any viewer.

### ANSI Color (`-c`)
```bash
# View in terminal (if it supports 24-bit color)
wasmtime run --dir=. -- "$WASM" -c photo.jpg 80

# Convert to HTML for browser viewing (requires aha)
wasmtime run --dir=. -- "$WASM" -c photo.jpg 80 | aha --black > output.html
open output.html  # macOS
xdg-open output.html  # Linux
```
Best for: terminal display, sharing as HTML. Requires `brew install aha`
or `apt install aha` for browser viewing.

### Half-Block RGB (`-b`)
```bash
wasmtime run --dir=. -- "$WASM" -b photo.jpg 80 | aha --black > output.html
open output.html
```
Best for: highest visual fidelity. Uses Unicode half-block characters
(▀ ▄) plus ANSI color — near-photorealistic in terminal and HTML.
Note: doubles the output height compared to plain mode.

---

## Mode Comparison

| Mode | Flag | File size | Visual quality | Agent-readable? |
|------|------|-----------|----------------|-----------------|
| Plain | (none) | smallest | retro ASCII | ✅ yes |
| Color | -c | medium | good color | ⚠️ ANSI escapes |
| Half-block | -b | larger | near-photo | ⚠️ ANSI escapes |

For agent analysis of image content (reading text in images, identifying
structure), **plain mode is best** — the output is clean text with no
escape codes.

---

## Handling Image Paths

The WASM sandbox cannot see your full filesystem — only directories you
explicitly grant with `--dir`. Always match `--dir` to where your image lives:

```bash
# Image in current directory
wasmtime run --dir=. -- "$WASM" photo.jpg

# Image in /tmp
wasmtime run --dir=/tmp -- "$WASM" /tmp/screenshot.png

# Image elsewhere — grant that directory
wasmtime run --dir=/Users/me/photos -- "$WASM" /Users/me/photos/img.jpg
# Or just use the filename (relative to --dir):
wasmtime run --dir=/Users/me/photos -- "$WASM" img.jpg
```

> **Tip:** It's simplest to `cd` into the directory containing your image,
> then use `--dir=.` and just the filename. This always works.

---

## Practical Workflow Example

```bash
# Screenshot is at ~/Desktop/screenshot.png
cd ~/Desktop
WASM="~/.openclaw/workspace/textsight/textsight.wasm"

# Quick analysis — save to text and read
wasmtime run --dir=. -- "$WASM" screenshot.png 100 > analysis.txt
cat analysis.txt
```

---

## When to Use

- Reading text from images (whiteboards, screenshots, scanned docs)
- Offline image analysis without cloud API calls
- When the `image` tool is unavailable or you want to save API credits
- Privacy-sensitive images that must not leave your machine
- Quick sanity check on image content before deciding whether to invoke a heavier vision model

## When NOT to Use

- Already-digital text (PDFs with selectable text, HTML pages) — use a text extractor instead
- When you need high-accuracy OCR — use a dedicated OCR tool (tesseract, etc.)
- When the `image` tool is available and accuracy matters more than privacy
- Very small or low-resolution images (results degrade below ~200px wide)

---

## Troubleshooting

**`Error: failed to open file 'photo.jpg'`**
Your image isn't in the directory you granted with `--dir`. Check the path
and make sure `--dir` points to the folder containing the image.

**`wasmtime: command not found`**
Install wasmtime: `curl https://wasmtime.dev/install.sh -sSf | bash`
Then open a new terminal (or `source ~/.bashrc` / `source ~/.zshrc`).

**Output is garbled symbols (not ASCII art)**
You may be passing flags in the wrong order. The `-c` or `-b` flag must
come before the filename: `textsight.wasm -c photo.jpg 80`, not after.

**Half-block mode looks like garbage in terminal**
Your terminal may not support Unicode half-blocks or 24-bit color. Try
piping through `aha` to get an HTML file instead.

**Width argument ignored**
Make sure width is the last argument and is a plain integer: `80`, not `--width=80`.

---

## Specs

- **Binary size:** 3.1MB WASM
- **Latency:** <1s for most images
- **Memory:** ~20MB
- **Supported formats:** JPEG, PNG, GIF
- **Platform:** Any OS with wasmtime
- **Build:** `GOOS=wasip1 GOARCH=wasm go build -o textsight.wasm .`
- **Source:** https://github.com/minkymorgan/textsight
