I saw a post from Gregor where talked about playing Clash Royale in a terminal. That sounded hard, so heres what I built:
A Rust CLI that receives your iPhone's AirPlay stream, decodes it, renders it as pixel graphics in Kitty, and forwards your mouse clicks back as touch events on the phone.
What I used
- I used UxPlay, an open-source AirPlay receiver. UxPlay receives the H.264 video stream and dumps it to a file.
- Then we use ffmpeg to read the raw H.264 and decode it into RGB frames.
- Now the rust CLI tool reads RGB frames from ffmpeg, renders them in Kitty using the graphics protocol, and captures mouse events.
- I then added in WebDriverAgent which is an HTTP server running on the iPhone (via Xcode) that accepts touch commands. Then we translate mouse clicks into touch events and sends them over.
The data flow looks like this:
iPhone ──AirPlay──▶ UxPlay ──H.264 via FIFO──▶ ffmpeg ──RGB frames──▶ cr-cli ──Kitty Graphics──▶ Kitty Terminal
▲ │
│ Coordinate mapping │
│ ▼
iPhone ◀──Touch── WDA ◀──HTTP POST── TouchInjector ◀────────────── cr-cli Mouse eventsProblem 1: UxPlay's Buffered Writes
The first issue I had was that no frames were appearing. UxPlay was receiving the AirPlay stream, ffmpeg was running, but nothing came through the pipe.
I then found out that UxPlay uses fopen with default stdio buffering to write the H.264 dump file. stdio buffers are typically 4-8KB. H.264 NAL units can be small, especially P-frames, so data would sit in the buffer and never reach ffmpeg until enough accumulated.
So the fix was to change UxPlay's source (uxplay.cpp), right after the fopen call for video dumping:
video_dumpfile = fopen(fn.c_str(), "wb");
if (video_dumpfile) {
setvbuf(video_dumpfile, NULL, _IONBF, 0);
}_IONBF disables buffering entirely. Every fwrite immediately hits the file descriptor. Frames started flowing instantly.
Problem 2: FIFO Pipes Instead of Files
Originally I tried having UxPlay write to a regular file and having ffmpeg tail it. But then a race condition appeared and things just weren't working as expected, because ffmpeg doesn't know when new data arrives, and you can't seek on a growing file in a "good" way.
So the solution was named pipes (FIFOs). A FIFO looks like a file but behaves like a pipe, writes block until there's a reader, and reads block until there's data. Created with mkfifo:
Command::new("mkfifo").arg(&video_dump_path).status()?;UxPlay writes H.264 to the FIFO, ffmpeg reads from it.
Problem 3: Rendering in the Terminal
Kitty's graphics protocol lets you display images at pixel resolution inside the terminal. I had to use file-based transmission instead of base64-encoding entire frames because that's too slow, you write raw RGB pixels to a temp file and Kitty reads from it:
\x1b_Gq=2,a=T,t=f,f=24,s=WIDTH,v=HEIGHT,c=COLS,r=ROWS;BASE64_PATH\x1b\\The t=f means "read from file", f=24 means RGB24 format, and c/r is how many terminal cells to scale the image into. The path is base64-encoded.
I use two temp files and alternate between them (double buffering) so Kitty is never reading a file that's being written to mid-frame.
Problem 4: Touch Forwarding - Three Iterations
Attempt 1: Shell Command Templates
My first method was a generic shell command template system. You'd do something like --touch "my-tool {phase} {x} {y}" and the CLI would spawn a subprocess for every touch event. so every mouse move spawned a process. This was way slow as each Command::new().status() forks a process, waits for it, and parses the exit code. On a drag gesture, you'd spawn dozens of processes. Now to be fair, it's still not 100% optimized so there could be more work done to it.
Attempt 2: WebDriverAgent with Batched Actions
I replaced the shell template with direct HTTP calls to WebDriverAgent. WDA implements the W3C WebDriver protocol and runs on the iPhone via Xcode.
A improvement was batching. Instead of sending individual down/move/up events, we accumulate all the points during a drag gesture and sends them as a single W3C Actions sequence when the finger lifts:
match event.phase {
TouchPhase::Begin => {
self.pending.clear();
self.pending.push((x, y));
}
TouchPhase::Move => {
self.pending.push((x, y));
}
TouchPhase::End => {
self.flush();
}
}So we build a W3C Actions payload, where pointerMove is the start position, pointerDown, a series of pointerMoves for the drag path, then pointerUp.
Attempt 3: Background Thread
Now I then later realized that HTTP request to WDA blocks the main event loop. so that meant no frames render while waiting for WDA. so the fix was to move the HTTP call to a background thread instead.
let (tx, rx) = mpsc::channel::<Value>();
thread::spawn(move || {
for payload in rx {
let _ = ureq::post(&actions_url).send_json(&payload);
}
});flush() now just sends the payload through a channel and returns immediately.
Problem 5: Coordinate Mapping
Terminal cells aren't square pixels. A typical cell might be 8px wide and 16px tall. So when a user clicks a cell in the terminal, you can't just multiply by a constant because
- the phone image is centered in the terminal, so you subtract the viewport's
leftandtopmargins. - the viewport's cell dimensions map to the phone's pixel dimensions. A click at 50% across the viewport should map to 50% across the phone screen.
- WDA uses "points" not pixels. An iPhone 15 Pro is 1179px wide but only 393 points. So after mapping terminal cells to device pixels, you divide by the scale factor (~3x).
let x = event.device.0 as f64 / self.scale; // pixels → points
let y = event.device.1 as f64 / self.scale;scale factors are computed automatically by querying WDA's /window/size endpoint and comparing against the known device pixel width.
Problem 6: My stupid audio mistake
UxPlay supports audio output via GStreamer sinks. so i added -as osxaudiosink to play audio through macOS's native audio system. Then I realized no audio was playing at all.
so i tried to dig into investigating GStreamer plugin paths, worrying about Anaconda's bundled GStreamer conflicting with Homebrew's, trying env_clear() to isolate the environment, renaming Anaconda's .dylib files...
MY DUMBASS didn't realize my phone was on silent mode.