Skip to content
case-studycase studyiOSmobile IDE

How We Built Onepilot: A Mobile AI IDE for iPhone

The technical decisions, architecture trade-offs, and hard lessons behind building a real terminal emulator with AI agent deployment on iOS โ€” from SSH tunnels to VT100 rendering.

9 min readSofiane El Mokaddam, ELM Labs

TL;DR

  • We built a full VT100 terminal emulator on iOS โ€” not a web view, a real terminal with ANSI escape codes, cursor positioning, and scrollback
  • SSH connections use Citadel (pure Swift) with interactive PTY sessions that survive app backgrounding
  • The agent deployment wizard lets you install and configure AI coding agents on any server from your phone
  • Agent-agnostic architecture means any LLM provider, any agent framework, any messaging channel

Why We Built a Terminal for iPhone

The idea came from a real problem. We manage servers โ€” for our own products, for clients, for side projects. When something goes wrong at 11pm, the options on mobile are limited. Existing SSH apps let you connect and type commands, but that is where they stop.

We wanted more. We wanted to SSH into a server, deploy an AI coding agent, give it instructions over Telegram, and monitor its work โ€” all from the phone. No laptop required. Not a remote desktop hack, not a web-based terminal, but a native iOS app that works like a real development environment.

That is Onepilot.

The Hard Problem: A Real Terminal on iOS

The first decision was the most consequential. How do you render a terminal on iOS?

Option 1: WebView with xterm.js

The easy path. Wrap xterm.js in a WKWebView, pipe SSH data through JavaScript. Most mobile terminal apps do this. It works, but the trade-offs are significant:

  • Latency. Every keystroke crosses the native-to-JS bridge, gets processed by xterm.js, then the rendered output crosses back. On a fast connection this adds 10-20ms. On a slow one, it compounds.
  • Memory. A WebView is a full browser engine. On an iPhone with 6GB RAM shared between all apps, that matters.
  • Keyboard. iOS keyboard handling in WebViews is notoriously fragile. Custom key mappings, the special character bar, Ctrl/Escape sequences โ€” all require workarounds that break across iOS versions.

Option 2: Native VT100 rendering

The hard path. Parse raw VT100/ANSI escape sequences in Swift, maintain a character grid, render with CoreText. This is what desktop terminals like iTerm2 and Terminal.app do.

We chose this path, using SwiftTerm โ€” an open-source terminal emulator written in Swift by Miguel de Icaza. SwiftTerm handles the VT100 state machine, cursor positioning, color attributes, scrollback buffer, and text rendering.

The result: a terminal that feels native because it is native. No bridge latency, no WebView overhead, direct CoreText rendering at 120fps on ProMotion displays.

The data pipeline

Server PTY โ†’ SSH channel โ†’ Citadel โ†’ ByteBuffer โ†’ SwiftTerm feed โ†’ CoreText render

Each SSH output chunk arrives as a ByteBuffer from Citadel (a pure Swift SSH library built on SwiftNIO). We convert it to a byte array and feed it directly into SwiftTerm's terminal engine. No string encoding step, no intermediate parsing โ€” raw bytes to terminal state to pixels.

Input flows in reverse. When you tap a key on the custom mobile keypad, it becomes a byte sequence (with proper escape codes for special keys), goes through Citadel's TTYStdinWriter, and arrives at the server's PTY. The round-trip feels instant.

SSH on iOS: The Citadel Stack

We evaluated three SSH libraries for iOS:

LibraryLanguageAsyncPTY SupportMaintenance
NMSSHObjective-C (libssh2)NoLimitedAbandoned
ShoutSwift (libssh2)NoLimitedMinimal
CitadelPure Swift (SwiftNIO)Yes (async/await)FullActive

Citadel was the clear choice. It is built on SwiftNIO, which means proper async/await support, no blocking threads, and efficient I/O multiplexing. The PTY API is clean:

try await client.withTTY { tty in
    // tty.stdout: AsyncSequence of output chunks
    // tty.stdin: TTYStdinWriter for input
}

Surviving the background

iOS is aggressive about killing background tasks. An SSH connection that goes idle for 30 seconds can be terminated by the OS. We handle this with:

  1. Background task assertions โ€” request extra execution time when the app is backgrounded
  2. Connection health checks โ€” periodic keepalive packets to detect dead connections
  3. Automatic reconnection โ€” when a connection drops, reconnect and restore the terminal state from SwiftTerm's scrollback buffer

This does not make SSH connections immortal on iOS โ€” nothing can. But it makes them resilient enough for real work sessions.

The Mobile Keyboard Problem

A terminal needs keys that a phone keyboard does not have. Ctrl, Escape, Tab, arrow keys, pipe, tilde, backtick โ€” these are essential for shell work and they are missing or buried on iOS.

We built a custom keypad that sits above the standard iOS keyboard:

  • Top row: Esc, Tab, Ctrl, arrows, pipe, common symbols
  • Long-press: Hold Ctrl and tap a letter key for Ctrl+C, Ctrl+Z, etc.
  • Swipe gestures: Swipe on the terminal for scrollback navigation

The key insight was treating the keypad as a first-class input device, not an afterthought bolted onto the keyboard. Each key generates the correct escape sequence โ€” Tab sends \t, Escape sends \x1b, Ctrl+C sends \x03. These are byte-level operations, not string operations.

AI Agent Deployment

This is where Onepilot diverges from every other mobile SSH app. We did not just build a terminal โ€” we built an AI operations center.

The setup wizard

Deploying an AI coding agent on a remote server normally requires:

  1. SSH into the server
  2. Install Node.js (or Python, or whatever the agent runtime needs)
  3. Install the agent package
  4. Configure API keys for the LLM provider
  5. Configure messaging channels (Telegram, Discord, Slack)
  6. Set up the agent as a system service
  7. Configure auto-restart on crash

That is a lot of steps to do on a phone keyboard. Our setup wizard automates the entire process. You pick a server, choose your LLM provider and API key, select a messaging channel, and the wizard runs the installation commands over SSH. Under the hood, it is still running shell commands โ€” but sequenced, error-checked, and with clear progress feedback.

Agent-agnostic architecture

We intentionally avoided coupling to a single AI provider. Onepilot supports 23+ LLM providers:

  • Commercial APIs: Claude (Anthropic), GPT (OpenAI), Gemini (Google), Mistral, Cohere
  • Open-source models: Ollama, LM Studio, vLLM, llama.cpp
  • Cloud platforms: AWS Bedrock, Azure OpenAI, Google Vertex AI, Together AI, Fireworks

The agent framework we use (OpenClaw) is designed to work with any OpenAI-compatible API endpoint. This means you can run a local Ollama instance on your server with a $0 API cost and still get a fully functional AI coding agent.

The Soul Designer

Every AI agent has a system prompt that shapes its behavior. Our Soul Designer lets you customize your agent's personality, expertise, and guardrails directly from the app. Want an agent that specializes in Python and refuses to touch production databases? Configure that in the Soul Designer. Want one that writes verbose comments and always runs tests? Set that up.

This is not just prompt engineering โ€” it is agent configuration with persistence. The soul configuration is stored on the server alongside the agent, so it survives restarts and updates.

Beyond the Terminal

A terminal is necessary but not sufficient for mobile development. Onepilot includes:

  • File browser with syntax highlighting for 20+ languages, powered by tree-sitter
  • Git integration โ€” diffs, commit history, branch management, all through a native UI
  • Cron job manager โ€” create, edit, enable/disable scheduled tasks
  • Multi-server management โ€” switch between servers without disconnecting

Each of these features works over the same SSH connection. There is no separate protocol, no agent running on the server (beyond the AI agent), no special software to install. If you can SSH into it, Onepilot can manage it.

Architecture Decisions We Would Make Again

SwiftUI over UIKit. SwiftUI's declarative model made it possible to build complex UI with a small team. The conversation view, settings screens, and onboarding flows are all SwiftUI. The terminal view itself is a UIKit UIViewRepresentable wrapper around SwiftTerm โ€” because terminal rendering needs pixel-level control that SwiftUI's canvas does not provide.

Citadel over libssh2 wrappers. The async/await integration alone was worth it. No callback pyramids, no thread synchronization headaches. When you are managing multiple SSH sessions concurrently (terminal + file operations + agent status checks), proper async I/O is essential.

Actor isolation for SSH state. Swift's actor model prevents data races in SSH session management. The SSHSessionManager is an actor, which means concurrent access to connection state is serialized automatically. No locks, no race conditions, no mysterious crashes.

Supabase for the backend. User authentication, server credential sync (encrypted), subscription management, and analytics โ€” all handled by Supabase. The row-level security policies ensure that users can only access their own data, and we never store SSH passwords or private keys on our servers.

What We Learned

Mobile is not a limitation โ€” it is a constraint that forces good design. Every feature had to work on a 6.1-inch screen with a touch keyboard. This constraint produced a better UX than we would have built for desktop, because we could not hide complexity behind more screen real estate.

The terminal is the foundation, not the product. Users do not want a terminal app. They want to solve problems on their servers. The terminal is the mechanism. The agent deployment, file management, and git integration are what make it useful.

Agent-agnostic beats agent-specific. We considered building our own LLM integration from scratch. Instead, we made the architecture pluggable. This turned out to be the right call โ€” the AI landscape moves too fast to bet on one provider. Users bring their own API keys, their own model preferences, their own workflows.

Try It

Onepilot is free to get started. SSH into your servers, deploy AI agents, manage your infrastructure โ€” from your iPhone. No subscription required for core features.

If you are building something similar or have questions about mobile SSH, terminal emulation, or AI agent architecture, reach out to us. We are happy to share what we have learned.

Ready to move forward?

30 minutes, no commitment. Let's talk.

Try Onepilot โ€” free on the App Store

Related articles