2026-02-07 11:29:45 -06:00
2026-02-06 21:31:49 -06:00
2026-02-06 21:29:55 -06:00


Yuy Chat



Beautiful TUI Chat for Local AI Models

Talk to Yuuki models in your terminal.
Streaming responses. Conversation history. Zero cloud required.


Get Started    Models    Yuy CLI    Demo



License   Rust   ratatui   Tokio   Termux   Linux   macOS   Windows




+--------------------------------------------------+
| yuy-chat v0.1.0        Yuuki-best | Balanced     |
+--------------------------------------------------+
|                                                  |
|  You: Explain async/await in Rust                |
|                                                  |
|  Yuuki: async/await in Rust allows you to write  |
|  asynchronous code that looks synchronous. The   |
|  async keyword marks a function as returning a   |
|  Future, and await suspends execution until the  |
|  Future resolves...                              |
|                                                  |
+--------------------------------------------------+
| Message: _                                       |
+--------------------------------------------------+
| Enter: Send | Ctrl+C: Menu | Ctrl+S: Save       |
+--------------------------------------------------+

A full chat experience in your terminal.


Select models interactively.
Stream responses word by word.
Save and reload conversations.
Switch presets on the fly.

All running locally on your machine.
All powered by ratatui + Rust.




What is yuy-chat?


yuy-chat is a terminal user interface (TUI) application for chatting with local AI models. Built with Rust and powered by ratatui, it provides a polished, keyboard-driven interface for real-time conversations with Yuuki models -- without ever leaving the terminal.

It connects to proven inference backends (llama.cpp and llamafile) and optionally to the HuggingFace Inference API for cloud-based generation. Model discovery, conversation management, preset switching, and streaming are all handled out of the box.

yuy-chat is the companion tool to yuy (the CLI for downloading and managing models). Together they form the complete local inference toolkit for the Yuuki project.




Features


Interactive Chat

Real-time streaming responses displayed word by word. Multi-line input with Shift+Enter. Scrollable message history with keyboard navigation.


Model Selector

Auto-discovers .gguf and .llamafile models from your local directory. Navigate with arrow keys, select with Enter. Refresh without restarting.


Conversation History

Save conversations as JSON files. Load previous chats from a built-in conversation browser. Delete old sessions you no longer need.


HuggingFace Cloud

Optional API integration for cloud-based inference. Configure your HF token in the settings screen. Local and cloud models appear side by side in the selector.

Generation Presets

Three built-in modes -- Creative (0.8 temp), Balanced (0.6 temp), and Precise (0.3 temp). Cycle between them with a single keypress. Custom presets planned for v0.2.


Settings Screen

Configure models directory, HuggingFace token, default preset, history saving, and UI theme -- all from within the TUI.


Cross-Platform

Runs on Termux (Android), Linux, macOS, and Windows. Same binary, same interface, same experience. Mobile-first defaults for constrained hardware.


Lightweight

~8 MB binary. ~20 MB idle RAM. ~50 ms startup. Built with Rust for zero-overhead performance and memory safety.




Installation


Prerequisites

  • Rust 1.70 or later (1.75+ recommended)
  • An inference runtime: llama.cpp or a .llamafile model
  • AI models in GGUF or Llamafile format (use yuy to download them)

From Source

git clone https://github.com/YuuKi-OS/yuy-chat
cd yuy-chat
cargo build --release
cargo install --path .

Termux (Android)

pkg install rust git
git clone https://github.com/YuuKi-OS/yuy-chat
cd yuy-chat
cargo build --release -j 1
cargo install --path .

Note: First compilation takes longer on ARM due to CPU constraints. Use -j 1 to avoid thermal throttling. Incremental builds are fast (~10 sec).


Verify

yuy-chat --version



Quick Start


# 1. Get a model (using yuy CLI)
yuy download Yuuki-best

# 2. Install a runtime
pkg install llama-cpp        # Termux
brew install llama.cpp       # macOS

# 3. Launch the chat
yuy-chat

The interface opens in the model selector. Pick a model with arrow keys and press Enter. Start typing and hit Enter to send messages. That's it.




Keyboard Reference


Model Selector

Key Action
Up / k Previous model
Down / j Next model
Enter Select model
R Refresh model list
Q Quit

Chat

Key Action
Enter Send message
Shift+Enter New line
Ctrl+Enter Force send
Up / Down Scroll history (when input is empty)
Ctrl+C Open menu
Ctrl+L Clear chat
Ctrl+S Save conversation
Backspace Delete character

Menu

Key Action
1 Change model
2 Cycle preset
3 Save conversation
4 Load conversation
5 Clear chat
6 Settings
Q / Esc Back to chat

Settings

Key Action
Up / Down Navigate settings
Enter Edit setting
Esc Back to menu



Generation Presets


Preset Temperature Top P Best For
Creative 0.8 0.9 Stories, brainstorming, creative writing
Balanced 0.6 0.7 General chat, explanations (default)
Precise 0.3 0.5 Code, math, factual answers

Cycle presets during a chat session via Ctrl+C then 2, or set a default in the configuration file.




Supported Formats and Runtimes


Model Formats

Format Extension Notes
GGUF .gguf Recommended. Requires llama.cpp
Llamafile .llamafile Self-executing. Zero dependencies

Inference Runtimes

Runtime Type Notes
llama.cpp Local subprocess Default. Fast, CPU-optimized
llamafile Local executable Bundled runtime + model
HuggingFace API Cloud HTTP Optional. Requires token

yuy-chat auto-detects the appropriate runtime based on the selected model's format. For GGUF models, it searches for llama-cli, llama, or main binaries in PATH.




HuggingFace Integration


Cloud-based inference is optional. To enable it:

  1. Get a token from huggingface.co/settings/tokens
  2. Open Settings in yuy-chat (Ctrl+C then 6)
  3. Navigate to "HuggingFace Token" and paste it

Cloud models then appear in the selector alongside local models:

> Yuuki-best-q5.gguf          2.3 GB  [Local]
  Yuuki-3.7-q4.gguf           1.8 GB  [Local]
  Yuuki-best (via API)        Cloud   [HuggingFace]
Local Cloud
Speed Faster (no network) Depends on connection
Privacy 100% offline Data sent to HF API
Storage Requires disk space None
Availability Always Requires internet



Configuration


Config File

Location: ~/.config/yuy-chat/config.toml

models_dir = "/home/user/.yuuki/models"
default_preset = "Balanced"
save_history = true
theme = "Dark"

# Optional
# hf_token = "hf_xxxxxxxxxxxxx"

Priority Order

  1. TUI settings -- changes made in the settings screen
  2. Config file -- ~/.config/yuy-chat/config.toml
  3. Defaults -- sensible defaults based on platform detection

Directory Layout

~/.config/yuy-chat/
    config.toml                     # configuration
    conversations/                  # saved chats
        conversation-20260206-143022.json
        conversation-20260206-150133.json

~/.yuuki/models/                    # models (shared with yuy CLI)
    Yuuki-best/
        yuuki-best-q4_0.gguf
    Yuuki-3.7/
        yuuki-3.7-q5_k_m.gguf
Platform-specific paths
Platform Config path
Linux ~/.config/yuy-chat/config.toml
macOS ~/.config/yuy-chat/config.toml
Windows C:\Users\{user}\AppData\Roaming\yuy-chat\config.toml
Termux /data/data/com.termux/files/home/.config/yuy-chat/config.toml



Architecture


                              User
                                |
                                v
  +-------------------------------------------------------------+
  |                      yuy-chat (Rust)                        |
  |                                                             |
  |   main.rs              Entry point + event loop             |
  |       |                crossterm polling (100ms)             |
  |       v                                                     |
  |   app.rs               State machine                        |
  |       |                ModelSelector | Chat | Menu |        |
  |       |                Settings | ConversationList          |
  |       v                                                     |
  |   ui/                  Rendering layer (ratatui)            |
  |       |                selector.rs, chat.rs, menu.rs,       |
  |       |                settings.rs, conversations.rs        |
  |       v                                                     |
  |   models/              Business logic                       |
  |                        scanner.rs, runtime.rs, hf_api.rs    |
  +-------+----------------------------+-----------------------+
          |                            |
          v                            v
  +------------------+       +-------------------+
  |  External APIs   |       |  Local Storage    |
  |  HuggingFace     |       |  ~/.config/       |
  |  Inference API   |       |  ~/.yuuki/models/ |
  +--------+---------+       +-------------------+
           |
           v
  +--------------------------------+
  |      Inference Runtimes        |
  |  llama.cpp  |  llamafile       |
  +--------------------------------+

Source Layout

yuy-chat/
    Cargo.toml                  # manifest and dependencies
    README.md
    TECHNICAL.md                # full technical specification
    src/
        main.rs                 # entry point, event loop, terminal setup
        app.rs                  # application state, message handling
        config.rs               # config load/save, presets, themes
        conversation.rs         # message storage, JSON persistence
        models/
            mod.rs              # module declarations
            scanner.rs          # auto-discovery of local + HF models
            runtime.rs          # subprocess management, streaming
            hf_api.rs           # HuggingFace Inference API client
        ui/
            mod.rs              # module declarations
            selector.rs         # model selection screen
            chat.rs             # main chat interface
            menu.rs             # options menu
            settings.rs         # configuration screen
            conversations.rs    # saved conversations browser

Data Flow

User Input --> Event Loop --> App State --> Business Logic --> UI Render
                  ^                             |
                  +-----------------------------+
                       (state mutation loop)

Design Patterns

Pattern Implementation
State machine AppState enum drives which screen is active and how events are routed
Async streaming Tokio channels (mpsc) pipe inference output chunk-by-chunk to the UI
Subprocess isolation llama.cpp runs in a spawned Child process with piped stdout
Double buffering ratatui handles minimal redraws automatically
Lazy loading Models and conversations are loaded on-demand, not at startup



Technical Specifications


Project Metrics

Language:              Rust 2021 Edition
Lines of code:         ~1,453
Rust source files:     15
Modules:               5
Public functions:      ~45
Data structures:       12
Enums:                 8
Direct dependencies:   16
Binary size (release): ~8 MB

Performance

Operation Time
Startup (no models) ~50 ms
Startup (10 models) ~200 ms
Model scan (10 models) ~100 ms
Render frame ~1-2 ms
Send message (pre-inference) ~5 ms
Save conversation ~10 ms

Memory Usage

State RAM
Idle (no model) ~20 MB
Model loaded ~50 MB
Active inference ~100-500 MB
Peak (large models) ~1 GB

System Requirements

Requirement Minimum Recommended
Rust 1.70+ 1.75+
RAM 512 MB 2 GB
Disk 50 MB (binary) 100 MB
CPU ARM/x86 (32/64-bit) x86_64 or ARM64
Terminal Unicode support Modern terminal emulator

Dependencies

Crate Purpose
ratatui Terminal UI framework
crossterm Cross-platform terminal control
tokio Async runtime
reqwest HTTP client (HuggingFace API)
serde + serde_json + toml Serialization
chrono Timestamps for conversations
walkdir Recursive model directory scanning
dirs Cross-platform home directory
anyhow + thiserror Error handling
colored Terminal colors
tracing Logging
which Binary detection in PATH



Platform Support


Platform Status Notes
Termux (Android) Full support Primary target. ARM64 tested
Linux x86_64 Full support Ubuntu 22.04+ tested
Linux ARM64 Full support Raspberry Pi 4 tested
macOS Intel Full support Catalina+ tested
macOS Apple Silicon Full support M1/M2 tested
Windows 10/11 Full support Windows 11 tested
FreeBSD Untested Should work

Termux (Android) -- Primary Target

Optimizations applied automatically when Termux is detected:

  • Single-threaded compilation (-j 1) to prevent thermal throttling
  • Conservative I/O patterns for mobile storage
  • Simplified progress indicators for narrow terminal widths

Detection method:

std::env::var("PREFIX")
    .map(|p| p.contains("com.termux"))
    .unwrap_or(false)
macOS
  • Metal GPU acceleration available through llama.cpp
  • Homebrew for runtime installation (brew install llama.cpp)
  • Full keyboard support in Terminal.app and iTerm2
Windows
  • Windows Terminal recommended for best rendering
  • Backslash path handling automatic
  • CUDA acceleration via llama.cpp for NVIDIA GPUs



Conversation Format


Conversations are saved as JSON files in ~/.config/yuy-chat/conversations/.

Filename convention: conversation-{YYYYMMDD}-{HHMMSS}.json

{
  "messages": [
    {
      "role": "user",
      "content": "Explain async/await in Rust",
      "timestamp": "2026-02-06T14:30:22.123Z"
    },
    {
      "role": "assistant",
      "content": "async/await in Rust allows you to write...",
      "timestamp": "2026-02-06T14:30:25.456Z"
    }
  ],
  "created_at": "2026-02-06T14:30:22.123Z",
  "updated_at": "2026-02-06T14:35:10.789Z"
}



Security


Current

  • HTTPS only -- all HuggingFace API calls use TLS (rustls, no OpenSSL)
  • No shell injection -- subprocesses use Command::arg(), never string interpolation
  • Scoped file access -- all reads/writes within ~/.config/yuy-chat/ and ~/.yuuki/
  • Process isolation -- llama.cpp runs as a separate subprocess with piped I/O

Known Limitations

  • HuggingFace tokens are stored in plaintext in config.toml
  • File permissions are not enforced on config files

Planned (v0.2+)

  • System keyring integration for token storage
  • File permission enforcement (0o600 for sensitive files)
  • Encrypted token storage on Termux via libsodium
  • Input size limits and sanitization



Troubleshooting


No models found
# Check if models exist
ls ~/.yuuki/models/

# Download a model using yuy CLI
yuy download Yuuki-best

# Or place a .gguf file manually
cp your-model.gguf ~/.yuuki/models/
llama.cpp not found
# Termux
pkg install llama-cpp

# macOS
brew install llama.cpp

# Verify
which llama-cli
Permission denied on llamafile
chmod +x ~/.yuuki/models/*.llamafile
Slow responses
  • Use a smaller quantization (q4_0 instead of q8_0)
  • Check available RAM (free -h or top)
  • Switch to the Precise preset (shorter outputs)
  • Ensure no other heavy processes are running
UI rendering issues
  • Use a terminal with Unicode support
  • On Windows, use Windows Terminal (not CMD)
  • On Termux, ensure terminal encoding is UTF-8
  • Try resizing the terminal window



Roadmap


v0.2 -- Enhanced UX

  • Syntax highlighting for code blocks
  • Copy/paste support
  • Export conversations to Markdown
  • Custom system prompts
  • Vim keybindings mode
  • Custom user-defined presets

v0.3 -- Power Features

  • Multiple chat tabs
  • Search in conversation history
  • Token usage statistics
  • Model comparison mode
  • Template system for prompts

v1.0 -- Ecosystem

  • Plugin system
  • Custom themes (user-defined color schemes)
  • Conversation branching
  • Multi-modal support (images)
  • REST API server mode



Contributing


Development Setup

git clone https://github.com/YuuKi-OS/yuy-chat
cd yuy-chat

# install Rust if needed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# build and verify
cargo build
cargo test
cargo fmt -- --check
cargo clippy

Commit Convention

<type>(<scope>): <subject>

Types: feat | fix | docs | style | refactor | test | chore

feat(chat): add multi-line input support

- Detect Shift+Enter for newlines
- Update input rendering for wrapped text
- Add cursor position tracking

Closes #12

Pull Request Checklist

  • Tests pass (cargo test)
  • Code is formatted (cargo fmt)
  • No clippy warnings (cargo clippy)
  • Documentation updated if needed
  • Commits follow the convention above

Coding Standards

  • snake_case for functions, CamelCase for types
  • Document all public functions with /// comments
  • Use Result<T> and the ? operator for error handling
  • Prefer async/await over callbacks
  • Justify any new dependency in the PR description



Design Decisions


Why a TUI instead of a GUI or web UI?

The primary target is Termux on Android. A TUI requires no display server, no browser, and minimal resources. It also works over SSH, inside tmux, and in any terminal emulator on any platform. A GUI may be added as an optional feature later.

Why ratatui?

ratatui is the most actively maintained TUI framework in the Rust ecosystem. It provides immediate-mode rendering, a rich widget library, and cross-platform terminal support through crossterm. The API is well-documented and the community is responsive.

Why subprocess spawning instead of library linking?

Linking llama.cpp as a C library adds significant build complexity, especially for cross-compilation and Termux. Spawning a subprocess is simpler, isolates crashes, and allows the user to update llama.cpp independently. Library integration is planned for v1.0.

Why Tokio for a TUI?

Inference is slow. Without async, the UI would freeze during response generation. Tokio enables non-blocking subprocess reads, smooth streaming display, and sets the foundation for future parallel features like multi-tab chat.

Why JSON for conversations instead of SQLite?

JSON files are human-readable, trivially portable, and require no additional dependency. Each conversation is self-contained. SQLite may be introduced in v1.0 if search and indexing become necessary.




Build Configuration


Release Profile

[profile.release]
opt-level = "z"         # optimize for binary size
lto = true              # link-time optimization
codegen-units = 1       # single codegen unit for better optimization
strip = true            # strip debug symbols

Environment Variables

RUST_LOG=debug yuy-chat          # enable debug logging
RUST_LOG=info yuy-chat           # info-level logging
YUY_MODELS_DIR=/path yuy-chat    # custom models directory
XDG_CONFIG_HOME=/custom yuy-chat # custom config directory

Cross-Compilation

# ARM64 (Raspberry Pi, Termux native)
rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu

# Windows (from Linux)
rustup target add x86_64-pc-windows-gnu
cargo build --release --target x86_64-pc-windows-gnu

# macOS Apple Silicon (from Linux)
rustup target add aarch64-apple-darwin
cargo build --release --target aarch64-apple-darwin



About the Yuuki Project


yuy-chat exists to serve the Yuuki project -- a code-generation LLM being trained entirely on a smartphone with zero cloud budget.

Training Details

Base model GPT-2 (124M parameters)
Training type Continued pre-training
Hardware Snapdragon 685, CPU only
Training time 50+ hours
Progress 2,000 / 37,500 steps (5.3%)
Cost $0.00

Quality Scores (Checkpoint 2000)

Language Score
Agda 55 / 100
C 20 / 100
Assembly 15 / 100
Python 8 / 100

A fully native model (trained from scratch, not fine-tuned) is planned for v1.0. A research paper documenting the mobile training methodology is in preparation.





Project Description
yuy CLI for downloading, managing, and running Yuuki models
Yuuki-best Best checkpoint model weights
Yuuki Space Web-based interactive demo
yuuki-training Training code and scripts




Model Weights   Live Demo   Yuy CLI


Training Code   Report Issue




License


Copyright 2026 Yuuki Project

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.



Built with patience, a phone, and zero budget.


Yuuki Project


Description
No description provided
Readme 68 KiB
Languages
Rust 100%