Aguitauwu/yuy-chat

Fork 0

mirror of https://github.com/YuuKi-OS/yuy-chat.git synced 2026-02-18 22:01:09 +00:00

Go to file

awa 00aa4ad050 Rename yuy-chat-README.md to README.md

2026-02-07 11:29:45 -06:00

yuy-chat-complete

sorry

2026-02-06 21:31:49 -06:00

LICENSE

yuy chat init

2026-02-06 21:29:55 -06:00

README.md

Rename yuy-chat-README.md to README.md

2026-02-07 11:29:45 -06:00

README.md

Beautiful TUI Chat for Local AI Models

Talk to Yuuki models in your terminal.
Streaming responses. Conversation history. Zero cloud required.

+--------------------------------------------------+
| yuy-chat v0.1.0        Yuuki-best | Balanced     |
+--------------------------------------------------+
|                                                  |
|  You: Explain async/await in Rust                |
|                                                  |
|  Yuuki: async/await in Rust allows you to write  |
|  asynchronous code that looks synchronous. The   |
|  async keyword marks a function as returning a   |
|  Future, and await suspends execution until the  |
|  Future resolves...                              |
|                                                  |
+--------------------------------------------------+
| Message: _                                       |
+--------------------------------------------------+
| Enter: Send | Ctrl+C: Menu | Ctrl+S: Save       |
+--------------------------------------------------+

A full chat experience in your terminal.

Select models interactively.
Stream responses word by word.
Save and reload conversations.
Switch presets on the fly.

All running locally on your machine.
All powered by ratatui + Rust.

What is yuy-chat?

yuy-chat is a terminal user interface (TUI) application for chatting with local AI models. Built with Rust and powered by ratatui, it provides a polished, keyboard-driven interface for real-time conversations with Yuuki models -- without ever leaving the terminal.

It connects to proven inference backends (llama.cpp and llamafile) and optionally to the HuggingFace Inference API for cloud-based generation. Model discovery, conversation management, preset switching, and streaming are all handled out of the box.

yuy-chat is the companion tool to yuy (the CLI for downloading and managing models). Together they form the complete local inference toolkit for the Yuuki project.

Features

Interactive Chat

Real-time streaming responses displayed word by word. Multi-line input with Shift+Enter. Scrollable message history with keyboard navigation.

Model Selector

Auto-discovers .gguf and .llamafile models from your local directory. Navigate with arrow keys, select with Enter. Refresh without restarting.

Conversation History

Save conversations as JSON files. Load previous chats from a built-in conversation browser. Delete old sessions you no longer need.

HuggingFace Cloud

Optional API integration for cloud-based inference. Configure your HF token in the settings screen. Local and cloud models appear side by side in the selector.

Generation Presets

Three built-in modes -- Creative (0.8 temp), Balanced (0.6 temp), and Precise (0.3 temp). Cycle between them with a single keypress. Custom presets planned for v0.2.

Settings Screen

Configure models directory, HuggingFace token, default preset, history saving, and UI theme -- all from within the TUI.

Cross-Platform

Runs on Termux (Android), Linux, macOS, and Windows. Same binary, same interface, same experience. Mobile-first defaults for constrained hardware.

Lightweight

~8 MB binary. ~20 MB idle RAM. ~50 ms startup. Built with Rust for zero-overhead performance and memory safety.

Installation

Prerequisites

Rust 1.70 or later (1.75+ recommended)
An inference runtime: llama.cpp or a .llamafile model
AI models in GGUF or Llamafile format (use yuy to download them)

From Source

git clone https://github.com/YuuKi-OS/yuy-chat
cd yuy-chat
cargo build --release
cargo install --path .

Termux (Android)

pkg install rust git
git clone https://github.com/YuuKi-OS/yuy-chat
cd yuy-chat
cargo build --release -j 1
cargo install --path .

Note: First compilation takes longer on ARM due to CPU constraints. Use -j 1 to avoid thermal throttling. Incremental builds are fast (~10 sec).

Verify

yuy-chat --version

Quick Start

# 1. Get a model (using yuy CLI)
yuy download Yuuki-best

# 2. Install a runtime
pkg install llama-cpp        # Termux
brew install llama.cpp       # macOS

# 3. Launch the chat
yuy-chat

The interface opens in the model selector. Pick a model with arrow keys and press Enter. Start typing and hit Enter to send messages. That's it.

Keyboard Reference

Model Selector

Key	Action
`Up` / `k`	Previous model
`Down` / `j`	Next model
`Enter`	Select model
`R`	Refresh model list
`Q`	Quit

Chat

Key	Action
`Enter`	Send message
`Shift+Enter`	New line
`Ctrl+Enter`	Force send
`Up` / `Down`	Scroll history (when input is empty)
`Ctrl+C`	Open menu
`Ctrl+L`	Clear chat
`Ctrl+S`	Save conversation
`Backspace`	Delete character

Key	Action
`1`	Change model
`2`	Cycle preset
`3`	Save conversation
`4`	Load conversation
`5`	Clear chat
`6`	Settings
`Q` / `Esc`	Back to chat

Settings

Key	Action
`Up` / `Down`	Navigate settings
`Enter`	Edit setting
`Esc`	Back to menu

Generation Presets

Preset	Temperature	Top P	Best For
Creative	0.8	0.9	Stories, brainstorming, creative writing
Balanced	0.6	0.7	General chat, explanations (default)
Precise	0.3	0.5	Code, math, factual answers

Cycle presets during a chat session via Ctrl+C then 2, or set a default in the configuration file.

Supported Formats and Runtimes

Model Formats

Format	Extension	Notes
GGUF	`.gguf`	Recommended. Requires llama.cpp
Llamafile	`.llamafile`	Self-executing. Zero dependencies

Inference Runtimes

Runtime	Type	Notes
llama.cpp	Local subprocess	Default. Fast, CPU-optimized
llamafile	Local executable	Bundled runtime + model
HuggingFace API	Cloud HTTP	Optional. Requires token

yuy-chat auto-detects the appropriate runtime based on the selected model's format. For GGUF models, it searches for llama-cli, llama, or main binaries in PATH.

HuggingFace Integration

Cloud-based inference is optional. To enable it:

Get a token from huggingface.co/settings/tokens
Open Settings in yuy-chat (Ctrl+C then 6)
Navigate to "HuggingFace Token" and paste it

Cloud models then appear in the selector alongside local models:

> Yuuki-best-q5.gguf          2.3 GB  [Local]
  Yuuki-3.7-q4.gguf           1.8 GB  [Local]
  Yuuki-best (via API)        Cloud   [HuggingFace]

	Local	Cloud
Speed	Faster (no network)	Depends on connection
Privacy	100% offline	Data sent to HF API
Storage	Requires disk space	None
Availability	Always	Requires internet

Configuration

Config File

Location: ~/.config/yuy-chat/config.toml

models_dir = "/home/user/.yuuki/models"
default_preset = "Balanced"
save_history = true
theme = "Dark"

# Optional
# hf_token = "hf_xxxxxxxxxxxxx"

Priority Order

TUI settings -- changes made in the settings screen
Config file -- ~/.config/yuy-chat/config.toml
Defaults -- sensible defaults based on platform detection

Directory Layout

~/.config/yuy-chat/
    config.toml                     # configuration
    conversations/                  # saved chats
        conversation-20260206-143022.json
        conversation-20260206-150133.json

~/.yuuki/models/                    # models (shared with yuy CLI)
    Yuuki-best/
        yuuki-best-q4_0.gguf
    Yuuki-3.7/
        yuuki-3.7-q5_k_m.gguf

Platform-specific paths

Platform	Config path
Linux	`~/.config/yuy-chat/config.toml`
macOS	`~/.config/yuy-chat/config.toml`
Windows	`C:\Users\{user}\AppData\Roaming\yuy-chat\config.toml`
Termux	`/data/data/com.termux/files/home/.config/yuy-chat/config.toml`

Architecture

                              User
                                |
                                v
  +-------------------------------------------------------------+
  |                      yuy-chat (Rust)                        |
  |                                                             |
  |   main.rs              Entry point + event loop             |
  |       |                crossterm polling (100ms)             |
  |       v                                                     |
  |   app.rs               State machine                        |
  |       |                ModelSelector | Chat | Menu |        |
  |       |                Settings | ConversationList          |
  |       v                                                     |
  |   ui/                  Rendering layer (ratatui)            |
  |       |                selector.rs, chat.rs, menu.rs,       |
  |       |                settings.rs, conversations.rs        |
  |       v                                                     |
  |   models/              Business logic                       |
  |                        scanner.rs, runtime.rs, hf_api.rs    |
  +-------+----------------------------+-----------------------+
          |                            |
          v                            v
  +------------------+       +-------------------+
  |  External APIs   |       |  Local Storage    |
  |  HuggingFace     |       |  ~/.config/       |
  |  Inference API   |       |  ~/.yuuki/models/ |
  +--------+---------+       +-------------------+
           |
           v
  +--------------------------------+
  |      Inference Runtimes        |
  |  llama.cpp  |  llamafile       |
  +--------------------------------+

Source Layout

yuy-chat/
    Cargo.toml                  # manifest and dependencies
    README.md
    TECHNICAL.md                # full technical specification
    src/
        main.rs                 # entry point, event loop, terminal setup
        app.rs                  # application state, message handling
        config.rs               # config load/save, presets, themes
        conversation.rs         # message storage, JSON persistence
        models/
            mod.rs              # module declarations
            scanner.rs          # auto-discovery of local + HF models
            runtime.rs          # subprocess management, streaming
            hf_api.rs           # HuggingFace Inference API client
        ui/
            mod.rs              # module declarations
            selector.rs         # model selection screen
            chat.rs             # main chat interface
            menu.rs             # options menu
            settings.rs         # configuration screen
            conversations.rs    # saved conversations browser

Data Flow

User Input --> Event Loop --> App State --> Business Logic --> UI Render
                  ^                             |
                  +-----------------------------+
                       (state mutation loop)

Design Patterns

Pattern	Implementation
State machine	`AppState` enum drives which screen is active and how events are routed
Async streaming	Tokio channels (`mpsc`) pipe inference output chunk-by-chunk to the UI
Subprocess isolation	llama.cpp runs in a spawned `Child` process with piped stdout
Double buffering	ratatui handles minimal redraws automatically
Lazy loading	Models and conversations are loaded on-demand, not at startup

Technical Specifications

Project Metrics

Language:              Rust 2021 Edition
Lines of code:         ~1,453
Rust source files:     15
Modules:               5
Public functions:      ~45
Data structures:       12
Enums:                 8
Direct dependencies:   16
Binary size (release): ~8 MB

Performance

Operation	Time
Startup (no models)	~50 ms
Startup (10 models)	~200 ms
Model scan (10 models)	~100 ms
Render frame	~1-2 ms
Send message (pre-inference)	~5 ms
Save conversation	~10 ms

Memory Usage

State	RAM
Idle (no model)	~20 MB
Model loaded	~50 MB
Active inference	~100-500 MB
Peak (large models)	~1 GB

System Requirements

Requirement	Minimum	Recommended
Rust	1.70+	1.75+
RAM	512 MB	2 GB
Disk	50 MB (binary)	100 MB
CPU	ARM/x86 (32/64-bit)	x86_64 or ARM64
Terminal	Unicode support	Modern terminal emulator

Dependencies

Crate	Purpose
`ratatui`	Terminal UI framework
`crossterm`	Cross-platform terminal control
`tokio`	Async runtime
`reqwest`	HTTP client (HuggingFace API)
`serde` + `serde_json` + `toml`	Serialization
`chrono`	Timestamps for conversations
`walkdir`	Recursive model directory scanning
`dirs`	Cross-platform home directory
`anyhow` + `thiserror`	Error handling
`colored`	Terminal colors
`tracing`	Logging
`which`	Binary detection in PATH

Platform Support

Platform	Status	Notes
Termux (Android)	Full support	Primary target. ARM64 tested
Linux x86_64	Full support	Ubuntu 22.04+ tested
Linux ARM64	Full support	Raspberry Pi 4 tested
macOS Intel	Full support	Catalina+ tested
macOS Apple Silicon	Full support	M1/M2 tested
Windows 10/11	Full support	Windows 11 tested
FreeBSD	Untested	Should work

Termux (Android) -- Primary Target

Optimizations applied automatically when Termux is detected:

Single-threaded compilation (-j 1) to prevent thermal throttling
Conservative I/O patterns for mobile storage
Simplified progress indicators for narrow terminal widths

Detection method:

std::env::var("PREFIX")
    .map(|p| p.contains("com.termux"))
    .unwrap_or(false)

macOS

Metal GPU acceleration available through llama.cpp
Homebrew for runtime installation (brew install llama.cpp)
Full keyboard support in Terminal.app and iTerm2

Windows

Windows Terminal recommended for best rendering
Backslash path handling automatic
CUDA acceleration via llama.cpp for NVIDIA GPUs

Conversation Format

Conversations are saved as JSON files in ~/.config/yuy-chat/conversations/.

Filename convention: conversation-{YYYYMMDD}-{HHMMSS}.json

{
  "messages": [
    {
      "role": "user",
      "content": "Explain async/await in Rust",
      "timestamp": "2026-02-06T14:30:22.123Z"
    },
    {
      "role": "assistant",
      "content": "async/await in Rust allows you to write...",
      "timestamp": "2026-02-06T14:30:25.456Z"
    }
  ],
  "created_at": "2026-02-06T14:30:22.123Z",
  "updated_at": "2026-02-06T14:35:10.789Z"
}

Security

Current

HTTPS only -- all HuggingFace API calls use TLS (rustls, no OpenSSL)
No shell injection -- subprocesses use Command::arg(), never string interpolation
Scoped file access -- all reads/writes within ~/.config/yuy-chat/ and ~/.yuuki/
Process isolation -- llama.cpp runs as a separate subprocess with piped I/O

Known Limitations

HuggingFace tokens are stored in plaintext in config.toml
File permissions are not enforced on config files

Planned (v0.2+)

System keyring integration for token storage
File permission enforcement (0o600 for sensitive files)
Encrypted token storage on Termux via libsodium
Input size limits and sanitization

Troubleshooting

No models found

# Check if models exist
ls ~/.yuuki/models/

# Download a model using yuy CLI
yuy download Yuuki-best

# Or place a .gguf file manually
cp your-model.gguf ~/.yuuki/models/

llama.cpp not found

# Termux
pkg install llama-cpp

# macOS
brew install llama.cpp

# Verify
which llama-cli

Permission denied on llamafile

chmod +x ~/.yuuki/models/*.llamafile

Slow responses

Use a smaller quantization (q4_0 instead of q8_0)
Check available RAM (free -h or top)
Switch to the Precise preset (shorter outputs)
Ensure no other heavy processes are running

UI rendering issues

Use a terminal with Unicode support
On Windows, use Windows Terminal (not CMD)
On Termux, ensure terminal encoding is UTF-8
Try resizing the terminal window

Roadmap

v0.2 -- Enhanced UX

Syntax highlighting for code blocks
Copy/paste support
Export conversations to Markdown
Custom system prompts
Vim keybindings mode
Custom user-defined presets

v0.3 -- Power Features

Multiple chat tabs
Search in conversation history
Token usage statistics
Model comparison mode
Template system for prompts

v1.0 -- Ecosystem

Plugin system
Custom themes (user-defined color schemes)
Conversation branching
Multi-modal support (images)
REST API server mode

Contributing

Development Setup

git clone https://github.com/YuuKi-OS/yuy-chat
cd yuy-chat

# install Rust if needed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# build and verify
cargo build
cargo test
cargo fmt -- --check
cargo clippy

Commit Convention

<type>(<scope>): <subject>

feat(chat): add multi-line input support

- Detect Shift+Enter for newlines
- Update input rendering for wrapped text
- Add cursor position tracking

Closes #12

Pull Request Checklist

Tests pass (cargo test)
Code is formatted (cargo fmt)
No clippy warnings (cargo clippy)
Documentation updated if needed
Commits follow the convention above

Coding Standards

snake_case for functions, CamelCase for types
Document all public functions with /// comments
Use Result<T> and the ? operator for error handling
Prefer async/await over callbacks
Justify any new dependency in the PR description

Design Decisions

Why a TUI instead of a GUI or web UI?

The primary target is Termux on Android. A TUI requires no display server, no browser, and minimal resources. It also works over SSH, inside tmux, and in any terminal emulator on any platform. A GUI may be added as an optional feature later.

Why ratatui?

ratatui is the most actively maintained TUI framework in the Rust ecosystem. It provides immediate-mode rendering, a rich widget library, and cross-platform terminal support through crossterm. The API is well-documented and the community is responsive.

Why subprocess spawning instead of library linking?

Linking llama.cpp as a C library adds significant build complexity, especially for cross-compilation and Termux. Spawning a subprocess is simpler, isolates crashes, and allows the user to update llama.cpp independently. Library integration is planned for v1.0.

Why Tokio for a TUI?

Inference is slow. Without async, the UI would freeze during response generation. Tokio enables non-blocking subprocess reads, smooth streaming display, and sets the foundation for future parallel features like multi-tab chat.

Why JSON for conversations instead of SQLite?

JSON files are human-readable, trivially portable, and require no additional dependency. Each conversation is self-contained. SQLite may be introduced in v1.0 if search and indexing become necessary.

Build Configuration

Release Profile

[profile.release]
opt-level = "z"         # optimize for binary size
lto = true              # link-time optimization
codegen-units = 1       # single codegen unit for better optimization
strip = true            # strip debug symbols

Environment Variables

RUST_LOG=debug yuy-chat          # enable debug logging
RUST_LOG=info yuy-chat           # info-level logging
YUY_MODELS_DIR=/path yuy-chat    # custom models directory
XDG_CONFIG_HOME=/custom yuy-chat # custom config directory

Cross-Compilation

# ARM64 (Raspberry Pi, Termux native)
rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu

# Windows (from Linux)
rustup target add x86_64-pc-windows-gnu
cargo build --release --target x86_64-pc-windows-gnu

# macOS Apple Silicon (from Linux)
rustup target add aarch64-apple-darwin
cargo build --release --target aarch64-apple-darwin

About the Yuuki Project

yuy-chat exists to serve the Yuuki project -- a code-generation LLM being trained entirely on a smartphone with zero cloud budget.

Training Details


Base model	GPT-2 (124M parameters)
Training type	Continued pre-training
Hardware	Snapdragon 685, CPU only
Training time	50+ hours
Progress	2,000 / 37,500 steps (5.3%)
Cost	$0.00

Quality Scores (Checkpoint 2000)

Language	Score
Agda	55 / 100
C	20 / 100
Assembly	15 / 100
Python	8 / 100

A fully native model (trained from scratch, not fine-tuned) is planned for v1.0. A research paper documenting the mobile training methodology is in preparation.

Project	Description
yuy	CLI for downloading, managing, and running Yuuki models
Yuuki-best	Best checkpoint model weights
Yuuki Space	Web-based interactive demo
yuuki-training	Training code and scripts

License

Copyright 2026 Yuuki Project

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Built with patience, a phone, and zero budget.

README.md

Beautiful TUI Chat for Local AI Models

What is yuy-chat?

Features

Interactive Chat

Model Selector

Conversation History

HuggingFace Cloud

Generation Presets

Settings Screen

Cross-Platform

Lightweight

Installation

Prerequisites

From Source

Termux (Android)

Verify

Quick Start

Keyboard Reference

Model Selector

Chat

Menu

Settings

Generation Presets

Supported Formats and Runtimes

Model Formats

Inference Runtimes

HuggingFace Integration

Configuration

Config File

Priority Order

Directory Layout

Architecture

Source Layout

Data Flow

Design Patterns

Technical Specifications

Project Metrics

Performance

Memory Usage

System Requirements

Dependencies

Platform Support

Conversation Format

Security

Current

Known Limitations

Planned (v0.2+)

Troubleshooting

Roadmap

v0.2 -- Enhanced UX

v0.3 -- Power Features

v1.0 -- Ecosystem

Contributing

Development Setup

Commit Convention

Pull Request Checklist

Coding Standards

Design Decisions

Build Configuration

Release Profile

Environment Variables

Cross-Compilation

About the Yuuki Project

Related Projects

Links

License