diff --git a/README.md b/README.md old mode 100644 new mode 100755 index 3719f82..bf99c21 --- a/README.md +++ b/README.md @@ -1,97 +1,176 @@ -

+

-``` -$$\ $$\ -\$$\ $$ | - \$$\ $$ /$$\ $$\ $$\ $$\ - \$$$$ / $$ | $$ |$$ | $$ | - \$$ / $$ | $$ |$$ | $$ | - $$ | $$ | $$ |$$ | $$ | - $$ | \$$$$$$ |\$$$$$$$ | - \__| \______/ \____$$ | - $$\ $$ | - \$$$$$$ | - \______/ -``` +
-**The official CLI for the Yuuki project.** -Download, manage, and run Yuuki models locally. +Yuy -[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE) -[![Rust](https://img.shields.io/badge/rust-1.75%2B-orange.svg)](https://www.rust-lang.org/) -[![Platform](https://img.shields.io/badge/platform-Termux%20%7C%20Linux%20%7C%20macOS%20%7C%20Windows-green.svg)](#platform-support) -[![HuggingFace](https://img.shields.io/badge/models-HuggingFace-yellow.svg)](https://huggingface.co/OpceanAI) +

-

+# The CLI for the Yuuki Project + +**Download, manage, and run Yuuki models locally.**
+**From phones to desktops. One command at a time.** + +
+ +Get Started +   +Models +   +Demo + +

+ +[![License](https://img.shields.io/badge/Apache_2.0-222222?style=flat-square&logo=apache&logoColor=white)](LICENSE) +  +[![Rust](https://img.shields.io/badge/Rust_1.75+-222222?style=flat-square&logo=rust&logoColor=white)](https://www.rust-lang.org/) +  +[![Termux](https://img.shields.io/badge/Termux-222222?style=flat-square&logo=android&logoColor=white)](#platform-support) +  +[![Linux](https://img.shields.io/badge/Linux-222222?style=flat-square&logo=linux&logoColor=white)](#platform-support) +  +[![macOS](https://img.shields.io/badge/macOS-222222?style=flat-square&logo=apple&logoColor=white)](#platform-support) +  +[![Windows](https://img.shields.io/badge/Windows-222222?style=flat-square&logo=windows&logoColor=white)](#platform-support) + +
--- -## Table of Contents +
-- [About](#about) -- [Features](#features) -- [Installation](#installation) -- [Quick Start](#quick-start) -- [Commands](#commands) - - [download](#download) - - [run](#run) - - [list](#list) - - [info](#info) - - [remove](#remove) - - [runtime](#runtime) - - [doctor](#doctor) - - [setup](#setup) -- [Model Quantizations](#model-quantizations) -- [Runtimes](#runtimes) -- [Configuration](#configuration) -- [Architecture](#architecture) -- [Platform Support](#platform-support) -- [Platform-Specific Optimizations](#platform-specific-optimizations) -- [Design Decisions](#design-decisions) -- [Performance](#performance) -- [Security](#security) -- [Roadmap](#roadmap) -- [Contributing](#contributing) -- [About Yuuki](#about-yuuki) -- [Links](#links) -- [License](#license) + + + + + +
---- - -## About - -**Yuy** is the official command-line interface for the [Yuuki project](https://huggingface.co/OpceanAI) — an LLM trained entirely on a smartphone. Yuy provides a complete toolkit for downloading, managing, and running Yuuki models on local hardware, with first-class support for mobile devices running Termux. - -Yuy wraps proven inference engines (llama.cpp, ollama) and provides an opinionated, streamlined experience on top of them. It handles model discovery, quantization selection, runtime management, and system diagnostics so you can go from zero to inference in three commands. - -``` +```bash yuy setup yuy download Yuuki-best yuy run Yuuki-best ``` + + +**Three commands. That's all.**

+Setup your environment, grab a model,
+and start generating code.

+Yuy handles the rest. + +
+ +
+ +
+ --- +
+ +
+ +## What is Yuy? + +
+ +
+ +**Yuy** is the command-line interface for the [Yuuki project](https://huggingface.co/OpceanAI) -- an LLM trained entirely on a smartphone with zero budget. Yuy provides a complete toolkit for downloading, managing, and running Yuuki models on any local hardware, with first-class support for mobile devices running Termux. + +Under the hood, Yuy wraps proven inference engines (**llama.cpp** and **ollama**) and delivers a streamlined experience on top of them. Model discovery, quantization selection, runtime management, and system diagnostics are handled automatically. + +
+ +--- + +
+ +
+ ## Features -- **Download models** from Hugging Face with streaming progress bars and auto-selected quantization -- **Run models** locally using llama.cpp or ollama with preset configurations -- **Manage models** — list, inspect, and remove local models -- **Runtime management** — detect, install, and configure inference runtimes -- **System diagnostics** — check hardware, dependencies, and configuration health -- **Cross-platform** — Termux (Android), Linux, macOS, and Windows -- **Mobile-first** — optimized defaults for constrained hardware -- **Zero configuration** — smart defaults that work out of the box +
+ +
+ + + + + + +
+ +

Model Downloads

+ +Stream models from Hugging Face with real-time progress bars and auto-selected quantization based on your hardware. + +
+ +

Local Inference

+ +Run models using llama.cpp or ollama. Presets for balanced, creative, and precise generation included. + +
+ +

Model Management

+ +List, inspect metadata, view available quantizations, and remove downloaded models with a single command. + +
+ +

System Diagnostics

+ +Full health check: hardware info, runtime detection, dependency status, and configuration validation. + +
+ +

Runtime Management

+ +Detect, install, and configure inference runtimes. Yuy guides you through setup on any platform. + +
+ +

Cross-Platform

+ +Termux (Android), Linux, macOS, and Windows. One codebase, consistent experience everywhere. + +
+ +

Mobile-First

+ +Optimized defaults for constrained hardware. Memory-aware quantization, conservative I/O, thermal-safe compilation. + +
+ +

Zero Configuration

+ +Smart defaults that work out of the box. Platform detection, RAM-based recommendations, runtime auto-discovery. + +
+ +
--- +
+ +
+ ## Installation +
+ +
+ ### Prerequisites - [Rust](https://www.rust-lang.org/tools/install) 1.75 or later - An inference runtime: [llama.cpp](https://github.com/ggerganov/llama.cpp) or [ollama](https://ollama.ai) (Yuy can install these for you) +
+ ### From Source ```bash @@ -100,12 +179,14 @@ cd yuy cargo build --release ``` -The binary will be at `target/release/yuy`. Optionally install it system-wide: +The binary will be at `target/release/yuy`. Install system-wide: ```bash cargo install --path . ``` +
+ ### Termux (Android) ```bash @@ -117,6 +198,8 @@ cargo build --release > **Note:** First compilation on Termux takes longer due to ARM CPU constraints. Subsequent builds use cache and are significantly faster. +
+ ### Verify Installation ```bash @@ -124,101 +207,107 @@ yuy --version yuy doctor ``` +
+ --- +
+ +
+ ## Quick Start +
+ +
+ ```bash -# 1. Initial setup — creates directories, detects hardware, offers runtime install +# 1. Initial setup -- creates directories, detects hardware, offers runtime install yuy setup -# 2. Download a model — auto-selects best quantization for your hardware +# 2. Download a model -- auto-selects best quantization for your hardware yuy download Yuuki-best -# 3. Run the model — interactive chat session +# 3. Run the model -- interactive chat session yuy run Yuuki-best ``` -That's it. Yuy handles quantization selection, runtime detection, and parameter configuration automatically. +Yuy handles quantization selection, runtime detection, and parameter configuration automatically. + +
--- +
+ +
+ ## Commands -### download +
+ +
+ +### `yuy download` Download Yuuki models from Hugging Face. ```bash -# Auto-select best quantization for your hardware -yuy download Yuuki-best - -# Specify a quantization -yuy download Yuuki-best --quant q8_0 - -# Download a different model -yuy download Yuuki-3.7 --quant q4_0 +yuy download Yuuki-best # auto-select quantization +yuy download Yuuki-best --quant q8_0 # specify quantization +yuy download Yuuki-3.7 --quant q4_0 # different model ``` -**What happens:** +
+How it works internally +
-1. Validates the model name against the known model registry -2. Detects your platform and available RAM +1. Validates the model name against the known registry +2. Detects platform and available RAM 3. Recommends the best quantization (or uses your override) 4. Constructs the Hugging Face download URL -5. Streams the file with a progress bar showing speed and ETA +5. Streams the file with progress bar showing speed and ETA 6. Saves to `~/.yuuki/models//` -**Available quantizations:** `q4_0`, `q5_k_m`, `q8_0`, `f32` +Available quantizations: `q4_0` | `q5_k_m` | `q8_0` | `f32` ---- +
-### run +
+ +### `yuy run` Run a downloaded model with an inference runtime. ```bash -# Run with defaults -yuy run Yuuki-best - -# Specify a runtime -yuy run Yuuki-best --runtime llama-cpp - -# Use a preset -yuy run Yuuki-best --preset creative +yuy run Yuuki-best # defaults +yuy run Yuuki-best --runtime llama-cpp # specify runtime +yuy run Yuuki-best --preset creative # use a preset ``` -**Presets:** +**Generation Presets:** | Preset | Temperature | Top P | Use Case | -|--------|-------------|-------|----------| -| `balanced` | 0.6 | 0.7 | General use (default) | +|:-------|:------------|:------|:---------| +| `balanced` | 0.6 | 0.7 | General use **(default)** | | `creative` | 0.8 | 0.9 | Creative writing, exploration | | `precise` | 0.3 | 0.5 | Factual, deterministic output | -Yuy detects the available runtime automatically. If both llama.cpp and ollama are installed, it defaults to llama.cpp (or your configured preference). +Yuy detects the available runtime automatically. If both are installed, it defaults to llama.cpp. -**Runtime detection order for llama.cpp:** +
-``` -llama-cli → llama → main -``` - ---- - -### list +### `yuy list` List models locally or remotely. ```bash -# List downloaded models with sizes -yuy list models - -# List all available models on Hugging Face -yuy list models --remote +yuy list models # downloaded models with sizes +yuy list models --remote # all available on Hugging Face ``` -**Example output:** +
+Example output ``` Local Models: @@ -228,25 +317,22 @@ Local Models: Total: 5.4 GB ``` ---- +
-### info +
-Display detailed information about a model. +### `yuy info` + +Display detailed model information. ```bash -# Show model info -yuy info Yuuki-best - -# Show available variants/quantizations -yuy info Yuuki-best --variants +yuy info Yuuki-best # show model info +yuy info Yuuki-best --variants # available quantizations ``` -Shows download status, file sizes, available quantizations, and the path on disk. +
---- - -### remove +### `yuy remove` Remove a downloaded model. @@ -256,46 +342,40 @@ yuy remove Yuuki-v0.1 Calculates the disk space to be freed and asks for confirmation before deletion. ---- +
-### runtime +### `yuy runtime` Manage inference runtimes. ```bash -# Check what's installed -yuy runtime check - -# Install a runtime (interactive selection) -yuy runtime install - -# Install a specific runtime -yuy runtime install llama-cpp - -# List supported runtimes -yuy runtime list +yuy runtime check # what's installed +yuy runtime install # interactive selection +yuy runtime install llama-cpp # specific runtime +yuy runtime list # supported runtimes ``` -**Installation methods by platform:** +
+Installation methods by platform +
| Platform | llama.cpp | ollama | -|----------|-----------|--------| +|:---------|:----------|:-------| | Termux | `pkg install llama-cpp` | `pkg install ollama` | | macOS | `brew install llama.cpp` | `brew install ollama` | | Linux | Binary from GitHub Releases | Official installer | | Windows | Chocolatey or manual download | Official installer | ---- +
-### doctor +
+ +### `yuy doctor` Run a full system diagnostic. -```bash -yuy doctor -``` - -**Example output:** +
+Example output ``` System Information: @@ -324,49 +404,73 @@ Health Summary: System is ready to use Yuuki! ``` ---- +
-### setup +
-First-time setup wizard. +### `yuy setup` + +First-time setup wizard. Creates the `~/.yuuki/` directory structure, detects platform and hardware, checks for runtimes, and offers to install one if none are found. ```bash yuy setup ``` -Creates the `~/.yuuki/` directory structure, detects your platform and hardware, checks for runtimes, and offers to install one if none are found. Run this once after installation. +
--- +
+ +
+ ## Model Quantizations +
+ +
+ Quantization reduces model size at the cost of some precision. Yuy automatically recommends the best option for your hardware. -| Quantization | Relative Size | Quality | Recommended For | -|-------------|---------------|---------|-----------------| -| `q4_0` | Smallest | Good | Termux, low-RAM devices (<8 GB) | -| `q5_k_m` | Medium | Better | Desktop with 8-16 GB RAM | +| Quantization | Size | Quality | Recommended For | +|:-------------|:-----|:--------|:----------------| +| `q4_0` | Smallest | Good | Termux, low-RAM devices (< 8 GB) | +| `q5_k_m` | Medium | Better | Desktop with 8--16 GB RAM | | `q8_0` | Large | Best | Desktop with 16+ GB RAM | -| `f32` | Largest | Full precision | Research, analysis | +| `f32` | Largest | Full precision | Research and analysis | **Auto-selection logic:** ``` -Termux (any RAM) → q4_0 -Linux/macOS (<8 GB) → q4_k_m -Linux/macOS (<16 GB) → q5_k_m (default) -Linux/macOS (16+ GB) → q8_0 +Termux (any RAM) q4_0 +Linux/macOS (< 8 GB) q4_k_m +Linux/macOS (< 16 GB) q5_k_m (default) +Linux/macOS (16+ GB) q8_0 ``` +
+ --- -## Runtimes +
-Yuy delegates inference to external engines. It currently supports two runtimes: +
-### llama.cpp +## Inference Runtimes -The default and recommended runtime. Lightweight, portable, and highly optimized. +
+ +
+ +Yuy delegates inference to external engines. Two runtimes are supported. + + + + + + +
+ +

llama.cpp (default)

+ +Lightweight, portable, highly optimized. Recommended for most users. - Single binary, no dependencies - CPU-optimized with SIMD (NEON on ARM, AVX on x86) @@ -374,7 +478,7 @@ The default and recommended runtime. Lightweight, portable, and highly optimized - Low memory footprint - Ideal for Termux -**How Yuy invokes llama.cpp:** +**How Yuy invokes it:** ```bash llama-cli \ @@ -387,19 +491,36 @@ llama-cli \ --color ``` -### ollama +
-Server-based runtime with a more user-friendly model management system. +

ollama

+ +Server-based runtime with user-friendly model management. - Built-in model management - REST API for programmatic access - Can serve multiple models - Optional web UI +
+ +
+ --- +
+ +
+ ## Configuration +
+ +
+ ### Config File Location: `~/.yuuki/config.toml` @@ -413,103 +534,117 @@ default_quant = "q5_k_m" # q4_0 | q5_k_m | q8_0 | f32 ### Priority Order -Settings are resolved in this order (highest priority first): +Settings resolve in this order (highest priority first): -1. **CLI flags** — `yuy run Yuuki-best --quant q8_0` -2. **Config file** — `default_quant = "q5_k_m"` -3. **Auto-detection** — platform and hardware-based defaults +1. **CLI flags** -- `yuy run Yuuki-best --quant q8_0` +2. **Config file** -- `default_quant = "q5_k_m"` +3. **Auto-detection** -- platform and hardware-based defaults ### Directory Structure ``` ~/.yuuki/ -├── config.toml # User configuration -└── models/ # Downloaded models - ├── Yuuki-best/ - │ ├── yuuki-best-q4_0.gguf - │ └── yuuki-best-q5_k_m.gguf - ├── Yuuki-3.7/ - └── Yuuki-v0.1/ + config.toml # user configuration + models/ # downloaded models + Yuuki-best/ + yuuki-best-q4_0.gguf + yuuki-best-q5_k_m.gguf + Yuuki-3.7/ + Yuuki-v0.1/ ``` -On Termux, the base path is `/data/data/com.termux/files/home/.yuuki/`. +On Termux the base path is `/data/data/com.termux/files/home/.yuuki/`. + +
--- +
+ +
+ ## Architecture +
+ +
+ ``` -┌─────────────────────────────────────────────────────┐ -│ User │ -└───────────────────────┬─────────────────────────────┘ - │ - v -┌─────────────────────────────────────────────────────┐ -│ Yuy CLI (Rust) │ -│ │ -│ CLI Layer ──────── clap + colored │ -│ │ Argument parsing, UI, validation │ -│ v │ -│ Commands Layer ─── 8 async command modules │ -│ │ download, run, list, info, │ -│ │ remove, runtime, doctor, setup │ -│ v │ -│ Core Services ──── config.rs + utils.rs │ -│ Config management, platform │ -│ detection, formatting │ -└──────────┬──────────────────────┬────────────────────┘ - │ │ - v v - ┌─────────────────┐ ┌──────────────────┐ - │ External APIs │ │ Local Storage │ - │ Hugging Face │ │ ~/.yuuki/ │ - │ GitHub │ │ Models + Config │ - └────────┬─────────┘ └──────────────────┘ - │ - v - ┌──────────────────────────────┐ - │ Inference Runtimes │ - │ llama.cpp │ ollama │ - └──────────────────────────────┘ + User + | + v + +------------------------------------------------------------+ + | Yuy CLI (Rust) | + | | + | CLI Layer clap + colored | + | | argument parsing, UI, validation | + | v | + | Commands Layer 8 async command modules | + | | download, run, list, info, | + | | remove, runtime, doctor, setup | + | v | + | Core Services config.rs + utils.rs | + | config management, platform | + | detection, formatting | + +------------+----------------------------+------------------+ + | | + v v + +------------------+ +------------------+ + | External APIs | | Local Storage | + | Hugging Face | | ~/.yuuki/ | + | GitHub | | models + config | + +---------+--------+ +------------------+ + | + v + +-------------------------------+ + | Inference Runtimes | + | llama.cpp | ollama | + +-------------------------------+ ``` +
+ ### Source Layout ``` yuy/ -├── Cargo.toml # Project manifest and dependencies -├── README.md -├── PROJECT.md # Technical documentation -│ -└── src/ - ├── main.rs # Entry point, CLI router, error handling - ├── cli.rs # CLI definitions with clap derive macros - ├── config.rs # Configuration management, paths, constants - ├── utils.rs # Platform detection, RAM check, formatting - │ - └── commands/ - ├── mod.rs # Module declarations - ├── download.rs # Model download with streaming + progress - ├── run.rs # Model execution with runtime detection - ├── list.rs # Local and remote model listing - ├── info.rs # Model metadata and variant inspection - ├── remove.rs # Model deletion with confirmation - ├── runtime.rs # Runtime detection and installation - ├── doctor.rs # System diagnostics - └── setup.rs # First-time setup wizard + Cargo.toml # project manifest and dependencies + README.md + PROJECT.md # technical documentation + src/ + main.rs # entry point, CLI router, error handling + cli.rs # CLI definitions with clap derive macros + config.rs # configuration management, paths, constants + utils.rs # platform detection, RAM check, formatting + commands/ + mod.rs # module declarations + download.rs # model download with streaming + progress + run.rs # model execution with runtime detection + list.rs # local and remote model listing + info.rs # model metadata and variant inspection + remove.rs # model deletion with confirmation + runtime.rs # runtime detection and installation + doctor.rs # system diagnostics + setup.rs # first-time setup wizard ``` +
+ ### Design Patterns -- **Command pattern** — Each command is an isolated async module with an `execute()` entry point -- **Type-safe CLI** — `clap` derive macros ensure compile-time validation of arguments -- **Async I/O** — Tokio runtime for non-blocking downloads and process management -- **Error propagation** — `anyhow::Result` with contextual error messages throughout +| Pattern | Implementation | +|:--------|:---------------| +| Command pattern | Each command is an isolated async module with an `execute()` entry point | +| Type-safe CLI | `clap` derive macros ensure compile-time validation of arguments | +| Async I/O | Tokio runtime for non-blocking downloads and process management | +| Error propagation | `anyhow::Result` with contextual error messages throughout | + +
### Dependencies | Crate | Purpose | -|-------|---------| +|:------|:--------| | `clap` | CLI argument parsing with derive macros | | `tokio` | Async runtime | | `reqwest` | HTTP client for downloads | @@ -520,33 +655,43 @@ yuy/ | `anyhow` | Error handling | | `futures-util` | Stream utilities for downloads | +
+ --- +
+ +
+ ## Platform Support +
+ +
+ | Platform | Status | Notes | -|----------|--------|-------| -| Termux (Android) | Full support | Primary target, fully tested | -| Linux (x86_64) | Full support | Tested on Ubuntu 22.04+ | -| Linux (ARM64) | Full support | Tested on Raspberry Pi | -| macOS (Intel) | Full support | Tested on Big Sur+ | -| macOS (Apple Silicon) | Full support | Metal acceleration via llama.cpp | -| Windows 10/11 | Partial | Runtime auto-install not yet implemented | +|:---------|:-------|:------| +| **Termux (Android)** | Full support | Primary target, fully tested | +| **Linux x86_64** | Full support | Tested on Ubuntu 22.04+ | +| **Linux ARM64** | Full support | Tested on Raspberry Pi | +| **macOS Intel** | Full support | Tested on Big Sur+ | +| **macOS Apple Silicon** | Full support | Metal acceleration via llama.cpp | +| **Windows 10/11** | Partial | Runtime auto-install not yet implemented | ---- +
-## Platform-Specific Optimizations +
+Termux (Android) -- Primary Target +
-### Termux (Android) +Platform optimizations applied automatically: -Termux is the primary target. Yuy applies these optimizations automatically: +- Default quantization: `q4_0` (minimum memory footprint) +- Download buffer: 64 KB (conservative for mobile I/O) +- Compilation: single-threaded (`-j 1`) to avoid thermal throttling +- Progress bars: simplified for narrower terminal widths -- **Default quantization:** `q4_0` (minimum memory footprint) -- **Download buffer:** 64 KB (conservative for mobile I/O) -- **Compilation:** Single-threaded (`-j 1`) to avoid thermal throttling -- **Progress bars:** Simplified for narrower terminal widths - -**Platform detection:** +Platform detection: ```rust std::env::var("PREFIX") @@ -554,102 +699,163 @@ std::env::var("PREFIX") .unwrap_or(false) ``` -### Linux Desktop +
+ +
+Linux Desktop +
- Default quantization: `q5_k_m` - Parallel compilation - GPU support via CUDA or ROCm when available -### macOS +
+ +
+macOS +
- Metal acceleration for Apple Silicon GPUs - Homebrew-based runtime installation - `q8_0` default on machines with 16+ GB RAM -### Windows +
+ +
+Windows +
- Path handling with backslashes - Chocolatey for package management - CUDA support for NVIDIA GPUs +
+ +
+ --- +
+ +
+ ## Design Decisions -### Why Rust? +
-- **Performance** — small, fast binaries with no runtime overhead -- **Memory safety** — no garbage collector, no segfaults -- **Async ecosystem** — Tokio provides mature non-blocking I/O -- **Cross-compilation** — single codebase targets all platforms -- **Cargo** — dependency management and build system in one tool +
-### Why wrap llama.cpp instead of building a custom runtime? +
+Why Rust? +
-Pragmatism. llama.cpp has 3+ years of optimization work from 500+ contributors. It handles SIMD, GPU acceleration, quantization formats, and thousands of edge cases. Building an equivalent would take years for a single developer. Yuy provides the experience layer; llama.cpp provides the engine. +Performance with zero runtime overhead, memory safety without a garbage collector, a mature async ecosystem through Tokio, straightforward cross-compilation for all target platforms, and Cargo as a unified build and dependency system. -### Why clap for CLI? +
-clap v4 absorbed structopt, has the best documentation in the Rust CLI ecosystem, supports colored help text, and provides compile-time validation through derive macros. +
+Why wrap llama.cpp instead of building a custom runtime? +
-### Why TOML for configuration? +Pragmatism. llama.cpp has 3+ years of optimization from 500+ contributors. It handles SIMD, GPU acceleration, quantization formats, and thousands of edge cases. Building an equivalent runtime would take years for a single developer. Yuy provides the experience layer; llama.cpp provides the engine. + +
+ +
+Why clap for CLI? +
+ +clap v4 absorbed structopt, has the strongest documentation in the Rust CLI ecosystem, supports colored help text, and provides compile-time validation through derive macros. + +
+ +
+Why TOML for configuration? +
TOML is more readable than JSON, simpler than YAML, and is the standard in the Rust ecosystem (Cargo.toml). First-class serde support makes serialization trivial. -### Why async/await? +
-Large model downloads (multi-GB) must not block the UI. Async enables smooth progress bars, and sets the foundation for future parallel chunk downloads. +
+Why async/await? +
+ +Large model downloads (multi-GB) must not block the UI. Async enables smooth progress bars and sets the foundation for future parallel chunk downloads. + +
+ +
--- +
+ +
+ ## Performance -### Benchmarks +
+ +
| Operation | Target | Actual | -|-----------|--------|--------| -| CLI startup | <100 ms | ~50 ms | -| Download 1 GB | <5 min | 3-4 min (network dependent) | -| Model listing | <50 ms | ~10 ms | -| Doctor check | <200 ms | ~150 ms | - -### Binary Size +|:----------|:-------|:-------| +| CLI startup | < 100 ms | ~50 ms | +| Download 1 GB | < 5 min | 3--4 min (network dependent) | +| Model listing | < 50 ms | ~10 ms | +| Doctor check | < 200 ms | ~150 ms | ``` -Release build: ~8 MB +Binary size (release): ~8 MB +Rust source files: 15 +Lines of code: ~2,500 +Direct dependencies: 11 +Clean build time: ~2 min ``` -### Code Statistics - -``` -Rust source files: 15 -Lines of code: ~2,500 -Direct dependencies: 11 -Clean build time: ~2 min -``` +
--- +
+ +
+ ## Security -### Current Measures +
-- **URL validation** — only downloads from `https://huggingface.co/` -- **No arbitrary code execution** — Yuy spawns runtimes, never executes model content -- **Scoped file access** — all operations within `~/.yuuki/` +
+ +### Current + +- **URL validation** -- only downloads from `https://huggingface.co/` +- **No arbitrary code execution** -- Yuy spawns runtimes, never executes model content +- **Scoped file access** -- all operations within `~/.yuuki/` ### Planned (v0.2+) - SHA256 checksum verification for downloaded models -- System keyring integration for Hugging Face tokens (instead of plaintext in config) +- System keyring integration for Hugging Face tokens - File permission enforcement (`0o600` for sensitive files) - Encrypted token storage on Termux via libsodium +
+ --- +
+ +
+ ## Roadmap -### Phase 1: MVP (Complete) +
+ +
+ +### Phase 1 -- MVP (Complete) - [x] Core CLI with 8 commands - [x] Download from Hugging Face with progress bars @@ -662,7 +868,7 @@ Clean build time: ~2 min - [x] Auto-selection of quantization - [x] Colored terminal output -### Phase 2: Core Features (In Progress) +### Phase 2 -- Core Features (In Progress) - [ ] Resume interrupted downloads - [ ] Parallel chunk downloads @@ -672,57 +878,50 @@ Clean build time: ~2 min - [ ] Unit and integration tests - [ ] CI/CD with GitHub Actions -### Phase 3: Advanced Features (Planned) +### Phase 3 -- Advanced Features (Planned) - [ ] Persistent conversation sessions - ``` - ~/.yuuki/conversations/ - ├── session-2026-01-15.json - └── session-2026-01-16.json - ``` - [ ] Template system for custom prompts - ```bash - yuy template create coding-assistant - yuy run Yuuki-best --template coding-assistant - ``` - [ ] Custom user-defined presets - ```toml - [presets.my-creative] - temperature = 0.9 - top_p = 0.95 - top_k = 50 - ``` - [ ] llama.cpp library integration (bypass CLI spawning) - [ ] Training code download command -### Phase 4: Ecosystem (Future) +### Phase 4 -- Ecosystem (Future) - [ ] Plugin system - [ ] Optional web UI - [ ] REST API server mode - [ ] Auto-updates -- [ ] Optional telemetry (opt-in) - [ ] Community model hub with ratings - [ ] Fine-tuning helpers +
+ --- +
+ +
+ ## Contributing +
+ +
+ ### Development Setup ```bash -# Clone git clone https://github.com/YuuKi-OS/yuy cd yuy -# Install Rust (if needed) +# install Rust if needed curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -# Install dev tools +# install dev tools cargo install cargo-watch cargo-edit -# Verify +# verify cargo check cargo test cargo fmt -- --check @@ -735,9 +934,9 @@ cargo clippy (): ``` -**Types:** `feat`, `fix`, `docs`, `style`, `refactor`, `test`, `chore` +Types: `feat` | `fix` | `docs` | `style` | `refactor` | `test` | `chore` -**Example:** +Example: ``` feat(download): add resume capability @@ -765,54 +964,100 @@ Closes #42 - Prefer `async/await` over callbacks - Justify any new dependency +
+ --- -## About Yuuki +
-Yuy exists to serve the [Yuuki project](https://huggingface.co/OpceanAI/Yuuki-best) — a code-generation LLM being trained entirely on a smartphone (Redmi 12, Snapdragon 685, CPU only) with zero cloud budget. +
-**Key facts about the model:** +## About the Yuuki Project -| Detail | Value | -|--------|-------| +
+ +
+ +Yuy exists to serve the [Yuuki project](https://huggingface.co/OpceanAI/Yuuki-best) -- a code-generation LLM being trained entirely on a smartphone (Redmi 12, Snapdragon 685, CPU only) with zero cloud budget. + + + + + + +
+ +**Training Details** + +| | | +|:--|:--| | Base model | GPT-2 (124M parameters) | -| Training type | Continued pre-training (fine-tuning) | +| Training type | Continued pre-training | | Hardware | Snapdragon 685, CPU only | | Training time | 50+ hours | | Progress | 2,000 / 37,500 steps (5.3%) | | Cost | $0.00 | -| Best language | Agda (55/100) | -| License | Apache 2.0 | -**Current quality scores (Checkpoint 2000):** + + +**Quality Scores (Checkpoint 2000)** | Language | Score | -|----------|-------| -| Agda | 55/100 | -| C | 20/100 | -| Assembly | 15/100 | -| Python | 8/100 | +|:---------|:------| +| Agda | 55 / 100 | +| C | 20 / 100 | +| Assembly | 15 / 100 | +| Python | 8 / 100 | + +
A fully native model (trained from scratch, not fine-tuned) is planned for v1.0. A research paper documenting the mobile training methodology is in preparation. +
+ --- +
+ +
+ ## Links -| Resource | URL | -|----------|-----| -| Model weights (recommended) | https://huggingface.co/OpceanAI/Yuuki-best | -| Original model (historical) | https://huggingface.co/OpceanAI/Yuuki | -| Interactive demo | https://huggingface.co/spaces/OpceanAI/Yuuki | -| Training code | https://github.com/YuuKi-OS/yuuki-training | -| CLI source (this repo) | https://github.com/YuuKi-OS/yuy | -| Issues | https://github.com/YuuKi-OS/yuy/issues | +
+ +
+ +
+ +[![Model Weights](https://img.shields.io/badge/Model_Weights-Hugging_Face-ffd21e?style=for-the-badge&logo=huggingface&logoColor=black)](https://huggingface.co/OpceanAI/Yuuki-best) +  +[![Live Demo](https://img.shields.io/badge/Live_Demo-Spaces-ffd21e?style=for-the-badge&logo=huggingface&logoColor=black)](https://huggingface.co/spaces/OpceanAI/Yuuki) +  +[![Source Code](https://img.shields.io/badge/Source_Code-GitHub-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/YuuKi-OS/yuy) + +
+ +[![Training Code](https://img.shields.io/badge/Training_Code-GitHub-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/YuuKi-OS/yuuki-training) +  +[![Original Model](https://img.shields.io/badge/Original_Model-Historical-555555?style=for-the-badge)](https://huggingface.co/OpceanAI/Yuuki) +  +[![Issues](https://img.shields.io/badge/Report_Issue-GitHub-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/YuuKi-OS/yuy/issues) + +
+ +
--- +
+ +
+ ## License -Licensed under the **Apache License, Version 2.0**. +
+ +
``` Copyright 2026 Yuuki Project @@ -830,9 +1075,20 @@ See the License for the specific language governing permissions and limitations under the License. ``` +
+ --- -

- Built with patience, a phone, and zero budget.
- Yuuki Project -

+
+ +
+ +**Built with patience, a phone, and zero budget.** + +
+ +[![Yuuki Project](https://img.shields.io/badge/Yuuki_Project-2026-000000?style=for-the-badge)](https://huggingface.co/OpceanAI) + +
+ +