Architecture

Overview

Existing site → <script widget.js> ─┐
Admin UI (React) ────────────────────┼→ NeNe Corpus API (PHP 8.4) → MySQL
                                     │         │
                                     │         ↓ server-side only
                                     │   Claude API (tool_use)

NeNe Corpus is a PHP 8.4 application built on NENE2 — an internal framework for Japanese shared-hosting compatibility. It shares the same codebase for both Tier A (shared hosting) and Tier B (Docker/VPS).


PHP Layer Structure

Handler → UseCase → RepositoryInterface → PdoRepository
LayerResponsibilityRestrictions
HandlerHTTP parse, DTO construction, UseCase call, JSON responseNo SQL, no business logic
UseCaseBusiness logic, orchestrationNo $_SERVER, no PDO
RepositorySQL / persistence onlyNo HTTP, no session logic
Llm/Claude API (tool_use)No domain invariants

All PHP files use declare(strict_types=1). Classes are final readonly where possible.


Module Structure (src/)

Domain-first. Layer-first folders (src/Handlers/, src/Repositories/) are not used.

src/
  AdminAuth/      # JWT auth, admin user management
  Ingestion/      # PDF / CSV / text ingestion pipeline
  Source/         # Source file metadata
  Document/       # Logical document CRUD
  Chunk/          # Text segments for search and citations
  Chat/           # Sync JSON chat (SendChatMessageUseCase)
  Session/        # Chat session management
  Search/         # Full-text search (LIKE + scoring)
  Llm/            # Claude API orchestration (tool_use, max 3 rounds)
  RateLimit/      # Rate limiting (session / IP)
  Appearance/     # Widget appearance settings
  Settings/       # LLM settings (API key management)
  ChatLimits/     # Usage limits (per-IP / global)
  Notification/   # Email notifications (daily report, limit alerts)
  Mail/           # SMTP mailer (PHPMailer wrapper)
  Install/        # Web installer (Tier A)
  Config/         # .env update utilities

Database Schema

TablePurpose
sourcesUploaded source files (pdf / csv / text)
documentsLogical documents derived from sources
chunksSearchable text segments with citation data
chat_sessionsConsumer chat sessions
chat_messagesMessages with citations_json
rate_limit_bucketsIP/session rate limiting
admin_usersAdmin accounts
admin_password_resetsPassword reset tokens (SHA-256 hash, 1-hour TTL)
appearance_settingsWidget theme, hero, chat, layout JSON
corpus_chat_settingsLLM config (model, prompt, fallback)
chat_limits_settingsUsage limits (char count, intervals, daily budgets)

Migrations: Phinx (database/migrations/). Schema snapshots: database/schema/.


Frontend (frontend/)

npm workspaces monorepo:

frontend/
  apps/admin/       # Admin SPA — Tailwind CSS v4 + React
  apps/widget/      # Embed widget — BEM + CSS variables → widget.js
  packages/
    api-client/     # snake_case types + fetch helpers
    i18n/           # Message keys + catalogs (ja/en/de/fr/zh-Hans/pt-BR)
    tokens/         # BEM class constants + CSS variable names
RuleDetail
Widget stylesBEM + var(--nc-*) only. No Tailwind in widget
Admin stylesTailwind utility classes in TSX
JSONsnake_case throughout (no client-side renaming)
i18nAll UI strings in locale catalogs

Sync JSON Chat

NeNe Corpus uses sync JSON chat — a single HTTP request/response. SSE / token streaming is intentionally not implemented (Tier A shared hosting compatibility; ignore_user_abort pattern is unreliable on shared hosts).

Claude tool_use orchestration runs server-side with a maximum of 3 rounds per message.