Query arXiv, Semantic Scholar, ScienceDirect, Springer Nature (browser + API), DOAJ, Europe PMC, OpenAlex, BASE, CORE, NASA ADS, IEEE Xplore, Zenodo, Crossref, DBLP, HAL, PubMed, PubMed Central, and bioRxiv/medRxiv simultaneously. Results are deduplicated by DOI so you never see the same paper twice.
π
PDF Download
Download open-access PDFs directly. When no PDF link is known, MOSAIC queries Unpaywall to find a legal open-access copy automatically.
ποΈ
Local Cache
All search results and download history are stored in a local SQLite database. Re-run queries instantly without hitting the network.
βοΈ
Source-aware
Enable or disable individual sources, set per-source API keys, and control rate limits β all from a single TOML config file.
π₯οΈ
Rich Terminal UI
Results are displayed as a formatted table with open-access and PDF indicators. Progress spinners keep you informed during long searches.
π
Web Interface
Launch a browser-based UI with `mosaic ui`. Search, filter, download, and export from a responsive dashboard with dark mode, per-source progress, and keyboard shortcuts.
π§
Local RAG
Index your paper library and ask questions in natural language β synthesis, gap analysis, method comparison, structured extraction. Runs fully locally via Ollama or any OpenAI-compatible server. No data leaves your machine.
π
Relevance Ranking
Re-rank any result set by semantic similarity to your query with --sort relevance. BM25 by default (instant, no model). Optionally delegate to your configured LLM for higher-quality scores.
π€
NotebookLM Integration
Turn any search into a Google NotebookLM notebook β podcast, video overview, slides, quiz, flashcards, mind map, and briefing doc queued in one command with mosaic notebook create.
πΈοΈ
Citation Network
Explore hub papers and topic clusters from your local citation graph. BFS subgraph from any seed query, Louvain or connected-components clustering, and graph export to JSON (D3/Gephi), Graphviz DOT, or Mermaid β all with `mosaic network`.
π
Compare Papers
Generate a structured comparison table (method, dataset, metric, result) across any set of cached papers using an LLM. Falls back gracefully to metadata fields without one. Export to Markdown, CSV, or JSON with `mosaic compare`.
π¦Ύ
Claude Code Skill & AI Agent Mode
Install the bundled Claude Code skill with `mosaic skill install` β gives Claude Code full knowledge of every command, source, and option. The `--json` flag on search and similar emits a structured JSON envelope to stdout for piping, scripting, and CI pipelines.
π
Open & Extensible
Each source is a small self-contained class. Adding a new database takes fewer than 50 lines of Python.
π
Custom Sources
Wire any number of JSON REST APIs as new search sources directly in config.toml β one block per source, no Python required. Supports GET and POST, nested field paths, API keys, and author objects.
π
Zotero Integration
Push results directly into your Zotero library with --zotero. Works with Zotero desktop (local API) and Zotero web (api.zotero.org). Organise into named collections and link downloaded PDFs as attachments.
# Core install β all 21 sources and the full CLIpipx install mosaic-search# Everything at once β RAG, web UI, Louvain clustering, browser sessions, NotebookLMpipx install 'mosaic-search[all]'playwright install chromium # browser binary (needed for auth and NotebookLM)mosaic config --unpaywall-email you@example.commosaic search "attention is all you need" --oa-only --download
pipx inject mosaic-search sqlite-vec # install vector extension oncemosaic index # embed all cached papersmosaic ask "What are the main approaches to graph neural networks?" --show-sourcesmosaic ask "What open problems remain in protein structure prediction?" --mode gapsmosaic chat # interactive multi-turn session
Runs entirely on your machine via Ollama or any OpenAI-compatible server. β RAG guide
After enriching your cache with citation edges, explore the topology of your corpus: identify hub papers, cluster by community, and export the graph for downstream tools β without leaving the terminal.
bash
# Enrich the citation graph (OpenAlex + CrossRef edges)mosaic index --enrich-citations# Most-connected papers across the full graphmosaic network --top 10# Topic subgraph with cluster reportmosaic network --query "transformer attention" --depth 2 --cluster --top 5# Export for D3.js / Gephi / Mermaidmosaic network --output graph.jsonmosaic network --output graph.md
Louvain clustering via networkx (pipx inject mosaic-search networkx); falls back to connected components automatically.
Generate a structured comparison table across any set of cached papers. With an LLM configured, it extracts dimensions (method, dataset, metric, result) from each abstract. Without one, metadata fields are used and a notice is printed β the command never fails silently.
bash
# Compare top-cited papers on a topicmosaic compare --query "diffusion models" --sort citations -n 15 --output comparison.md# Custom dimensions from a BibTeX filemosaic compare --from refs.bib --dimensions "method,dataset,BLEU,limitations"
Re-rank any result set by semantic similarity to the query. BM25 by default (no model, no network, instant). Configure your LLM for higher-quality scores.
bash
mosaic search "diffusion models" --sort relevance # live, rankedmosaic search "diffusion models" --cached --sort relevance # offline, from local cache
Turn any search into a Google NotebookLM notebook in one command. Podcast, video overview, slides, quiz, flashcards, mind map, briefing doc β all queued automatically.
Claude Code Skill & AI agent mode β mosaic skill installβ
MOSAIC ships a bundled Claude Code skill. Install it once and the /mosaic slash command gives Claude Code expert knowledge of every command, source shorthand, filter, export format, and scripting pattern β so you can describe your bibliography goal in plain English and let Claude Code build and run the right commands for you.
bash
# Install into the current project's .claude/skills/ directorymosaic skill install# Or globally, for all your projectsmosaic skill install --global
All search and similar commands support --json for structured stdout β a clean {status, query, count, papers[], errors[]} envelope designed for piping, agent scripts, and CI:
bash
# Pipe directly to jqmosaic search "attention mechanism" --max 30 --oa-only --json \ | jq -r '.papers[] | "\(.year) \(.doi) \(.title)"'# Combine file export and stdout JSON in one runmosaic search "FDTD methods" --json --output refs.bib
Chief Yak Shaver & Accidental Package Maintainer β Fortran programmer who needed one paper, opened 21 browser tabs, and six months later found himself maintaining a Python library