Skip to content

MOSAICMulti-source Scientific Article Indexer and Collector

a vivid mosaic of open scientific literature, assembled in seconds

MOSAIC logoMOSAIC logo

Quick start ​

bash
# Core install β€” all 21 sources and the full CLI
pipx install mosaic-search

# Everything at once β€” RAG, web UI, Louvain clustering, browser sessions, NotebookLM
pipx install 'mosaic-search[all]'
playwright install chromium   # browser binary (needed for auth and NotebookLM)

mosaic config --unpaywall-email you@example.com
mosaic search "attention is all you need" --oa-only --download

AI features ​

Local RAG β€” mosaic index / ask / chat ​

Index your cached papers once and interrogate your library in natural language. Four structured analysis modes produce cited, grounded answers:

ModeWhat it produces
synthesisComprehensive state-of-the-art summary
gapsOpen problems, contradictions, methodological limitations
compareSide-by-side comparison of methods, datasets, metrics, results
extractPer-paper structured extraction: Task Β· Method Β· Dataset Β· Metric Β· Result
bash
pipx inject mosaic-search sqlite-vec          # install vector extension once

mosaic index                                  # embed all cached papers
mosaic ask "What are the main approaches to graph neural networks?" --show-sources
mosaic ask "What open problems remain in protein structure prediction?" --mode gaps
mosaic chat                                   # interactive multi-turn session

Runs entirely on your machine via Ollama or any OpenAI-compatible server. β†’ RAG guide


Citation network β€” mosaic network ​

After enriching your cache with citation edges, explore the topology of your corpus: identify hub papers, cluster by community, and export the graph for downstream tools β€” without leaving the terminal.

bash
# Enrich the citation graph (OpenAlex + CrossRef edges)
mosaic index --enrich-citations

# Most-connected papers across the full graph
mosaic network --top 10

# Topic subgraph with cluster report
mosaic network --query "transformer attention" --depth 2 --cluster --top 5

# Export for D3.js / Gephi / Mermaid
mosaic network --output graph.json
mosaic network --output graph.md

Louvain clustering via networkx (pipx inject mosaic-search networkx); falls back to connected components automatically.

β†’ Citation Network guide


Compare papers β€” mosaic compare ​

Generate a structured comparison table across any set of cached papers. With an LLM configured, it extracts dimensions (method, dataset, metric, result) from each abstract. Without one, metadata fields are used and a notice is printed β€” the command never fails silently.

bash
# Compare top-cited papers on a topic
mosaic compare --query "diffusion models" --sort citations -n 15 --output comparison.md

# Custom dimensions from a BibTeX file
mosaic compare --from refs.bib --dimensions "method,dataset,BLEU,limitations"

β†’ Compare Papers guide


Relevance ranking β€” --sort relevance ​

Re-rank any result set by semantic similarity to the query. BM25 by default (no model, no network, instant). Configure your LLM for higher-quality scores.

bash
mosaic search "diffusion models" --sort relevance          # live, ranked
mosaic search "diffusion models" --cached --sort relevance # offline, from local cache

β†’ Relevance ranking guide


NotebookLM β€” mosaic notebook ​

Turn any search into a Google NotebookLM notebook in one command. Podcast, video overview, slides, quiz, flashcards, mind map, briefing doc β€” all queued automatically.

bash
mosaic notebook create "Transformers" --query "attention mechanism" --oa-only --podcast --briefing

β†’ NotebookLM guide


Claude Code Skill & AI agent mode β€” mosaic skill install ​

MOSAIC ships a bundled Claude Code skill. Install it once and the /mosaic slash command gives Claude Code expert knowledge of every command, source shorthand, filter, export format, and scripting pattern β€” so you can describe your bibliography goal in plain English and let Claude Code build and run the right commands for you.

bash
# Install into the current project's .claude/skills/ directory
mosaic skill install

# Or globally, for all your projects
mosaic skill install --global

All search and similar commands support --json for structured stdout β€” a clean {status, query, count, papers[], errors[]} envelope designed for piping, agent scripts, and CI:

bash
# Pipe directly to jq
mosaic search "attention mechanism" --max 30 --oa-only --json \
  | jq -r '.papers[] | "\(.year)  \(.doi)  \(.title)"'

# Combine file export and stdout JSON in one run
mosaic search "FDTD methods" --json --output refs.bib

β†’ Agent Workflows guide


Architecture ​

Authors ​

Stefano Zaghi Β· stefano.zaghi@gmail.com

Chief Yak Shaver & Accidental Package Maintainer β€” Fortran programmer who needed one paper, opened 21 browser tabs, and six months later found himself maintaining a Python library

Andrea Giulianini

Grand Pixel Overlord & Architect of the Sacred Button β€” world-class web UI designer, responsible for making MOSAIC actually look good

Claude (Anthropic)

Omniscient Code Oracle & Tireless Rubber Duck β€” AI pair programmer, responsible for writing the boring parts so humans don't have to

Contributions are welcome.

License ​

MOSAIC is available under your choice of license: GPL-3.0-or-later, BSD-2-Clause, BSD-3-Clause, or MIT. See LICENSE.gpl3.md, LICENSE.bsd-2.md, LICENSE.bsd-3.md, LICENSE.mit.md.

Β© Stefano Zaghi