Usage
Basic search
mosaic search "transformer attention mechanism"Queries all enabled sources and prints a results table:
# Title Authors Year Source OA PDF
1 Attention Is All You Need Vaswani et al. 2017 Semantic Scholar yes ✓
2 Attention Is All You Need Vaswani et al. 2017 arXiv yes ✓
...
10 result(s)Duplicate entries (same DOI) are merged automatically.

Limit results per source
mosaic search "CRISPR gene editing" -n 25The -n / --max flag controls how many results are requested from each source (default: 10).
Open-access filter
mosaic search "quantum computing" --oa-onlyHides papers that have neither is_open_access=true nor a known PDF link.
Search a single source
mosaic search "RNA velocity" --source epmc
mosaic search "neural ODE" --source arxiv
mosaic search "drug discovery" --source ss| Shorthand | Source |
|---|---|
arxiv | arXiv |
ss | Semantic Scholar |
sd | ScienceDirect |
doaj | DOAJ |
epmc | Europe PMC |
pubmed | PubMed |
pmc | PubMed Central |
rxiv | bioRxiv / medRxiv |
Download PDFs
mosaic search "diffusion models image generation" --oa-only --downloadFor each result that has a PDF link (or a DOI that Unpaywall can resolve), the file is saved to ~/mosaic-papers/ with the naming pattern:
{FirstAuthorLastName}_{Year}_{Title_slug}.pdfAlready-downloaded files are skipped automatically.

Download by DOI
mosaic get 10.48550/arXiv.1706.03762Fetches a single paper by DOI. MOSAIC first checks the local cache, then tries Unpaywall if no PDF URL is known.

Bulk download from BibTeX or CSV
# Download all DOIs found in a BibTeX export (Zotero, JabRef, Mendeley…)
mosaic get --from refs.bib
# Download from a CSV with a 'doi' column
mosaic get --from references.csv
# Mark unresolvable papers as skipped instead of failed
mosaic get --from refs.bib --oa-only
# Save to a custom directory
mosaic get --from refs.bib --download-dir ~/papersMOSAIC extracts every doi = {…} field from .bib files (no extra dependency) or reads the doi column from .csv files, deduplicates, and prints a per-entry result followed by a summary:
Found 42 DOI(s) in refs.bib
✓ 2017_arXiv_Vaswani_Attention_Is_All_You_Need.pdf
✓ 2019_Semantic_Scholar_Devlin_BERT.pdf
– 10.1016/j.celrep.2020.107834 (no OA copy)
…
Done: 35 downloaded, 4 failed, 3 skipped (no OA copy)Filter by year
# Exact year
mosaic search "transformer" --year 2017
# Inclusive range
mosaic search "diffusion models" -y 2020-2023
# Explicit list of years
mosaic search "BERT" -y 2019,2020,2021The year filter is passed to each source's native API where supported (arXiv, Semantic Scholar, ScienceDirect, Europe PMC, DOAJ), and applied as a post-processing step on all results as a safety net.
Filter by author
# Single author substring match
mosaic search "attention" --author Vaswani
# Multiple authors (OR logic — paper must match at least one)
mosaic search "graph neural" -a Kipf -a VelickovicAuthor matching is case-insensitive and substring-based, so --author Hinton matches "Geoffrey E. Hinton".
Filter by journal
# Substring match against the journal name
mosaic search "protein structure" --journal "Nature"
mosaic search "RNA sequencing" -j "Nucleic Acids"Combining filters
Filters compose with AND logic (year AND author AND journal must all match):
# Papers by Vaswani from 2017, open-access, download PDFs
mosaic search "attention mechanism" -y 2017 -a Vaswani --oa-only --download
# Papers in Nature between 2020 and 2023, from Europe PMC
mosaic search "CRISPR" -j Nature -y 2020-2023 --source epmc -n 25
# Broad search, open-access only, download everything, from arXiv
mosaic search "large language model" -n 50 --source arxiv --oa-only --download
Verbose mode
Add --verbose to any search to see a per-source breakdown and deduplication report printed before the results table:
mosaic search "transformer attention" --verbose╭─ Search stats ────────────────────────────────────────────────╮
│ Sources arXiv, Semantic Scholar, OpenAlex, Crossref │
│ Raw arXiv=12 Semantic Scholar=18 OpenAlex=15 … → 54 total │
│ Unique 31 papers (23 merged by DOI) │
│ Filters none │
╰───────────────────────────────────────────────────────────────╯Useful for tuning source selection and understanding which sources contribute unique results.
Sort results
Use --sort to rank results after the search:
# Most-cited papers first (citation count from Semantic Scholar and OpenAlex)
mosaic search "transformer attention" --sort citations
# Newest papers first
mosaic search "diffusion models" --sort year
# Combine with other flags
mosaic search "protein folding" --oa-only --sort citations -n 20When --sort citations is active, the results table gains a Cited column showing the citation count for each paper. Papers from sources that do not return citation data (arXiv, DOAJ, …) show – and are placed after all papers with known counts.
Save results to a file
Use --output / -o to write results to disk in any of five formats. The format is inferred from the file extension:
mosaic search "transformer attention" --output results.csv
mosaic search "diffusion models" -y 2022-2024 --oa-only --output refs.bib
mosaic search "CRISPR" --output papers.json
mosaic search "protein folding" --output summary.md
mosaic search "RNA velocity" --output report.markdownSupported formats
| Extension | Format | Best for |
|---|---|---|
.csv | CSV table | Excel, Google Sheets, data analysis |
.json | JSON array | Scripting, pipelines, custom tooling |
.bib | BibTeX | LaTeX, Zotero, JabRef, Mendeley |
.md | Markdown table | Quick README or wiki table |
.markdown | Markdown sections | Detailed per-paper notes, static-site generators |
Format details
.csv — 14 columns: title, authors (semicolon-separated), year, doi, arxiv_id, journal, volume, issue, pages, source, is_open_access, citation_count, pdf_url, url.
.json — JSON array of objects; authors is a native JSON array; null for missing fields; pretty-printed with 2-space indentation.
.bib — @article for papers with a journal, @misc for preprints. Auto-generated cite key: LastName + Year + FirstTitleWord (e.g. Vaswani2017Attention). ArXiv papers get eprint and eprinttype=arXiv fields. Open-access papers get note={Open Access}.
.md — a single compact Markdown table (columns: #, Title, Authors, Year, DOI, Source, OA, PDF).
.markdown — one ## Title section per paper, each containing a key/value table of all available fields (abstract included); empty fields are omitted.
Export multiple formats in one command
--output is repeatable — pass it more than once to write several files simultaneously without re-running the search:
# Write BibTeX, CSV, and a Markdown summary in one go
mosaic search "large language models" -n 30 --oa-only \
--output refs.bib \
--output results.csv \
--output summary.mdCombine with other flags
# Open-access papers from 2020–2024, sorted by citations, saved as BibTeX
mosaic search "diffusion models" -y 2020-2024 --oa-only --sort citations \
--output diffusion.bib
# Single-source search saved as JSON for scripting
mosaic search "RNA splicing" --source epmc -n 50 --output splicing.json
# Author filter + journal filter, saved as detailed Markdown notes
mosaic search "graph neural" -a Kipf -j "ICLR" --output gnns.markdownWorks with mosaic similar too
--output is available on the similar command with the same formats:
# Find related papers and export to BibTeX for LaTeX
mosaic similar 10.48550/arXiv.1706.03762 --output related.bib
# Export to multiple formats at once
mosaic similar arxiv:1706.03762 -n 30 --sort citations \
--output related.bib \
--output related.jsonExport to Zotero
Push results directly into your Zotero library — no copy-paste required.
# Local API (Zotero must be running)
mosaic search "CRISPR" --oa-only --zotero
# Push to a named collection
mosaic search "transformers" --zotero --zotero-collection "Deep Learning"
# Download PDFs and link them in Zotero
mosaic search "diffusion models" --download --zotero --zotero-collection "Generative AI"For the web API (Zotero not running locally), configure once:
mosaic config --zotero-key YOUR_API_KEYThen use --zotero as normal — MOSAIC will talk to api.zotero.org.

See the Zotero Integration guide for the full setup and all options.
Find similar papers
mosaic similar discovers related literature from any DOI or arXiv ID — no search query needed.
mosaic similar 10.48550/arXiv.1706.03762Similar to: Attention Is All You Need
# Title Authors Year Source OA PDF
1 BERT: Pre-training of Deep Bidirectional… Devlin et al. 2019 OpenAlex yes ✓
2 Language Models are Few-Shot Learners Brown et al. 2020 OpenAlex no –
...# arXiv prefix, sort by citations, open-access only
mosaic similar arxiv:1706.03762 -n 20 --sort citations --oa-only
# Save to BibTeX for Zotero / LaTeX
mosaic similar 10.48550/arXiv.1706.03762 --output related.bibTwo sources contribute results:
- OpenAlex
related_works— always queried, no key required - Semantic Scholar recommendations — used when
ss-keyis set in config (dramatically increases recall)
See the Find Similar Papers guide for the full reference, identifier formats, and workflow tips.

