Authenticated Access
Some publishers and institutional repositories do not offer a public API but do grant access to logged-in users — for example via a university single sign-on (Shibboleth/SAML), a personal account, or a library proxy.
MOSAIC can open a real browser window, let you log in interactively, save the session (cookies and local storage), and reuse it in future runs to download PDFs you are legally entitled to access — without asking you to log in again every time.
Prerequisite
This feature requires the [browser] optional extra and at least one Playwright browser. See Installation for setup instructions.
How it works
Saving a session
mosaic auth login <name> --url <login-url>- A headed (visible) browser window opens at
<login-url>. - You log in manually — SSO, username/password, 2FA — whatever the site requires.
- Wait until you are fully logged in — the publisher page should show your name or a "Log out" link. Do not press Enter during the login redirect chain.
- Switch back to the terminal and press Enter.
- MOSAIC saves the browser session (cookies and local storage) to
~/.config/mosaic/sessions/<name>.jsonand records the login domain alongside it.
Complete the login before pressing Enter
If you press Enter while the browser is still going through SSO redirects, the session will be saved in a partially authenticated state and downloads will fail silently. Always verify the publisher page shows you as logged in before returning to the terminal.
Automatic download fallback
When you run mosaic search --download or mosaic get, the downloader tries three strategies in order:
- Known PDF URL — used directly if the search source returned one.
- Unpaywall — resolves a legal open-access copy by DOI.
- Browser session — if steps 1 and 2 fail, MOSAIC iterates over all saved sessions, opens a headless browser with the session cookies, navigates to the paper page, and downloads the PDF automatically.
No extra flags are needed — the browser step runs silently whenever saved sessions are found and the other methods come up empty.
Domain matching
Each session is associated with the domain of the --url you passed at login time (e.g. link.springer.com for a Springer login). When downloading, MOSAIC first tries to match the paper's URL domain against saved session domains. If no direct match is found (e.g. the DOI resolves via an intermediate hub like linkinghub.elsevier.com), MOSAIC falls back to trying all saved sessions in order — the browser follows the full redirect chain including JavaScript redirects.
Publisher compatibility
Not all publishers are equally automatable. The feature works best with publishers that use standard cookie-based session management.
| Publisher | Search | PDF download | Notes |
|---|---|---|---|
| Springer Nature | ✅ | ✅ | Search is public (no login needed); session used for PDF download of subscribed content |
| Wiley | — | ✅ | Standard cookie session |
| Taylor & Francis | — | ✅ | Standard cookie session |
| Cambridge University Press | — | ✅ | Standard cookie session |
| ScienceDirect (Elsevier) | ✅ | ⚠️ | Search works via browser session. PDF download blocked by Cloudflare on the /pdfft/ endpoint — falls back to Unpaywall. See ScienceDirect notes below. |
| Scopus (Elsevier) | ✅ | — | Search works via browser session (shares id.elsevier.com SSO with ScienceDirect). Scopus does not expose PDF links; Unpaywall is used as fallback. See Scopus notes below. |
Wiley, Taylor & Francis, Cambridge
These publishers do not yet have a dedicated browser-based search source. The saved session is used only for PDF downloading of papers found via other sources (arXiv, Semantic Scholar, OpenAlex, Springer, etc.).
Springer Nature
Springer search works without a saved session — the search interface is publicly accessible and MOSAIC uses headless Firefox automatically. A session is only needed to download PDFs of subscribed articles.
Login URL: https://link.springer.com (homepage). Click Log in → Log in via Institution from there.
Session storage
Sessions are stored as standard Playwright storageState files:
~/.config/mosaic/sessions/
springer.json
wiley.json
myuni.jsonEach file contains only browser cookies and local storage — no passwords are ever stored.
Sessions expire when the website's cookies expire (typically days to weeks for most publishers, up to a year for Cloudflare-protected sites like ScienceDirect). Re-run mosaic auth login to refresh.
MOSAIC checks cookie expiry timestamps when deciding whether to activate a browser-based search source. If all timed cookies in a session file have passed their expiry, the source is excluded from active sources at startup — you will not see it in results and no browser is launched. Run mosaic auth status to see which sessions are still valid.
Commands
mosaic auth login
mosaic auth login [OPTIONS] NAME| Argument / Option | Description |
|---|---|
NAME | Arbitrary label for the session (e.g. springer, myuni) |
--url / -u | URL to open in the browser (required) |
MOSAIC tries browsers in order: Firefox → Chromium → WebKit. The first one that is installed is used automatically.
Why Firefox first?
Firefox's TLS fingerprint passes Cloudflare Bot Management on sites like ScienceDirect where headless Chromium is blocked. Using the same browser for both login and headless reuse also ensures that Cloudflare session cookies (cf_clearance) remain valid — they are bound to the browser's TLS fingerprint.
Examples:
# Log in to Springer Nature
mosaic auth login springer --url https://link.springer.com
# Log in to Wiley
mosaic auth login wiley --url https://onlinelibrary.wiley.com
# Log in via your university SSO
mosaic auth login myuni --url https://library.myuni.edu/login
# Log in to ScienceDirect (Elsevier) — see compatibility note above
mosaic auth login elsevier --url https://www.sciencedirect.commosaic auth status
List all saved sessions:
mosaic auth status Name Domain Saved Valid Path
elsevier www.sciencedirect… 2026-03-09 11:21 ✓ ~/.config/mosaic/sessions/elsevier.json
springer link.springer.com 2026-03-09 10:14 ✓ ~/.config/mosaic/sessions/springer.json
myuni library.myuni.edu 2026-02-01 09:15 ✗ exp ~/.config/mosaic/sessions/myuni.json- Domain — which URLs the session will be tried for during automatic download.
- Valid — MOSAIC inspects cookie expiry timestamps in the saved file. ✓ means at least one timed cookie is still active; ✗ expired means all timed cookies have passed their expiry date and the session will be excluded from active sources until refreshed.

mosaic auth logout
Remove a saved session:
mosaic auth logout springerScienceDirect (Elsevier)
ScienceDirect is the only publisher for which a browser session enables both search and (partial) download support. The behaviour depends on which credentials are configured:
| Credentials | What MOSAIC does |
|---|---|
| API key | Uses the Elsevier Article Search API (fast, reliable). PDF via Unpaywall. |
| Browser session (no API key) | Uses headless Firefox to run searches on sciencedirect.com. PDF via Unpaywall only. |
| Neither | ScienceDirect source is skipped entirely. |
The API key always takes precedence when both are present.
Saving the ScienceDirect session
mosaic auth login elsevier --url https://www.sciencedirect.comComplete the full institutional SSO flow until your name appears on ScienceDirect, then press Enter. Do not press Enter during intermediate redirects.
Same browser for login and search
MOSAIC uses Firefox for both headed login and headless search. The Cloudflare cf_clearance cookie is bound to the browser's TLS fingerprint — if you log in with Chromium and search with Firefox (or vice versa) the session is rejected and you will be redirected to the SSO page. Ensure Firefox is installed (playwright install firefox) before logging in.
PDF download limitation
The ScienceDirect PDF endpoint (/pdfft/) enforces Cloudflare Bot Management rules that are stricter than article pages. Even with a valid institutional session, automated PDF downloads from this endpoint are blocked. The browser session therefore enables search only; PDF retrieval always falls back to Unpaywall for open-access copies.
For reliable PDF access to subscribed content, use the API key combined with campus IP or institutional VPN — see ScienceDirect configuration.
Session expiry and warnings
Elsevier's Cloudflare session cookies (cf_clearance) typically expire after a year, while the shorter-lived __cf_bm cookie (30 minutes) is refreshed automatically during each browser session.
MOSAIC detects expiry at two points:
At startup — cookie timestamps in the session file are checked. An expired session is excluded from active sources before any browser is launched.
mosaic auth statusshows a ✗ in the Valid column.During search — if the headless browser is redirected to the Elsevier SSO page mid-search, MOSAIC prints a clear warning and skips the source:
ScienceDirect session has expired. Run: mosaic auth login elsevier --url https://www.sciencedirect.com
To refresh the session:
mosaic auth login elsevier --url https://www.sciencedirect.comScopus (Elsevier)
Scopus and ScienceDirect are both Elsevier products that use the same id.elsevier.com SSO. A single browser session can be used for both.
| Credentials | What MOSAIC does |
|---|---|
| API key | Uses the Elsevier Scopus Search API (fast, reliable). |
| Browser session (no API key) | Uses headless Firefox to search via the advanced-search form. |
| Neither | Scopus source is skipped entirely. |
Saving the Scopus session
mosaic auth login scopus --url https://www.scopus.comComplete the full institutional SSO flow until the Scopus homepage shows your name, then press Enter.
Share the session with ScienceDirect
Because both sites use id.elsevier.com for authentication, a Scopus login will also enable the ScienceDirect browser source (and vice versa). You only need to save one session — MOSAIC matches saved sessions by domain, and the Elsevier SSO cookies are valid across scopus.com and sciencedirect.com.
Same browser for login and search
Use Firefox for both headed login and headless search (same as ScienceDirect). Install it before logging in: playwright install firefox.
Notes on search and selectors
Scopus uses a heavily JavaScript-rendered interface. The browser source submits queries via the advanced-search form using full Scopus boolean syntax (e.g. TITLE-ABS-KEY("transformer") AND PUBYEAR > 2019). Because the Scopus frontend can change independently of MOSAIC releases, the CSS selectors used to extract result rows may occasionally need updating.
If search returns empty results after a known-working login, open an issue at github.com/szaghi/mosaic.
Legal and ethical notice
This feature is designed for users who have legitimate access to the content they download — through a personal subscription, an institutional licence, or any other legal right. It automates what you would do manually in a browser.
MOSAIC does not circumvent paywalls, DRM, or access controls for content you do not have the right to access. Using this feature to download content without authorisation may violate the site's terms of service and applicable law.

