scitex_scholar.citation_graph
Citation Graph Module
Build and analyze citation networks for academic papers using CrossRef data.
This module provides tools to: - Extract citation relationships - Calculate paper similarity (co-citation, bibliographic coupling) - Build citation network graphs - Export for visualization (D3.js, vis.js, Cytoscape)
- Example (local SQLite):
>>> from scitex_scholar.citation_graph import CitationGraphBuilder >>> builder = CitationGraphBuilder(db_path="/path/to/crossref.db") >>> graph = builder.build("10.1038/s41586-020-2008-3", top_n=20)
- Example (HTTP via crossref-local):
>>> builder = CitationGraphBuilder(api_url="http://localhost:31291") >>> graph = builder.build("10.1038/s41586-020-2008-3", top_n=20)
- class scitex_scholar.citation_graph.CitationGraphBuilder(db_path=None, api_url=None)[source]
Bases:
objectBuild citation network graphs for academic papers.
Auto-detects backend via crossref_local.Config (DB → HTTP).
- Example (auto-detect):
>>> builder = CitationGraphBuilder() >>> graph = builder.build("10.1038/s41586-020-2008-3", top_n=20)
- Example (explicit SQLite):
>>> builder = CitationGraphBuilder(db_path="/path/to/crossref.db")
- Example (explicit HTTP):
>>> builder = CitationGraphBuilder(api_url="http://localhost:31291")
- __init__(db_path=None, api_url=None)[source]
Initialize builder with database path, HTTP API URL, or auto-detect.
When no args given, delegates to crossref_local.Config for auto-detection: 1. CROSSREF_LOCAL_MODE env var (explicit “db” or “http”) 2. CROSSREF_LOCAL_API_URL env var → HTTP mode 3. Local DB file existence → DB mode 4. Fallback to HTTP mode
- build(seed_doi, top_n=20, weight_coupling=2.0, weight_cocitation=2.0, weight_direct=1.0)[source]
Build citation network around a seed paper.
- Parameters:
- Return type:
- Returns:
CitationGraph object with nodes and edges
- _build_citation_edges(dois)[source]
Build citation edges between papers in the network.
- Parameters:
- Return type:
- Returns:
List of CitationEdge objects
- build_from_dois(dois, num_related_per_doi=20, weight_coupling=2.0, weight_cocitation=2.0, weight_direct=1.0)[source]
Build citation network from multiple seed DOIs.
Combines similarity scores from all seeds to find papers related to the entire set, producing a richer connected graph.
- Parameters:
- Return type:
- Returns:
CitationGraph with all seeds + related papers + edges
- build_from_query(query, num_related_per_doi=20, search_limit=10, weight_coupling=2.0, weight_cocitation=2.0, weight_direct=1.0)[source]
Build citation network from a text query.
Searches local databases, extracts DOIs from results, then delegates to build_from_dois().
- Parameters:
query (
str) – Search query (e.g. “hippocampal sharp wave ripples”)num_related_per_doi (
int) – Related papers per seed DOIsearch_limit (
int) – Max papers to fetch from searchweight_coupling (
float) – Weight for bibliographic couplingweight_cocitation (
float) – Weight for co-citationweight_direct (
float) – Weight for direct citations
- Return type:
- Returns:
CitationGraph with search-discovered seeds + related papers
- export_json(graph, output_path)[source]
Export graph to JSON file for visualization.
- Parameters:
graph (
CitationGraph) – CitationGraph to exportoutput_path (
str) – Path to output JSON file
- class scitex_scholar.citation_graph.PaperNode(doi, title='', year=0, authors=<factory>, journal='', citation_count=0, similarity_score=0.0, is_seed=False, metadata=<factory>)[source]
Bases:
objectRepresents a paper in the citation network.
- __init__(doi, title='', year=0, authors=<factory>, journal='', citation_count=0, similarity_score=0.0, is_seed=False, metadata=<factory>)
- class scitex_scholar.citation_graph.CitationEdge(source, target, edge_type='cites', weight=1.0)[source]
Bases:
objectRepresents a citation relationship between papers.
- __init__(source, target, edge_type='cites', weight=1.0)
- class scitex_scholar.citation_graph.CitationGraph(seed_doi, seed_dois=<factory>, nodes=<factory>, edges=<factory>, metadata=<factory>)[source]
Bases:
objectRepresents a complete citation network.
- edges: List[CitationEdge]
- to_networkx()[source]
Convert to NetworkX DiGraph with node attributes.
- Returns:
Directed graph with node attributes: title, short_title, year, citations, similarity, journal.
- Return type:
networkx.DiGraph
- __init__(seed_doi, seed_dois=<factory>, nodes=<factory>, edges=<factory>, metadata=<factory>)
- scitex_scholar.citation_graph.plot_citation_graph(graph, backend='auto', output=None, **kwargs)[source]
Visualize a citation graph with pluggable backends.
- Parameters:
graph (CitationGraph or networkx.DiGraph) – Citation network to visualize. CitationGraph is auto-converted via
to_networkx().backend (str) – Rendering backend: ‘auto’, ‘figrecipe’, ‘scitex.plt’, ‘matplotlib’, or ‘pyvis’. Default ‘auto’ picks the best available.
output (str, optional) – Output file path. Required for ‘pyvis’ backend (HTML). For static backends, saves the figure to this path.
**kwargs – Backend-specific keyword arguments (layout, seed, figsize, etc.).
- Returns:
Backend-specific result. Static backends return
{'fig', 'ax', 'pos', 'backend'}. Pyvis returns{'output', 'backend'}.- Return type: