EOSDIS Graph Neural Network Model Card
Model Overview
Model Name: EOSDIS-GNN Version: 1.0.4 Type: Heterogeneous Graph Neural Network Framework: PyTorch + PyTorch Geometric Base Language Model: nasa-impact/nasa-smd-ibm-st-v2
This model was trained on 2025-09-10.
Model Details
- Hidden Channels: 256
- Number of Layers: 3
- Convolution Type: SAGE
- Max Epochs: 100000
Core Components
- Base Text Encoder: NASA-SMD-IBM Language Model (768-dimensional embeddings)
- Graph Neural Network: Heterogeneous GNN with multiple layers
- Node Types: Dataset, Publication, Instrument, Platform, ScienceKeyword
- Edge Types: Multiple relationship types between nodes
Technical Specifications
- Input Dimensions: 768 (NASA-SMD-IBM embeddings)
- Hidden Dimensions: 256
- Output Dimensions: 768 (aligned with NASA-SMD-IBM space)
- Activation Function: ReLU
- Dropout: Applied between layers
Training Details
Training Data
Source: NASA EOSDIS Knowledge Graph
Node Types and Counts:
- Datasets: Earth science datasets from NASA DAACs
- Publications: Related scientific papers
- Instruments: Earth observation instruments
- Platforms: Satellite and other observation platforms
- Science Keywords: NASA Earth Science taxonomy
Training Process
Optimization: Adam optimizer
Loss Function: Contrastive loss for semantic alignment
Training Strategy:
- Initial node embedding generation
- Message passing through graph structure
- Contrastive learning with NASA-SMD-IBM embeddings
Intended Use
Designed for: Research, data discovery, and semantic search in Earth science Not intended for: Safety-critical systems or unrelated domains without fine-tuning
Strengths
- Semantic Understanding: Effective cross-modal alignment between text and graph
- Domain Specificity: Tuned for NASA Earth Science terminology
- Multi-Modal Integration: Combines language and graph context
Limitations
- Performance depends on graph coverage
- Specialized for the Earth Science domain
Usage Guide
Installation
pip install torch torch-geometric transformers huggingface-hub
Basic Usage
from transformers import AutoTokenizer, AutoModel
import torch
from gnn_model import EOSDIS_GNN
tokenizer = AutoTokenizer.from_pretrained("nasa-impact/nasa-smd-ibm-st-v2")
text_model = AutoModel.from_pretrained("nasa-impact/nasa-smd-ibm-st-v2")
gnn_model = EOSDIS_GNN.from_pretrained("nasa-gesdisc/edgraph-gnn-graphsage")
Semantic Search Example
from semantic_search import SemanticSearch
searcher = SemanticSearch()
results = searcher.search("atmospheric carbon dioxide measurements", top_k=5, node_type="Dataset")
Evaluation Metrics
| Metric | Value | Notes |
|---|---|---|
| Top-5 Accuracy | 87.4% | Probability that one of top-5 nodes is relevant |
| MRR | 0.73 | Ranking quality |
| Link Prediction ROC-AUC | 0.91 | Edge-existence prediction |
| Node Classification F1 (macro) | 0.84 | Balanced across node types |
| Triple Classification Accuracy | 88.6% | Valid vs. invalid triples |
Reproducibility
- Hardware: 4 x NVIDIA A100 (80 GB)
- Software: Python 3.11, PyTorch 2.3.1, PyG 2.5.2
- Training Duration: ~24 hours
- Random Seed: 42
License and Data Use
This model is released under the Apache 2.0 License. NASA Earth Science data used for training are publicly available via earthdata.nasa.gov.
Related Models and Resources
- Base model: nasa-impact/nasa-smd-ibm-st-v2
- Dataset: nasa-gesdisc/nasa-eo-knowledge-graph
- GitHub Repository: github.com/nasa-gesdisc/nasa-eosdis-gnn
Citation
Model
@misc{armin_mehrabian_2025,
author = { Armin Mehrabian },
title = { nasa-eosdis-heterogeneous-gnn (Revision 7e71e62) },
year = 2025,
url = { https://huggingface.co/arminmehrabian/nasa-eosdis-heterogeneous-gnn },
doi = { 10.57967/hf/6789 },
publisher = { Hugging Face }
}
Dataset
@misc{nasa_goddard_earth_sciences_data_and_information_services_center__(ges-disc)_2024,
author = {{NASA Goddard Earth Sciences Data and Information Services Center (GES-DISC)}},
title = { nasa-eo-knowledge-graph },
year = 2024,
url = { https://huggingface.co/datasets/nasa-gesdisc/nasa-eo-knowledge-graph },
doi = { 10.57967/hf/3463 },
publisher = { Hugging Face }
}
How to Cite
If you use this model, please cite both entries above and include the model DOI: https://doi.org/10.57967/hf/6789
Contact Information
Maintainers:
- Armin Mehrabian (armin.mehrabian@nasa.gov)
- Kendall Gilbert (kendall.c.gilbert@nasa.gov)
- Irina Gerasimov (irina.gerasimov@nasa.gov)
Organization: NASA GES-DISC
- Downloads last month
- 29
Model tree for nasa-gesdisc/edgraph-gnn-graphsage
Base model
nasa-impact/nasa-smd-ibm-st-v2