Graph Machine Learning
PyTorch
English
eosdis_hetero_gnn
gnn
earth
nasa
1.0.0

EOSDIS Graph Neural Network Model Card

Model Overview

Model Name: EOSDIS-GNN Version: 1.0.4 Type: Heterogeneous Graph Neural Network Framework: PyTorch + PyTorch Geometric Base Language Model: nasa-impact/nasa-smd-ibm-st-v2

This model was trained on 2025-09-10.


Model Details

  • Hidden Channels: 256
  • Number of Layers: 3
  • Convolution Type: SAGE
  • Max Epochs: 100000

Core Components

  • Base Text Encoder: NASA-SMD-IBM Language Model (768-dimensional embeddings)
  • Graph Neural Network: Heterogeneous GNN with multiple layers
  • Node Types: Dataset, Publication, Instrument, Platform, ScienceKeyword
  • Edge Types: Multiple relationship types between nodes

Technical Specifications

  • Input Dimensions: 768 (NASA-SMD-IBM embeddings)
  • Hidden Dimensions: 256
  • Output Dimensions: 768 (aligned with NASA-SMD-IBM space)
  • Activation Function: ReLU
  • Dropout: Applied between layers

Training Details

Training Data

  • Source: NASA EOSDIS Knowledge Graph

  • Node Types and Counts:

    • Datasets: Earth science datasets from NASA DAACs
    • Publications: Related scientific papers
    • Instruments: Earth observation instruments
    • Platforms: Satellite and other observation platforms
    • Science Keywords: NASA Earth Science taxonomy

Training Process

  • Optimization: Adam optimizer

  • Loss Function: Contrastive loss for semantic alignment

  • Training Strategy:

    • Initial node embedding generation
    • Message passing through graph structure
    • Contrastive learning with NASA-SMD-IBM embeddings

Intended Use

Designed for: Research, data discovery, and semantic search in Earth science Not intended for: Safety-critical systems or unrelated domains without fine-tuning


Strengths

  1. Semantic Understanding: Effective cross-modal alignment between text and graph
  2. Domain Specificity: Tuned for NASA Earth Science terminology
  3. Multi-Modal Integration: Combines language and graph context

Limitations

  • Performance depends on graph coverage
  • Specialized for the Earth Science domain

Usage Guide

Installation

pip install torch torch-geometric transformers huggingface-hub

Basic Usage

from transformers import AutoTokenizer, AutoModel
import torch
from gnn_model import EOSDIS_GNN

tokenizer = AutoTokenizer.from_pretrained("nasa-impact/nasa-smd-ibm-st-v2")
text_model = AutoModel.from_pretrained("nasa-impact/nasa-smd-ibm-st-v2")
gnn_model = EOSDIS_GNN.from_pretrained("nasa-gesdisc/edgraph-gnn-graphsage")

Semantic Search Example

from semantic_search import SemanticSearch
searcher = SemanticSearch()
results = searcher.search("atmospheric carbon dioxide measurements", top_k=5, node_type="Dataset")

Evaluation Metrics

Metric Value Notes
Top-5 Accuracy 87.4% Probability that one of top-5 nodes is relevant
MRR 0.73 Ranking quality
Link Prediction ROC-AUC 0.91 Edge-existence prediction
Node Classification F1 (macro) 0.84 Balanced across node types
Triple Classification Accuracy 88.6% Valid vs. invalid triples

Reproducibility

  • Hardware: 4 x NVIDIA A100 (80 GB)
  • Software: Python 3.11, PyTorch 2.3.1, PyG 2.5.2
  • Training Duration: ~24 hours
  • Random Seed: 42

License and Data Use

This model is released under the Apache 2.0 License. NASA Earth Science data used for training are publicly available via earthdata.nasa.gov.


Related Models and Resources


Citation

Model

@misc{armin_mehrabian_2025,
  author       = { Armin Mehrabian },
  title        = { nasa-eosdis-heterogeneous-gnn (Revision 7e71e62) },
  year         = 2025,
  url          = { https://huggingface.co/arminmehrabian/nasa-eosdis-heterogeneous-gnn },
  doi          = { 10.57967/hf/6789 },
  publisher    = { Hugging Face }
}

Dataset

@misc{nasa_goddard_earth_sciences_data_and_information_services_center__(ges-disc)_2024,
  author       = {{NASA Goddard Earth Sciences Data and Information Services Center (GES-DISC)}},
  title        = { nasa-eo-knowledge-graph },
  year         = 2024,
  url          = { https://huggingface.co/datasets/nasa-gesdisc/nasa-eo-knowledge-graph },
  doi          = { 10.57967/hf/3463 },
  publisher    = { Hugging Face }
}

How to Cite

If you use this model, please cite both entries above and include the model DOI: https://doi.org/10.57967/hf/6789


Contact Information

Maintainers:

Organization: NASA GES-DISC

Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nasa-gesdisc/edgraph-gnn-graphsage

Finetuned
(2)
this model

Dataset used to train nasa-gesdisc/edgraph-gnn-graphsage