MedCodeMCP / docs /plans /implementation_plan.md
gpaasch's picture
badly needed link added
1c75f41
  • Gather ICD-10 data

    • Obtain dataset from CMS/CDC or Kaggle
    • Download CSV of ICD-10-CM codes (~70k entries)
    • Load data into application or database
  • Build search/lookup functionality

    • Implement keyword filter for description matching
    • Generate embeddings for each ICD description (offline)
    • Build vector index (FAISS, Annoy, or numpy)
    • Embed user query and perform nearest-neighbor search
  • Combine code and description lookup into MCP API

    • Accept input as code (lookup definition) or description (search codes)
    • Return list of candidate codes with descriptions
  • Integrate LLM for refinement (optional)

    • Use GPT-4 or Claude to select best code from top-N results
    • Prompt LLM to generate short rationale for selected code
    • Cache LLM prompts and responses to conserve tokens
  • Build MCP server (Gradio App)

    • Create Gradio UI with text input and output area
    • Implement backend logic to expose API endpoint or STDIO interface per MCP standards
    • Tag Space with “mcp-server-track” and configure /api route
    • Test connectivity with MCP client (e.g., Cursor IDE or Claude Desktop)
  • Test with realistic inputs

    • Simple case: “Type 1 diabetes mellitus” → expect E10.9
    • Complex case: “Acute MI involving LAD” → expect I21.02 or related code
    • Edge case: Typos or layman terms (e.g., “heart attack”) → verify semantic search or add spell-check
    • Compare tool output to expected codes (use ChatGPT or reference lists)
  • Optimize and cache

    • Precompute embeddings for entire code database
    • Cache embeddings of frequent queries
    • Cache LLM explanations in memory or simple key-value store
    • Choose deployment hardware (GPU-backed if running local embedding model; CPU if precomputed)
  • Polish documentation & demo

    • Write README.md with tool description, architecture outline, research citations, and sponsor acknowledgments
    • Prepare 2–3 minute demo video showing Gradio UI and AI agent calling the MCP server
    • Share project on community channels (Discord, YouTube) for feedback and visibility