<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[ByteLab]]></title><description><![CDATA[Test]]></description><link>https://tanmoymandal.dev</link><image><url>https://cdn.hashnode.com/uploads/logos/69a9f9c9b8d41e6dede3e6c7/38960246-daa3-43d9-9853-6bc6e3e6b5d9.png</url><title>ByteLab</title><link>https://tanmoymandal.dev</link></image><generator>RSS for Node</generator><lastBuildDate>Wed, 29 Apr 2026 02:35:42 GMT</lastBuildDate><atom:link href="https://tanmoymandal.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Effortlessly Upgrade Your Spring Boot Application with GitHub Copilot's Custom Agent]]></title><description><![CDATA[Upgrading a Spring Boot application across major versions can be daunting. Breaking API changes, dependency incompatibilities, package reorganizations, and test failures can turn what should be a rout]]></description><link>https://tanmoymandal.dev/effortlessly-upgrade-your-spring-boot-application-with-github-copilot-s-custom-agent</link><guid isPermaLink="true">https://tanmoymandal.dev/effortlessly-upgrade-your-spring-boot-application-with-github-copilot-s-custom-agent</guid><category><![CDATA[Springboot]]></category><category><![CDATA[springbootai]]></category><category><![CDATA[spring ai]]></category><category><![CDATA[GitHub]]></category><category><![CDATA[copilot AI]]></category><category><![CDATA[github copilot]]></category><category><![CDATA[custom-agents]]></category><dc:creator><![CDATA[Tanmoy Mandal]]></dc:creator><pubDate>Sun, 29 Mar 2026 16:55:16 GMT</pubDate><content:encoded><![CDATA[<p>Upgrading a Spring Boot application across major versions can be daunting. Breaking API changes, dependency incompatibilities, package reorganizations, and test failures can turn what should be a routine maintenance task into a multi-day ordeal. What if you could automate this entire process with a single command?</p>
<p>In this article, I'll show you how to use a custom GitHub Copilot agent to automatically upgrade a multi-module Spring Boot application from version 3.5.0 to 4.0.5, complete with dependency updates, code migrations, and test validations.</p>
<h2>Prerequisites</h2>
<p>Before we dive in, I'm assuming you:</p>
<ul>
<li><p>Have <strong>GitHub Copilot</strong> installed and activated in VS Code</p>
</li>
<li><p>Are familiar with basic Copilot interactions (chat, inline suggestions)</p>
</li>
<li><p>Understand what Spring Boot is and have worked with Maven projects</p>
</li>
<li><p>Have a Spring Boot project that needs upgrading</p>
</li>
</ul>
<p>If you're new to GitHub Copilot, check out the <a href="https://docs.github.com/en/copilot">official documentation</a> to get started.</p>
<h2>What Are Custom Agents in GitHub Copilot?</h2>
<p>GitHub Copilot Workspace introduces a powerful concept: <strong>custom agents</strong>. Think of them as specialized AI assistants with domain-specific expertise.</p>
<h3>Understanding the Agent Hierarchy</h3>
<p><strong>Base Agent (GitHub Copilot)</strong><br />Your primary AI coding assistant that handles general programming tasks, questions, and code generation.</p>
<p><strong>Custom Agents</strong><br />Specialized agents configured for specific tasks or domains. They extend Copilot's capabilities with:</p>
<ul>
<li><p>Domain-specific knowledge</p>
</li>
<li><p>Predefined workflows</p>
</li>
<li><p>Custom instructions and prompts</p>
</li>
<li><p>Access to specialized tools</p>
</li>
</ul>
<p><strong>Sub-Agents</strong><br />Purpose-built agents that work under a parent agent to handle specific subtasks. For complex operations, a custom agent might orchestrate multiple sub-agents, each handling a specific aspect of the task.</p>
<h3>The Spring Boot Upgrader Agent Ecosystem</h3>
<p>To address the complexity of Spring Boot upgrades, I built the <strong>Spring Boot Upgrader Agent</strong>—a purpose-built solution that transforms what used to be a multi-day manual task into an automated, reliable workflow. The agent is available as an open-source project at <a href="https://github.com/tanmoymandal/gh-copilot-agents">github.com/tanmoymandal/gh-copilot-agents</a>.</p>
<p>The Spring Boot Upgrader is a <strong>parent agent</strong> that orchestrates several specialized <strong>sub-agents</strong>:</p>
<table>
<thead>
<tr>
<th>Sub-Agent</th>
<th>Purpose</th>
</tr>
</thead>
<tbody><tr>
<td><strong>SB Version Detector</strong></td>
<td>Scans pom.xml/build.gradle to detect current Spring Boot, Java, and dependency versions</td>
</tr>
<tr>
<td><strong>SB Docs Fetcher</strong></td>
<td>Retrieves Spring Boot 4.0 release notes, migration guides, and ecosystem documentation</td>
</tr>
<tr>
<td><strong>SB Dependency Upgrader</strong></td>
<td>Updates all Spring Boot and related dependencies, handles Jakarta EE migrations, and resolves API removals</td>
</tr>
<tr>
<td><strong>SB Test Updater</strong></td>
<td>Fixes test compilation issues, updates deprecated test APIs, and runs the test suite</td>
</tr>
<tr>
<td><strong>SB Vulnerability Scanner</strong></td>
<td>Scans dependencies for CVEs before and after upgrade</td>
</tr>
<tr>
<td><strong>SB Upgrade Reporter</strong></td>
<td>Generates a comprehensive upgrade report document</td>
</tr>
</tbody></table>
<p>This orchestrated approach ensures each aspect of the upgrade is handled by a specialized component, resulting in a thorough and reliable upgrade process.</p>
<h2>Getting the Spring Boot Upgrader Agent</h2>
<p>The Spring Boot Upgrader agent is available as an open-source project that you can easily integrate into your workspace.</p>
<h3>Step 1: Clone the Agent Repository</h3>
<pre><code class="language-bash">git clone https://github.com/tanmoymandal/gh-copilot-agents.git
cd gh-copilot-agents
</code></pre>
<h3>Step 2: Understand the Agent Configuration</h3>
<p>The agent is configured using a <code>.agent.md</code> file with YAML frontmatter. Here's a simplified view of the structure:</p>
<pre><code class="language-markdown">---
name: Spring Boot Upgrade to 4.0.x
description: &gt;
  Use when upgrading a Spring Boot project to version 4.0.x. 
  Orchestrates full upgrade workflow: version detection, Java version selection, 
  dependency upgrade, test updates, vulnerability scanning, and upgrade report generation.
applyTo:
  - filePattern: '**/pom.xml'
  - filePattern: '**/build.gradle*'
tools:
  - runSubagent
  - read_file
  - replace_string_in_file
  - run_in_terminal
---

# Agent Instructions

[Detailed workflow instructions for the agent...]
</code></pre>
<p><strong>Key Configuration Elements:</strong></p>
<ul>
<li><p><code>name</code>: The identifier you'll use to invoke this agent</p>
</li>
<li><p><code>description</code>: What the agent does and when to use it</p>
</li>
<li><p><code>applyTo</code>: File patterns that trigger the agent's availability (Maven/Gradle files)</p>
</li>
<li><p><code>tools</code>: Which Copilot tools the agent can use</p>
</li>
<li><p><strong>Instructions</strong>: Detailed workflow steps the agent follows</p>
</li>
</ul>
<h3>Step 3: Install the Agent in Your Project</h3>
<p>There are two ways to make the agent available:</p>
<p><strong>Option A: Workspace-Level (Recommended for Team Projects)</strong></p>
<p>Copy the <code>.agent.md</code> file to a <code>.github/copilot/</code> directory in your project:</p>
<pre><code class="language-bash">mkdir -p &lt;your-project&gt;/.github/copilot/agents
cp spring-boot-upgrade/.agent.md &lt;your-project&gt;/.github/copilot/agents/spring-boot-upgrader.agent.md
</code></pre>
<p><strong>Option B: User-Level (Recommended for Personal Use)</strong></p>
<p>Place the agent in your global Copilot configuration:</p>
<pre><code class="language-bash">mkdir -p ~/.copilot/agents
cp spring-boot-upgrade/.agent.md ~/.copilot/agents/spring-boot-upgrader.agent.md
</code></pre>
<p>After copying, reload VS Code or restart the Copilot extension to make the agent available.</p>
<h2>Real-World Example: Upgrading my-awesome-app</h2>
<p>Let's walk through upgrading a real multi-module Spring Boot application. The <code>my-awesome-app</code> is a sample project available at <a href="https://github.com/tanmoymandal/my-awesome-app">https://github.com/tanmoymandal/my-awesome-app</a>.</p>
<h3>Project Structure</h3>
<pre><code class="language-plaintext">my-awesome-app/
├── pom.xml                    # Parent POM (Spring Boot 3.5.0)
├── dataaccess/                # Shared JPA entities and repositories
│   ├── pom.xml
│   └── src/main/java/...
├── api/                       # REST API module
│   ├── pom.xml
│   └── src/main/java/...
└── batch/                     # Spring Batch jobs
    ├── pom.xml
    └── src/main/java/...
</code></pre>
<h3>Initial State</h3>
<ul>
<li><p><strong>Spring Boot</strong>: 3.5.0</p>
</li>
<li><p><strong>Java</strong>: 17</p>
</li>
<li><p><strong>Spring Batch</strong>: 5.2.2</p>
</li>
<li><p><strong>Spring Data JPA</strong>: 3.5.0</p>
</li>
<li><p><strong>Known Vulnerabilities</strong>: 16 CVEs (5 High, 6 Medium, 5 Low)</p>
</li>
</ul>
<h3>Step 1: Clone the Sample App</h3>
<pre><code class="language-bash">git clone https://github.com/tanmoymandal/my-awesome-app.git
cd my-awesome-app
</code></pre>
<h3>Step 2: Open in VS Code</h3>
<pre><code class="language-bash">code .
</code></pre>
<h3>Step 3: Invoke the Spring Boot Upgrader Agent</h3>
<p>Open the GitHub Copilot Chat (Cmd+Shift+I on Mac, Ctrl+Shift+I on Windows/Linux) and type:</p>
<pre><code class="language-plaintext">@Spring Boot Upgrade to 4.0.x upgrade this project to Spring Boot 4.0.x
</code></pre>
<p>Or simply:</p>
<pre><code class="language-plaintext">Upgrade this Spring Boot project to version 4.0.x
</code></pre>
<p>If the agent is configured correctly with the <code>applyTo</code> patterns, it will automatically detect that you're in a Maven project and activate.</p>
<h3>Step 4: Watch the Magic Happen</h3>
<p>The agent will:</p>
<ol>
<li><p><strong>Detect Current Versions</strong> 🔍</p>
<pre><code class="language-plaintext">Analyzing pom.xml files...
Detected: Spring Boot 3.5.0, Java 17
</code></pre>
</li>
<li><p><strong>Fetch Migration Documentation</strong> 📖</p>
<pre><code class="language-plaintext">Retrieving Spring Boot 4.0 migration guide...
Analyzing breaking changes...
</code></pre>
</li>
<li><p><strong>Upgrade Dependencies</strong> ⬆️</p>
<pre><code class="language-plaintext">Updating parent POM to Spring Boot 4.0.5...
Updating Spring Batch 5.2.2 → 6.0.3...
Resolving starter modularization changes...
</code></pre>
</li>
<li><p><strong>Apply Code Migrations</strong> 🔧</p>
<pre><code class="language-plaintext">Updating Spring Batch imports (package restructure)...
Replacing deprecated APIs...
</code></pre>
</li>
<li><p><strong>Update Tests</strong></p>
<pre><code class="language-plaintext">Fixing test dependencies...
Running test suite... 23/23 tests passed
</code></pre>
</li>
<li><p><strong>Vulnerability Scan</strong></p>
<pre><code class="language-plaintext">Pre-upgrade: 16 CVEs (5 High)
Post-upgrade: 0 CVEs
</code></pre>
</li>
<li><p><strong>Generate Report</strong></p>
<pre><code class="language-plaintext">Creating UPGRADE_REPORT.md...
</code></pre>
</li>
</ol>
<h3>What Actually Changed?</h3>
<h4>1. Parent POM Update</h4>
<pre><code class="language-diff"> &lt;parent&gt;
     &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
     &lt;artifactId&gt;spring-boot-starter-parent&lt;/artifactId&gt;
-    &lt;version&gt;3.5.0&lt;/version&gt;
+    &lt;version&gt;4.0.5&lt;/version&gt;
     &lt;relativePath/&gt;
 &lt;/parent&gt;

 &lt;properties&gt;
-    &lt;java.version&gt;17&lt;/java.version&gt;
+    &lt;java.version&gt;25&lt;/java.version&gt;
 &lt;/properties&gt;
</code></pre>
<h4>2. Starter Modularization (api/pom.xml)</h4>
<p>Spring Boot 4.0 decomposed monolithic starters:</p>
<pre><code class="language-diff"> &lt;dependency&gt;
     &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
-    &lt;artifactId&gt;spring-boot-starter-web&lt;/artifactId&gt;
+    &lt;artifactId&gt;spring-boot-starter-webmvc&lt;/artifactId&gt;
 &lt;/dependency&gt;

 &lt;dependency&gt;
     &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
-    &lt;artifactId&gt;spring-boot-starter-test&lt;/artifactId&gt;
+    &lt;artifactId&gt;spring-boot-starter-webmvc-test&lt;/artifactId&gt;
     &lt;scope&gt;test&lt;/scope&gt;
 &lt;/dependency&gt;

+&lt;dependency&gt;
+    &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
+    &lt;artifactId&gt;spring-boot-starter-data-jpa-test&lt;/artifactId&gt;
+    &lt;scope&gt;test&lt;/scope&gt;
+&lt;/dependency&gt;
</code></pre>
<h4>3. Batch Module Updates (batch/pom.xml)</h4>
<pre><code class="language-diff"> &lt;dependency&gt;
     &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
-    &lt;artifactId&gt;spring-boot-starter-batch&lt;/artifactId&gt;
+    &lt;artifactId&gt;spring-boot-starter-batch-jdbc&lt;/artifactId&gt;
 &lt;/dependency&gt;

 &lt;dependency&gt;
-    &lt;groupId&gt;org.springframework.batch&lt;/groupId&gt;
-    &lt;artifactId&gt;spring-batch-test&lt;/artifactId&gt;
+    &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
+    &lt;artifactId&gt;spring-boot-starter-batch-test&lt;/artifactId&gt;
     &lt;scope&gt;test&lt;/scope&gt;
 &lt;/dependency&gt;
</code></pre>
<h4>4. Spring Batch 6.0 Package Restructure</h4>
<p>Spring Batch 6.0 reorganized core types into sub-packages:</p>
<pre><code class="language-diff">-import org.springframework.batch.core.Job;
-import org.springframework.batch.core.Step;
+import org.springframework.batch.core.job.Job;
+import org.springframework.batch.core.step.Step;

-import org.springframework.batch.item.ItemProcessor;
-import org.springframework.batch.item.ItemWriter;
+import org.springframework.batch.infrastructure.item.ItemProcessor;
+import org.springframework.batch.infrastructure.item.ItemWriter;

-import org.springframework.batch.item.data.RepositoryItemReader;
-import org.springframework.batch.item.data.builder.RepositoryItemReaderBuilder;
+import org.springframework.batch.infrastructure.item.data.RepositoryItemReader;
+import org.springframework.batch.infrastructure.item.data.builder.RepositoryItemReaderBuilder;
</code></pre>
<p>These changes were applied automatically across:</p>
<ul>
<li><p><code>ProductReportJobConfig.java</code></p>
</li>
<li><p><code>UserSyncJobConfig.java</code></p>
</li>
<li><p>Both batch job test files</p>
</li>
</ul>
<h3>The Final Result</h3>
<p>After the agent completes its work, you get:</p>
<p><strong>✅ Fully Upgraded Application</strong></p>
<ul>
<li><p>Spring Boot 3.5.0 → <strong>4.0.5</strong></p>
</li>
<li><p>Java 17 → <strong>25</strong></p>
</li>
<li><p>All dependencies updated to compatible versions</p>
</li>
</ul>
<p><strong>✅ Zero Vulnerabilities</strong></p>
<ul>
<li>Eliminated all 16 CVEs (5 High, 6 Medium, 5 Low)</li>
</ul>
<p><strong>✅ All Tests Passing</strong></p>
<ul>
<li><p>23/23 tests pass with zero failures</p>
</li>
<li><p>Test infrastructure updated to Spring Boot 4.0 conventions</p>
</li>
</ul>
<p><strong>✅ Comprehensive Documentation</strong></p>
<ul>
<li><p><code>UPGRADE_REPORT.md</code> with full change log</p>
</li>
<li><p>Before/after comparison tables</p>
</li>
<li><p>CVE remediation details</p>
</li>
</ul>
<h2>When to Use This Agent</h2>
<p>The Spring Boot Upgrader agent is ideal for:</p>
<p>✅ <strong>Major Version Upgrades</strong> - Spring Boot 3.x to 4.x migrations<br />✅ <strong>Multi-Module Projects</strong> - Handles complex Maven/Gradle structures<br />✅ <strong>Security Compliance</strong> - Eliminates known CVEs through upgrades<br />✅ <strong>CI/CD Modernization</strong> - Part of dependency update automation<br />✅ <strong>Java Version Migrations</strong> - Coordinated Java + framework upgrades</p>
<h2>⚠️ Important Considerations</h2>
<p>While the agent is powerful, keep in mind:</p>
<ol>
<li><p><strong>Always Review Changes</strong> - The agent makes extensive modifications. Review all changes before committing.</p>
</li>
<li><p><strong>Test Thoroughly</strong> - While the agent runs tests, you should perform additional integration and manual testing.</p>
</li>
<li><p><strong>Backup First</strong> - Commit your current state or work in a branch before running the upgrade.</p>
</li>
<li><p><strong>Custom Code</strong> - The agent handles framework migrations but can't understand all custom business logic. Manual review is required.</p>
</li>
<li><p><strong>Version Compatibility</strong> - Ensure your application is compatible with Java 25 and Spring Boot 4.0's requirements.</p>
</li>
</ol>
<h2>Inspecting the Upgrade Report</h2>
<p>The generated <code>UPGRADE_REPORT.md</code> contains:</p>
<pre><code class="language-markdown">## Executive Summary
- Before/after version matrix
- Dependency upgrade count
- Vulnerability remediation summary
- Test pass rates

## Version Changes
- Core framework versions
- Spring ecosystem dependencies
- Third-party library updates

## Migration Changes
- API breaking changes applied
- Package reorganizations
- Code transformations performed

## Test Results
- Module-by-module test results
- Failure details (if any)
- Coverage statistics

## Vulnerability Assessment
- Before/after CVE comparison
- Severity breakdown
- Remediation details

## Recommendations
- Post-upgrade tasks
- Performance considerations
- Further improvements
- Detected areas for enhancement (based on project analysis)
</code></pre>
<p>The agent intelligently analyzes your codebase during the upgrade process and may suggest areas for improvement—such as deprecated patterns it detected, opportunities for modernization, or performance optimizations that align with the new Spring Boot version's capabilities.</p>
<h2>Next Steps</h2>
<p>After a successful upgrade:</p>
<ol>
<li><p><strong>Review the changes</strong> carefully using <code>git diff</code></p>
</li>
<li><p><strong>Run your full test suite</strong> (unit, integration, E2E)</p>
</li>
<li><p><strong>Test in a staging environment</strong> before production</p>
</li>
<li><p><strong>Update your CI/CD pipelines</strong> for Java 25</p>
</li>
<li><p><strong>Review and commit</strong> the <code>UPGRADE_REPORT.md</code></p>
</li>
</ol>
<h2>Conclusion</h2>
<p>Custom GitHub Copilot agents represent a paradigm shift in how we approach complex, repetitive development tasks. The Spring Boot Upgrader agent demonstrates how domain-specific AI assistants can handle tasks that typically require hours of manual work—dependency analysis, migration guide research, code transformations, testing, and documentation—all in a single automated workflow.</p>
<p>By leveraging sub-agents for specialized tasks and orchestrating them intelligently, we can achieve results that are both faster and more reliable than manual upgrades.</p>
<h3>Try It Yourself</h3>
<ol>
<li><p><strong>Clone the agent repository</strong>: <a href="https://github.com/tanmoymandal/gh-copilot-agents">gh-copilot-agents</a></p>
</li>
<li><p><strong>Try the sample app</strong>: <a href="https://github.com/tanmoymandal/my-awesome-app">my-awesome-app</a></p>
</li>
<li><p><strong>Follow instructions in this article</strong> to see how you can test out this agent against the sample app</p>
</li>
</ol>
<h3>Resources</h3>
<ul>
<li><p><a href="https://docs.github.com/en/copilot">GitHub Copilot Documentation</a></p>
</li>
<li><p><a href="https://github.com/spring-projects/spring-boot/wiki/Spring-Boot-4.0-Release-Notes">Spring Boot 4.0 Release Notes</a></p>
</li>
<li><p><a href="https://spring.io/blog/2024/08/01/spring-batch-6-0-migration-guide">Spring Batch 6.0 Migration Guide</a></p>
</li>
<li><p><a href="https://github.com/tanmoymandal/gh-copilot-agents">Custom Agent Repository</a></p>
</li>
<li><p><a href="https://github.com/tanmoymandal/my-awesome-app">Sample Application</a></p>
</li>
</ul>
<hr />
<p><em>Have you used custom GitHub Copilot agents in your workflow? What tasks would you like to automate? Share your thoughts.</em></p>
<hr />
]]></content:encoded></item><item><title><![CDATA[Build a Production-Ready MCP Server with Spring Boot 4 & Spring AI 1.1]]></title><description><![CDATA[Give any AI agent full CRUD control over a database — in pure Java, zero Python.

Why This Matters
The Model Context Protocol (MCP) landed in late 2024 and spread fast. Within months, every major AI c]]></description><link>https://tanmoymandal.dev/build-a-production-ready-mcp-server-with-spring-boot-4-spring-ai-1-1</link><guid isPermaLink="true">https://tanmoymandal.dev/build-a-production-ready-mcp-server-with-spring-boot-4-spring-ai-1-1</guid><category><![CDATA[Springboot]]></category><category><![CDATA[AI]]></category><category><![CDATA[tools]]></category><category><![CDATA[mcp]]></category><category><![CDATA[claude.ai]]></category><category><![CDATA[springboot4]]></category><dc:creator><![CDATA[Tanmoy Mandal]]></dc:creator><pubDate>Tue, 24 Mar 2026 00:42:14 GMT</pubDate><content:encoded><![CDATA[<p><em>Give any AI agent full CRUD control over a database — in pure Java, zero Python.</em></p>
<hr />
<h2>Why This Matters</h2>
<p>The Model Context Protocol (MCP) landed in late 2024 and spread fast. Within months, every major AI client — Claude Desktop, Cursor, Windsurf, and dozens of others — adopted it as the standard way for LLMs to talk to external tools.</p>
<p>The problem? Nearly all the tutorials are Python.</p>
<p>The Java ecosystem is massively underserved here. If you're running Spring Boot microservices — and a huge portion of the enterprise world is — you deserve a first-class, idiomatic MCP story. That's what this article delivers.</p>
<p>By the end, you'll have:</p>
<ul>
<li><p>A fully working <strong>MCP server</strong> exposing 9 Todo management tools</p>
</li>
<li><p>A <strong>Spring AI 1.1.2</strong> + <strong>Spring Boot 4.0.x</strong> setup using the official starters</p>
</li>
<li><p>Clean <strong>service/tools separation</strong> so your business logic stays testable</p>
</li>
<li><p><strong>Unit tests</strong> with JUnit 5 + Mockito</p>
</li>
<li><p>A clear mental model of how MCP fits into your architecture</p>
</li>
</ul>
<hr />
<h2>What Is MCP? (The 90-Second Version)</h2>
<p>MCP is a client–server protocol that lets AI models call external tools in a standardised way. Think of it as <strong>USB-C for AI integrations</strong> — one protocol, any tool, any client.</p>
<pre><code class="language-plaintext">MCP Client (Claude Desktop / Cursor / your app)
        │
        │  SSE or STDIO transport
        │
MCP Server (your Spring Boot app)
        │
        ├── Tool: createTodo
        ├── Tool: getAllTodos
        ├── Tool: completeTodo
        └── Tool: getStats
</code></pre>
<p>When a user asks Claude <em>"What are my critical tasks today?"</em>, the LLM:</p>
<ol>
<li><p>Recognizes it needs external data</p>
</li>
<li><p>Calls your <code>getTodosByPriority</code> tool with <code>priority="CRITICAL"</code></p>
</li>
<li><p>Reads the JSON response</p>
</li>
<li><p>Formulates a natural language answer</p>
</li>
</ol>
<p>Your Java code runs. The AI gets the data. The user gets a useful answer. No hallucination, no guessing.</p>
<hr />
<h2>Project Overview</h2>
<p><strong>Stack:</strong></p>
<ul>
<li><p>Spring Boot <strong>4.0.3</strong></p>
</li>
<li><p>Spring AI <strong>1.1.2</strong> (latest stable as of March 2026)</p>
</li>
<li><p>Spring Data JPA + <strong>H2</strong> (swap to PostgreSQL for production)</p>
</li>
<li><p>Java <strong>25</strong></p>
</li>
</ul>
<p><strong>MCP Tools exposed:</strong></p>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Description</th>
</tr>
</thead>
<tbody><tr>
<td><code>createTodo</code></td>
<td>Create a new task with title, description, priority</td>
</tr>
<tr>
<td><code>getTodoById</code></td>
<td>Fetch a single task by id</td>
</tr>
<tr>
<td><code>getAllTodos</code></td>
<td>List all tasks</td>
</tr>
<tr>
<td><code>getTodosByStatus</code></td>
<td>Filter by PENDING / IN_PROGRESS / COMPLETED / CANCELLED</td>
</tr>
<tr>
<td><code>getTodosByPriority</code></td>
<td>Filter by LOW / MEDIUM / HIGH / CRITICAL</td>
</tr>
<tr>
<td><code>searchTodos</code></td>
<td>Full-text search on title and description</td>
</tr>
<tr>
<td><code>updateTodo</code></td>
<td>Partial update — only pass fields you want to change</td>
</tr>
<tr>
<td><code>completeTodo</code></td>
<td>Mark a task done, records completion timestamp</td>
</tr>
<tr>
<td><code>deleteTodo</code></td>
<td>Permanently remove a task</td>
</tr>
<tr>
<td><code>getStats</code></td>
<td>Aggregate counts — great for dashboard summaries</td>
</tr>
</tbody></table>
<hr />
<h2>Project Structure</h2>
<pre><code class="language-plaintext">todo-mcp-server/
├── pom.xml
└── src/main/java/com/tanmoymandal/mcp/todo/
    ├── TodoMcpServerApplication.java       ← main class
    ├── model/
    │   └── Todo.java                       ← JPA entity + enums
    ├── repository/
    │   └── TodoRepository.java             ← Spring Data JPA
    ├── service/
    │   └── TodoService.java                ← business logic (no MCP dependency)
    ├── tools/
    │   └── TodoMcpTools.java               ← @Tool annotations live here
    └── config/
        ├── McpServerConfig.java            ← registers tools with MCP
        └── DataInitializer.java            ← seeds demo data on startup
</code></pre>
<blockquote>
<p><strong>Key design decision:</strong> <code>TodoService</code> has zero dependency on Spring AI. The <code>TodoMcpTools</code> class is the adapter between the AI world and your domain. This separation keeps your business logic independently testable and reusable outside of MCP context.</p>
</blockquote>
<hr />
<h2>Step 1 — The <code>pom.xml</code></h2>
<p>The single most important thing here is the <strong>Spring AI BOM</strong> — it ensures all Spring AI artifacts are version-compatible with no manual version juggling.</p>
<pre><code class="language-xml">&lt;parent&gt;
    &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
    &lt;artifactId&gt;spring-boot-starter-parent&lt;/artifactId&gt;
    &lt;version&gt;4.0.3&lt;/version&gt;
&lt;/parent&gt;

&lt;properties&gt;
    &lt;java.version&gt;21&lt;/java.version&gt;
    &lt;spring-ai.version&gt;1.1.3&lt;/spring-ai.version&gt;
&lt;/properties&gt;

&lt;dependencyManagement&gt;
    &lt;dependencies&gt;
        &lt;dependency&gt;
            &lt;groupId&gt;org.springframework.ai&lt;/groupId&gt;
            &lt;artifactId&gt;spring-ai-bom&lt;/artifactId&gt;
            &lt;version&gt;${spring-ai.version}&lt;/version&gt;
            &lt;type&gt;pom&lt;/type&gt;
            &lt;scope&gt;import&lt;/scope&gt;
        &lt;/dependency&gt;
    &lt;/dependencies&gt;
&lt;/dependencyManagement&gt;

&lt;dependencies&gt;
    &lt;!-- MCP Server over HTTP/SSE using Spring MVC --&gt;
    &lt;dependency&gt;
        &lt;groupId&gt;org.springframework.ai&lt;/groupId&gt;
        &lt;artifactId&gt;spring-ai-starter-mcp-server-webmvc&lt;/artifactId&gt;
    &lt;/dependency&gt;

    &lt;!-- Web layer (required by webmvc transport) --&gt;
    &lt;dependency&gt;
        &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
        &lt;artifactId&gt;spring-boot-starter-web&lt;/artifactId&gt;
    &lt;/dependency&gt;

    &lt;!-- JPA + H2 --&gt;
    &lt;dependency&gt;
        &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt;
        &lt;artifactId&gt;spring-boot-starter-data-jpa&lt;/artifactId&gt;
    &lt;/dependency&gt;
    &lt;dependency&gt;
        &lt;groupId&gt;com.h2database&lt;/groupId&gt;
        &lt;artifactId&gt;h2&lt;/artifactId&gt;
        &lt;scope&gt;runtime&lt;/scope&gt;
    &lt;/dependency&gt;
&lt;/dependencies&gt;
</code></pre>
<blockquote>
<p><strong>Why</strong> <code>spring-ai-starter-mcp-server-webmvc</code><strong>?</strong> This starter provides the HTTP/SSE transport layer — the way most MCP clients connect to remote servers. It auto-configures the <code>/sse</code> and <code>/mcp/messages</code> endpoints for you. The alternative <code>spring-ai-starter-mcp-server</code> is STDIO only (suitable for local subprocess invocation). Use <code>webmvc</code> for anything network-accessible.</p>
</blockquote>
<hr />
<h2>Step 2 — <code>application.yml</code></h2>
<pre><code class="language-yaml">server:
  port: 8080

spring:
  datasource:
    url: jdbc:h2:mem:tododb;DB_CLOSE_DELAY=-1
    driver-class-name: org.h2.Driver
    username: sa
    password:
  jpa:
    hibernate:
      ddl-auto: create-drop
  h2:
    console:
      enabled: true

  ai:
    mcp:
      server:
        name: todo-mcp-server
        version: 1.0.0
        type: SYNC
        instructions: |
          This server manages Todo/Task items.
          Tools available: createTodo, getTodoById, getAllTodos,
          updateTodo, deleteTodo, completeTodo, searchTodos,
          getTodosByPriority, getStats.
        capabilities:
          tool: true
</code></pre>
<p>The <code>instructions</code> field is important — it's sent to the AI client during the handshake and helps the LLM understand what your server does before it even reads the individual tool descriptions.</p>
<hr />
<h2>Step 3 — The Entity</h2>
<pre><code class="language-java">@Entity
@Table(name = "todos")
public class Todo {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @NotBlank
    private String title;

    @Column(length = 2000)
    private String description;

    @Enumerated(EnumType.STRING)
    private Priority priority = Priority.MEDIUM;

    @Enumerated(EnumType.STRING)
    private Status status = Status.PENDING;

    private LocalDateTime createdAt;
    private LocalDateTime updatedAt;
    private LocalDateTime completedAt;

    public enum Priority { LOW, MEDIUM, HIGH, CRITICAL }
    public enum Status   { PENDING, IN_PROGRESS, COMPLETED, CANCELLED }

    @PrePersist
    protected void onCreate() {
        createdAt = updatedAt = LocalDateTime.now();
    }

    @PreUpdate
    protected void onUpdate() {
        updatedAt = LocalDateTime.now();
    }

    // ... getters/setters
}
</code></pre>
<hr />
<h2>Step 4 — The Tools Class (The Heart of It All)</h2>
<p>This is where Spring AI's <code>@Tool</code> annotation does the heavy lifting. Every annotated method becomes a discoverable MCP tool with an auto-generated JSON Schema for its parameters.</p>
<pre><code class="language-java">@Service
public class TodoMcpTools {

    private final TodoService service;
    private final ObjectMapper objectMapper;

    public TodoMcpTools(TodoService service, ObjectMapper objectMapper) {
        this.service = service;
        this.objectMapper = objectMapper;
    }

    @Tool(name = "createTodo",
          description = """
              Creates a new Todo item. Returns the created Todo as JSON
              including the auto-generated id.
              Priority: LOW, MEDIUM, HIGH, CRITICAL (defaults to MEDIUM).
              """)
    public String createTodo(
            @ToolParam(description = "Short, descriptive title. Required.")
            String title,

            @ToolParam(description = "Optional detailed description.", required = false)
            String description,

            @ToolParam(description = "Priority: LOW, MEDIUM, HIGH, CRITICAL.", required = false)
            String priority) {

        try {
            Todo todo = service.create(title, description, priority);
            return toJson(todoToMap(todo));
        } catch (Exception e) {
            return errorJson("createTodo failed: " + e.getMessage());
        }
    }

    @Tool(name = "completeTodo",
          description = """
              Marks a Todo as COMPLETED and records the completion timestamp.
              Returns the updated Todo as JSON.
              """)
    public String completeTodo(
            @ToolParam(description = "The numeric id of the todo to complete.")
            Long id) {

        return service.complete(id)
                .map(t -&gt; toJson(todoToMap(t)))
                .orElse(errorJson("Todo with id=%d not found".formatted(id)));
    }

    @Tool(name = "getStats",
          description = """
              Returns aggregate statistics: counts by status and priority.
              Great for summaries and dashboards.
              """)
    public String getStats() {
        return toJson(service.getStats());
    }

    // ... remaining tools follow the same pattern
}
</code></pre>
<blockquote>
<p><strong>Write tool descriptions for the LLM, not for humans.</strong> Be explicit about valid enum values, what null means, and exactly what the return value contains. The LLM reads these descriptions to decide which tool to call and how to call it correctly.</p>
</blockquote>
<hr />
<h2>Step 5 — Register the Tools</h2>
<pre><code class="language-java">@Configuration
public class McpServerConfig {

    @Bean
    public ToolCallbackProvider todoToolCallbacks(TodoMcpTools todoMcpTools) {
        return MethodToolCallbackProvider.builder()
                .toolObjects(todoMcpTools)
                .build();
    }
}
</code></pre>
<p>That's it. <code>MethodToolCallbackProvider</code> reflects over your <code>@Tool</code> methods, generates the MCP tool descriptors with full JSON Schema from the method signatures, and Spring AI's auto-configuration registers them on the MCP endpoint automatically.</p>
<hr />
<h2>Step 6 — Run It</h2>
<pre><code class="language-bash">./mvnw spring-boot:run
</code></pre>
<p>You'll see in the logs:</p>
<pre><code class="language-plaintext">Registered MCP tools: [createTodo, getTodoById, getAllTodos,
  getTodosByStatus, getTodosByPriority, searchTodos,
  updateTodo, completeTodo, deleteTodo, getStats]
Started TodoMcpServerApplication on port 8080
</code></pre>
<p>The server is live at:</p>
<ul>
<li><p><strong>SSE endpoint:</strong> <code>http://localhost:8080/sse</code></p>
</li>
<li><p><strong>H2 Console:</strong> <code>http://localhost:8080/h2-console</code></p>
</li>
<li><p><strong>Health check:</strong> <code>http://localhost:8080/actuator/health</code></p>
</li>
</ul>
<hr />
<h2>Connecting Claude Desktop</h2>
<p>Add this to your Claude Desktop <code>claude_desktop_config.json</code>:</p>
<pre><code class="language-json">{
  "mcpServers": {
    "todo-manager": {
      "type": "sse",
      "url": "http://localhost:8080/sse"
    }
  }
}
</code></pre>
<p>Restart Claude Desktop. You'll see a hammer icon in the chat UI — that means your tools are live. Now ask:</p>
<blockquote>
<p><em>"What are my critical priority tasks?""Mark task 5 as complete.""Give me a summary of all my todos."</em></p>
</blockquote>
<p>Claude will call your Spring Boot server, get real data back, and answer with zero hallucination.</p>
<hr />
<h2>Unit Testing the Service Layer</h2>
<p>Because <code>TodoService</code> has no Spring AI dependency, it tests exactly like any other Spring service:</p>
<pre><code class="language-java">@ExtendWith(MockitoExtension.class)
class TodoServiceTest {

    @Mock  TodoRepository repository;
    @InjectMocks TodoService service;

    @Test
    void create_shouldPersistAndReturnTodo() {
        Todo expected = new Todo("Fix bug", "NPE on line 42", Todo.Priority.HIGH);
        expected.setId(1L);
        when(repository.save(any())).thenReturn(expected);

        Todo result = service.create("Fix bug", "NPE on line 42", "HIGH");

        assertThat(result.getTitle()).isEqualTo("Fix bug");
        assertThat(result.getPriority()).isEqualTo(Todo.Priority.HIGH);
        verify(repository).save(any(Todo.class));
    }

    @Test
    void complete_shouldSetStatusAndTimestamp() {
        Todo todo = new Todo("Task", null, Todo.Priority.MEDIUM);
        todo.setId(1L);
        when(repository.findById(1L)).thenReturn(Optional.of(todo));
        when(repository.save(any())).thenAnswer(i -&gt; i.getArguments()[0]);

        Optional&lt;Todo&gt; result = service.complete(1L);

        assertThat(result).isPresent();
        assertThat(result.get().getStatus()).isEqualTo(Todo.Status.COMPLETED);
        assertThat(result.get().getCompletedAt()).isNotNull();
    }
}
</code></pre>
<hr />
<h2>Production Considerations</h2>
<p>When you're ready to move beyond the demo, here's what to address:</p>
<p><strong>Database:</strong> Swap H2 for PostgreSQL by replacing the datasource config and adding the PostgreSQL driver. Zero code changes needed — that's the beauty of Spring Data JPA.</p>
<p><strong>Security:</strong> The MCP spec requires OAuth2 for HTTP-exposed servers. Spring AI 1.1.x has a companion <code>mcp-server-security</code> module. Add it and a <code>SecurityFilterChain</code> bean — the Spring team published a detailed walkthrough on the Spring blog.</p>
<p><strong>Observability:</strong> Add <code>spring-boot-starter-actuator</code> with Micrometer. Your MCP tool call counts, latencies, and error rates surface as Prometheus metrics automatically.</p>
<p><strong>Packaging:</strong> Build a Docker image with <code>./mvnw spring-boot:build-image</code> — Spring Boot's Cloud Native Buildpacks produce a production-grade container with no Dockerfile required.</p>
<hr />
<h2>What We Built</h2>
<p>In one Spring Boot application we've produced an MCP server that:</p>
<ul>
<li><p>Exposes <strong>9 fully-described tools</strong> discoverable by any MCP client</p>
</li>
<li><p>Uses <strong>standard Spring idioms</strong> — JPA, <code>@Service</code>, <code>@Bean</code>, <code>@Transactional</code></p>
</li>
<li><p>Keeps business logic <strong>completely decoupled</strong> from the AI/MCP layer</p>
</li>
<li><p>Is <strong>unit-testable</strong> without any AI dependencies</p>
</li>
<li><p>Seeds itself with demo data for instant exploration</p>
</li>
<li><p>Connects to <strong>Claude Desktop, Cursor, or any MCP-compatible client</strong> in 30 seconds</p>
</li>
</ul>
<p>The Java ecosystem is ready for MCP. Spring AI 1.1.x gives you a first-class, annotation-driven path that feels exactly like the Spring you already know — no Python required.</p>
<hr />
<h2>Full Source Code</h2>
<p>The complete project is available on GitHub: <a href="https://github.com/tanmoymandal/todo-mcp-server">https://github.com/tanmoymandal/todo-mcp-server</a></p>
<hr />
<p><strong>Tags:</strong> <code>#Java</code> <code>#SpringBoot</code> <code>#SpringAI</code> <code>#MCP</code> <code>#ModelContextProtocol</code> <code>#AI</code> <code>#LLM</code> <code>#Claude</code></p>
]]></content:encoded></item><item><title><![CDATA[The Ultimate AI Glossary: 60+ Terms Every Developer Should Know in 2026]]></title><description><![CDATA[From Transformers to RAG, Agents to Embeddings — decoded.

Whether you're diving into your first machine learning project or architecting enterprise AI systems, the landscape of AI terminology can fee]]></description><link>https://tanmoymandal.dev/the-ultimate-ai-glossary-60-terms-every-developer-should-know-in-2026</link><guid isPermaLink="true">https://tanmoymandal.dev/the-ultimate-ai-glossary-60-terms-every-developer-should-know-in-2026</guid><category><![CDATA[AI]]></category><category><![CDATA[Machine Learning]]></category><category><![CDATA[llm]]></category><category><![CDATA[Deep Learning]]></category><category><![CDATA[generative ai]]></category><category><![CDATA[ai-glossary]]></category><category><![CDATA[mlops]]></category><category><![CDATA[RAG ]]></category><category><![CDATA[transformers]]></category><dc:creator><![CDATA[Tanmoy Mandal]]></dc:creator><pubDate>Tue, 17 Mar 2026 14:00:47 GMT</pubDate><content:encoded><![CDATA[<hr />
<p><em>From Transformers to RAG, Agents to Embeddings — decoded.</em></p>
<hr />
<p>Whether you're diving into your first machine learning project or architecting enterprise AI systems, the landscape of AI terminology can feel overwhelming. This glossary cuts through the noise with clear, developer-friendly definitions — organized by category so you can navigate what you need.</p>
<p>Bookmark this. You'll be back.</p>
<hr />
<h2>🧠 Foundation: Core AI Concepts</h2>
<h3>Artificial Intelligence (AI)</h3>
<p>The broad field of building systems that can perform tasks typically requiring human intelligence — reasoning, understanding language, recognizing patterns, and making decisions. AI is the umbrella; everything else in this glossary lives under it.</p>
<h3>Machine Learning (ML)</h3>
<p>A subset of AI where systems <em>learn from data</em> rather than following explicitly programmed rules. Instead of writing <code>if (temperature &gt; 100) return "hot"</code>, you feed examples and let the algorithm figure out the pattern.</p>
<h3>Deep Learning (DL)</h3>
<p>A subset of machine learning that uses <em>neural networks with many layers</em> (hence "deep"). Deep learning powers most modern breakthroughs — image recognition, speech synthesis, large language models.</p>
<h3>Neural Network</h3>
<p>A computational model loosely inspired by the human brain. It consists of interconnected <em>nodes (neurons)</em> organized in layers that transform input data into output predictions. Each connection has a <em>weight</em> that gets tuned during training.</p>
<h3>Algorithm</h3>
<p>A set of rules or instructions that a model follows to make decisions or learn from data. In ML, the algorithm defines <em>how</em> the model learns — gradient descent, backpropagation, etc.</p>
<h3>Model</h3>
<p>The trained artifact that results from running an ML algorithm on data. When people say "deploy the model," they mean the weights and architecture that now encode learned knowledge.</p>
<hr />
<h2>📊 Data &amp; Training</h2>
<h3>Training Data</h3>
<p>The dataset used to teach a model. Quality and quantity both matter enormously. Biased training data → biased model. Insufficient training data → an underfit model.</p>
<h3>Test Data / Validation Data</h3>
<p>Held-out datasets used to evaluate model performance <em>after</em> training. Validation data guides hyperparameter tuning during training; test data gives the final performance estimate.</p>
<h3>Overfitting</h3>
<p>When a model learns the training data <em>too well</em> — including its noise and quirks — and fails to generalize to new data. Classic symptom: 99% training accuracy, 60% test accuracy.</p>
<h3>Underfitting</h3>
<p>The opposite problem: the model is too simple to capture the underlying patterns. Both low training accuracy and low test accuracy.</p>
<h3>Supervised Learning</h3>
<p>Training with <em>labeled examples</em> — input/output pairs. The model learns to map inputs to correct outputs. Most classification and regression tasks are supervised.</p>
<h3>Unsupervised Learning</h3>
<p>Training on <em>unlabeled data</em> to discover hidden structure. Clustering (grouping similar items) and dimensionality reduction are common unsupervised tasks.</p>
<h3>Reinforcement Learning (RL)</h3>
<p>A paradigm where an <em>agent</em> learns by taking actions in an environment and receiving <em>rewards or penalties</em>. Used in game-playing AIs (AlphaGo) and increasingly in fine-tuning LLMs.</p>
<h3>Fine-Tuning</h3>
<p>Taking a pre-trained model and continuing to train it on a smaller, task-specific dataset. Much cheaper than training from scratch and usually yields excellent results for specialized domains.</p>
<h3>RLHF (Reinforcement Learning from Human Feedback)</h3>
<p>A fine-tuning technique where human raters score model outputs, and those scores train a <em>reward model</em> that guides further RL training. Core to how models like Claude and ChatGPT are aligned.</p>
<h3>Batch Size</h3>
<p>The number of training examples processed together before updating model weights. Larger batches = more stable gradients but more memory. Smaller batches = noisier gradients but potentially better generalization.</p>
<h3>Epoch</h3>
<p>One complete pass through the entire training dataset. Training typically runs for multiple epochs.</p>
<h3>Learning Rate</h3>
<p>A hyperparameter that controls <em>how much</em> to adjust model weights per update. Too high = unstable training. Too low = painfully slow convergence.</p>
<hr />
<h2>🤖 Large Language Models (LLMs)</h2>
<h3>Large Language Model (LLM)</h3>
<p>A neural network trained on massive text corpora to understand and generate human language. "Large" refers to billions of parameters. GPT-4, Claude, Gemini, and Llama are LLMs.</p>
<h3>Transformer</h3>
<p>The neural network architecture that powers virtually all modern LLMs. Introduced in the 2017 paper <em>"Attention Is All You Need"</em>, it replaced RNNs with a mechanism called <em>self-attention</em> that processes all tokens simultaneously.</p>
<h3>Attention Mechanism</h3>
<p>The core innovation of Transformers. Allows the model to weigh the importance of different parts of the input when generating each output token. "Attending" to the right context is what makes LLMs coherent.</p>
<h3>Token</h3>
<p>The basic unit of text that LLMs process. Not quite words — tokens are chunks of characters (e.g., "transformer" might be one token; "unbelievable" might be two). Most LLMs use ~4 characters per token on average.</p>
<h3>Context Window</h3>
<p>The maximum number of tokens an LLM can process in a single interaction — both input and output combined. GPT-4 Turbo has 128K tokens; Claude has up to 200K. Larger context = better for long documents.</p>
<h3>Prompt</h3>
<p>The input text you send to an LLM to get a response. Prompt design significantly affects output quality — hence the discipline of <em>prompt engineering</em>.</p>
<h3>Prompt Engineering</h3>
<p>The art and science of crafting prompts to elicit better responses from LLMs. Techniques include chain-of-thought prompting, few-shot examples, role assignment, and structured output requests.</p>
<h3>Few-Shot Prompting</h3>
<p>Including a few examples of the task in the prompt (e.g., 2–5 input/output pairs) to help the model understand what you want without any fine-tuning.</p>
<h3>Zero-Shot Prompting</h3>
<p>Asking the model to perform a task with <em>no examples</em> — just a description. Works surprisingly well with modern LLMs due to their broad pre-training.</p>
<h3>Chain-of-Thought (CoT)</h3>
<p>A prompting technique where you ask the model to reason step-by-step before giving its final answer. Dramatically improves performance on multi-step reasoning and math problems.</p>
<h3>Temperature</h3>
<p>A parameter (0.0 to 2.0) that controls output randomness. Temperature 0 = deterministic, always picks the most likely token. Temperature 1+ = more creative and varied. For code generation, use low temperatures; for brainstorming, use higher.</p>
<h3>Top-P (Nucleus Sampling)</h3>
<p>An alternative to temperature for controlling randomness. Instead of adjusting probabilities, it restricts sampling to the smallest set of tokens whose cumulative probability exceeds P. <code>top_p=0.9</code> is a common default.</p>
<h3>Hallucination</h3>
<p>When an LLM confidently generates factually incorrect information. A fundamental challenge in LLMs — they optimize for <em>plausible</em> text, not necessarily <em>true</em> text.</p>
<h3>Grounding</h3>
<p>Connecting model outputs to verifiable, external sources of truth to reduce hallucinations. Retrieval-Augmented Generation (RAG) is the primary grounding technique.</p>
<hr />
<h2>🔍 RAG &amp; Retrieval</h2>
<h3>RAG (Retrieval-Augmented Generation)</h3>
<p>An architecture that combines an LLM with a <em>retrieval system</em>. Instead of relying solely on training knowledge, the model retrieves relevant documents at inference time and uses them as context. Dramatically reduces hallucinations for knowledge-intensive tasks.</p>
<h3>Vector Database</h3>
<p>A database optimized for storing and querying <em>embeddings</em> (high-dimensional vectors). Used heavily in RAG systems to find semantically similar documents. Examples: Pinecone, Weaviate, Chroma, pgvector.</p>
<h3>Embedding</h3>
<p>A numerical vector representation of text (or images, audio, etc.) that captures semantic meaning. Similar concepts cluster together in embedding space. <code>"cat"</code> and <code>"kitten"</code> will have similar embeddings; <code>"cat"</code> and <code>"blockchain"</code> will not.</p>
<h3>Semantic Search</h3>
<p>Search based on <em>meaning</em> rather than keyword matching. Uses embeddings to find documents that are conceptually relevant, even if they don't share the exact same words.</p>
<h3>Chunking</h3>
<p>The process of splitting large documents into smaller pieces before embedding and storing them in a vector database. Chunk size is a critical tuning parameter in RAG — too large loses precision, too small loses context.</p>
<h3>Reranking</h3>
<p>A second-pass step in RAG that takes the top-k retrieved chunks and re-scores them using a more powerful (but slower) cross-encoder model, before passing the best results to the LLM.</p>
<hr />
<h2>🛠️ Agents &amp; Tools</h2>
<h3>AI Agent</h3>
<p>An LLM-powered system that can <em>reason, plan, and take actions</em> autonomously — not just generate text. Agents decide what tools to call, observe results, and iterate until a goal is achieved.</p>
<h3>Tool Use / Function Calling</h3>
<p>The ability for an LLM to call external functions or APIs as part of generating a response. The model outputs a structured "call this function with these arguments" rather than raw text.</p>
<h3>Agentic Loop</h3>
<p>The iterative cycle an AI agent follows: observe → think → act → observe → repeat, until the task is complete or a stopping condition is met.</p>
<h3>Multi-Agent System</h3>
<p>An architecture where multiple specialized AI agents collaborate — one might browse the web, another writes code, another reviews it. Frameworks like LangGraph and AutoGen implement this.</p>
<h3>ReAct (Reason + Act)</h3>
<p>A prompting framework for agents that interleaves reasoning ("Thought: ...") with actions ("Action: search[...]") and observations. Makes agent behavior more transparent and debuggable.</p>
<h3>MCP (Model Context Protocol)</h3>
<p>An open protocol developed by Anthropic that standardizes how AI models connect to external tools, data sources, and services. Think of it as USB-C for AI integrations — a universal interface for connecting models to the world.</p>
<h3>Orchestration</h3>
<p>The layer that manages the flow of an AI system — routing between agents, managing state, handling retries, and coordinating tool calls. LangChain, LlamaIndex, and LangGraph are popular orchestration frameworks.</p>
<hr />
<h2>⚙️ Model Architecture &amp; Inference</h2>
<h3>Parameters</h3>
<p>The learned numerical weights inside a model. "A 70B model" has 70 billion parameters. More parameters generally means more capability, but also more compute and memory.</p>
<h3>Inference</h3>
<p>Running a trained model to generate predictions or responses. Distinct from <em>training</em>. When you call an LLM API, you're doing inference.</p>
<h3>Quantization</h3>
<p>Reducing the numerical precision of model weights (e.g., from 32-bit floats to 4-bit integers) to decrease memory usage and speed up inference. Essential for running large models on consumer hardware.</p>
<h3>Latency vs. Throughput</h3>
<p>Two key inference metrics. <em>Latency</em> is how long a single request takes (user-facing). <em>Throughput</em> is how many requests per second the system handles. There's often a tradeoff.</p>
<h3>TTFT (Time to First Token)</h3>
<p>The latency between sending a request and receiving the <em>first token</em> of the response. Critical for user experience in streaming applications.</p>
<h3>Structured Output</h3>
<p>Constraining an LLM to generate responses in a specific format (JSON, XML, etc.) rather than free text. Used when downstream code needs to parse the response programmatically.</p>
<h3>System Prompt</h3>
<p>Instructions sent to an LLM that set the context, persona, or rules for the conversation — separate from the user's message. Most API-based LLMs support a dedicated system prompt field.</p>
<hr />
<h2>🎨 Generative AI (Images, Audio, Video)</h2>
<h3>Generative AI</h3>
<p>AI systems that can <em>create new content</em> — text, images, audio, video, code — rather than just classifying or analyzing existing content.</p>
<h3>Diffusion Model</h3>
<p>The architecture behind most modern image generation models (Stable Diffusion, DALL-E, Midjourney). Works by learning to <em>reverse</em> a noise-addition process — starting from random noise and gradually denoising into a coherent image.</p>
<h3>Text-to-Image</h3>
<p>Generating images from natural language descriptions. The prompt <code>"a photorealistic astronaut riding a horse on Mars, golden hour lighting"</code> produces an image.</p>
<h3>Multimodal Model</h3>
<p>A model that can process and generate <em>multiple types of data</em> — text, images, audio, video. GPT-4o and Claude 3.5 are multimodal — they can see images and respond in text.</p>
<h3>Latent Space</h3>
<p>The compressed, abstract representation of data learned by a model. Diffusion models generate images by navigating latent space. Embeddings <em>are</em> points in latent space.</p>
<hr />
<h2>🔐 Safety, Alignment &amp; Ethics</h2>
<h3>Alignment</h3>
<p>The challenge of ensuring AI systems behave in accordance with human values and intentions. Misaligned AI does what it was <em>literally trained to do</em>, not necessarily what we <em>actually want</em>.</p>
<h3>Constitutional AI</h3>
<p>An Anthropic technique where a model critiques and revises its own outputs based on a set of principles (a "constitution"), reducing reliance on human feedback for every edge case.</p>
<h3>Guardrails</h3>
<p>Constraints applied to model inputs or outputs to prevent unsafe, harmful, or off-topic responses. Can be implemented at the prompt level, via fine-tuning, or with a separate classifier.</p>
<h3>Jailbreak</h3>
<p>An attempt to bypass an LLM's safety guardrails through clever prompting — often by roleplay scenarios, hypothetical framings, or encoded instructions.</p>
<h3>Bias</h3>
<p>Systematic errors in model outputs reflecting unfair prejudices from training data. Algorithmic bias can perpetuate or amplify societal inequalities if left unchecked.</p>
<hr />
<h2>📐 Evaluation &amp; Metrics</h2>
<h3>Benchmark</h3>
<p>A standardized dataset and evaluation protocol used to compare model capabilities. Common LLM benchmarks include MMLU, HumanEval (coding), and MATH.</p>
<h3>Perplexity</h3>
<p>A measure of how well a language model predicts a sample of text. Lower perplexity = better. Mostly used internally during training; less useful for task-specific evaluation.</p>
<h3>BLEU / ROUGE</h3>
<p>Automated metrics for evaluating text generation quality by comparing to reference outputs. BLEU is common for translation; ROUGE for summarization. Both have limitations — high scores don't always mean high quality.</p>
<h3>Evals (Evaluations)</h3>
<p>The practice of systematically testing AI model outputs against desired behavior. Moving from vibes-based to eval-driven development is the mark of a mature AI engineering team.</p>
<hr />
<h2>🚀 Deployment &amp; Infrastructure</h2>
<h3>API (Application Programming Interface)</h3>
<p>The interface through which you call an LLM programmatically. Send a prompt → receive a response. OpenAI, Anthropic, Google, and others expose their models via REST APIs.</p>
<h3>Self-Hosted / On-Premises</h3>
<p>Running an LLM on your own infrastructure rather than via a cloud API. Required for air-gapped environments, data privacy requirements, or cost optimization at scale.</p>
<h3>GPU (Graphics Processing Unit)</h3>
<p>The hardware backbone of AI. GPUs excel at the massively parallel matrix multiplications that neural networks require. NVIDIA's H100 and A100 are the current gold standard for AI training and inference.</p>
<h3>Model Serving</h3>
<p>The infrastructure that takes a trained model and makes it available as a service — handling request routing, batching, scaling, and versioning. Tools include NVIDIA Triton, vLLM, and Ray Serve.</p>
<h3>Streaming</h3>
<p>Returning LLM output <em>token by token</em> as it's generated rather than waiting for the full response. Makes UX feel much more responsive.</p>
<hr />
<h2>🧩 Quick Reference Cheat Sheet</h2>
<table>
<thead>
<tr>
<th>Term</th>
<th>One-Line Definition</th>
</tr>
</thead>
<tbody><tr>
<td>LLM</td>
<td>Large neural net trained on text to understand and generate language</td>
</tr>
<tr>
<td>RAG</td>
<td>Retrieval + generation to ground LLMs in real documents</td>
</tr>
<tr>
<td>Embedding</td>
<td>Numerical vector representing meaning</td>
</tr>
<tr>
<td>Token</td>
<td>Basic text unit an LLM processes</td>
</tr>
<tr>
<td>Fine-tuning</td>
<td>Adapting a pre-trained model for a specific task</td>
</tr>
<tr>
<td>Agent</td>
<td>LLM + tools + reasoning loop = autonomous task execution</td>
</tr>
<tr>
<td>Hallucination</td>
<td>Model confidently saying something false</td>
</tr>
<tr>
<td>Temperature</td>
<td>Controls how random/creative output is</td>
</tr>
<tr>
<td>Context Window</td>
<td>Max tokens the model can "see" at once</td>
</tr>
<tr>
<td>Quantization</td>
<td>Compressing model weights to run on less memory</td>
</tr>
</tbody></table>
<hr />
<h2>Wrapping Up</h2>
<p>AI terminology evolves fast — new terms emerge with every major paper and product launch. The best way to stay current is to read primary sources (ArXiv, research blogs from Anthropic, Google DeepMind, Meta AI), build things, and stay curious.</p>
<p>Got a term that should be in here? Drop a comment below.</p>
]]></content:encoded></item></channel></rss>