<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Hullabalooing</title>
    <link>https://cruver.ai/</link>
    <description>Recent content on Hullabalooing</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Mon, 23 Mar 2026 18:00:00 -0600</lastBuildDate>
    <atom:link href="https://cruver.ai/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>It Turns Out My Router Was a National Security Decision</title>
      <link>https://cruver.ai/homelab/posts/it-turns-out-my-router-was-a-national-security-decision-1774311445/</link>
      <pubDate>Mon, 23 Mar 2026 18:00:00 -0600</pubDate>
      <guid>https://cruver.ai/homelab/posts/it-turns-out-my-router-was-a-national-security-decision-1774311445/</guid>
      <description>&lt;p&gt;In March 2026, the FCC designated all consumer routers manufactured outside the United States as a national security risk, banning new foreign-made models from sale.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; Eero, NetGear, and Google Nest are all affected, not just Chinese manufacturers, because they all produce hardware overseas.&lt;/p&gt;&#xA;&lt;p&gt;The reasoning in the FCC&amp;rsquo;s National Security Determination is specific. Three Chinese nation-state operations, Volt Typhoon, Flax Typhoon, and Salt Typhoon, used compromised consumer routers to build botnets targeting US critical infrastructure: communications, energy, transportation, and water systems. The FBI and DOJ shut down one of these botnets in 2024. The ban follows directly from those incidents.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <title>Three Production Apps, Zero Code: What Keip Actually Looks Like in Practice</title>
      <link>https://cruver.ai/homelab/posts/keip-connect-real-world-apps/</link>
      <pubDate>Fri, 13 Mar 2026 00:00:00 -0600</pubDate>
      <guid>https://cruver.ai/homelab/posts/keip-connect-real-world-apps/</guid>
      <description>&lt;h2 id=&#34;three-production-apps-zero-code-what-keip-actually-looks-like-in-practice&#34;&gt;Three Production Apps, Zero Code: What Keip Actually Looks Like in Practice&lt;a class=&#34;anchor&#34; href=&#34;#three-production-apps-zero-code-what-keip-actually-looks-like-in-practice&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;In the past month I shipped three integration apps to my home cluster: a Spanish translator that replies in audio, a health tracking bot I can query from any room in my house, and a camera alert system that decides for itself what&amp;rsquo;s worth telling me about. Each one took under an hour to have running. None of them required me to write a single line of application code.&lt;/p&gt;</description>
    </item>
    <item>
      <title>My Data, My Stack</title>
      <link>https://cruver.ai/homelab/posts/my-data-my-stack/</link>
      <pubDate>Sat, 07 Mar 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/homelab/posts/my-data-my-stack/</guid>
      <description>&lt;p&gt;I want to introduce a concept I&amp;rsquo;ll be coming back to throughout this blog: Personal Data Sovereignty, or PDS. It&amp;rsquo;s the idea that individuals can and should have meaningful control over where their data lives, who can access it, and what happens to it. Not as a legal right, as something you actually build.&lt;/p&gt;&#xA;&lt;p&gt;&lt;em&gt;If you&amp;rsquo;ve been reading this blog for a while, some of what follows will look familiar. I&amp;rsquo;ve written about the network setup, the GPU server, the second brain, the AI inference stack. This post isn&amp;rsquo;t about new technical ground, it&amp;rsquo;s more philosophical. PDS is the concept I&amp;rsquo;ve been building toward without naming it directly, and I wanted to put it in one place before the series goes any further.&lt;/em&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <title>I Pushed Local LLMs Harder. Here&#39;s What Two Models Actually Did.</title>
      <link>https://cruver.ai/gpu-ai/posts/local-llm-two-models-real-project/</link>
      <pubDate>Mon, 02 Mar 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/local-llm-two-models-real-project/</guid>
      <description>&lt;p&gt;In Part 1 of this series, I set up Claude Code against local LLMs on dual MI60 GPUs and watched it scaffold a Flask application from scratch. Small tasks worked. Complex ones did not. I ended with three ideas I wanted to test: running a dense model, trying Claude Code&amp;rsquo;s agent teams feature, and building a persistent memory layer for coding sessions.&lt;/p&gt;&#xA;&lt;p&gt;I started the experiment that mattered most: giving a local LLM a project with real scope and seeing what happened. I ran the same project against two different models. The results were instructive, and not in the direction I expected.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Enterprise Integration Patterns Aren&#39;t Dead; They&#39;re Running on Kubernetes and Orchestrating AI</title>
      <link>https://cruver.ai/writing/posts/keip-eip-ai-kubernetes/</link>
      <pubDate>Tue, 24 Feb 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/writing/posts/keip-eip-ai-kubernetes/</guid>
      <description>&lt;h2 id=&#34;enterprise-integration-patterns-arent-dead-theyre-running-on-kubernetes-and-orchestrating-ai&#34;&gt;Enterprise Integration Patterns Aren&amp;rsquo;t Dead; They&amp;rsquo;re Running on Kubernetes and Orchestrating AI&lt;a class=&#34;anchor&#34; href=&#34;#enterprise-integration-patterns-arent-dead-theyre-running-on-kubernetes-and-orchestrating-ai&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;Keip is a Kubernetes operator for Enterprise Integration Patterns. Gregor Hohpe and Bobby Woolf documented these patterns in 2003: content-based routers, message transformers, splitters, aggregators, dead letter channels. Spring Integration has implemented them for years, but deploying Spring Integration on Kubernetes has always been harder than it should be. Keip fixes that. It turns integration routes into native Kubernetes resources, and I use it to run an LLM-powered media analysis pipeline on my home cluster.&lt;/p&gt;</description>
    </item>
    <item>
      <title>I Ran Claude Code on Local LLMs for a Month. Here&#39;s What Worked for Me.</title>
      <link>https://cruver.ai/gpu-ai/posts/claude-code-local-llms-what-worked/</link>
      <pubDate>Mon, 16 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/claude-code-local-llms-what-worked/</guid>
      <description>&lt;p&gt;I have been running local LLMs on dual AMD MI60 GPUs for over a year, and recently pointed Claude Code at them to see how close I could get to frontier-quality agentic coding on my own hardware. I wanted to know: how close can local models actually get to Anthropic&amp;rsquo;s Claude for building real software?&lt;/p&gt;&#xA;&lt;p&gt;This is the first post in a series documenting that experiment. Future posts will cover specific agentic coding sessions, hardware and model setup, and the tools I am building to close the gap.&lt;/p&gt;</description>
    </item>
    <item>
      <title>I Built an AI Health Coach That Actually Knows Me</title>
      <link>https://cruver.ai/health-tracking/posts/health-tracking-overview-1768764745/</link>
      <pubDate>Wed, 11 Feb 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/health-tracking/posts/health-tracking-overview-1768764745/</guid>
      <description>&lt;p&gt;Every health app wants to coach you, but none of them know you. They do not know your baseline. They do not know that your blood pressure spikes after certain meals, or that your ketones tank when you are stressed. They offer generic advice based on population averages and call it personalized.&lt;/p&gt;&#xA;&lt;p&gt;So I built my own. It runs on my infrastructure, stores everything in a time-series database I control, and uses Claude to understand what I am eating, estimate the macros, and give me context on every measurement I log. Not generic advice. Insights based on &lt;em&gt;my&lt;/em&gt; data.&lt;/p&gt;</description>
    </item>
    <item>
      <title>A Second Brain That My AI and I Share</title>
      <link>https://cruver.ai/second-brain/posts/a-second-brain-that-my-ai-and-i-share/</link>
      <pubDate>Fri, 30 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://cruver.ai/second-brain/posts/a-second-brain-that-my-ai-and-i-share/</guid>
      <description>&lt;p&gt;My AI assistant and I share the same brain. Not metaphorically, we literally read and write to the same knowledge base. When I update a project note in Emacs, Nabu (my AI) sees the change. When Nabu logs something it learned, I see it in my daily file. This post is about how it works.&lt;/p&gt;&#xA;&lt;h2 id=&#34;how-it-works&#34;&gt;How It Works&lt;a class=&#34;anchor&#34; href=&#34;#how-it-works&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;Nabu connects to my self-hosted Matrix server and it has access to my org-roam knowledge base via an MCP server I built. But it&amp;rsquo;s not just access, it&amp;rsquo;s &lt;em&gt;integration&lt;/em&gt;. Nabu treats org-roam as its source of truth:&lt;/p&gt;</description>
    </item>
    <item>
      <title>Why Your News Feed Feels Coordinated</title>
      <link>https://cruver.ai/signalscope/posts/why-your-news-feed-feels-coordinated-1769718132/</link>
      <pubDate>Thu, 29 Jan 2026 13:20:00 +0000</pubDate>
      <guid>https://cruver.ai/signalscope/posts/why-your-news-feed-feels-coordinated-1769718132/</guid>
      <description>&lt;p&gt;Some years ago, I started to notice on Facebook that friends, family members, people I had not talked to in years, all posting the same arguments using nearly identical phrasing. Not just agreeing on issues, which is normal, but using the same specific words and framing. They had clearly heard it somewhere, probably cable news or talk radio, and were repeating it verbatim.&lt;/p&gt;&#xA;&lt;p&gt;Sometime later, I started to see news reports about foreign propaganda campaigns. Russian troll farms, Chinese state media, Iranian influence networks. The pattern I noticed on Facebook was not just me being paranoid.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Task-Decoupled Agents: Breaking the Monolithic Context Problem</title>
      <link>https://cruver.ai/writing/posts/task-decoupled-agents-1769113856/</link>
      <pubDate>Thu, 22 Jan 2026 13:30:56 +0000</pubDate>
      <guid>https://cruver.ai/writing/posts/task-decoupled-agents-1769113856/</guid>
      <description>&lt;p&gt;A recent paper from Tsinghua and Renmin University caught my attention: &lt;a href=&#34;https://arxiv.org/abs/2601.07577&#34;&gt;Task-Decoupled Planning for Long-Horizon Agents&lt;/a&gt;. The core insight resonated with a problem I&amp;rsquo;ve been wrestling with on a Java code modernization project at work, and I think the pattern deserves more attention.&lt;/p&gt;&#xA;&lt;h2 id=&#34;the-entangled-context-problem&#34;&gt;The Entangled Context Problem&lt;a class=&#34;anchor&#34; href=&#34;#the-entangled-context-problem&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;Most LLM agent architectures fall into two camps: step-wise planning (reactive, one step at a time) or one-shot planning (generate the whole plan upfront). Both share a fatal flaw: they maintain a monolithic reasoning context that spans the entire task history.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Building a Voice-First Cyberdeck with the Jetson Orin</title>
      <link>https://cruver.ai/cyberdeck/posts/building-voice-first-cyberdeck-1769033451/</link>
      <pubDate>Wed, 21 Jan 2026 18:30:00 +0000</pubDate>
      <guid>https://cruver.ai/cyberdeck/posts/building-voice-first-cyberdeck-1769033451/</guid>
      <description>&lt;p&gt;I had an NVIDIA Jetson Orin Nano sitting around collecting dust, along with a 5-inch touchscreen monitor, and thought it would be fun to build a voice interface for my second brain. Not because I needed one, but because the idea of talking to a device with an oscilloscope-style voice visualization seemed like a cool weekend project.&lt;/p&gt;&#xA;&lt;h2 id=&#34;why-build-a-cyberdeck&#34;&gt;Why Build a Cyberdeck?&lt;a class=&#34;anchor&#34; href=&#34;#why-build-a-cyberdeck&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;Honestly, I just thought it would be neat. I had the parts, I had the time, and the concept of a voice-first chat device in a custom enclosure sounded fun to build. I had no productivity goals, no problem to solve, just a project for the sake of building something interesting.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Building a Mobile Interface for My Second Brain with n8n and Matrix</title>
      <link>https://cruver.ai/second-brain/posts/n8n-mobile-access-via-matrix-1769025724/</link>
      <pubDate>Wed, 21 Jan 2026 14:00:00 +0000</pubDate>
      <guid>https://cruver.ai/second-brain/posts/n8n-mobile-access-via-matrix-1769025724/</guid>
      <description>&lt;p&gt;My org-roam knowledge base is the center of how I think and work. But it lives in Emacs, which means I can only capture ideas when I&amp;rsquo;m at my computer. I wanted a way to feed my second brain from anywhere, using just my phone—but I refused to put my notes in someone else&amp;rsquo;s cloud. The solution had to be entirely self-hosted. What I built combines Matrix for messaging, n8n for orchestration, and an MCP server that bridges to org-roam, all running on my own infrastructure.&lt;/p&gt;</description>
    </item>
    <item>
      <title>ComfyUI on MI60</title>
      <link>https://cruver.ai/gpu-ai/posts/comfyui-mi60/</link>
      <pubDate>Wed, 21 Jan 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/comfyui-mi60/</guid>
      <description>&lt;p&gt;Node-based Stable Diffusion UI with AMD MI60 GPU acceleration via ROCm.&lt;/p&gt;&#xA;&lt;h2 id=&#34;overview&#34;&gt;Overview&lt;a class=&#34;anchor&#34; href=&#34;#overview&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;ComfyUI runs as a containerized service using a custom ROCm 5.7 image built for the MI60&amp;rsquo;s gfx906 architecture. It can run standalone on both GPUs or alongside vLLM in the &lt;code&gt;image-chat&lt;/code&gt; configuration.&lt;/p&gt;&#xA;&lt;h2 id=&#34;quick-start&#34;&gt;Quick Start&lt;a class=&#34;anchor&#34; href=&#34;#quick-start&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#e2e4e5;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#78787e&#34;&gt;# Build the container image&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#ff5c57&#34;&gt;cd&lt;/span&gt; comfyui&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;make build&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#78787e&#34;&gt;# Start ComfyUI (standalone, both GPUs)&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;make up&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#78787e&#34;&gt;# Or use the image-chat config (vLLM on GPU0, ComfyUI on GPU1)&lt;/span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;curl -X POST -H &lt;span style=&#34;color:#5af78e&#34;&gt;&amp;#34;Content-Type: application/json&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#5af78e&#34;&gt;\&#xA;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  -d &lt;span style=&#34;color:#5af78e&#34;&gt;&amp;#39;{&amp;#34;config&amp;#34;:&amp;#34;image-chat&amp;#34;}&amp;#39;&lt;/span&gt; http://localhost:9100/switch&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Access the UI at &lt;a href=&#34;http://localhost:8188&#34;&gt;http://localhost:8188&lt;/a&gt;.&lt;/p&gt;</description>
    </item>
    <item>
      <title>GPU Configuration Management</title>
      <link>https://cruver.ai/gpu-ai/posts/gpu-config-management/</link>
      <pubDate>Wed, 21 Jan 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/gpu-config-management/</guid>
      <description>&lt;p&gt;Dynamic switching between GPU configurations based on workload needs.&lt;/p&gt;&#xA;&lt;h2 id=&#34;overview&#34;&gt;Overview&lt;a class=&#34;anchor&#34; href=&#34;#overview&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;Rather than running a single fixed configuration, I dynamically switch between GPU configurations. A Python service manages the transitions via HTTP API.&lt;/p&gt;&#xA;&lt;h2 id=&#34;available-configurations&#34;&gt;Available Configurations&lt;a class=&#34;anchor&#34; href=&#34;#available-configurations&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;table&gt;&#xA;  &lt;thead&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;th&gt;Config&lt;/th&gt;&#xA;          &lt;th&gt;Model&lt;/th&gt;&#xA;          &lt;th&gt;Use Case&lt;/th&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/thead&gt;&#xA;  &lt;tbody&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;code&gt;big-chat&lt;/code&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Llama 3.3 70B (TP=2)&lt;/td&gt;&#xA;          &lt;td&gt;Maximum quality chat&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;code&gt;coder&lt;/code&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Qwen3 32B (TP=2)&lt;/td&gt;&#xA;          &lt;td&gt;Coding tasks with thinking mode&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;code&gt;dual-chat&lt;/code&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Qwen3 8B + 14B&lt;/td&gt;&#xA;          &lt;td&gt;Two models, different tasks&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;&lt;code&gt;image-chat&lt;/code&gt;&lt;/td&gt;&#xA;          &lt;td&gt;Qwen 7B + ComfyUI&lt;/td&gt;&#xA;          &lt;td&gt;Text + image generation&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/tbody&gt;&#xA;&lt;/table&gt;&#xA;&lt;p&gt;Configuration files are stored as compose YAML files.&lt;/p&gt;</description>
    </item>
    <item>
      <title>GPU Metrics and Monitoring</title>
      <link>https://cruver.ai/gpu-ai/posts/gpu-metrics-monitoring/</link>
      <pubDate>Wed, 21 Jan 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/gpu-metrics-monitoring/</guid>
      <description>&lt;p&gt;Prometheus metrics and Grafana dashboards for GPU health and state monitoring.&lt;/p&gt;&#xA;&lt;h2 id=&#34;prometheus-metrics&#34;&gt;Prometheus Metrics&lt;a class=&#34;anchor&#34; href=&#34;#prometheus-metrics&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;The &lt;code&gt;gpu-state-service&lt;/code&gt; exposes Prometheus metrics on port 9100:&lt;/p&gt;&#xA;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#e2e4e5;background-color:#282a36;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-text&#34; data-lang=&#34;text&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;# GPU state and configuration&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gpu_config_info{config=&amp;#34;big-chat&amp;#34;,model=&amp;#34;llama-3.3-70b&amp;#34;} 1&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gpu_state_info{state=&amp;#34;ready&amp;#34;} 1&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gpu_ready 1&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gpu_config_switches_total 5&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gpu_config_uptime_seconds 3600&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;# Temperature monitoring (from separate AMD GPU exporter on port 9101)&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;amd_gpu_temperature_junction_celsius{gpu=&amp;#34;0&amp;#34;} 72&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;amd_gpu_temperature_junction_celsius{gpu=&amp;#34;1&amp;#34;} 75&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;amd_gpu_utilization_percent{gpu=&amp;#34;0&amp;#34;} 85&#xA;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;amd_gpu_power_watts{gpu=&amp;#34;0&amp;#34;} 180&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&#34;prometheus-endpoints&#34;&gt;Prometheus Endpoints&lt;a class=&#34;anchor&#34; href=&#34;#prometheus-endpoints&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;table&gt;&#xA;  &lt;thead&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;th&gt;Port&lt;/th&gt;&#xA;          &lt;th&gt;Service&lt;/th&gt;&#xA;          &lt;th&gt;Metrics&lt;/th&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/thead&gt;&#xA;  &lt;tbody&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;9100&lt;/td&gt;&#xA;          &lt;td&gt;gpu-state-service&lt;/td&gt;&#xA;          &lt;td&gt;Config state, switches, uptime&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;9101&lt;/td&gt;&#xA;          &lt;td&gt;AMD GPU exporter&lt;/td&gt;&#xA;          &lt;td&gt;Temperature, utilization, power&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;      &lt;tr&gt;&#xA;          &lt;td&gt;9102&lt;/td&gt;&#xA;          &lt;td&gt;NVIDIA GPU exporter&lt;/td&gt;&#xA;          &lt;td&gt;RTX 2080 metrics (if present)&lt;/td&gt;&#xA;      &lt;/tr&gt;&#xA;  &lt;/tbody&gt;&#xA;&lt;/table&gt;&#xA;&lt;h2 id=&#34;temperature-alerts&#34;&gt;Temperature Alerts&lt;a class=&#34;anchor&#34; href=&#34;#temperature-alerts&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;The service monitors junction temperatures and sends ntfy notifications:&lt;/p&gt;</description>
    </item>
    <item>
      <title>MI60 Hardware Setup</title>
      <link>https://cruver.ai/gpu-ai/posts/mi60-hardware-setup/</link>
      <pubDate>Wed, 21 Jan 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/mi60-hardware-setup/</guid>
      <description>&lt;p&gt;How to get an AMD Instinct MI60 running for AI workloads, including cooling solutions, BIOS settings, and fan control.&lt;/p&gt;&#xA;&lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;a class=&#34;anchor&#34; href=&#34;#introduction&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;The AMD Instinct MI60 is a powerful server GPU featuring 32 GB of HBM2 VRAM and PCIe 3.0 connectivity. It remains a good choice for budget-conscious AI developers looking for high VRAM capacity at a fraction of modern GPU prices. With proper setup, the MI60 can handle local LLM inference, Whisper transcription, Stable Diffusion, and other workloads.&lt;/p&gt;</description>
    </item>
    <item>
      <title>vLLM Inference on MI60</title>
      <link>https://cruver.ai/gpu-ai/posts/vllm-inference-mi60/</link>
      <pubDate>Wed, 21 Jan 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/vllm-inference-mi60/</guid>
      <description>&lt;p&gt;Production inference using vLLM with tensor parallelism across dual MI60 GPUs.&lt;/p&gt;&#xA;&lt;h2 id=&#34;why-vllm&#34;&gt;Why vLLM?&lt;a class=&#34;anchor&#34; href=&#34;#why-vllm&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;After evaluating Ollama, llama.cpp, and vLLM for MI60 inference, &lt;strong&gt;vLLM&lt;/strong&gt; emerged as the best choice:&lt;/p&gt;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;Tensor Parallelism&lt;/strong&gt;: Native support for splitting large models across multiple GPUs. With dual MI60s (64GB total), I can run 70B parameter models that wouldn&amp;rsquo;t fit on a single GPU.&lt;/p&gt;&#xA;&lt;/li&gt;&#xA;&lt;li&gt;&#xA;&lt;p&gt;&lt;strong&gt;PagedAttention&lt;/strong&gt;: More efficient memory management, allowing higher GPU utilization (90%) without OOM errors.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Publishing with Org-roam and Hugo</title>
      <link>https://cruver.ai/second-brain/posts/publishing-with-org-roam-and-hugo-1768946341/</link>
      <pubDate>Tue, 20 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://cruver.ai/second-brain/posts/publishing-with-org-roam-and-hugo-1768946341/</guid>
      <description>&lt;p&gt;Most publishing systems treat writing and publishing as separate activities. You draft in one tool, then copy content into a CMS, tweak formatting, and hit publish. The writing environment and the publishing system are disconnected by design.&lt;/p&gt;&#xA;&lt;p&gt;I wanted something different: a system where publishing is just another view of my thinking. Where the same notes I write for myself can become public posts with minimal friction. Where the blog is an extension of my knowledge base, not a separate silo.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Running Claude Code with Local LLMs via vLLM and LiteLLM</title>
      <link>https://cruver.ai/gpu-ai/posts/claude-code-local-llms-1768966311/</link>
      <pubDate>Tue, 20 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/claude-code-local-llms-1768966311/</guid>
      <description>&lt;p&gt;Every query to Claude Code means sending my source code to Anthropic&amp;rsquo;s servers. For proprietary codebases, that&amp;rsquo;s a non-starter. With vLLM and LiteLLM, I can point Claude Code at my own hardware - keeping my code on my network while maintaining the same workflow.&lt;/p&gt;&#xA;&lt;h2 id=&#34;the-architecture&#34;&gt;The Architecture&lt;a class=&#34;anchor&#34; href=&#34;#the-architecture&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;The trick is that Claude Code expects the Anthropic Messages API, but local inference servers speak OpenAI&amp;rsquo;s API format. LiteLLM bridges this gap. It accepts Anthropic-formatted requests and translates them to OpenAI format for my local vLLM instance.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Building a Second Brain in Emacs with Org-roam</title>
      <link>https://cruver.ai/second-brain/posts/building-a-second-brain-in-emacs-1768831790/</link>
      <pubDate>Mon, 19 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://cruver.ai/second-brain/posts/building-a-second-brain-in-emacs-1768831790/</guid>
      <description>&lt;p&gt;I&amp;rsquo;ve been building what I call my &amp;ldquo;Second Brain,&amp;rdquo; an Emacs-based knowledge management system that doesn&amp;rsquo;t just store notes, but actively helps me think. It&amp;rsquo;s built on org-roam, extended with semantic search via vector embeddings and proactive surfacing of relevant information. The goal isn&amp;rsquo;t to replace my thinking but to augment it, surfacing connections I might miss and nudging me toward follow-ups I&amp;rsquo;ve forgotten.&lt;/p&gt;&#xA;&lt;h2 id=&#34;why-emacs-and-org-roam&#34;&gt;Why Emacs and Org-roam&lt;a class=&#34;anchor&#34; href=&#34;#why-emacs-and-org-roam&#34;&gt;#&lt;/a&gt;&lt;/h2&gt;&#xA;&lt;p&gt;Before diving into what I&amp;rsquo;ve built, it&amp;rsquo;s worth explaining the foundation I chose.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Data Sovereignty and My Network Router</title>
      <link>https://cruver.ai/homelab/posts/data-sovereignty-and-my-network-router-1768778676/</link>
      <pubDate>Sun, 18 Jan 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/homelab/posts/data-sovereignty-and-my-network-router-1768778676/</guid>
      <description>&lt;p&gt;Consumer routers frustrated me for years before I understood why. It wasn&amp;rsquo;t the hardware; my ASUS RT-AC5300 was perfectly capable for most tasks. It was the assumption baked into every consumer device: that I didn&amp;rsquo;t need to see what was happening on my own network.&lt;/p&gt;&#xA;&lt;p&gt;At some point, I wanted my router to do something it wouldn&amp;rsquo;t do. I don&amp;rsquo;t even remember what it was now. I tried open-source firmware, which worked for a while, until I hit another wall. Underpowered CPUs, limited RAM, confusing web interfaces, and the general lack of configurability.&lt;/p&gt;</description>
    </item>
    <item>
      <title>An Affordable AI Server</title>
      <link>https://cruver.ai/gpu-ai/posts/an-affordable-ai-server-1768704467/</link>
      <pubDate>Sat, 17 Jan 2026 00:00:00 -0700</pubDate>
      <guid>https://cruver.ai/gpu-ai/posts/an-affordable-ai-server-1768704467/</guid>
      <description>&lt;p&gt;Two AMD MI60s from eBay cost me about $1,000 total and gave me 64GB of VRAM. That&amp;rsquo;s enough to run Llama 3.3 70B at home with a 32K context window.&lt;/p&gt;&#xA;&lt;p&gt;When I started looking into running large language models locally, the obvious limiting factor was VRAM. Consumer GPUs top out at 24GB, and that&amp;rsquo;s on an RTX 4090 at the high end. I wanted to run 70B parameter models locally, on hardware I own.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
