<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://blog.nigiva.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blog.nigiva.com/" rel="alternate" type="text/html" /><updated>2026-05-12T18:36:44+02:00</updated><id>https://blog.nigiva.com/feed.xml</id><title type="html">Nigiva</title><subtitle>Thoughts on machine learning, deep learning, and software engineering</subtitle><author><name>Nigiva</name><email>blog@nigiva.com</email></author><entry><title type="html">Data URLs versus rotating presigned HTTPS: latency and cache in multimodal chat APIs</title><link href="https://blog.nigiva.com/2026/05/10/data-vs-presigned-url-llm-images.html" rel="alternate" type="text/html" title="Data URLs versus rotating presigned HTTPS: latency and cache in multimodal chat APIs" /><published>2026-05-10T00:00:00+02:00</published><updated>2026-05-10T00:00:00+02:00</updated><id>https://blog.nigiva.com/2026/05/10/data-vs-presigned-url-llm-images</id><content type="html" xml:base="https://blog.nigiva.com/2026/05/10/data-vs-presigned-url-llm-images.html"><![CDATA[<div class="callout tldr">
  <span class="callout-icon" aria-hidden="true">
    <svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M16 4h2a2 2 0 0 1 2 2v14a2 2 0 0 1-2 2H6a2 2 0 0 1-2-2V6a2 2 0 0 1 2-2h2" /><rect width="8" height="4" x="8" y="2" rx="1" ry="1" /><path d="M9 14h6" /><path d="M9 18h6" /><path d="M9 22h6" /></svg>
  </span>
  <div class="callout-body">
    <div class="callout-title">TL;DR</div>
    <div class="callout-md">
<p>On every tested model, mean <code class="language-plaintext highlighter-rouge">data:</code> completion latency beat fresh presigned HTTPS (<strong>+13%</strong>, <strong>+23%</strong>, <strong>+39%</strong> presigned slowdown, by tier).</p>

<p>Caching tracks the <strong>image</strong>, not the <strong>URL text</strong>, for Gemini, OpenAI, and Anthropic here. Rotating presigned URLs did not look like wiping cache just by changing the link on any stack.</p>

    </div>
  </div>
</div>

<h2 id="from-habit-to-hypothesis">From habit to hypothesis</h2>

<p>I got used to presigned HTTPS URLs for multimodal payloads. Stored chat JSON stayed small because I did not embed full image bytes in every message. The provider fetches the PNG over HTTPS on their side. I assumed I was moving traffic off my laptops and servers.</p>

<p>A teammate asked a simple follow-up: aside from RAM and log size, does that pattern really make completions faster? This post is what I measured.</p>

<p>My guess was that presigned wins on elapsed time because download at the provider would beat me sending Base64 again on every turn. I had no data, only a gut feeling.</p>

<p>That picture weakens when you remember requests already leave my machines on fast datacenter uplinks. OVH, AWS, and similar hosts are built to accept uploads. Assuming the provider always wins on fetch versus one more POST with the image inside is a rough shortcut for that setup. Slow home upload, mobile, or strict bandwidth caps can still change the tradeoff. So I ran a benchmark.</p>

<h2 id="ocr-as-the-benchmark-task">OCR as the benchmark task</h2>

<p>OCR was a good fit. Current multimodal models already do well on clean synthetic screenshots. Outputs are nearly repeatable. I can generate endless labeled pairs in code instead of tuning hand-written creative prompts.</p>

<p>The synthetic pages stick to one recipe: five OCR lines of about ten words each, light backgrounds, fixed layout.</p>

<p>The setup ties each page to fixed ground-truth text baked into the image. Each assistant reply gets compared to that target string with <code class="language-plaintext highlighter-rouge">rapidfuzz</code>, yielding a similarity score from <strong>0</strong> (no overlap) through <strong>1</strong> (exact transcript). <code class="language-plaintext highlighter-rouge">temperature</code> is <strong>0</strong>. Replies stay inside one fenced Markdown code block.</p>

<p>Images live in Cloudflare R2 for <code class="language-plaintext highlighter-rouge">presigned</code>. <code class="language-plaintext highlighter-rouge">data:</code> reads files from disk and sends Base64.</p>

<p>Main numbers rotate presigned URLs on every replay (including older history rows) so logs never reuse stale URL strings by accident. I also ran <code class="language-plaintext highlighter-rouge">presigned</code> with one fixed HTTPS URL whenever the same PNG comes back.</p>

<p>Latency versus <code class="language-plaintext highlighter-rouge">data:</code> looked the same in both setups, so rewriting the signature did not read like a latency win in what I captured. Nothing in these traces suggested multimodal backends reuse prior HTTPS fetches just because the PNG URL matched an earlier request; completions still behaved like vendors pay for decode plus HTTPS pull, not like a warmed URL-aware shortcut.</p>

<p>I use 3 vendors, smaller multimodal models (Gemini, GPT, Claude, names below). Larger tiers should behave in the same ballpark, but this run does not prove that for every model.</p>

<h2 id="avoid-mixing-cached-work-between-lanes">Avoid mixing cached work between lanes</h2>

<p>To compare <code class="language-plaintext highlighter-rouge">data:</code> with <code class="language-plaintext highlighter-rouge">presigned</code> without one side reusing image cache hits meant for the other, I generate paired PNG layouts (A: white background, dark text; B: off-white background, charcoal text). Same wording, slightly different RGB, different hashes and bucket keys, hard to spot by eye. <code class="language-plaintext highlighter-rouge">data</code> or <code class="language-plaintext highlighter-rouge">presigned</code> receives A or B at random each time.</p>

<p>Example from <strong>series 6</strong>, <strong>turn 6</strong> in the generator (<code class="language-plaintext highlighter-rouge">a</code> / <code class="language-plaintext highlighter-rouge">b</code> files). Same transcript; two skins so bytes never match between lanes while the page still reads the same to a human.</p>

<p><img src="/assets/images/posts/2026-05-10-data-vs-presigned/serie_006_turn_06a.png" alt="OCR synthetic page, pair variant a (series 6, turn 6)." class="align-center" />
<em>Variant <code class="language-plaintext highlighter-rouge">a</code> in the generator: higher-contrast white backdrop and dark type.</em></p>

<p><img src="/assets/images/posts/2026-05-10-data-vs-presigned/serie_006_turn_06b.png" alt="OCR synthetic page, pair variant b (series 6, turn 6)." class="align-center" />
<em>Variant <code class="language-plaintext highlighter-rouge">b</code> in the generator: slightly tinted backdrop and softened ink so the raster hash diverges.</em></p>

<p>For <code class="language-plaintext highlighter-rouge">presigned</code>, histories use a fresh signed GET URL whenever an image appears, including turns already stored earlier in the thread.</p>

<h2 id="method-shape-and-timer">Method shape and timer</h2>

<p>The benchmark harness ran on an OVH VPS in a datacenter. I did not drive it from my home machine on purpose: residential upload is often the weak link for large multimodal payloads, and I wanted timings closer to what you get from a production-style host on a datacenter link. Presigned objects live in Cloudflare R2, so the vendor fetches PNGs from R2 over HTTPS while the API client runs on that same VPS. That way both <code class="language-plaintext highlighter-rouge">data:</code> reads from local disk and <code class="language-plaintext highlighter-rouge">presigned</code> fetch paths are not dominated by consumer broadband jitter or caps.</p>

<p>Runs use series and turn depth.</p>

<p>Ten independent chats (series) with unrelated text. Inside each, ten turns, index <strong>0</strong> through <strong>9</strong>, calls run strictly one after another.</p>

<p>Turn <strong>0</strong>: one PNG plus a short OCR instruction. Turns <code class="language-plaintext highlighter-rouge">&gt; 0</code>: send full history plus one new PNG.</p>

<p><code class="language-plaintext highlighter-rouge">total_time_ms</code> runs from starting the multimodal request until the last assistant token arrives. Signing, disk read, and Base64 encoding (when used) finish before that timer starts.</p>

<p>Replay outline (one series, one delivery path):</p>

<pre><code class="language-mermaid">flowchart LR
  subgraph warmup [Before timed calls]
    R2[R2 PNG objects uploaded]
    IDX[Index rows per serie turn method]
  end
  subgraph per_series_per_method [Replay one series]
    T0["Turn 0: user PNG plus OCR brief"]
    T0 --&gt; A0["Assistant OCR"]
    A0 --&gt; T1["Turn 1: full history plus new PNG"]
    T1 --&gt; A1["Assistant OCR"]
    A1 --&gt; Tn["Turns 2 through 9 repeat pattern"]
  end
  warmup --&gt; per_series_per_method
</code></pre>

<p><strong>Wire formats</strong></p>

<table>
  <thead>
    <tr>
      <th>Method</th>
      <th>Payload shape</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">data</code></td>
      <td><code class="language-plaintext highlighter-rouge">data:image/png;base64,...</code> inside the multimodal POST</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">presigned</code></td>
      <td>Fresh Cloudflare R2 HTTPS GET URL for each emission and replay</td>
    </tr>
  </tbody>
</table>

<p>PNG canvas <strong>1920 x 1080</strong>. Model identifiers in logs: <code class="language-plaintext highlighter-rouge">gemini-flash-2.5</code> (<code class="language-plaintext highlighter-rouge">flash-2.5</code>), <code class="language-plaintext highlighter-rouge">gpt-5.4-mini</code>, <code class="language-plaintext highlighter-rouge">claude-haiku-4.5</code> (<code class="language-plaintext highlighter-rouge">haiku-4.5</code>).</p>

<h3 id="operational-notes">Operational notes</h3>

<div class="callout note">
  <span class="callout-icon">💡</span>
  <div class="callout-body">
    <div class="callout-title">Note</div>
    <ul>
      <li>OpenAI and Gemini turned on prompt caching with defaults in this harness. Claude requires explicit <code>cache_control</code> on the multimodal payloads you intend to cache; without it Anthropic prompt caching stays off.</li>
      <li>Vendors expose cached-input totals under different shapes; compare rows cautiously rather than stacking them blindly.</li>
      <li>Where Claude caching was active, ephemeral TTL stayed at <strong>5 minutes</strong>.</li>
      <li>Some stacks prefetch HTTPS-linked PNG bytes on the client before timing. That work stays outside <code>total_time_ms</code>.</li>
    </ul>
  </div>
</div>

<p>This benchmark compares inlined PNG bytes in <code class="language-plaintext highlighter-rouge">data:</code> requests with presigned HTTPS <code class="language-plaintext highlighter-rouge">GET</code>s routed through Cloudflare R2.</p>

<h2 id="question-1-same-accuracy-on-both-lanes">Question 1: Same accuracy on both lanes?</h2>

<p>If <code class="language-plaintext highlighter-rouge">rapidfuzz</code> scores disagree, latency numbers are pointless.</p>

<p><strong>Result:</strong> mean score <strong>1.0</strong> (100 transcripts each model and method). Choosing <code class="language-plaintext highlighter-rouge">data:</code> or <code class="language-plaintext highlighter-rouge">presigned</code> did not break OCR accuracy here.</p>

<h2 id="question-2-does-url-rotation-break-cache-counters">Question 2: Does URL rotation break cache counters?</h2>

<p>Side question: every presigned URL string is new each time. Do vendors treat the image as brand new and zero the cached-input totals they expose?</p>

<p>Across replay with fresh signatures, including older turns, <code class="language-plaintext highlighter-rouge">claude-haiku-4.5</code> still showed equal mean cached-input token counts for <code class="language-plaintext highlighter-rouge">data</code> and <code class="language-plaintext highlighter-rouge">presigned</code>. That lines up with the <strong>image</strong> driving those counters, not the URL alone, inside each completion payload.</p>

<p>Separate runs reused one stable presigned HTTPS string per PNG, without issuing another signature whenever history repeats. Cached-input counters moved like the rotated-URL batches, which sits well with lumping ordinary static CDN links into the same story on whether rewriting the HTTPS string clears those counters all by itself.</p>

<table>
  <thead>
    <tr>
      <th>Combination</th>
      <th style="text-align: right">Cached input tokens (<code class="language-plaintext highlighter-rouge">mean</code>)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>gemini-flash-2.5 / <code class="language-plaintext highlighter-rouge">data</code></td>
      <td style="text-align: right">964</td>
    </tr>
    <tr>
      <td>gemini-flash-2.5 / <code class="language-plaintext highlighter-rouge">presigned</code></td>
      <td style="text-align: right">823</td>
    </tr>
    <tr>
      <td>gpt-5.4-mini / <code class="language-plaintext highlighter-rouge">data</code></td>
      <td style="text-align: right">8474</td>
    </tr>
    <tr>
      <td>gpt-5.4-mini / <code class="language-plaintext highlighter-rouge">presigned</code></td>
      <td style="text-align: right">8950</td>
    </tr>
    <tr>
      <td>claude-haiku-4.5 / <code class="language-plaintext highlighter-rouge">data</code></td>
      <td style="text-align: right">7175</td>
    </tr>
    <tr>
      <td>claude-haiku-4.5 / <code class="language-plaintext highlighter-rouge">presigned</code></td>
      <td style="text-align: right">7175</td>
    </tr>
  </tbody>
</table>

<p>Claude shows the same mean cached-input count for <code class="language-plaintext highlighter-rouge">data</code> and <code class="language-plaintext highlighter-rouge">presigned</code>. I thought Gemini and OpenAI would too. They do not. I only set explicit <code class="language-plaintext highlighter-rouge">cache_control</code> for Claude here. For Gemini and OpenAI I left vendor defaults. Something in those defaults likely treats <code class="language-plaintext highlighter-rouge">data:</code> and presigned paths differently. I do not know every knob on their side.</p>

<p>I first ran Claude without setting <code class="language-plaintext highlighter-rouge">cache_control</code>. That was a mistake: Anthropic only enables this cache path when you mark it explicitly. Loads of <code class="language-plaintext highlighter-rouge">data:</code> requests came back <code class="language-plaintext highlighter-rouge">BadRequestError</code> with a PNG download timeout message from Anthropic's side. My client retries the call up to three times, which sometimes unstuck it, but it mostly felt like <code class="language-plaintext highlighter-rouge">data:</code> image sends freezing and then succeeding late. Turning <code class="language-plaintext highlighter-rouge">cache_control</code> on flattened the failures and trimmed spend. Numbers in the main table still skew faster on <code class="language-plaintext highlighter-rouge">data:</code>. During the bad window, <code class="language-plaintext highlighter-rouge">presigned</code> tended to survive more cleanly if you optimize for uptime. None of those failure runs feed the published means above.</p>

<h2 id="question-3-which-path-finishes-faster">Question 3: Which path finishes faster?</h2>

<p>Each cell pools 100 timed calls (10 series, 10 turns). Totals come from <code class="language-plaintext highlighter-rouge">summary_by_model_method</code>.</p>

<table>
  <thead>
    <tr>
      <th>Model</th>
      <th style="text-align: right">Data mean (<code class="language-plaintext highlighter-rouge">ms</code>)</th>
      <th style="text-align: right">Presigned mean (<code class="language-plaintext highlighter-rouge">ms</code>)</th>
      <th style="text-align: right">Presigned slowdown</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>gemini-flash-2.5</td>
      <td style="text-align: right">2832</td>
      <td style="text-align: right">3200</td>
      <td style="text-align: right"><strong>+13%</strong></td>
    </tr>
    <tr>
      <td>gpt-5.4-mini</td>
      <td style="text-align: right">1854</td>
      <td style="text-align: right">2280</td>
      <td style="text-align: right"><strong>+23%</strong></td>
    </tr>
    <tr>
      <td>claude-haiku-4.5</td>
      <td style="text-align: right">2689</td>
      <td style="text-align: right">3749</td>
      <td style="text-align: right"><strong>+39%</strong></td>
    </tr>
  </tbody>
</table>

<p><code class="language-plaintext highlighter-rouge">presigned</code> means the vendor pulls the PNG over HTTPS. <code class="language-plaintext highlighter-rouge">data:</code> means you already inlined the PNG in the HTTPS POST.</p>

<h2 id="figure-latency-vs-turn-index">Figure: latency vs turn index</h2>

<p>Averages across ten chats per plotted turn. Solid blue is <code class="language-plaintext highlighter-rouge">data</code>. Dashed red is <code class="language-plaintext highlighter-rouge">presigned</code>. Markers: circle <code class="language-plaintext highlighter-rouge">gemini-flash-2.5</code>, square <code class="language-plaintext highlighter-rouge">gpt-5.4-mini</code>, triangle <code class="language-plaintext highlighter-rouge">claude-haiku-4.5</code>.</p>

<p><img src="/assets/images/posts/2026-05-10-data-vs-presigned/latency_by_turn.png" alt="Mean completion latency for gemini-flash-2.5, gpt-5.4-mini, and claude-haiku-4.5. Data URLs versus presigned HTTPS. Ten conversational turns." class="align-center" /></p>

<p>Single turns swing a lot (queue, longer context). The averaged table still has <code class="language-plaintext highlighter-rouge">presigned</code> slower. Gemini can sit below <code class="language-plaintext highlighter-rouge">data:</code> on early turns, so I rely on means and the chart together.</p>

<h2 id="what-caught-me-off-guard">What caught me off guard</h2>

<p>I expected a tie or a <code class="language-plaintext highlighter-rouge">presigned</code> win because my day job habits favor small logs and less RAM, not milliseconds. Mean <code class="language-plaintext highlighter-rouge">total_time_ms</code> rose about <strong>13%</strong> to <strong>40%</strong> when the vendor had to GET from R2 instead of parsing inline Base64 (model dependent), on the same OVH VPS with matching upload and download speed.</p>

<p>Cached-input counts also did not match my neat guess.</p>

<h2 id="closing-take">Closing take</h2>

<p>If you only care about shortest median completion time on this OCR setup, pick <code class="language-plaintext highlighter-rouge">data:</code>. Presigned still wins when upload is slow, when you reuse the same images in long chats, or when you need less RAM or smaller stored chats. Those goals sit next to latency; they do not replace it.</p>

<p>I only wired this benchmark through OpenAI, Google Gemini, and Anthropic. I still expect the same directional story on other multimodal hosts, but treat that as guesswork until somebody runs the same split on their APIs.</p>]]></content><author><name>Nigiva</name><email>blog@nigiva.com</email></author><category term="ml" /><category term="multimodal" /><category term="benchmarking" /><category term="llm" /><category term="agents" /><summary type="html"><![CDATA[Benchmarks whether sending PNG bytes inline as Base64 data URLs or as Cloudflare R2 presigned HTTPS URLs changes multimodal completion latency. Three LLMs on synthetic OCR.]]></summary></entry><entry><title type="html">Cobjectric: Measuring Parsing Quality for Structured Data</title><link href="https://blog.nigiva.com/2026/05/09/cobjectric-metrics-for-complex-objects.html" rel="alternate" type="text/html" title="Cobjectric: Measuring Parsing Quality for Structured Data" /><published>2026-05-09T00:00:00+02:00</published><updated>2026-05-09T00:00:00+02:00</updated><id>https://blog.nigiva.com/2026/05/09/cobjectric-metrics-for-complex-objects</id><content type="html" xml:base="https://blog.nigiva.com/2026/05/09/cobjectric-metrics-for-complex-objects.html"><![CDATA[<div class="callout tldr">
  <span class="callout-icon" aria-hidden="true">
    <svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M16 4h2a2 2 0 0 1 2 2v14a2 2 0 0 1-2 2H6a2 2 0 0 1-2-2V6a2 2 0 0 1 2-2h2" /><rect width="8" height="4" x="8" y="2" rx="1" ry="1" /><path d="M9 14h6" /><path d="M9 18h6" /><path d="M9 22h6" /></svg>
  </span>
  <div class="callout-body">
    <div class="callout-title">TL;DR</div>
    <div class="callout-md">
<p><strong>Cobjectric</strong> scores structured objects with <strong><code class="language-plaintext highlighter-rouge">compute_fill_rate</code></strong>, <strong><code class="language-plaintext highlighter-rouge">compute_fill_rate_accuracy</code></strong>, and <strong><code class="language-plaintext highlighter-rouge">compute_similarity</code></strong>.</p>

<p>It started as a benchmark harness for <strong>curriculum vitae (CV) parsing</strong>, but the same recipe generalizes to <strong>API payloads</strong>, <strong>configs</strong>, <strong>migration QA</strong>, and any nested <strong><code class="language-plaintext highlighter-rouge">dict</code> / JSON</strong> you care about.</p>

<p>For Specs, pandas export, and API tables, read <strong><a href="https://cobjectric.nigiva.com/">docs</a></strong>.</p>

    </div>
  </div>
</div>

<h2 id="where-this-came-from">Where this came from</h2>

<p>I built Cobjectric because I kept comparing <strong>parsed CVs</strong> against a schema and a labeled extract. The painful bit is rarely strict equality everywhere. Outputs are <strong>almost right</strong>: extra spaces, different casing, harmless punctuation, <strong>lists in another order</strong>, or a field present on one side but missing on the other.</p>

<p>That pattern is not <strong>CV-specific</strong>. Once you model your payload as a <strong><code class="language-plaintext highlighter-rouge">BaseModel</code></strong>, you get repeatable metrics you can log, aggregate, and compare across prompts or pipelines.</p>

<h2 id="fill-rate-how-complete-is-one-object">Fill rate: how complete is one object?</h2>

<p>Fill rate answers a simple question: <strong>which fields look filled vs missing</strong> for a single instance?</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">cobjectric</span> <span class="kn">import</span> <span class="n">BaseModel</span>


<span class="k">class</span> <span class="nc">Person</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
    <span class="n">age</span><span class="p">:</span> <span class="nb">int</span>
    <span class="n">email</span><span class="p">:</span> <span class="nb">str</span>


<span class="n">person</span> <span class="o">=</span> <span class="n">Person</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">John Doe</span><span class="sh">"</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">age</span><span class="sh">"</span><span class="p">:</span> <span class="mi">30</span><span class="p">,</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">result</span> <span class="o">=</span> <span class="n">person</span><span class="p">.</span><span class="nf">compute_fill_rate</span><span class="p">()</span>
<span class="nf">print</span><span class="p">(</span><span class="n">result</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">result</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">age</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">result</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">email</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">result</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.0
1.0
0.0
0.667
</code></pre></div></div>

<p>You land around <strong>66.7%</strong> mean completeness when <strong>2 / 3</strong> fields are present.</p>

<div class="callout note">
  <span class="callout-icon">📝</span>
  <div class="callout-body">
    <div class="callout-title">Note</div>
    <p>
      Think of per-field scores as <strong>1.0</strong> when the field is present and valid,
      and <strong>0.0</strong> when it is missing or fails validation.
      If you need weighted summaries, Spec weights apply here too.
    </p>
  </div>
</div>

<h2 id="fill-rate-accuracy-did-we-miss-the-same-fields">Fill rate accuracy: did we miss the same fields?</h2>

<p>Fill rate accuracy compares <strong>two objects</strong>, but still focuses on <strong>presence</strong>, not semantic equality.
That is useful when you want to know whether your extractor <strong>skipped the same sections</strong> as your reference label.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">got</span> <span class="o">=</span> <span class="n">Person</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">({</span><span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">John</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">age</span><span class="sh">"</span><span class="p">:</span> <span class="mi">30</span><span class="p">})</span>
<span class="n">expected</span> <span class="o">=</span> <span class="n">Person</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Jane</span><span class="sh">"</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">age</span><span class="sh">"</span><span class="p">:</span> <span class="mi">25</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">email</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">jane@example.com</span><span class="sh">"</span><span class="p">,</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">accuracy</span> <span class="o">=</span> <span class="n">got</span><span class="p">.</span><span class="nf">compute_fill_rate_accuracy</span><span class="p">(</span><span class="n">expected</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">accuracy</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">accuracy</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">age</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">accuracy</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">email</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">accuracy</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.0
1.0
0.0
0.667
</code></pre></div></div>

<p>Here <strong>66.7%</strong> means <strong>2 / 3</strong> fields share the same filled-or-missing pattern (both sides have <code class="language-plaintext highlighter-rouge">name</code> and <code class="language-plaintext highlighter-rouge">age</code>, only <code class="language-plaintext highlighter-rouge">expected</code> has <code class="language-plaintext highlighter-rouge">email</code>).</p>

<h2 id="similarity-near-matches-for-noisy-text">Similarity: near matches for noisy text</h2>

<p>When both sides have text, you usually care about <strong>near matches</strong>, not character-by-character identity. Think casing edits, spacing, light paraphrases, or abbreviations, not only literal typos.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">cobjectric</span> <span class="kn">import</span> <span class="n">BaseModel</span>
<span class="kn">from</span> <span class="n">cobjectric.specs</span> <span class="kn">import</span> <span class="n">TextSpec</span>


<span class="k">class</span> <span class="nc">Article</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
    <span class="n">title</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">TextSpec</span><span class="p">(</span><span class="n">scorer</span><span class="o">=</span><span class="sh">"</span><span class="s">WRatio</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">content</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">TextSpec</span><span class="p">(</span><span class="n">scorer</span><span class="o">=</span><span class="sh">"</span><span class="s">WRatio</span><span class="sh">"</span><span class="p">)</span>


<span class="n">reference</span> <span class="o">=</span> <span class="n">Article</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">title</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Introduction to Machine Learning</span><span class="sh">"</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">content</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span>
            <span class="sh">"</span><span class="s">Machine learning is a subset of artificial intelligence.</span><span class="sh">"</span>
        <span class="p">),</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">parsed</span> <span class="o">=</span> <span class="n">Article</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">title</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Introduction to machine learning</span><span class="sh">"</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">content</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Machine learning is a subset of AI.</span><span class="sh">"</span><span class="p">,</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">similarity</span> <span class="o">=</span> <span class="n">parsed</span><span class="p">.</span><span class="nf">compute_similarity</span><span class="p">(</span><span class="n">reference</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">similarity</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">title</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">similarity</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">content</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">similarity</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.0
0.8735294117647059
0.9367647058823529
</code></pre></div></div>

<p>That is the practical win: casing changes can score <strong>100%</strong>, while light paraphrases still land around <strong>87.4%</strong> on <code class="language-plaintext highlighter-rouge">content</code>, so the overall score stays near <strong>93.7%</strong>.</p>

<p>Other slots should stay <strong>exact</strong> once normalization runs: IDs, enums, fixed taxonomy labels, SKUs. <strong><code class="language-plaintext highlighter-rouge">KeywordSpec</code></strong> uses <strong>exact similarity</strong> on those strings (with preprocessing such as stripping whitespace and optional int-to-string coercion), so you do not get partial fuzzy credit when the value must match.</p>

<div class="callout tip">
  <span class="callout-icon">💡</span>
  <div class="callout-body">
    <div class="callout-title">Tip</div>
    <p>
      Use <code>TextSpec</code> for free-form prose so normalization (case, spacing, accents) and RapidFuzz-backed similarity stay consistent. Tune <code>scorer</code> (for example <code>WRatio</code>) when you need stricter or looser fuzzy behavior.
    </p>
    <p>
      Use <code>KeywordSpec</code> when the contract is effectively "equal or wrong": matching normalized tokens must score <strong>1.0</strong>, anything else scores <strong>0.0</strong>.
      For fully custom rules you can still attach <code>similarity_func</code> or use helpers such as <code>exact_similarity</code> from <code>cobjectric.similarity</code>; see <a href="https://cobjectric.nigiva.com/similarity/">Similarity</a> and <a href="https://cobjectric.nigiva.com/specs/">Pre-defined Specs</a>.
    </p>
  </div>
</div>

<h2 id="lists-match-items-even-when-order-shifts">Lists: match items even when order shifts</h2>

<p>Models often emit <strong>arrays</strong> of nested objects. Pairwise index alignment works when order is stable.
When it is not, <strong><code class="language-plaintext highlighter-rouge">ListCompareStrategy.OPTIMAL_ASSIGNMENT</code></strong> finds a strong one-to-one pairing.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">cobjectric</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Spec</span><span class="p">,</span> <span class="n">ListCompareStrategy</span>
<span class="kn">from</span> <span class="n">cobjectric.specs</span> <span class="kn">import</span> <span class="n">KeywordSpec</span>


<span class="k">class</span> <span class="nc">Skill</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">KeywordSpec</span><span class="p">()</span>
    <span class="n">level</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">KeywordSpec</span><span class="p">()</span>


<span class="k">class</span> <span class="nc">Developer</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
    <span class="n">skills</span><span class="p">:</span> <span class="nb">list</span><span class="p">[</span><span class="n">Skill</span><span class="p">]</span> <span class="o">=</span> <span class="nc">Spec</span><span class="p">(</span>
        <span class="n">list_compare_strategy</span><span class="o">=</span><span class="n">ListCompareStrategy</span><span class="p">.</span><span class="n">OPTIMAL_ASSIGNMENT</span>
    <span class="p">)</span>


<span class="n">reference</span> <span class="o">=</span> <span class="n">Developer</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">skills</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span><span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Python</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">level</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Expert</span><span class="sh">"</span><span class="p">},</span>
            <span class="p">{</span><span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">JavaScript</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">level</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Intermediate</span><span class="sh">"</span><span class="p">},</span>
            <span class="p">{</span><span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">SQL</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">level</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Advanced</span><span class="sh">"</span><span class="p">},</span>
        <span class="p">]</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">parsed</span> <span class="o">=</span> <span class="n">Developer</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">skills</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span><span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">JavaScript</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">level</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Intermediate</span><span class="sh">"</span><span class="p">},</span>
            <span class="p">{</span><span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">SQL</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">level</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Advanced</span><span class="sh">"</span><span class="p">},</span>
            <span class="p">{</span><span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Python</span><span class="sh">"</span><span class="p">,</span> <span class="sh">"</span><span class="s">level</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Expert</span><span class="sh">"</span><span class="p">},</span>
        <span class="p">]</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">similarity</span> <span class="o">=</span> <span class="n">parsed</span><span class="p">.</span><span class="nf">compute_similarity</span><span class="p">(</span><span class="n">reference</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">similarity</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.0
</code></pre></div></div>

<p><strong>100%</strong> here means every aligned pair matches on structured fields, even though the incoming list was rotated.</p>

<div class="callout warning">
  <span class="callout-icon">⚠️</span>
  <div class="callout-body">
    <div class="callout-title">Warning</div>
    <p>
      Default <strong>pairwise</strong> alignment compares index <code>i</code> on both sides.
      If your generator shuffles sections (skills, roles, bullet lists), pairwise similarity will look unfairly bad even when the content is right.
      Reach for <strong>Levenshtein</strong> when order is mostly stable but items insert or drop,
      or <strong>optimal assignment</strong> when order is unreliable (SciPy required for that strategy).
    </p>
  </div>
</div>

<h2 id="case-study-a-cv-shaped-schema-with-all-three-metrics">Case study: a CV-shaped schema with all three metrics</h2>

<p>This mirrors what I wanted first: nested <strong><code class="language-plaintext highlighter-rouge">Experience</code></strong> rows plus fuzzy <strong><code class="language-plaintext highlighter-rouge">summary</code></strong> text. Assume <strong><code class="language-plaintext highlighter-rouge">Experience</code></strong>, <strong><code class="language-plaintext highlighter-rouge">CV</code></strong>, <strong><code class="language-plaintext highlighter-rouge">reference_cv</code></strong>, and <strong><code class="language-plaintext highlighter-rouge">llm_cv</code></strong> match the expanded snippet below.</p>

<p>On this toy pair, <strong>completeness lines up</strong>, but <strong>wording still drifts</strong>, which is exactly when you want all three APIs:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">fill_only</span> <span class="o">=</span> <span class="n">llm_cv</span><span class="p">.</span><span class="nf">compute_fill_rate</span><span class="p">()</span>
<span class="n">presence_match</span> <span class="o">=</span> <span class="n">llm_cv</span><span class="p">.</span><span class="nf">compute_fill_rate_accuracy</span><span class="p">(</span><span class="n">reference_cv</span><span class="p">)</span>
<span class="n">value_match</span> <span class="o">=</span> <span class="n">llm_cv</span><span class="p">.</span><span class="nf">compute_similarity</span><span class="p">(</span><span class="n">reference_cv</span><span class="p">)</span>

<span class="nf">print</span><span class="p">(</span><span class="n">fill_only</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
<span class="nf">print</span><span class="p">(</span><span class="n">presence_match</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
<span class="nf">print</span><span class="p">(</span><span class="n">value_match</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
</code></pre></div></div>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.0
1.0
0.9755555555555555
</code></pre></div></div>

<p><strong>Readout:</strong> fill rate is <strong>100%</strong> because the model output is fully populated. Fill-rate accuracy is also <strong>100%</strong> because the same slots are filled on both sides. Similarity is <strong>~97.6%</strong> because names and summaries are close, not identical, even when companies and descriptions line up.</p>

<details>
<summary>Show full CV example (models, payloads, all metrics)</summary>
<div class="details-content">

    <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">cobjectric</span> <span class="kn">import</span> <span class="n">BaseModel</span><span class="p">,</span> <span class="n">Spec</span><span class="p">,</span> <span class="n">ListCompareStrategy</span>
<span class="kn">from</span> <span class="n">cobjectric.specs</span> <span class="kn">import</span> <span class="n">KeywordSpec</span><span class="p">,</span> <span class="n">TextSpec</span>


<span class="k">class</span> <span class="nc">Experience</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
    <span class="n">company</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">KeywordSpec</span><span class="p">()</span>
    <span class="n">title</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">TextSpec</span><span class="p">()</span>
    <span class="n">description</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">TextSpec</span><span class="p">(</span><span class="n">scorer</span><span class="o">=</span><span class="sh">"</span><span class="s">WRatio</span><span class="sh">"</span><span class="p">)</span>


<span class="k">class</span> <span class="nc">CV</span><span class="p">(</span><span class="n">BaseModel</span><span class="p">):</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">TextSpec</span><span class="p">()</span>
    <span class="n">summary</span><span class="p">:</span> <span class="nb">str</span> <span class="o">=</span> <span class="nc">TextSpec</span><span class="p">(</span><span class="n">scorer</span><span class="o">=</span><span class="sh">"</span><span class="s">WRatio</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">experiences</span><span class="p">:</span> <span class="nb">list</span><span class="p">[</span><span class="n">Experience</span><span class="p">]</span> <span class="o">=</span> <span class="nc">Spec</span><span class="p">(</span>
        <span class="n">list_compare_strategy</span><span class="o">=</span><span class="n">ListCompareStrategy</span><span class="p">.</span><span class="n">OPTIMAL_ASSIGNMENT</span>
    <span class="p">)</span>


<span class="n">reference_cv</span> <span class="o">=</span> <span class="n">CV</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Jean-Pierre Dupont</span><span class="sh">"</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">summary</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span>
            <span class="sh">"</span><span class="s">Senior Software Engineer with 10 years of experience </span><span class="sh">"</span>
            <span class="sh">"</span><span class="s">in Python and ML.</span><span class="sh">"</span>
        <span class="p">),</span>
        <span class="sh">"</span><span class="s">experiences</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span>
                <span class="sh">"</span><span class="s">company</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">TechCorp</span><span class="sh">"</span><span class="p">,</span>
                <span class="sh">"</span><span class="s">title</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Senior Software Engineer</span><span class="sh">"</span><span class="p">,</span>
                <span class="sh">"</span><span class="s">description</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span>
                    <span class="sh">"</span><span class="s">Led development of ML pipelines. </span><span class="sh">"</span>
                    <span class="sh">"</span><span class="s">Managed team of 5 engineers.</span><span class="sh">"</span>
                <span class="p">),</span>
            <span class="p">}</span>
        <span class="p">],</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">llm_cv</span> <span class="o">=</span> <span class="n">CV</span><span class="p">.</span><span class="nf">from_dict</span><span class="p">(</span>
    <span class="p">{</span>
        <span class="sh">"</span><span class="s">name</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Jean Pierre Dupont</span><span class="sh">"</span><span class="p">,</span>
        <span class="sh">"</span><span class="s">summary</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span>
            <span class="sh">"</span><span class="s">Senior Software Engineer with 10+ years experience </span><span class="sh">"</span>
            <span class="sh">"</span><span class="s">in Python &amp; ML</span><span class="sh">"</span>
        <span class="p">),</span>
        <span class="sh">"</span><span class="s">experiences</span><span class="sh">"</span><span class="p">:</span> <span class="p">[</span>
            <span class="p">{</span>
                <span class="sh">"</span><span class="s">company</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">TechCorp</span><span class="sh">"</span><span class="p">,</span>
                <span class="sh">"</span><span class="s">title</span><span class="sh">"</span><span class="p">:</span> <span class="sh">"</span><span class="s">Senior Software Engineer</span><span class="sh">"</span><span class="p">,</span>
                <span class="sh">"</span><span class="s">description</span><span class="sh">"</span><span class="p">:</span> <span class="p">(</span>
                    <span class="sh">"</span><span class="s">Led development of ML pipelines.  </span><span class="sh">"</span>
                    <span class="sh">"</span><span class="s">Managed team of 5 engineers.</span><span class="sh">"</span>
                <span class="p">),</span>
            <span class="p">}</span>
        <span class="p">],</span>
    <span class="p">}</span>
<span class="p">)</span>

<span class="n">fill_only</span> <span class="o">=</span> <span class="n">llm_cv</span><span class="p">.</span><span class="nf">compute_fill_rate</span><span class="p">()</span>
<span class="n">presence_match</span> <span class="o">=</span> <span class="n">llm_cv</span><span class="p">.</span><span class="nf">compute_fill_rate_accuracy</span><span class="p">(</span><span class="n">reference_cv</span><span class="p">)</span>
<span class="n">value_match</span> <span class="o">=</span> <span class="n">llm_cv</span><span class="p">.</span><span class="nf">compute_similarity</span><span class="p">(</span><span class="n">reference_cv</span><span class="p">)</span>

<span class="nf">print</span><span class="p">(</span><span class="n">fill_only</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
<span class="nf">print</span><span class="p">(</span><span class="n">presence_match</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
<span class="nf">print</span><span class="p">(</span><span class="n">value_match</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">value_match</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">summary</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">value_match</span><span class="p">.</span><span class="n">fields</span><span class="p">.</span><span class="n">experiences</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">fields</span><span class="p">.</span><span class="n">description</span><span class="p">.</span><span class="n">value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">value_match</span><span class="p">.</span><span class="nf">mean</span><span class="p">())</span>
</code></pre></div>    </div>

    <div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1.0
1.0
0.9444444444444444
0.9333333333333332
1.0
0.9755555555555555
</code></pre></div>    </div>

  </div>
</details>

<h2 id="beyond-hiring-cvs">Beyond hiring CVs</h2>

<p>Once the metrics exist, you can reuse them anywhere structured output shows up:</p>

<ul>
  <li><strong>API contracts</strong>: did we return the same shaped object across versions?</li>
  <li><strong>LLM benchmarks</strong>: swap prompts or models, keep the schema fixed, log means.</li>
  <li><strong>Data quality</strong>: measure completeness before pushing rows downstream.</li>
  <li><strong>Migration QA</strong>: compare legacy vs new serializers field by field.</li>
</ul>

<h2 id="built-in-specs-quick-map">Built-in Specs (quick map)</h2>

<p>If you want batteries included normalizers and similarity defaults, Cobjectric ships <strong><code class="language-plaintext highlighter-rouge">KeywordSpec</code></strong>, <strong><code class="language-plaintext highlighter-rouge">TextSpec</code></strong>, <strong><code class="language-plaintext highlighter-rouge">NumericSpec</code></strong>, <strong><code class="language-plaintext highlighter-rouge">BooleanSpec</code></strong>, and <strong><code class="language-plaintext highlighter-rouge">DatetimeSpec</code></strong>.</p>

<details>
<summary>Expand Spec cheat sheet</summary>
<div class="details-content">

    <table>
      <thead>
        <tr>
          <th>Spec</th>
          <th>Good for</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <td><code class="language-plaintext highlighter-rouge">KeywordSpec</code></td>
          <td>IDs, enums, codes (<code class="language-plaintext highlighter-rouge">strip</code>, optional int-to-string coercion)</td>
        </tr>
        <tr>
          <td><code class="language-plaintext highlighter-rouge">TextSpec</code></td>
          <td>Long prose with normalization + RapidFuzz similarity</td>
        </tr>
        <tr>
          <td><code class="language-plaintext highlighter-rouge">NumericSpec</code></td>
          <td>JSON number quirks + tolerant similarity</td>
        </tr>
        <tr>
          <td><code class="language-plaintext highlighter-rouge">BooleanSpec</code></td>
          <td>Loose truthy parsing</td>
        </tr>
        <tr>
          <td><code class="language-plaintext highlighter-rouge">DatetimeSpec</code></td>
          <td>ISO-ish timestamps with optional tolerance</td>
        </tr>
      </tbody>
    </table>

    <p>For field-level weights, custom normalizers, and aggregation helpers, read <strong><a href="https://cobjectric.nigiva.com/specs/">Pre-defined Specs</a></strong> and <strong><a href="https://cobjectric.nigiva.com/field_specs/">Field Specifications</a></strong>.</p>

  </div>
</details>

<div class="callout note">
  <span class="callout-icon">📝</span>
  <div class="callout-body">
    <div class="callout-title">Limits</div>
    <p>
      Fuzzy scores depend on RapidFuzz and your chosen <code>scorer</code>, so pin versions when you compare runs over time.
      <strong>Optimal assignment</strong> for lists needs SciPy.
      Similarity returns <strong>0.0</strong> when one side is missing while the other is filled, even if the gap is only on one nested field.
      When you need guarantees beyond strings and tolerances, pair these metrics with schema validation or task-specific checks.
    </p>
  </div>
</div>

<h2 id="where-to-go-next">Where to go next</h2>

<p>Install from PyPI:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pip <span class="nb">install </span>cobjectric
</code></pre></div></div>

<p>Docs live at <strong><a href="https://cobjectric.nigiva.com/">cobjectric.nigiva.com</a></strong> (quick start, list strategies, pandas export). Source and issues are on <strong><a href="https://github.com/nigiva/cobjectric">GitHub</a></strong>.</p>]]></content><author><name>Nigiva</name><email>blog@nigiva.com</email></author><category term="project" /><category term="python" /><category term="ml" /><category term="tooling" /><summary type="html"><![CDATA[A Python library for fill rate, fill-rate accuracy, and fuzzy similarity on structured payloads such as CV parses or LLM JSON.]]></summary></entry><entry><title type="html">Hello world</title><link href="https://blog.nigiva.com/2026/05/01/hello-world.html" rel="alternate" type="text/html" title="Hello world" /><published>2026-05-01T00:00:00+02:00</published><updated>2026-05-01T00:00:00+02:00</updated><id>https://blog.nigiva.com/2026/05/01/hello-world</id><content type="html" xml:base="https://blog.nigiva.com/2026/05/01/hello-world.html"><![CDATA[<div class="callout tldr">
  <span class="callout-icon" aria-hidden="true">
    <svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M16 4h2a2 2 0 0 1 2 2v14a2 2 0 0 1-2 2H6a2 2 0 0 1-2-2V6a2 2 0 0 1 2-2h2" /><rect width="8" height="4" x="8" y="2" rx="1" ry="1" /><path d="M9 14h6" /><path d="M9 18h6" /><path d="M9 22h6" /></svg>
  </span>
  <div class="callout-body">
    <div class="callout-title">TL;DR</div>
    <div class="callout-md">
<p>This blog is my <strong>public lab notebook</strong>. I'm aiming for evidence: questions, controlled setups or benchmarks, and honest conclusions about what works <strong>where</strong>.</p>

<p>Posts reflect <strong>my</strong> views and <strong>my</strong> hardware unless I say otherwise.</p>

<p>Read <strong><a href="/about/">About</a></strong> for background, keys, and how email works here.</p>

    </div>
  </div>
</div>

<h2 id="disclaimers">Disclaimers</h2>

<p><strong>What's here is my opinion</strong>, not my employers', clients', or sponsors'. When something overlaps with work I'll still separate personal takes from official positions.</p>

<p><strong>Experiments run on my dime</strong>: my hardware, cloud credits I pay for, and code I own unless a note explicitly says I got sponsored GPU time or borrowed gear. When that happens I'll flag it up front.</p>

<p>If you spot a mistake or disagree with an argument, email <strong><a href="mailto:blog@nigiva.com">blog@nigiva.com</a></strong>. That's the inbox I actually read for serious replies.</p>

<h2 id="the-mantra-this-site-keeps-coming-back-to">The mantra this site keeps coming back to</h2>

<p>I care about answers you can <strong>actually stress-test</strong>.</p>

<p>When it's realistic I write from something close to the <strong>scientific method</strong>: nail down the claim, freeze the scenario, compare options with <strong>benchmarks</strong> or reproducible demos, then say clearly <strong>what's better here</strong> (not universal slogans). When evidence is thin I'll say so and treat the piece as a notebook entry, not a final verdict.</p>

<p>That's a big part of why I publish: writing forces assumptions, harness design, and failure modes out into the open.</p>

<p><strong>Replication</strong> fits in the same picture: document procedures clearly enough that people can rerun them. When calibration matters, redo external papers or older versions of my own notes. <strong>Same curve or a different one</strong>, both teach you something.</p>

<h2 id="why-bother-publishing">Why bother publishing</h2>

<p>Writing keeps me honest. Turning fuzzy intuition into prose pushes me toward clearer assumptions, tighter experiments, and explicit limits.</p>

<p>This site is also <strong>external memory</strong>. Over a year I ship tons of fragments that never leave local repos or notebooks. Putting some of that in public helps me see what actually moved instead of feeling stuck.</p>

<p>Friends sometimes ask how I think about a topic. Instead of repeating long chats I can send <strong>one link</strong> that already has trade-offs, failures, or partial wins.</p>

<p>Publishing also lets me <strong>stress-test ideas</strong>. Some posts back hunches with numbers; others will age badly and deserve updates. Both are useful.</p>

<p>For <strong>opinion-heavy posts</strong>, keeping a dated version in public helps me stay <strong>honest about drift</strong>: I can point to what I believed earlier, admit when something aged poorly, show where I changed my mind, and explain <strong>why</strong> instead of pretending my takes were always consistent.</p>

<h2 id="what-youll-find">What you'll find</h2>

<ul>
  <li><strong>ML engineering</strong> notes.</li>
  <li><strong>Agentic systems</strong> experiments.</li>
  <li><strong>Reverse-engineering</strong> curiosity, written responsibly.</li>
  <li><strong>Art-adjacent</strong> tangents when they cross tooling.</li>
  <li><strong>Projects</strong> I'm working on.</li>
  <li>And maybe more… 😉</li>
</ul>

<div class="callout info">
  <span class="callout-icon">ℹ️</span>
  <div class="callout-body">
    <div class="callout-title">Info</div>
    <p>
      Want plain Markdown instead of HTML? Swap <strong><code>.html</code></strong> for
      <strong><code>.md</code></strong> on the URL path.
      For this note that's <a href="/2026/05/01/hello-world.md"><code>/2026/05/01/hello-world.md</code></a>
      (front matter included). Handy if you want to edit offline or paste into an agent.
    </p>
  </div>
</div>]]></content><author><name>Nigiva</name><email>blog@nigiva.com</email></author><category term="blog" /><category term="meta" /><summary type="html"><![CDATA[Why this blog exists, the evidence-first mindset, disclaimers, and what to expect.]]></summary></entry></feed>