Feature
AI Detection Scoring
An honest, in-house heuristic. We tell you what it measures, what it doesn't, and why we don't claim parity with third-party detectors.
Every document Inksong processes gets a before/after score between 0 and 100. Higher numbers mean the text reads, by our measurement, as more machine-generated. Lower numbers mean it reads as more human. The drop between before and after is the signal that the humanization actually did something.
The score is a heuristic. We label it as such on every page where it appears. We're not a detector business and we don't pretend to be one — the score exists so you have a feedback loop on the rewrite, not so you can publish a claim about it.
How it works
What's happening under the hood
The score is built from five signals. The first is burstiness: the variance in sentence length across the document. Human writing tends to swing widely — a long subordinated sentence followed by a fragment. Most LLM output settles into a uniform mid-length cadence. Low variance, high score.
The second is token-level repetition density, computed over bigrams. Models tend to lean on the same two-word phrases (“it is important,” “in order to,” “moreover, the”) at characteristic rates. We measure how often a document falls into those grooves. The third is hedge-word density— the frequency of qualifiers like “may,” “could,” “arguably,” “tends to.” LLM defaults skew higher than most human writing.
The fourth is vocabulary entropy, measured as the n-gram predictability of the document under a baseline language model. Predictable token sequences score higher; surprising ones score lower. The fifth is the function-word distribution— the ratios of articles, prepositions, conjunctions. Function-word fingerprints are surprisingly stable within a corpus, and AI text has a recognizable one.
The five signals are combined into a weighted sum, normalized to 0–100, calibrated against an internal corpus of known-AI and known-human samples drawn from public datasets and our own test materials. Calibration is recalculated as the corpus grows.
Here is what we are explicit about: this is not Turnitin and it is not GPTZero. We do not have parity with commercial detectors. The score does not measure quality, factuality, or whether your professor will accept the work. It measures statistical patterns that correlate with machine generation under our heuristic, nothing more. Use it as a feedback loop, not as proof.
Example
See it in action
In today's rapidly evolving digital landscape, businesses must leverage cutting-edge technologies to remain competitive. By implementing strategic initiatives that prioritize innovation and customer-centricity, organizations can unlock new opportunities for growth. Moreover, the integration of data-driven decision-making frameworks enables companies to optimize their operations and deliver enhanced value to stakeholders.
Most businesses don't actually need cutting-edge anything — they need to figure out which technology problem is costing them money and fix that one. Strategic initiatives are mostly a way to spend a quarter pretending to plan. If you want growth, listen to customers. If you want better decisions, look at your data. The rest is conference talk.
Benefits
Why this matters
Instant feedback loop
See the before/after delta on every document. If the score didn't move, the rewrite didn't work and you'll know immediately.
Transparent algorithm
We publish the five signals we measure. No black box, no marketing math. You can argue with the methodology and we'll listen.
Costs you nothing
The score is included on every humanization run regardless of plan. Free, Pro, and Enterprise all see the same heuristic.
Related features
Start humanizing today
5 documents free a month, no card needed. Three minutes to your first humanized doc.
- 5 documents/month on the free tier
- No credit card required
- Cancel or upgrade anytime