Website Keyword Intelligence
0 WORDS
Detrended Signal (Topic Residuals)
Waiting for data...
Most linguistic data follows **Zipf’s Law**, where the frequency of a word is inversely proportional to its rank. This dashboard calculates the "Expected Frequency" curve and subtracts it from the "Actual Frequency." The resulting **Residual Signal** identifies keywords that are intentionally emphasized for this specific topic, effectively filtering out natural language bias.
**Term Frequency-Inverse Document Frequency** ensures that "Stopwords" and "Web Boilerplate" (e.g., *Menu, Home, Click*) are penalized. The horizontal bars represent the statistical rarity of a term; a longer bar indicates a word that carries more **Information Gain** for this specific URL.
Single words often lack intent. By extracting **Bi-grams** (continuous sequences of two words), the engine identifies "Semantic Clusters." This allows for the discovery of complex topics (e.g., *"machine learning"*) that single-word counters miss.
Before processing, the engine performs a "Surgical Scrub" of the HTML. It identifies and deletes non-content nodes including `nav`, `footer`, `script`, and `style` tags, ensuring the statistical analysis is performed only on the **Primary Content Body**.