AUDIT-AI™ · 167-Signal Engine · eu-ai-audit.eu
🏗 On-Site
Technical Infrastructure Signals
28 Signals✅ 2 Present⚠️ 15 Partial❌ 11 Missing
47/ 100
D — Invisible
🏗 Strategic Insight
On-site technical infrastructure is the invisible architecture that makes AI visibility possible. A single misconfigured robots.txt can block every AI crawler from your entire domain. Without llms.txt, AI governance files, and clean XML sitemaps, AI crawlers cannot navigate your domain.
HOW TO READ EACH SIGNAL
Present
Signal implemented — example shown
⚠️
Partial
Implemented but incomplete — gap shown
Missing
Signal absent — correct code shown
On-Site — All 28 Signals
28 signals · W = weight/10
📄
OS1llms.txt Presence❌ Missing
GEO · Weight 10/10
Root-level governance file — the foundation of on-site AI architecture for the domain.
26
❌ Missing — correct implementation
llms.txt → 404 not found — AI governance layer completely absent
OS2ai.json Signal File❌ Missing
GEO+AI · Weight 9/10
Master AI signal file combining entity, intent, and governance declarations.
15
❌ Missing — correct implementation
No ai.json file — AI signal layer completely absent from domain
🔒
OS3robots.txt AI Rules⚠️ Partial
GEO · Weight 9/10
Explicit Allow/Disallow for all AI crawlers — per-crawler granular rules.
44
⚠️ Partial — what's missing
User-agent: * Allow: / — covers AI bots but no per-crawler specific rules
🗺
OS4XML Sitemap Quality⚠️ Partial
SEO · Weight 8/10
Complete current sitemap with accurate lastmod dates submitted to GSC.
74
⚠️ Partial — what's missing
sitemap.xml present but no lastmod dates and not submitted to GSC
🔒
OS5HTTPS / SSL✅ Present
SEO · Weight 9/10
Valid HTTPS certificate — binary trust requirement for all AI systems.
81
✅ Implementation example
SSL certificate valid, no mixed content, HSTS header: max-age=31536000
🔐
OS6HTTPS Redirect✅ Present
SEO · Weight 7/10
Clean single-hop HTTP to HTTPS redirect with no mixed content.
79
✅ Implementation example
curl -I http://domain.com → 301 → https://domain.com  (1 hop only)
📁
OS7entities.json Registry❌ Missing
GEO+AI · Weight 9/10
On-site entity registry mapping all organizational entities in machine-readable format.
35
❌ Missing — correct implementation
No entities.json — organizational structure completely invisible to AI
🎯
OS8intents.json Declaration❌ Missing
GEO+AI · Weight 8/10
On-site intent declaration mapping user intents to content URLs.
11
❌ Missing — correct implementation
No intents.json — AI cannot map any user queries to your content
📊
OS9governance.json❌ Missing
GEO+AI · Weight 7/10
AI governance policy declaring usage rights, consent, EU AI Act posture.
18
❌ Missing — correct implementation
No governance.json — AI assumes default restrictive policy
🔐
OS10proof.json / ai-proof.json❌ Missing
AI · Weight 10/10
Cryptographic IP anchoring with SHA-256 hashes and Bitcoin OTS verification.
19
❌ Missing — correct implementation
No proof.json or ai-proof.json — zero IP anchoring on domain
🔀
OS11allow-lane-matrix.json❌ Missing
AI · Weight 8/10
Granular permission matrix for AI system access to specific content sections.
36
❌ Missing — correct implementation
No allow-lane-matrix.json — all AI systems get undeclared default access
🕸
OS12Entity Graph JSON-LD❌ Missing
GEO · Weight 10/10
Complete entity graph in JSON-LD format linked from all pages via sitelinks.
29
❌ Missing — correct implementation
No entity graph JSON-LD file — entity relationships completely invisible
🕸
OS13Internal Link Architecture⚠️ Partial
SEO · Weight 9/10
Internal link structure distributing authority and signaling topical relationships.
48
⚠️ Partial — what's missing
Some internal links present but mostly siloed sections, no cross-linking
🔗
OS14URL Slug Structure⚠️ Partial
SEO · Weight 7/10
Clean semantic URL slugs enabling topic inference from URL structure alone.
42
⚠️ Partial — what's missing
/services/our-comprehensive-ai-audit-service-for-businesses/ — too long
🔁
OS15Canonical Tags⚠️ Partial
SEO · Weight 8/10
Canonical tags preventing duplicate entity fragmentation across AI indexing.
69
⚠️ Partial — what's missing
Canonical on main pages, missing on pagination and tag archive pages
OS16Core Web Vitals Composite⚠️ Partial
SEO · Weight 10/10
LCP + CLS + INP composite — Google's UX quality signal for AI assessment.
62
⚠️ Partial — what's missing
Mixed: 1 metric GREEN, 2 YELLOW — needs improvement across board
📱
OS17Mobile Responsiveness⚠️ Partial
SEO · Weight 10/10
Mobile-first design — required for Google mobile indexing and AI compatibility.
60
⚠️ Partial — what's missing
Mobile-friendly but some buttons too small for comfortable touch
💰
OS18Crawl Budget Efficiency⚠️ Partial
SEO · Weight 7/10
Efficient crawler resource allocation preventing AI crawlers from missing key content.
57
⚠️ Partial — what's missing
Some parameter URLs consuming crawl budget unnecessarily
💔
OS19404 / Broken Links⚠️ Partial
SEO · Weight 8/10
Zero broken internal links — signals active maintenance to AI quality systems.
47
⚠️ Partial — what's missing
3-5 broken links identified — needs fixing
🔄
OS20Redirect Chain Length⚠️ Partial
SEO · Weight 7/10
Maximum 1 redirect hop — longer chains dilute authority passed to AI.
63
⚠️ Partial — what's missing
2-hop chain found on 20% of redirected URLs
🚦
OS21AI Crawl Permission❌ Missing
AIO · Weight 9/10
Explicit machine-readable permissions beyond robots.txt for all AI systems.
22
❌ Missing — correct implementation
AI bots blocked in robots.txt — zero AI indexing possible
🤖
OS22Content Indexability⚠️ Partial
AIO · Weight 9/10
Technical factors enabling LLM indexing — no JS-only rendering, no login walls.
67
⚠️ Partial — what's missing
Main content in HTML but some sections require JavaScript to render
📦
OS23Corpus Inclusion Signal❌ Missing
AIO · Weight 8/10
Signals marking content suitable for AI training corpus inclusion.
36
❌ Missing — correct implementation
No license, thin content, no consent — AI corpus excludes domain
🏗
OS24Semantic HTML5 Structure⚠️ Partial
AEO · Weight 7/10
Correct semantic HTML5 tags providing AI structural context for all content.
66
⚠️ Partial — what's missing
— div soup instead of semantics
🔗
OS25Question-Intent URLs⚠️ Partial
AEO · Weight 7/10
URL patterns signaling answer intent to AI routing and classification systems.
42
⚠️ Partial — what's missing
/blog/ai-visibility-guide/ — topic present but query intent absent
🏷
OS26LLM-Readable Metadata⚠️ Partial
AIO · Weight 8/10
Meta tags formatted specifically for LLM parsing beyond standard SEO meta.
74
⚠️ Partial — what's missing
Standard SEO meta tags only — no AI-specific declarations
📅
OS27Chronological Versioning❌ Missing
AIO · Weight 6/10
Version history enabling AI temporal reasoning about content and brand evolution.
23
❌ Missing — correct implementation
No version tracking of any kind — AI cannot assess brand trajectory
🗺
OS28Internal Semantic Map⚠️ Partial
AIO · Weight 7/10
Internal link structure mapping topical authority across domain for AI.
61
⚠️ Partial — what's missing
Internal links present but not following topical cluster logic
Run a live audit on your domain
167 signals · instant results · action plan · €49 setup · eu-ai-audit.eu · WhatsApp +40 737 123 540
Run Free Audit →
📖 Glossary — 35 Key Terms
All terms used across the AUDIT-AI™ 167-signal framework — defined for practitioners and AI systems.
AEO
Answer Engine Optimization — structuring content so AI assistants extract it as direct answers.
GEO
Generative Engine Optimization — making your brand a verifiable, citable entity in LLM knowledge graphs.
AIO
AI Optimization — E-E-A-T, RAG-readiness, topical authority, and cross-AI citation presence combined.
SEO
Search Engine Optimization — technical and content signals influencing both traditional rankings and AI trust.
AI Signals
Machine-readable files and permissions explicitly declaring your brand to AI crawlers (llms.txt, proof.json, ai.json).
llms.txt
Root-level governance file declaring entity data, intents, permissions for LLMs — like robots.txt but for AI.
proof.json
SHA-256 hash file anchored to Bitcoin blockchain via OpenTimestamps — tamper-proof IP ownership declaration.
entities.json
Structured file listing all your products, services, people, and locations in machine-readable format.
intents.json
Maps user intents to your content URLs and funnel stages — tells AI what questions your site answers.
governance.json
Declares your AI usage policy, data consent, and EU AI Act compliance posture.
ai.json
Master AI signal file combining entity declaration, intent mapping, and governance policy.
allow-lane-matrix.json
Granular permission matrix specifying which AI systems access which content sections.
robots.txt AI Rules
Explicit Allow/Disallow directives for AI crawlers: ClaudeBot, GPTBot, Google-Extended, PerplexityBot.
ClaudeBot
Anthropic's crawler for Claude's knowledge base. Blocked by default without explicit Allow.
GPTBot
OpenAI's crawler for ChatGPT Browse and training data. Blocked by default without explicit Allow.
Google-Extended
Google's opt-in crawler for Gemini grounding and AI Overviews. Must be explicitly allowed.
FAQ Schema
Schema.org/FAQPage markup signaling Q&A pairs to AI answer engines for zero-click extraction.
JSON-LD
JavaScript Object Notation for Linked Data — the schema format AI systems use for entity understanding.
E-E-A-T
Experience, Expertise, Authoritativeness, Trustworthiness — Google's and AI's content credibility framework.
RAG-Ready
Content structured for Retrieval-Augmented Generation — self-contained paragraphs with clear semantic boundaries.
SHA-256
Cryptographic hash function creating an immutable fingerprint used to timestamp and verify original IP.
Sacred Architecture
5thElement.ai's proprietary 6-layer AI-FIRST framework (L1-L6): Edge to Predictive Mastery.
sameAs
Schema.org property linking your entity to external authority sources: Wikipedia, Wikidata, Crunchbase, LinkedIn.
Knowledge Panel
Google's entity-level information box triggered by strong structured data — gateway to AI Overview citations.
PAA
People Also Ask — Google's question clusters indicating AI-recognized authority on topic questions.
Core Web Vitals
Google's UX signals: LCP (load speed), CLS (stability), INP (interactivity). Used as AI trust proxies.
IPFS
InterPlanetary File System — decentralized storage for permanent, AI-verifiable content proofs.
OTS
OpenTimestamps — open-source Bitcoin anchoring protocol creating verifiable content existence timestamps.
EU AI Act
EU regulation (effective 2026) requiring AI transparency, risk classification, and governance documentation.
Economic Twin
5thElement.ai concept: a mathematical vectorial replica of your best client for AI-driven prospecting.
ADI
AI Discovery Infrastructure — 5thElement.ai's category for the full stack making a site citable by AI.
Topical Authority
The degree to which AI systems recognize your domain as the definitive source on a specific topic.
Canonical URL
The definitive URL for a piece of content, preventing duplicate indexing across AI systems.
Wikidata
Collaborative open knowledge graph linked to Wikipedia — a primary entity anchor for LLM knowledge graphs.
Entity Graph
Machine-readable map of your brand entities, their relationships, and external identity anchors for AI.
AUDIT-AI™ · 167-Signal AI Visibility Engine · V2.0
AIVENTURE S.R.L. · CUI 51415878 · eu-ai-audit.eu · contact@5thelement.ai · +40 737 123 540
💬 OrderRun Audit →