AUDIT-AI™ · AI Signals — Crawler & Trust Layer

🤖 Strategic Insight

AI Signals are the identity layer of the AI economy — cryptographic proofs, crawler permissions, and machine-readable governance files that transform your domain from an HTML document into a verifiable, citable sovereign entity that AI systems can trust.

HOW TO READ EACH SIGNAL

✅

Present

Signal implemented — example shown

⚠️

Partial

Implemented but incomplete — gap shown

❌

Missing

Signal absent — correct code shown

AI Signals — All 32 Signals

32 signals

🤖

AI1 ClaudeBot Allow Directive ❌ Missing

ON-SITE · Weight 10/10

Explicit Allow for Anthropic's ClaudeBot in robots.txt. Without it, ClaudeBot is blocked by default.

❌ Missing — correct implementation

# No ClaudeBot rule — blocked by Disallow: *

🤖

AI2 GPTBot Allow Directive ❌ Missing

ON-SITE · Weight 10/10

Explicit Allow for OpenAI's GPTBot. Required for ChatGPT Browse indexing.

❌ Missing — correct implementation

# GPTBot absent — blocked by default

🔍

AI3 Google-Extended Allow ❌ Missing

ON-SITE · Weight 9/10

Opt-in directive for Google's Gemini AI crawler. Default is blocked without this rule.

❌ Missing — correct implementation

# Google-Extended absent — Gemini grounding blocked

🔎

AI4 PerplexityBot Allow ❌ Missing

ON-SITE · Weight 9/10

Explicit Allow for PerplexityBot — determines Perplexity citation eligibility.

❌ Missing — correct implementation

# No PerplexityBot rule — citations impossible

🌐

AI5 Meta-AI Allow CCBot ❌ Missing

ON-SITE · Weight 8/10

Allow directive for CCBot used by Meta AI and Common Crawl.

❌ Missing — correct implementation

# CCBot blocked — Meta AI cannot index

🔐

AI6 proof.json SHA-256 ⚠️ Partial

ON-SITE · Weight 10/10

Cryptographic identity file with SHA-256 hashes anchored to Bitcoin blockchain.

⚠️ Partial — what's missing

{"sha256":"31aa7a...","ots_status":"pending"}

🔑

AI7 Cryptographic IP Proof ❌ Missing

ON-SITE · Weight 9/10

Bitcoin-anchored prior art proof with OpenTimestamps verification file.

❌ Missing — correct implementation

No proof.json, no .ots — IP completely unverifiable by AI

⛓

AI8 IPFS / Blockchain Reg. ❌ Missing

OFF-SITE · Weight 8/10

Decentralized permanent content registration on IPFS or blockchain.

❌ Missing — correct implementation

No IPFS registration — proof only on your own server

📋

AI9 session.json ❌ Missing

ON-SITE · Weight 7/10

Declares content versioning state for AI temporal reasoning.

❌ Missing — correct implementation

# /session.json → 404 not found

🏷

AI10 aliases.json ⚠️ Partial

ON-SITE · Weight 7/10

Declares all brand name variants and aliases in machine-readable format.

⚠️ Partial — what's missing

{"aliases":["5thElement"]}  // incomplete list

📜

AI11 policy.json ❌ Missing

ON-SITE · Weight 8/10

Declares which AI systems may use your content and under what conditions.

❌ Missing — correct implementation

# No policy.json — AI uses default assumptions

⚡

AI12 actions.json ❌ Missing

ON-SITE · Weight 7/10

Declares specific AI-mediated actions available on your domain.

❌ Missing — correct implementation

# No actions.json — AI cannot discover capabilities

🛡

AI13 Hallucination Prevention ❌ Missing

ON-SITE · Weight 9/10

Structured data providing AI systems with verifiable facts to prevent confabulation.

❌ Missing — correct implementation

No structured identity layer — AI hallucinates brand facts

🕸

AI14 Entity-Graph Completeness ❌ Missing

ON-SITE · Weight 9/10

Completeness of machine-readable entity graph covering all relationships.

❌ Missing — correct implementation

No entity graph — AI cannot map your organization

📄

AI15 AI-Readable Density ⚠️ Partial

ON-PAGE · Weight 8/10

Information density optimized for LLM tokenization — facts per paragraph.

⚠️ Partial — what's missing

4-5 facts per paragraph, some generic filler sentences

🔒

AI16 Confidentiality Tags ❌ Missing

ON-SITE · Weight 7/10

Machine-readable declarations of which content is public vs restricted.

❌ Missing — correct implementation

# No boundary declarations — AI cannot assess access rights

✅

AI17 Training Data Consent ❌ Missing

ON-SITE · Weight 8/10

Explicit consent declaration for AI training data usage in machine-readable format.

❌ Missing — correct implementation

# No consent signal — AI uses default opt-out assumption

🇪🇺

AI18 EU AI Act Tag ⚠️ Partial

ON-SITE · Weight 9/10

Declaration of EU AI Act readiness — Article 50 transparency tags.

⚠️ Partial — what's missing

{"eu_ai_act":"in_progress"}

📊

AI19 AI Governance Statement ❌ Missing

ON-SITE · Weight 8/10

Human and machine-readable AI governance statement on domain.

❌ Missing — correct implementation

No governance declaration anywhere on domain

🔀

AI20 allow-lane-matrix.json ❌ Missing

ON-SITE · Weight 8/10

Granular matrix specifying which AI systems may access which content.

❌ Missing — correct implementation

# No allow-lane-matrix.json present on domain

✨

AI21 Gemini Grounding Verified ❌ Missing

OFF-SITE · Weight 9/10

Verification that content is actively used as Gemini factual grounding.

❌ Missing — correct implementation

Google-Extended blocked — Gemini grounding mathematically impossible

📈

AI22 Perplexity Citation Score ❌ Missing

OFF-SITE · Weight 8/10

Frequency and quality of citations in Perplexity answer responses.

❌ Missing — correct implementation

Zero Perplexity citations found — domain invisible in answer layer

🌐

AI23 AI Knowledge Graph Entry ❌ Missing

OFF-SITE · Weight 9/10

Verified entry in Google KG, Bing KG, or equivalent AI entity databases.

❌ Missing — correct implementation

No Knowledge Graph entry — brand is just a string of text

🧠

AI24 LLM Embedding Proximity ❌ Missing

OFF-SITE · Weight 8/10

Distance of brand entity to relevant concept clusters in LLM embedding space.

❌ Missing — correct implementation

Brand not in embedding space — category completely invisible

📐

AI25 Vectorial Brand Rep. ❌ Missing

OFF-SITE · Weight 7/10

Quality and consistency of brand's vectorial representation across AI models.

❌ Missing — correct implementation

Contradictory or absent brand description across all LLMs

💬

AI26 AI Answer Coverage ❌ Missing

OFF-SITE · Weight 9/10

Aggregate presence across ChatGPT, Gemini, Claude, Perplexity, Copilot.

❌ Missing — correct implementation

Brand absent or hallucinated across all AI systems tested

🎯

AI27 intents.json v1.1+ ❌ Missing

ON-SITE · Weight 8/10

Presence and validity of intents.json with 6-funnel-stage coverage.

❌ Missing — correct implementation

# /intents.json → 404 not found

🏆

AI28 ai-proof.json ❌ Missing

ON-SITE · Weight 8/10

Enhanced proof schema with OTS verification and Bitcoin block anchors.

❌ Missing — correct implementation

# No ai-proof.json — no verifiable IP registry on domain

📁

AI29 entity-index SSOT ❌ Missing

ON-SITE · Weight 7/10

Master manifest of all AI-readable files on your domain.

❌ Missing — correct implementation

# No entity-index.json — AI cannot discover signal files

🏅

AI30 AI-Ready Score Decl. ❌ Missing

ON-SITE · Weight 7/10

Self-declared AI readiness score in machine-readable format on domain.

❌ Missing — correct implementation

# No AI-Ready score declaration found on domain

⚓

AI31 Zero-Hallucination Anchors ❌ Missing

ON-SITE · Weight 8/10

Structured factual anchors preventing AI confabulation about your brand.

❌ Missing — correct implementation

No structured factual anchors — AI invents brand details

🔄

AI32 Cross-AI Consistency ❌ Missing

OFF-SITE · Weight 9/10

Consistency of brand representation across ChatGPT, Gemini, Claude, Perplexity.

❌ Missing — correct implementation

Contradictory descriptions across LLMs — entity fragmented

Run a live audit on your domain

167 signals · instant results · action plan · €49 setup · eu-ai-audit.eu

Run Free Audit →

📖 Glossary — 35 Key Terms

All terms used across the AUDIT-AI™ 167-signal framework — defined for practitioners and AI systems.

AEO

Answer Engine Optimization — structuring content so AI assistants extract it as direct answers without the user clicking.

GEO

Generative Engine Optimization — making your brand a verifiable, citable entity in LLM knowledge graphs.

AIO

AI Optimization — E-E-A-T, RAG-readiness, topical authority, and cross-AI citation presence combined.

SEO

Search Engine Optimization — technical and content signals influencing both traditional rankings and AI trust.

AI Signals

Machine-readable files and permissions explicitly declaring your brand to AI crawlers (llms.txt, proof.json, ai.json).

llms.txt

Root-level governance file declaring entity data, intents, and permissions for LLMs — like robots.txt for AI.

proof.json

SHA-256 hash file anchored to Bitcoin blockchain via OpenTimestamps — tamper-proof IP ownership declaration.

entities.json

Structured file listing all your products, services, people, and locations in machine-readable format.

intents.json

Maps user intents to your content URLs and funnel stages — tells AI what questions your site answers.

governance.json

Declares your AI usage policy, data consent, and EU AI Act compliance posture.

ai.json

Master AI signal file combining entity declaration, intent mapping, and governance policy.

allow-lane-matrix.json

Granular permission matrix specifying which AI systems access which content sections under what conditions.

robots.txt AI Rules

Explicit Allow/Disallow directives for AI crawlers: ClaudeBot, GPTBot, Google-Extended, PerplexityBot.

ClaudeBot

Anthropic's crawler for Claude's knowledge base. Blocked by default without explicit Allow in robots.txt.

GPTBot

OpenAI's crawler for ChatGPT Browse and training data. Blocked by default without explicit Allow.

Google-Extended

Google's opt-in crawler for Gemini grounding and AI Overviews. Must be explicitly allowed.

FAQ Schema

Schema.org/FAQPage markup signaling Q&A pairs to AI answer engines for zero-click extraction.

JSON-LD

JavaScript Object Notation for Linked Data — the schema format AI systems use for entity understanding.

E-E-A-T

Experience, Expertise, Authoritativeness, Trustworthiness — Google's and AI's content credibility framework.

RAG-Ready

Content structured for Retrieval-Augmented Generation — self-contained paragraphs with clear semantic boundaries.

SHA-256

Cryptographic hash function creating an immutable fingerprint used to timestamp and verify original IP.

Sacred Architecture

5thElement.ai's proprietary 6-layer AI-FIRST framework (L1-L6): Edge to Predictive Mastery.

sameAs

Schema.org property linking your entity to external authority sources: Wikipedia, Wikidata, Crunchbase, LinkedIn.

Knowledge Panel

Google's entity-level information box triggered by strong structured data — gateway to AI Overview citations.

PAA

People Also Ask — Google's question clusters indicating AI-recognized authority on topic questions.

Core Web Vitals

Google's UX signals: LCP (load speed), CLS (stability), INP (interactivity). Used as AI trust proxies.

IPFS

InterPlanetary File System — decentralized storage for permanent, AI-verifiable content proofs.

OTS

OpenTimestamps — open-source Bitcoin anchoring protocol creating verifiable content existence timestamps.

EU AI Act

EU regulation (effective 2026) requiring AI transparency, risk classification, and governance documentation.

Economic Twin

5thElement.ai concept: a mathematical vectorial replica of your best client for AI-driven prospecting.

ADI

AI Discovery Infrastructure — 5thElement.ai's category for the full stack making a site citable by AI.

Topical Authority

The degree to which AI systems recognize your domain as the definitive source on a specific topic.

Canonical URL

The definitive URL for a piece of content, preventing duplicate indexing across AI systems.

Wikidata

Collaborative open knowledge graph linked to Wikipedia — a primary entity anchor for LLM knowledge graphs.

Entity Graph

Machine-readable map of your brand entities, their relationships, and external identity anchors for AI.

AUDIT-AI™ · 167-Signal AI Visibility Engine · V2.0

AIVENTURE S.R.L. · CUI 51415878 · eu-ai-audit.eu · contact@5thelement.ai · +40 737 123 540

💬 Order Run Audit →