Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!
This week, I share my findings from analyzing 1.2 million ChatGPT responses to answer the question of how to improve your chances of getting cited.
For 20 years, SEOs have written”ultimate guides” designed to keep humans on the page. We write long intros. We drag insights all along through the draft and into the conclusion. We build suspense to the final call to action.
The data shows that this style of writing is not ideal for AI visibility.
After analyzing 1.2 million verified ChatGPT citations, I found a pattern so consistent it has a P-Value of 0.0: the “ski ramp.” ChatGPT pays disproportionate attention to the top 30% of your content. Furthermore, I found five clear characteristics of content that gets cited. To win in the AI era, you need to start writing like a journalist.
1. Which Sections Of A Text Are Most Likely To Be Cited By ChatGPT?
There isn’t much known about which parts of a text LLMs cite. We analyzed 18,012 citations and found a “ski ramp” distribution.
- 44.2% of all citations come from the first 30% of text (the intro). The AI reads like a journalist. It grabs the “Who, What, Where” from the top. If your key insight is in the intro, the chances it gets cited are high.
- 31.1% of citations come from the 30-70% of a text (the middle). If you bury your key product features in paragraph 12 of a 20-paragraph post, the AI is 2.5x less likely to cite it.
- 24.7% of citations come from the last third of an article (the conclusion). It proves the AI does wake up at the end (much like humans). It skips the actual footer (see the 90-100% drop-off), but it loves the “Summary” or “Conclusion” section right before the footer.
Possible explanations for the ski ramp pattern are training and efficiency:
- LLMs are trained on journalism and academic papers, which follow the “BLUF” (Bottom Line Up Front) structure. The model learns that the most “weighted” information is always at the top.
- While modern models can read up to 1 million tokens for a single interaction (~700,000-800,000 words), they aim to establish the frame as fast as possible, then interpret everything else through that frame.
18,000 out of 1.2 million citations gives us all the insight we need. The P-Value of this analysis is 0.0, meaning it’s statistically indisputable. I split the data into batches (randomized validation splits) to demonstrate the stability of the results.
- Batch 1 was slightly flatter, but batches 2, 3, and 4 are almost identical.
- Conclusion: Because batches 2, 3, and 4 locked onto the exact same pattern, the data is stable across all 1.2 million citations.
While these batches confirm the macro-level stability of where ChatGPT looks across a document, they raise a new question about its granular behavior: Does this top-heavy bias persist even within a single block of text, or does the AI’s focus change when it reads more deeply? Having established that the data is statistically indisputable at scale, I wanted to “zoom in” to the paragraph level.
A deep analysis of 1,000 pieces of content with a high amount of citations shows 53% of citations come from the middle of a paragraph. Only 24.5% come from the first and 22.5% from the last sentence of a paragraph. ChatGPT is not “lazy” and only reads the first sentence of every paragraph. It reads deeply.
Takeaway: You don’t need to force the answer into the first sentence of every paragraph. ChatGPT seeks the sentence with the highest “information gain” (the most complete use of relevant entities and additive, expansive information), regardless of whether that sentence is first, second, or fifth in the paragraph. Combined with the ski ramp pattern, we can conclude that the highest chances for citations come from the paragraphs in the first 20% of the page.
2. What Makes ChatGPT More Likely To Cite Chunks?
We know where in content ChatGPT likes to cite from, but what are the characteristics that influence citation likelihood?
The analysis shows five winning characteristics:
- Definitive language.
- Conversational question-answer structure.
- Entity richness.
- Balanced sentiment.
- Simple writing.
1. Definitive Vs. Vague Language
Citation winners are almost 2x more likely (36.2% vs 20.2%) to contain definitive language (“is defined as,” “refers to”). The language citation doesn’t have to be a definition verbatim, but the relationships between concepts have to be clear.
Possible explanations for the impact of direct, declarative writing:
- In a vector database, the word “is” acts as a strong bridge connecting a subject to its definition. When a user asks “What is X?” the model searches for the strongest vector path, which is almost always a direct “X is Y” sentence structure.
- The model tries to answer the user immediately. It prefers a text that allows it to resolve the query in a single sentence (Zero-Shot) rather than synthesizing an answer from five paragraphs.
Takeaway: Start your articles with a direct statement.
- Bad: “In this fast-paced world, automation is becoming key…”
- Good: “Demo automation is the process of using software to…”
2. Conversational Writing
Text that gets cited is 2x more likely (18% vs. 8.9%) to contain a question mark. When we talk about conversational writing, we mean the interplay between questions and answers.
Start with the user’s query as a question, then answer it immediately. For example:
- Winner Style: “What is Programmatic SEO? It is…”
- Loser Style: “In this article, we will discuss the various nuances of…”
78.4% of citations with questions come from headings. The AI is treating your H2 tag as the user prompt and the paragraph immediately following it as the generated response.
Example loser structure:
Example winner structure (The 78%):
-
When did SEO start?
(Literal Query)
-
SEO started in…
(Direct Answer)
The reason that specific example wins is because of what I call “entity echoing”: The header asks about SEO, and the very first word of the answer is SEO.
3. Entity Richness
Normal English text has an “entity density” (that is, contains proper nouns like brands, tools, people) of ~5-8%. Heavily cited text has an entity density of 20.6%!
- The 5-8% figure is a linguistic benchmark derived from standard corpora like the Brown Corpus (1 million words of representative English text) and the Penn Treebank (Wall Street Journal text).
Example:
- Loser sentence: “There are many good tools for this task.” (0% Density)
- Winner sentence: “Top tools include Salesforce, HubSpot, and Pipedrive.” (30% Density)
LLMs are probabilistic. Generic advice (”choose a good tool”) is risky and vague, but a specific entity (”choose Salesforce”) is grounded and verifiable. The model prioritizes sentences that contain “anchors” (entities) because they lower the perplexity (confusion) of the answer.
A sentence with three entities carries more “bits” of information than a sentence with 0 entities. So, don’t be afraid of namedropping (yes, even your competitors).
4. Balanced Sentiment
In my analysis, the cited text has a balanced subjectivity score of 0.47. The subjectivity score is a standard metric in natural language processing (NLP) that measures the amount of personal opinion, emotion, or judgment in a piece of text.
The score runs on a scale from 0.0 to 1.0:
- 0.0 (Pure Objectivity): The text contains only verifiable facts. No adjectives, no feelings. Example: “The iPhone 15 was released in September 2023.”
- 1.0 (Pure Subjectivity): The text contains only personal opinions, emotions, or intense descriptors. Example: “The iPhone 15 is an absolutely stunning masterpiece that I love.”
AI doesn’t want dry Wikipedia text (0.1), nor does it want unhinged opinion (0.9). It wants the “analyst voice.” It prefers sentences that explain how a fact applies, rather than just stating the stat alone.
The “winning” tone looks like this (Score ~0.5): “While the iPhone 15 features a standard A16 chip (fact), its performance in low-light photography makes it a superior choice for content creators (analysis/opinion).“
5. Business-Grade Writing
Business-grade writing (think The Economist or Harvard Business Review) gets more citations. “Winners” have a Flesch-Kincaid score of 16 (college level) compared to the “losers” with 19.1 (Academic/PhD level).
Even for complex topics, complexity can hurt. A grade 19 score means sentences are long, winding, and filled with multisyllable jargon. The AI prefers simple subject-verb-object structures with short to moderately long sentences, because they are easier to extract facts from.
Conclusion
The “ski ramp” pattern quantifies a misalignment between narrative writing and information retrieval. The algorithm interprets the slow reveal as a lack of confidence. It prioritizes the immediate classification of entities and facts.
High-visibility content functions more like a structured briefing than a story.
This imposes a “clarity tax” on the writer. The winners in this dataset rely on business-grade vocabulary and high entity density, disproving the theory that AI rewards “dumbing down” content (with exceptions).
We’re not only writing robots … yet. But the gap between human preferences and machine constraints is closing. In business writing, humans scan for insights. By front-loading the conclusion, we satisfy the algorithm’s architecture and the human reader’s scarcity of time.
Methodology
To understand exactly where and why AI cites content, we analyzed the code.
All data in this research comes from Gauge.
- Gauge provided roughly 3 million AI answers from ChatGPT, alongside 30 million citations. Each citation URL’s web content was scraped at the time of answer to provide direct correlation between the true web content and the answer itself. Both raw HTML and plaintext were scraped.
1. The Dataset
We started with a universe of 1.2 million search results and AI-generated answers. From this, we isolated 18,012 verified citations for positional analysis and 11,022 citations for “linguistic DNA” analysis.
- Significance: This sample size is large enough to produce a P-Value of 0.0 (p < 0.0001), meaning the patterns we found are statistically indisputable.
2. The “Harvester” Engine
To find exactly which sentence the AI was quoting, we used semantic embeddings (a Neural Network approach).
- The Model: We used all-MiniLM-L6-v2, a sentence-transformer model that understands meaning, not just keywords.
- The Process: We converted every AI answer and every sentence of the source text into 384-dimensional vectors. We then matched them using cosine similarity.
- The Filter: We applied a strict similarity threshold (0.55) to discard weak matches or hallucinations, ensuring we only analyzed high-confidence citations.
3. The Metrics
Once we found the exact match, we measured two things:
- Positional Depth: We calculated exactly where the cited text appeared in the HTML (e.g., at the 10% mark vs. the 90% mark).
- Linguistic DNA: We compared “winners” (cited intros) vs. “losers” (skipped intros) using Natural Language Processing (NLP) to measure:
- Definition Rate: Presence of definitive verbs (is, are, refers to).
- Entity Density: Frequency of proper nouns (brands, tools, people).
- Subjectivity: A sentiment score from 0.0 (Fact) to 1.0 (Opinion).
Featured Image: Paulo Bobita/Search Engine Journal