For NSFW Pony Diffusion V6 XL, open prompts with the score_9, score_8_up, score_7_up cluster, set CLIP skip 2, use Euler a or DPM++ 2M Karras at 25-40 steps, generate at 1024px, and control content with rating_explicit. Pony only understands booru tags, not sentences.
Pony Diffusion V6 XL is the most capable NSFW checkpoint most people will ever use, but it is also the most misunderstood. People load it expecting a normal SDXL model, type a descriptive sentence, get a muddy result, and conclude the model is bad. The model is excellent. It just speaks a specific language, and once you learn that language it becomes the most controllable NSFW model available.
This guide is a complete walkthrough of how to actually drive Pony Diffusion: the score tag system, CLIP skip, samplers and steps, rating and source tags, resolution, and full prompt templates you can copy. For where Pony sits against other models, see our best NSFW checkpoints guide.
The Score Tag System Explained
The defining feature of Pony Diffusion is its score tag system. During training, every image in the dataset was assigned a quality score, and those scores were baked in as tags. That means you can directly request a quality tier in your prompt. The standard opening for any Pony prompt is this cluster: score_9, score_8_up, score_7_up, score_6_up. Reading it as plain English, it says give me score 9, and everything score 8 and above, and score 7 and above, and score 6 and above. The overlapping ranges stack the model’s bias firmly toward its best material.
You can shorten it to just score_9, score_8_up, score_7_up with little loss. What you should not do is omit it entirely. Without score tags, Pony averages across its entire quality range, including weak training images, and the output looks flat. The score cluster is non-negotiable: it goes at the very front of every prompt, before your subject tags.

A common myth is that adding more score variants always helps. It does not. The four-tag cluster is enough. Stacking score_5_up and below actually pulls in lower-quality bias and can slightly degrade results.
CLIP Skip: Set It to 2
CLIP skip controls how deep into the text encoder the model reads before generating. Pony Diffusion V6 XL was trained with CLIP skip 2, and it expects that setting at generation time. This is a frequent source of confusion because many other SDXL checkpoints prefer CLIP skip 1. If your Pony output looks washed out, has muddy anatomy, or ignores tags, CLIP skip is the first thing to check.
In AUTOMATIC1111 and Forge, CLIP skip lives in Settings under Stable Diffusion, or you can add it to the quick settings bar for fast access. In ComfyUI, use a CLIP Set Last Layer node set to -2. Set it once and leave it for all Pony work.
Samplers and Steps for Pony
Pony is not fussy about samplers, but two stand out. Euler a is the fast, forgiving default. It converges quickly, handles complex prompts gracefully, and is the right pick for everyday batch work and prompt exploration. DPM++ 2M Karras is the detail option. It produces slightly crisper textures and edges, which matters for hero shots, at the cost of a few extra seconds per image.
For steps, the useful range is 25 to 40. Thirty steps is a reliable default that balances quality and speed. Going above 40 rarely improves anything visible and just burns time. Going below 25 starts to leave noise and undercooked detail. CFG should sit between 5 and 7 for Pony, with 6 being a safe middle. Higher CFG pushes contrast and saturation too hard and can melt anatomy. Our full CFG and sampler settings guide goes deeper on this.

Rating Tags and Source Tags
Pony controls content level with explicit rating tags. The three you need are rating_safe, rating_questionable, and rating_explicit. For NSFW output, place rating_explicit near the front of your prompt, right after the score cluster. Skipping the rating tag entirely leaves the model guessing, and results become inconsistent. Always state a rating.
Source tags steer the visual style. source_anime pushes a clean anime render, source_cartoon leans Western cartoon, source_furry and source_pony shift toward those communities, and source_pony can also help anchor the model’s house style. If your output drifts toward an unwanted aesthetic, an explicit source tag pulls it back. All of this content is intended strictly for adult, fictional characters, and you should keep your prompts unambiguous on that point.
Resolution and Aspect Ratios
Pony Diffusion V6 XL is SDXL-based, so it expects SDXL-native resolutions. Generate at 1024×1024 for square, 832×1216 for portrait, and 1216×832 for landscape. Generating below 1024px produces soft, detail-poor images because the model was never trained at those sizes. For larger final output, generate a 1024px base and then run Hires Fix to upscale, rather than generating directly at a huge resolution, which causes duplicated limbs and bodies.
A Complete Pony Prompt Template
Here is a copy-ready structure. Positive prompt: score_9, score_8_up, score_7_up, score_6_up, rating_explicit, source_anime, 1girl, solo, [character description], [outfit or state], [pose], [setting], detailed background, soft lighting, masterpiece. Negative prompt: score_4, score_5, score_6, low quality, worst quality, bad anatomy, bad hands, extra fingers, deformed, blurry, watermark, text.
Notice the negative prompt uses low score tags. Because Pony understands the score system, putting score_4, score_5, score_6 in the negative prompt actively pushes the model away from low-quality output. This is a Pony-specific trick that does not work on other models. For more reusable negatives, see the negative prompts master list.

LoRAs and Character Consistency on Pony
Pony has a vast LoRA ecosystem, but compatibility matters. Only use LoRAs whose Civitai page lists Pony Diffusion V6 XL as the base model. A LoRA trained on standard SDXL or SD 1.5 will inject artifacts or simply do nothing useful. Apply Pony LoRAs at 0.6 to 0.9 strength, and reduce strength if a LoRA is overpowering the score tags. For recurring characters, combine a Pony character LoRA with seed anchoring, a workflow covered in our character consistency techniques guide. To build your own Pony-compatible LoRA, follow the NSFW LoRA training guide.
Put these habits together and Pony stops being temperamental. Score cluster up front, CLIP skip 2, Euler a or DPM++ 2M Karras at 30 steps, CFG 6, rating_explicit stated, 1024px base, tag-based prompting throughout. Master that checklist and Pony Diffusion becomes the most obedient NSFW model you can run.
Common Pony Diffusion Mistakes
The single most common Pony mistake is omitting the score tags. Pony V6 was trained with a quality-scoring system, and prompts that skip score_9, score_8_up, score_7_up produce noticeably flatter, lower-detail output. The tags are not optional flavor, they are load-bearing. The second mistake is leaving CLIP skip at 1. Pony expects CLIP skip 2, and the wrong value subtly degrades coherence in a way that is easy to miss until you compare side by side.
A third frequent error is mixing natural-language prose with Pony in the wrong proportion. Pony responds best to comma-separated tags, not flowing sentences. Write 1girl, silver hair, red dress, sitting, window light rather than a paragraph. Save the prose style for Illustrious. Finally, many users set CFG too high. Pony holds together best at CFG 6 to 7; pushing past 9 produces the over-saturated, deep-fried look.
Pony LoRAs and the Wider Ecosystem
Pony V6 has the largest NSFW LoRA ecosystem of any base model, with thousands of character, style, and concept LoRAs on Civitai trained specifically against it. A LoRA trained on a different base, such as a stock SDXL or Illustrious LoRA, will not load cleanly on Pony, so always check the LoRA base model before downloading. Pony-native LoRAs are tagged clearly on their model pages.
Because Pony shifts the text encoder behavior, some general SDXL tools and embeddings behave differently on it. Negative embeddings built for vanilla SDXL often do little on Pony. Build your negatives from plain tags instead, and see our negative prompts master list for templates that hold up on Pony. For broader model selection, our NSFW checkpoints guide places Pony against the alternatives.
Resolution and Aspect Ratios for Pony
Pony V6 is an SDXL model, so it expects a total pixel budget near 1024×1024. For portraits, 832×1216 keeps the SDXL training distribution intact while giving a vertical frame; for wide scenes, 1216×832 works the same way. Avoid generating far above 1024 in a single pass, since SDXL models including Pony produce duplicated limbs and torsos at large native resolutions. Generate at a sane base size, then scale up with a hires pass for clean high-resolution output.
Frequently Asked Questions
What are score tags in Pony Diffusion?
Score tags are Pony Diffusion’s built-in quality control system. Tags like score_9, score_8_up, and score_7_up tell the model what quality tier you want. The standard opening cluster is score_9, score_8_up, score_7_up, score_6_up, which biases the model toward its highest-rated training examples and is the single most important prompt habit for Pony.
What CLIP skip should I use with Pony Diffusion?
Pony Diffusion V6 XL works best at CLIP skip 2. This is different from many SDXL models that prefer CLIP skip 1. Setting CLIP skip to 2 matches how the model was trained on tag data and gives noticeably cleaner anatomy and prompt obedience. Most UIs let you set this in the settings panel.
Which sampler is best for Pony Diffusion?
Euler a and DPM++ 2M Karras are the two reliable samplers for Pony. Euler a is fast and forgiving for everyday work, while DPM++ 2M Karras gives slightly sharper detail at the cost of a few extra seconds. Both work well in the 25 to 40 step range, with 30 steps being a solid default.
How do rating tags work in Pony Diffusion?
Pony uses explicit rating tags to control content level: rating_safe, rating_questionable, and rating_explicit. For NSFW output you place rating_explicit early in the prompt. Omitting a rating tag leaves the result unpredictable, so always include one. This system gives precise, reliable control over how explicit the output is.
Does Pony Diffusion understand natural language prompts?
Not really. Pony Diffusion is trained on booru-style tags, so it responds far better to comma-separated tags than to full sentences. Write 1girl, long hair, standing, detailed background rather than a girl with long hair standing in a detailed room. Tag-based prompting is essential for good Pony results.
What resolution should I generate at with Pony Diffusion?
Pony Diffusion V6 XL is an SDXL model, so generate at 1024×1024 or other SDXL-native resolutions like 832×1216 for portraits and 1216×832 for landscapes. Generating below 1024px produces soft, low-detail output. For larger final images, use Hires Fix to upscale from a 1024px base.
Why do my Pony Diffusion images look washed out?
Washed-out Pony output usually means CLIP skip is wrong, CFG is too high, or the score tags are missing. Set CLIP skip to 2, keep CFG between 5 and 7, and always open with the score_9 cluster. If anatomy is still off, add source_anime or source_pony and a strong negative prompt to anchor the style.
Can I use LoRAs with Pony Diffusion?
Yes, but use LoRAs trained specifically for Pony Diffusion V6 XL. LoRAs trained on other SDXL bases or on SD 1.5 will not work correctly and often produce artifacts. Civitai lets you filter LoRAs by base model, so select Pony as the filter. Apply Pony LoRAs at 0.6 to 0.9 strength for best results.



