Embedding Visual Diagrams (UML / Mermaid / PlantUML) Across Spec Kit Workflows #694

Nicered · 2025-10-01T04:12:15Z

Nicered
Oct 1, 2025

Context

Spec Kit’s workflow — /specify → /plan → /tasks → /implement — currently relies heavily on text-based documentation. While this is useful for details, complex flows, algorithms, and architecture decisions are often harder to follow in text alone. Contributors (especially newcomers) could benefit from a more visual representation of these processes.

Problem

Text-only documentation can be verbose and harder to parse.
Complex user flows, architecture decisions, or validation pipelines require repeated explanation.
Contributors may struggle to understand the overall context without a visual overview.

Proposal

Introduce visual diagrams (UML, Mermaid, PlantUML) into various Spec Kit documents:

Specs: illustrate system architecture, request/response flows, validation pipelines.
Plans: visualize design options, dependencies, and branching strategies.
Tasks: show decomposition of work, algorithm flows, or state transitions.
RFCs / Decision Records: capture trade-offs, before/after states, and impact analysis.

Recommended tooling:

Mermaid: lightweight, GitHub-renderable (flowcharts, sequence/state diagrams).
PlantUML: more detailed class, component, or architecture diagrams.
Diagrams can be embedded in Markdown or stored as .puml files for contributors to edit.

Example (Mermaid Sample)

sequenceDiagram
  user->>specify: submit feature prompt
  specify->>agent: generate spec
  agent->>specify: return spec document
  user->>plan: propose architecture
  plan->>agent: assess design options
  agent->>plan: return detailed plan
  plan->>tasks: decompose into tasks
  tasks->>agent: generate task list

Expected Benefits

Faster comprehension: contributors can scan diagrams instead of reading long text.
Shared visual language: reduces ambiguity, supports collaboration.
Consistency: diagrams-as-code ensure visuals are version-controlled.
Improved onboarding: new contributors understand workflows more quickly.

Discussion Points

Should we standardize on Mermaid or PlantUML, or allow both?
Where should diagrams be mandatory (specs, plans, RFCs) vs optional?
How do we ensure diagrams remain up-to-date (review checklists, CI validation)?
Should we provide starter templates (common UML patterns) for contributors?
Which areas should we prioritize first (e.g., spec flows, planning architecture)?

Zyzzx · 2025-10-01T20:26:54Z

Zyzzx
Oct 1, 2025

This concept is something I have been pondering. I have been trying to come up with some test projects to see if the LLMs are able to architect software processes better with textual descriptions or things like state/model diagrams. I think it is worth hammering on. I found spec-kit not too long ago and the way it works makes it easy to test a lot of different strategies quickly since all of it is just text files of prompts and templates.

Since I am more of an architect than a developer these days, I tend to think about projects as an architect. I would like to use Mermaid to very efficiently specify project layout, or even an abstract layer above to define how and app or service is put together logically. Then provide use cases and then behaviors more like BDD. From that it seems that development can be broken down into distinct phases (in separate files for context even) and they could be done in parallel if top-level scaffolding and interfaces are created.

0 replies

yubrshen · 2025-10-03T19:55:08Z

yubrshen
Oct 3, 2025

From my experience of playing with text to diagram, I suggest considering graphviz (.dot) format.
I found that PlanUML is too limited in expression. Mermaid may not produce readable layout beyond toy problems.
Graphviz' layout capability is so far the most capable and versatile.
It can be rather simple in syntax with additional detailed specification possible.

Actually, I'm trying to use Graphviz syntax in prompts to express my design intent.

2 replies

Zyzzx Oct 3, 2025

I have always been a fan of GraphViz, but it can be complex, probably complex enough to confuse an LLM. Mermaid is fairly simple and seems to be understood by LLMs fairly well. I am going to see how complex I can make a mermaid diagram before it falters.

I have found that some of the big (100B+) models able to understand my descriptions of hierarchies pretty well, but I want a more succinct way to specify relationships, both logical and physical. My experience maybe a year ago with some models being able to understand GraphViz were limited to simple relationship or state diagrams, but even now I think that having them understand arbitrary graphviz and being able to emit it with the kind of layout and design desired will still be a stretch.

Nicered Oct 8, 2025
Author

I completely agree that Graphviz remains the most powerful and flexible tool for producing precise and readable layouts, especially for complex system or data-flow diagrams. Its layout algorithms (like dot, neato, fdp, sfdp) offer a level of spatial control that neither Mermaid nor PlantUML can match.

However, when working with Large Language Models, the challenge is not about expressive power but about interpretability and consistency.
Graphviz syntax, while elegant for humans familiar with it, is structurally dense and full of optional parameters — which often confuses the token-based reasoning of LLMs. On the other hand, Mermaid and PlantUML trade some expressive flexibility for simplicity, making them easier for models to parse, predict, and reproduce consistently in conversation or generation loops.

That said, your idea of using Graphviz syntax within prompts to convey design intent is very compelling.
It could serve as a meta-language that bridges human design logic and machine reasoning, particularly if the model is fine-tuned or guided with schema-aware parsing.
In essence, Graphviz is ideal for the final expression of structure, while Mermaid/PlantUML are better for interactive reasoning and iteration during early design exploration.

Nicered · 2025-10-08T00:11:36Z

Nicered
Oct 8, 2025
Author

From my experience applying both Mermaid and PlantUML to real-world system architecture design, I’ve found each tool serves different purposes effectively:

Mermaid works well for process-level visualization — such as flowcharts for system workflows, internal API request/response sequences, and algorithmic logic flows using sequence diagrams.

PlantUML is better suited for structural representation — including class relationships, data models, and component dependencies.

However, when attempting to represent a complete architecture view that integrates services, databases, API gateways, and external systems, even Mermaid’s C4 model support remains somewhat limited. The notation lacks fine-grained layout control and expressive depth to capture multi-layered system interactions or deployment contexts accurately.

In practice, this means that while Mermaid and PlantUML are excellent for documenting parts of the system, additional tooling or hybrid approaches (e.g., Graphviz or manual diagram layering) may be needed to depict the entire architectural landscape.
This insight could inform how Spec Kit decides to balance readability, expressiveness, and maintainability in visual documentation standards.

0 replies

jzhangrpia · 2025-10-09T11:52:43Z

jzhangrpia
Oct 9, 2025

It would be great if in the plan phase the agent can use C4, state machine and sequence diagrams to interactively communicating with user to finalize the design. So far, I think these are most useful and also easy to understand ones.

0 replies

anchildress1 · 2025-10-12T01:17:25Z

anchildress1
Oct 12, 2025

I know you're not necessarily talking about permanent diagrams that the user would always want to commit, but if you want to provide that option then Mermaid is the only version currently supported in GFM. Also, while I agree that Graphviz provides more out-of-the-box control over the others, it also comes with the biggest learning curve and thus potential failure points for AI integration (as you've already pointed out).

I've been using Mermaid in my personal flows for quite literally everything ever since GitHub started rendering it natively, so 2 years give or take (I'm a senior dev with 7+ years in backend enterprise distributed systems). Mermaid can definitely be challenging to put together a single diagram that has any high+ level of complexity. However, you don't necessarily need (I'd even say want) the entire system outlined in a single diagram at that level anyway.

If you work across abstracted layers within a complex system instead, then it works great! More out of inane curiosity than anything, I put together a couple docs. The first is a completely fictional system that I spent exactly three turns generating with Copilot + Mermaid VSC Extension (although the last was a snafu, so it really was fine after the second). I used C4 for this example, just cause it's the one that stuck out after reading this thread. 😆

The second doc is an analysis generated with ChatGPT that covers a) native diagram support, b) it's estimation of AI ability's to both generate and understand each diagram format, and c) it's recommendation for which format this repo specifically should consider for AI integration based on results of the first two.

As an aside, I have a chat mode in awesome-copilot that's geared towards the documentation side of this exact same setup. In my experience, it's the simplest learning curve (esp for new devs) and the easiest to iterate with any AI tool I've tried it with (regularly, that's Copilot, Codex, and Verdent). So if you're taking a poll, Mermaid gets my vote! 🗳️

I put both files in this repo as reference (mostly because my comp was being a pita with quarantine, but also this is simpler anyway). 😉

0 replies

arcturien · 2025-10-23T20:47:46Z

arcturien
Oct 23, 2025

I've been able to generate PUML schemas with Perplexity, from and old project source code, and ask Claude 4 in VS Code agent mode to write the new code in another language.

This was formidable, as I can share the PUML schemas to human developpers and to Claude 4 in VS Code, to generate code.

I think that, definitively, UML schemas coded in text or specs in other formats (Word/PDF docs) should be available in Spec Kit.

Note : I'm using https://www.plantuml.com/ to preview my PUML schemas, and it's working fine for me.

@+
rv.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Embedding Visual Diagrams (UML / Mermaid / PlantUML) Across Spec Kit Workflows #694

Uh oh!

{{title}}

Uh oh!

Replies: 6 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Embedding Visual Diagrams (UML / Mermaid / PlantUML) Across Spec Kit Workflows #694

Uh oh!

Nicered Oct 1, 2025

Context

Problem

Proposal

Example (Mermaid Sample)

Expected Benefits

Discussion Points

Replies: 6 comments · 2 replies

Uh oh!

Zyzzx Oct 1, 2025

Uh oh!

yubrshen Oct 3, 2025

Uh oh!

Zyzzx Oct 3, 2025

Uh oh!

Nicered Oct 8, 2025 Author

Uh oh!

Nicered Oct 8, 2025 Author

Uh oh!

jzhangrpia Oct 9, 2025

Uh oh!

anchildress1 Oct 12, 2025

Uh oh!

Uh oh!

arcturien Oct 23, 2025

Nicered
Oct 1, 2025

Replies: 6 comments 2 replies

Zyzzx
Oct 1, 2025

yubrshen
Oct 3, 2025

Nicered Oct 8, 2025
Author

Nicered
Oct 8, 2025
Author

jzhangrpia
Oct 9, 2025

anchildress1
Oct 12, 2025

arcturien
Oct 23, 2025