xml

package
v0.0.0-...-3c4e83a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 11, 2026 License: EPL-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package xml provides XML structure definitions and types for DOCX documents.

This package contains the core XML structures used by go-stencil to parse and manipulate DOCX files. DOCX files are essentially ZIP archives containing XML files that define the document structure, content, and formatting.

Structure Organization

The package is organized into logical files based on XML element types:

  • types.go: Core interfaces (BodyElement, ParagraphContent, RawXMLElement) and common types
  • document.go: Top-level Document and Body structures
  • paragraph.go: Paragraph elements and their properties (alignment, spacing, etc.)
  • run.go: Run elements (text runs with formatting), Text, and Break elements
  • table.go: Table structures (Table, TableRow, TableCell) and their properties

Key Concepts

BodyElement: Top-level elements that can appear in a document body (paragraphs, tables).

ParagraphContent: Elements that can appear within a paragraph (runs, hyperlinks, breaks).

Run: A contiguous sequence of text with consistent formatting. Runs are the atomic units of text formatting in DOCX files.

Usage

This package is primarily used internally by the stencil package for DOCX parsing and rendering. Most users will interact with these types through the main stencil package API, which re-exports the common types.

Example of working with document structure:

doc := &xml.Document{
    Body: xml.Body{
        Elements: []xml.BodyElement{
            &xml.Paragraph{
                Content: []xml.ParagraphContent{
                    &xml.Run{
                        Content: []interface{}{
                            &xml.Text{Value: "Hello, world!"},
                        },
                    },
                },
            },
        },
    },
}

XML Namespaces

DOCX XML uses several namespaces:

  • w: (word processing) - Main WordProcessingML namespace
  • r: (relationships) - Relationships namespace
  • a: (drawing) - DrawingML namespace

These are defined in the XML tags throughout the structures.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Alignment

type Alignment struct {
	Val string `xml:"val,attr"`
}

Alignment represents text alignment

func (Alignment) MarshalXML

func (a Alignment) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Alignment

type Body

type Body struct {
	// Elements maintains the order of all body elements
	Elements []BodyElement `xml:"-"`
	// SectionProperties at the end of the body (critical for Word compatibility)
	SectionProperties *RawXMLElement `xml:"-"`
}

Body represents the document body

func (Body) MarshalXML

func (b Body) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling to preserve element order

func (*Body) UnmarshalXML

func (b *Body) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to preserve element order

type BodyElement

type BodyElement interface {
	// contains filtered or unexported methods
}

BodyElement represents any element that can appear in a document body

type BorderProperties

type BorderProperties struct {
	Val        string `xml:"val,attr,omitempty"`
	Sz         string `xml:"sz,attr,omitempty"`
	Space      string `xml:"space,attr,omitempty"`
	Color      string `xml:"color,attr,omitempty"`
	ThemeColor string `xml:"themeColor,attr,omitempty"`
	ThemeShade string `xml:"themeShade,attr,omitempty"`
}

BorderProperties represents border styling

func (BorderProperties) MarshalXML

func (b BorderProperties) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for BorderProperties

type Break

type Break struct {
	Type string `xml:"type,attr,omitempty"`
}

Break represents a line break

func (*Break) MarshalXML

func (b *Break) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements xml.Marshaler to ensure Break is self-closing

type CellMargin

type CellMargin struct {
	Width int    `xml:"w,attr"`
	Type  string `xml:"type,attr"`
}

CellMargin represents a single cell margin

func (CellMargin) MarshalXML

func (m CellMargin) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for CellMargin

type Color

type Color struct {
	Val string `xml:"val,attr"`
}

Color represents text color

type Document

type Document struct {
	XMLName xml.Name   `xml:"document"`
	Body    *Body      `xml:"body"`
	Attrs   []xml.Attr `xml:"-"` // Preserve root element attributes (namespaces)
}

Document represents a Word document structure

func ParseDocument

func ParseDocument(r io.Reader) (*Document, error)

ParseDocument parses a Word document XML

func (*Document) ExtractNamespaces

func (doc *Document) ExtractNamespaces() map[string]string

ExtractNamespaces returns all namespace declarations from document attributes Returns a map of prefix -> namespace URI

func (*Document) MergeNamespaces

func (doc *Document) MergeNamespaces(additionalNamespaces map[string]string)

MergeNamespaces adds namespace declarations to the document attributes If a prefix already exists, the existing declaration is preserved (first wins)

func (*Document) UnmarshalXML

func (doc *Document) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to preserve root attributes

type Empty

type Empty struct{}

Empty represents an empty element (used for boolean properties)

type Font

type Font struct {
	ASCII         string `xml:"ascii,attr,omitempty"`
	HAnsi         string `xml:"hAnsi,attr,omitempty"`
	CS            string `xml:"cs,attr,omitempty"`
	EastAsia      string `xml:"eastAsia,attr,omitempty"`
	ASCIITheme    string `xml:"asciiTheme,attr,omitempty"`
	HAnsiTheme    string `xml:"hAnsiTheme,attr,omitempty"`
	CSTheme       string `xml:"csTheme,attr,omitempty"`
	EastAsiaTheme string `xml:"eastAsiaTheme,attr,omitempty"`
	Hint          string `xml:"hint,attr,omitempty"`
}

Font represents font information

func (Font) MarshalXML

func (f Font) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Font

type GridColumn

type GridColumn struct {
	Width int `xml:"w,attr"`
}

GridColumn represents a table column

func (GridColumn) MarshalXML

func (g GridColumn) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for GridColumn

type GridSpan

type GridSpan struct {
	Val int `xml:"val,attr"`
}

GridSpan represents cell column span

type Height

type Height struct {
	Val int `xml:"val,attr"`
}

Height represents row height

func (Height) MarshalXML

func (h Height) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Height

type Hyperlink struct {
	ID      string `xml:"http://schemas.openxmlformats.org/officeDocument/2006/relationships id,attr"`
	History string `xml:"history,attr,omitempty"`
	Runs    []Run  `xml:"r"`
}

Hyperlink represents a hyperlink in the document

func (*Hyperlink) GetText

func (h *Hyperlink) GetText() string

GetText returns the concatenated text of all runs in a hyperlink

func (Hyperlink) MarshalXML

func (h Hyperlink) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Hyperlink to ensure proper namespacing

type Indentation

type Indentation struct {
	Left      string `xml:"left,attr,omitempty"`
	Right     string `xml:"right,attr,omitempty"`
	Start     string `xml:"start,attr,omitempty"`
	End       string `xml:"end,attr,omitempty"`
	FirstLine string `xml:"firstLine,attr,omitempty"`
	Hanging   string `xml:"hanging,attr,omitempty"`
}

Indentation represents paragraph indentation

func (Indentation) MarshalXML

func (i Indentation) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Indentation

type Kern

type Kern struct {
	Val int `xml:"val,attr"`
}

Kern represents character kerning

func (Kern) MarshalXML

func (k Kern) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Kern

type Lang

type Lang struct {
	Val      string `xml:"val,attr,omitempty"`
	EastAsia string `xml:"eastAsia,attr,omitempty"`
	Bidi     string `xml:"bidi,attr,omitempty"`
}

Lang represents language settings

func (Lang) MarshalXML

func (l Lang) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Lang

type Paragraph

type Paragraph struct {
	Properties *ParagraphProperties `xml:"pPr"`
	// Attrs preserves paragraph-level attributes (e.g. w14:paraId, w:rsidR).
	Attrs []xml.Attr `xml:"-"`
	// Content maintains the order of runs and hyperlinks
	Content []ParagraphContent `xml:"-"`
	// Legacy fields for backward compatibility during transition
	Runs       []Run       `xml:"-"`
	Hyperlinks []Hyperlink `xml:"-"`
}

Paragraph represents a paragraph in the document

func (*Paragraph) GetText

func (p *Paragraph) GetText() string

GetText returns the concatenated text of all runs in a paragraph

func (Paragraph) MarshalXML

func (p Paragraph) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Paragraph to ensure proper namespacing

func (*Paragraph) UnmarshalXML

func (p *Paragraph) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to preserve element order

type ParagraphContent

type ParagraphContent interface {
	// contains filtered or unexported methods
}

ParagraphContent represents any content that can appear in a paragraph

type ParagraphProperties

type ParagraphProperties struct {
	Style          *Style         `xml:"pStyle"`
	Tabs           *Tabs          `xml:"tabs"`
	OverflowPunct  bool           `xml:"-"` // Stored as flag
	AutoSpaceDE    bool           `xml:"-"` // Stored as flag
	AutoSpaceDN    bool           `xml:"-"` // Stored as flag
	AdjustRightInd bool           `xml:"-"` // Stored as flag
	Alignment      *Alignment     `xml:"jc"`
	Indentation    *Indentation   `xml:"ind"`
	Spacing        *Spacing       `xml:"spacing"`
	TextAlignment  *TextAlignment `xml:"-"`   // Stored as string
	RunProperties  *RunProperties `xml:"rPr"` // Default run properties for paragraph
	// RawXML stores unparsed XML elements to preserve all paragraph properties
	RawXML []RawXMLElement `xml:"-"`
	// RawXMLMarkers stores marker strings for RawXML elements (used during marshaling)
	RawXMLMarkers []string `xml:"-"`
}

ParagraphProperties represents paragraph formatting properties

func (ParagraphProperties) MarshalXML

func (p ParagraphProperties) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for ParagraphProperties

func (*ParagraphProperties) UnmarshalXML

func (p *ParagraphProperties) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to preserve unknown elements

type ProofErr

type ProofErr struct {
	Type  string     `xml:"type,attr,omitempty"`
	Attrs []xml.Attr `xml:"-"`
}

ProofErr represents a spell/grammar proofing marker within a paragraph.

func (ProofErr) MarshalXML

func (p ProofErr) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML writes proofErr as a self-closing WordprocessingML element.

func (*ProofErr) UnmarshalXML

func (p *ProofErr) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML preserves proofErr attributes such as w:type.

type RawXMLElement

type RawXMLElement struct {
	XMLName xml.Name
	Attrs   []xml.Attr
	Content []byte
}

RawXMLElement represents a raw XML element that we preserve but don't parse

type Run

type Run struct {
	Properties *RunProperties `xml:"rPr"`
	Text       *Text          `xml:"t"`
	Break      *Break         `xml:"br"`
	// Attrs preserves run-level attributes (e.g. w:rsidRPr).
	Attrs []xml.Attr `xml:"-"`
	// RawXML stores unparsed XML elements (like drawings) to preserve them
	RawXML []RawXMLElement `xml:"-"`
}

Run represents a run of text with common properties

func (*Run) GetText

func (r *Run) GetText() string

GetText returns the text content of a run

func (Run) MarshalXML

func (r Run) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Run to ensure proper namespacing

func (*Run) UnmarshalXML

func (r *Run) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to preserve unknown elements

type RunProperties

type RunProperties struct {
	Bold          *Empty          `xml:"b"`
	BoldCs        *Empty          `xml:"bCs"`
	Italic        *Empty          `xml:"i"`
	ItalicCs      *Empty          `xml:"iCs"`
	Underline     *UnderlineStyle `xml:"u"`
	Strike        *Empty          `xml:"strike"`
	VerticalAlign *VerticalAlign  `xml:"vertAlign"`
	Color         *Color          `xml:"color"`
	Size          *Size           `xml:"sz"`
	SizeCs        *Size           `xml:"szCs"` // Complex script size
	Kern          *Kern           `xml:"kern"` // Character kerning
	Lang          *Lang           `xml:"lang"` // Language settings
	Font          *Font           `xml:"rFonts"`
	Style         *RunStyle       `xml:"rStyle"`
}

RunProperties represents run formatting properties

type RunStyle

type RunStyle struct {
	Val string `xml:"val,attr"`
}

RunStyle represents a run style reference

func (RunStyle) MarshalXML

func (s RunStyle) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for RunStyle

type Shading

type Shading struct {
	Val       string `xml:"val,attr,omitempty"`
	Color     string `xml:"color,attr,omitempty"`
	Fill      string `xml:"fill,attr,omitempty"`
	ThemeFill string `xml:"themeFill,attr,omitempty"`
}

Shading represents cell or paragraph shading

func (Shading) MarshalXML

func (s Shading) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Shading

type Size

type Size struct {
	Val int `xml:"val,attr"`
}

Size represents font size

func (Size) MarshalXML

func (s Size) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Size

type Spacing

type Spacing struct {
	Before   int    `xml:"before,attr,omitempty"`
	After    int    `xml:"after,attr,omitempty"`
	Line     int    `xml:"line,attr,omitempty"`
	LineRule string `xml:"lineRule,attr,omitempty"`
}

Spacing represents paragraph spacing

func (Spacing) MarshalXML

func (s Spacing) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Spacing

type Style

type Style struct {
	Val string `xml:"val,attr"`
}

Style represents a style reference

func (Style) MarshalXML

func (s Style) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Style

type Tab

type Tab struct {
	Val string `xml:"val,attr"`
	Pos string `xml:"pos,attr"`
}

Tab represents a single tab stop

func (Tab) MarshalXML

func (t Tab) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Tab

type Table

type Table struct {
	Properties *TableProperties `xml:"tblPr"`
	Grid       *TableGrid       `xml:"tblGrid"`
	Rows       []TableRow       `xml:"tr"`
}

Table represents a table in the document

func (Table) MarshalXML

func (t Table) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Table to ensure proper namespacing

func (*Table) UnmarshalXML

func (t *Table) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to keep namespace scope in sync.

type TableBorders

type TableBorders struct {
	Top     *BorderProperties `xml:"top"`
	Left    *BorderProperties `xml:"left"`
	Bottom  *BorderProperties `xml:"bottom"`
	Right   *BorderProperties `xml:"right"`
	InsideH *BorderProperties `xml:"insideH"`
	InsideV *BorderProperties `xml:"insideV"`
}

TableBorders represents borders for a table (w:tblBorders) This includes inner borders (insideH, insideV) in addition to outer borders

func (TableBorders) MarshalXML

func (b TableBorders) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableBorders

type TableCell

type TableCell struct {
	Properties *TableCellProperties `xml:"tcPr"`
	Paragraphs []Paragraph          `xml:"p"`
}

TableCell represents a cell in a table

func (*TableCell) GetText

func (c *TableCell) GetText() string

GetText returns the concatenated text of all paragraphs in a cell

func (TableCell) MarshalXML

func (c TableCell) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableCell to ensure proper namespacing

func (*TableCell) UnmarshalXML

func (c *TableCell) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to keep namespace scope in sync.

type TableCellBorders

type TableCellBorders struct {
	Top    *BorderProperties `xml:"top"`
	Bottom *BorderProperties `xml:"bottom"`
	Left   *BorderProperties `xml:"left"`
	Right  *BorderProperties `xml:"right"`
}

TableCellBorders represents borders for a table cell

func (TableCellBorders) MarshalXML

func (b TableCellBorders) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableCellBorders

type TableCellMargins

type TableCellMargins struct {
	Left   *CellMargin `xml:"left"`
	Right  *CellMargin `xml:"right"`
	Top    *CellMargin `xml:"top"`
	Bottom *CellMargin `xml:"bottom"`
}

TableCellMargins represents default cell margins for a table

func (TableCellMargins) MarshalXML

func (m TableCellMargins) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableCellMargins

type TableCellProperties

type TableCellProperties struct {
	Width     *Width            `xml:"tcW"`
	VAlign    *VerticalAlign    `xml:"vAlign"`
	GridSpan  *GridSpan         `xml:"gridSpan"`
	Shading   *Shading          `xml:"shd"`
	TcBorders *TableCellBorders `xml:"tcBorders"`
}

TableCellProperties represents cell properties

func (TableCellProperties) MarshalXML

func (p TableCellProperties) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableCellProperties

type TableGrid

type TableGrid struct {
	Columns []GridColumn `xml:"gridCol"`
}

TableGrid represents table column definitions

func (TableGrid) MarshalXML

func (g TableGrid) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableGrid

type TableIndentation

type TableIndentation struct {
	Width int    `xml:"w,attr"`
	Type  string `xml:"type,attr"`
}

TableIndentation represents table indentation from margin

func (TableIndentation) MarshalXML

func (t TableIndentation) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableIndentation

type TableLayout

type TableLayout struct {
	Type string `xml:"type,attr"`
}

TableLayout represents table layout mode

func (TableLayout) MarshalXML

func (t TableLayout) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableLayout

type TableLook

type TableLook struct {
	Val         string `xml:"val,attr,omitempty"`
	FirstRow    string `xml:"firstRow,attr,omitempty"`
	LastRow     string `xml:"lastRow,attr,omitempty"`
	FirstColumn string `xml:"firstColumn,attr,omitempty"`
	LastColumn  string `xml:"lastColumn,attr,omitempty"`
	NoHBand     string `xml:"noHBand,attr,omitempty"`
	NoVBand     string `xml:"noVBand,attr,omitempty"`
}

TableLook represents table style options

func (TableLook) MarshalXML

func (t TableLook) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableLook

type TableProperties

type TableProperties struct {
	Style       *Style            `xml:"tblStyle"`
	Width       *Width            `xml:"tblW"`
	Indentation *TableIndentation `xml:"tblInd"`
	Borders     *TableBorders     `xml:"tblBorders"`
	Layout      *TableLayout      `xml:"tblLayout"`
	CellMargins *TableCellMargins `xml:"tblCellMar"`
	Look        *TableLook        `xml:"tblLook"`
}

TableProperties represents table formatting properties

func (TableProperties) MarshalXML

func (p TableProperties) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableProperties

type TableRow

type TableRow struct {
	Properties *TableRowProperties `xml:"trPr"`
	Cells      []TableCell         `xml:"tc"`
}

TableRow represents a row in a table

func (TableRow) MarshalXML

func (r TableRow) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableRow to ensure proper namespacing

func (*TableRow) UnmarshalXML

func (r *TableRow) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling to keep namespace scope in sync.

type TableRowProperties

type TableRowProperties struct {
	CantSplit bool    `xml:"-"` // Prevent row from splitting across pages
	Height    *Height `xml:"trHeight"`
}

TableRowProperties represents row properties

func (TableRowProperties) MarshalXML

func (p TableRowProperties) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TableRowProperties

func (*TableRowProperties) UnmarshalXML

func (p *TableRowProperties) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error

UnmarshalXML implements custom XML unmarshaling for TableRowProperties

type Tabs

type Tabs struct {
	XMLName xml.Name `xml:"tabs"`
	Tab     []Tab    `xml:"tab"`
}

Tabs represents tab stops

type Text

type Text struct {
	XMLName xml.Name `xml:"t"`
	Space   string   `xml:"space,attr"`
	Content string   `xml:",chardata"`
}

Text represents text content

func (Text) MarshalXML

func (t Text) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for Text to ensure proper namespacing

type TextAlignment

type TextAlignment struct {
	Val string `xml:"val,attr"`
}

TextAlignment represents text alignment settings

func (TextAlignment) MarshalXML

func (t TextAlignment) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for TextAlignment

type UnderlineStyle

type UnderlineStyle struct {
	Val string `xml:"val,attr"`
}

UnderlineStyle represents underline formatting

type VerticalAlign

type VerticalAlign struct {
	Val string `xml:"val,attr"`
}

VerticalAlign represents vertical text alignment (superscript/subscript)

func (VerticalAlign) MarshalXML

func (v VerticalAlign) MarshalXML(e *xml.Encoder, start xml.StartElement) error

MarshalXML implements custom XML marshaling for VerticalAlign

type Width

type Width struct {
	Type string `xml:"type,attr"`
	Val  int    `xml:"w,attr"`
}

Width represents width settings

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL