Python Docx is a python library for creation and modification of Microsoft Word documents. It offers a variety of operations to create new documents and other word operations like working with text, images, shapes, tables and many other document features. New document can be created and existing documents can also be modified using python docx. For getting started, first install python docx on your system using pip or source.

# using pip
pip install python-docx

# using easy_install
easy_install python-docx

# or build from source
tar xvzf python-docx-{version}.tar.gz
cd python-docx-{version}
python setup.py install

Now we can work with basic of python docx for word document creation.

Getting Started

First, we can create an empty document where we can write any text or other data.

# import docx document
from docx import Document
# initialize a document
document = Document()

Or if there is some existing document, it can also be opened using Document() in python docx by providing path of document.

document = Document(doc_path)

Next we work with python docx functions to add data to document.

Working with Text

Python-Docx offers different options like paragraph, heading and other options for simple text.

Headings are paragraphs with different text size and style based on its level defined while creating heading. Heading level ranges from 0-9 based on text size where 0 is biggest font heading. Here are some examples of headings.

# title heading
document.add_heading("This is a level 1 heading", 0)

# Add other heading levels
document.add_heading("This is a level 2 heading", 2)
document.add_heading("This is a level 3 heading", 3)
document.add_heading("This is a level 5 heading", 5)
document.add_heading("This is a level 7 heading", 7)
document.add_heading("This is a level 9 heading", 9)

My alt text

Paragraph has different properties depending on its placement and it divides content accordingly to its lines. Paragraphs has different style and alignment options to create a document with specified text locations and styles.

paragraph = document.add_paragraph("TensorFlow is a free and open-source software library for machine learning and artificial intelligence.")

Paragraphs can be updated/modified with new text or alignment options.

# add more text
paragraph.add_run(" It can be used across a range of tasks for ")

# add text with styles
paragraph.add_run('training model ').bold = True # added text with bold
paragraph.add_run('and inference.').italic = True # added italic text

Paragraphs can have other styles like quotes and other styles.

document.add_paragraph('Intense quote', style='I have no special talent')

Paragraph Alignment

Paragraph alignments like horizontal alignment, indentation and other features like line spacing can also be applied to paragraphs. First lets work with horizontal alignment.

from docx.enum.text import WD_ALIGN_PARAGRAPH

# Check previous alignment
print("Previous alignment", paragraph.paragraph_format.alignment)

# Align paragraph center
paragraph.paragraph_format.alignment = WD_ALIGN_PARAGRAPH.CENTER

Indentation is horizontal space between paragraph and its container edges. In python-docx we can specify details in Inches so we import function and can indent in direction.

from docx.shared import Inches

paragraph_r = document.add_paragraph('This is some random paragraph for testing indentation on both (left and right) side of paragraph.')
# only first line indent
paragraph.paragraph_format.first_line_indent = Inches(0.5)
# paragraph indent
paragraph_r.paragraph_format.left_indent = Inches(0.5) # apply 0.5 inch left indentation
paragraph_r.paragraph_format.right_indent = Inches(1) # apply 1 inch right indentation

 Line spacing is also part of paragraph formating and is easy to use.

from docx.shared import Pt
paragraph.paragraph_format.line_spacing = Pt(18)

Working with Fonts

Different font styles like color, font family and other modifications can also be applied. Here we create a paragraph and change its font size and font family.

# font modifications
para_font = document.add_paragraph()
run = para_font.add_run('This is a paragraph with different font styles.\n')
run.font.size = Pt(14) # font size
run.font.name = 'Courier New' # font name

We can also apply colors and other attributes like bold, italice, underline etc.

from docx.shared import RGBColor

# add underlined text with blue color
url = para_font.add_run("http://google.com")
url.font.color.rgb = RGBColor(0x00, 0x00, 0xFF)
url.font.underline = True # font underline

Here is the output for all the code we have written for paragraphs.

My alt text

Now we can export document and write it to directory.

# save document
document.save("paragraphs.docx")

There are a lot of other features of python docx like working with lists and nested lists, images, tables, header and footers and other operations. For this, you can check other posts that will be published soon or check official python-docx documentation.

https://python-docx.readthedocs.io/en/latest/