Adding Sections to PowerPoint Files with Python and lxml

The python-pptx library is great for creating PowerPoint presentations programmatically, but it doesn’t support PowerPoint sections — the collapsible groups you see in the slide panel. This post shows how to add sections by directly manipulating the underlying XML with lxml.

The OOXML Structure

A .pptx file is a ZIP archive containing XML files. The presentation structure lives in ppt/presentation.xml. Sections are stored inside an extension element using the p14 namespace (http://schemas.microsoft.com/office/powerpoint/2010/main), not the standard p namespace.

The correct XML structure looks like this:

<p:extLst>
 <p:ext uri="{521415D9-36F7-43E2-AB2F-B90AF26B5E84}">
   <p14:sectionLst xmlns:p14="http://schemas.microsoft.com/office/powerpoint/2010/main">
     <p14:section name="Introduction" id="{GUID-HERE}">
       <p14:sldIdLst>
         <p14:sldId id="256"/>
         <p14:sldId id="257"/>
       </p14:sldIdLst>
     </p14:section>
     <p14:section name="Background" id="{GUID-HERE}">
       <p14:sldIdLst>
         <p14:sldId id="258"/>
       </p14:sldIdLst>
     </p14:section>
     <p14:section name="Proposed Method" id="{GUID-HERE}">
       <p14:sldIdLst>
         <p14:sldId id="259"/>
         <p14:sldId id="260"/>
       </p14:sldIdLst>
     </p14:section>
     <p14:section name="Conclusion" id="{GUID-HERE}">
       <p14:sldIdLst/>
     </p14:section>
   </p14:sectionLst>
 </p:ext>
</p:extLst>
XML

Key points:

  • p14:sectionLst must be inside a p:ext element with URI {521415D9-36F7-43E2-AB2F-B90AF26B5E84}
  • That p:ext must be inside p:extLst, which is a child of p:presentation
  • Each p14:section has a name and a unique id (GUID format)
  • Each section contains a p14:sldIdLst listing the slide IDs that belong to it
  • Empty sections (no slides) use an empty <p14:sldIdLst/>

Common Pitfall: Wrong Namespace

Why p14 and not p10?

The namespace URI contains “2010”: http://schemas.microsoft.com/office/powerpoint/2010/main. You might expect the prefix to be p10, but Microsoft uses p14 — it refers to PowerPoint 14.0 (the internal version number of Office 2010). This is a common source of confusion:

OFFICE VERSIONINTERNAL VERSIONNAMESPACE PREFIX
Office 201014.0p14
Office 201315.0p15

Using the standard p namespace (wrong)

If you place sections in the standard p namespace as a direct child of p:presentation:

<!-- WRONG — PowerPoint ignores this -->
<p:presentation>
...
 <p:sectionLst>
   <p:section name="Introduction" id="{GUID}">
     <p:sldIdLst>
       <p:sldId id="256"/>
     </p:sldIdLst>
   </p:section>
 </p:sectionLst>
</p:presentation>
XML

PowerPoint will silently ignore it and show all slides in a “Default Section” instead. The file won’t be corrupted — the sections just won’t appear.

Python Implementation

Step 1: Create Slides with python-pptx

from pptx import Presentation

prs = Presentation("template.pptx")

# Add slides as usual
title_layout = prs.slide_masters[0].slide_layouts[0]
content_layout = prs.slide_masters[1].slide_layouts[0]

title_slide = prs.slides.add_slide(title_layout)
title_slide.shapes.title.text = "My Presentation"

intro_slide = prs.slides.add_slide(content_layout)
intro_slide.shapes.title.text = "Introduction"

bg_slide = prs.slides.add_slide(content_layout)
bg_slide.shapes.title.text = "Background"

method_slide1 = prs.slides.add_slide(content_layout)
method_slide1.shapes.title.text = "Proposed Method"

method_slide2 = prs.slides.add_slide(content_layout)
method_slide2.shapes.title.text = "Proposed Method (cont.)"

conclusion_slide = prs.slides.add_slide(content_layout)
conclusion_slide.shapes.title.text = "Conclusion"

# Remove template placeholder slides if needed
xml_slides = prs.slides._sldIdLst
for xml_slide in list(xml_slides[:2]):
   xml_slides.remove(xml_slide)
Python

Step 2: Add Sections with lxml

from lxml import etree

def add_sections(prs, sections):
   """
  Add sections to a PowerPoint presentation.

  Args:
      prs: python-pptx Presentation object
      sections: list of dicts with 'name' and optional 'slide_ids'
                e.g. [{"name": "Introduction", "slide_ids": ["256", "257"]},
                      {"name": "Background", "slide_ids": ["258"]}]
  """
   presentation_elm = prs.part._element

   P_NS = "http://schemas.openxmlformats.org/presentationml/2006/main"
   P14_NS = "http://schemas.microsoft.com/office/powerpoint/2010/main"
   SECTION_EXT_URI = "{521415D9-36F7-43E2-AB2F-B90AF26B5E84}"

   nsmap_p = {"p": P_NS}

   # Find or create extLst
   extLst = presentation_elm.find("p:extLst", nsmap_p)
   if extLst is None:
       extLst = etree.SubElement(presentation_elm, f"{{{P_NS}}}extLst")

   # Find or create the section extension element
   section_ext = None
   for ext in extLst.findall("p:ext", nsmap_p):
       if ext.get("uri") == SECTION_EXT_URI:
           section_ext = ext
           break

   if section_ext is None:
       section_ext = etree.SubElement(
           extLst, f"{{{P_NS}}}ext", uri=SECTION_EXT_URI
      )

   # Remove any existing sectionLst
   for old in section_ext.findall(f"{{{P14_NS}}}sectionLst"):
       section_ext.remove(old)

   # Create new sectionLst
   sectionLst = etree.SubElement(section_ext, f"{{{P14_NS}}}sectionLst")

   # Add each section
   for i, sec in enumerate(sections):
       section_id = f"{{{i:08d}-0000-0000-0000-000000000001}}"
       section = etree.SubElement(
           sectionLst,
           f"{{{P14_NS}}}section",
           name=sec["name"],
           id=section_id,
      )
       sldIdLst = etree.SubElement(section, f"{{{P14_NS}}}sldIdLst")
       for sid in sec.get("slide_ids", []):
           etree.SubElement(sldIdLst, f"{{{P14_NS}}}sldId", id=sid)
Python

Step 3: Get Slide IDs and Call

Slide IDs are numeric identifiers assigned by PowerPoint, not the slide index. You need to read them from the presentation XML:

nsmap_p = {"p": "http://schemas.openxmlformats.org/presentationml/2006/main"}
sldIdLst = prs.part._element.find(".//p:sldIdLst", nsmap_p)
slide_ids = [elem.get("id") for elem in sldIdLst]

# slide_ids is now e.g. ["256", "257", "258", "259", "260", "261"]
# Map them to your sections:
add_sections(prs, [
  {"name": "Introduction", "slide_ids": slide_ids[:2]},
  {"name": "Background", "slide_ids": [slide_ids[2]]},
  {"name": "Proposed Method", "slide_ids": slide_ids[3:5]},
  {"name": "Conclusion", "slide_ids": [slide_ids[5]]},
])

prs.save("output.pptx")
Python

Debugging

To verify sections are correctly written, inspect the XML inside the saved .pptx:

import zipfile
from lxml import etree

with zipfile.ZipFile("output.pptx") as z:
   xml = z.read("ppt/presentation.xml")

root = etree.fromstring(xml)
print(etree.tostring(root, pretty_print=True).decode())
Python

Look for p14:sectionLst inside p:extLstp:ext. If you see p:sectionLst as a direct child of p:presentation instead, sections won’t work.

Summary

WHATHOW
Create slidespython-pptx as usual
Add sectionslxml — write p14:sectionLst inside p:extLst
Namespacehttp://schemas.microsoft.com/office/powerpoint/2010/main (p14)
Extension URI{521415D9-36F7-43E2-AB2F-B90AF26B5E84}
Assign slidesReference slide IDs in each section’s p14:sldIdLst
Empty sectionsUse <p14:sldIdLst/> with no children

Posted

in

,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

🧭