Skip to content

Support for RichTextRun in Spreadsheet #410

Open
@freb

Description

@freb

Description

I was pulling content from an existing spreadsheet and noticed two cells which have content, but returned an empty string from cell.GetRawValue() and cell.GetString(). After digging into the raw xml, I noticed that the shared string for it looked like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst count="1" uniqueCount="1" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
	<si>
		<r>
			<t xml:space="preserve">some content and </t>
		</r>
		<r>
			<rPr>
				<sz val="11"/>
				<color rgb="FF000000"/>
				<rFont val="Calibri"/>
				<family val="2"/>
			</rPr>
			<t>example.com</t>
		</r>
		<r>
			<rPr>
				<sz val="11"/>
				<color theme="1"/>
				<rFont val="Calibri"/>
				<family val="2"/>
				<scheme val="minor"/>
			</rPr>
			<t>and more content.</t>
		</r>
	</si>
</sst>

I figured the runs were the issue. I dug through the code, and while RichTextRun exists, it doesn't seem to be used anywhere. I also inspected all attributes on cell.X() (sml.CT_Cell) and couldn't find the runs anywhere. It appears that RichTextRuns are not even being parsed in to the CT_Cell from what I can tell.

Expected Behavior

GetFormattedValue() should not be empty when a cell has content displayed in Excel. Ideally GetString() would be updated to return a plaintext version of the content, though according to how GetString is documented, it is currently working as expected.

Actual Behavior

GetFormattedValue() returns and empty string for cells with RichTextRuns. There is also no method that I was able to find to access the raw RichTextRun content directly through cell.X().

I've attached a shreadsheet with RichTextRun content in A1: wb.xlsx.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions