Ошибка invalid xml content

Windows 7 Enterprise Windows 7 Professional Windows 7 Ultimate Windows 7 Home Basic Windows 7 Home Premium Windows Server 2008 R2 Datacenter Windows Server 2008 R2 Enterprise Windows Server 2008 R2 for Itanium-Based Systems Windows Server 2008 R2 Foundation Windows Server 2008 R2 Standard Windows Server 2008 R2 Web Edition More…Less

Symptoms

Consider the following scenario. You connect a USB flash drive to a computer that is running Windows 7 or Windows Server 2008 R2. You try to run one of the following Windows Management Instrumentation Command-line (WMIC) tool commands to query the hard disk drives on the computer:

wmic diskdrive get *

wmic diskdrive get serialNumberIn this scenario, you receive an error message that resembles the following:

Invalid XML content

Cause

This issue occurs because the XML parser treats the control characters that are included in the serial number of some drives as invalid. Therefore, the XML parser cannot parse content that includes these control characters. This behavior causes valid results for other drives to be displayed incorrectly, together with the behavior that is mentioned in the «Symptoms» section.

Resolution

Hotfix information

A supported hotfix is available from Microsoft. However, this hotfix is intended to correct only the problem that is described in this article. Apply this hotfix only to systems that are experiencing the problem described in this article. This hotfix might receive additional testing. Therefore, if you are not severely affected by this problem, we recommend that you wait for the next software update that contains this hotfix.

If the hotfix is available for download, there is a «Hotfix download available» section at the top of this Knowledge Base article. If this section does not appear, contact Microsoft Customer Service and Support to obtain the hotfix.

Note If additional issues occur or if any troubleshooting is required, you might have to create a separate service request. The usual support costs will apply to additional support questions and issues that do not qualify for this specific hotfix. For a complete list of Microsoft Customer Service and Support telephone numbers or to create a separate service request, visit the following Microsoft website:

http://support.microsoft.com/contactus/?ws=supportNote The «Hotfix download available» form displays the languages for which the hotfix is available. If you do not see your language, it is because a hotfix is not available for that language.

Prerequisites

To apply this hotfix, you must be running Windows 7 Service Pack 1 (SP1) or Windows Server 2008 R2 Service Pack 1 (SP1).

For more information about how to obtain a Windows 7 or Windows Server 2008 R2 service pack, click the following article number to view the article in the Microsoft Knowledge Base:

976932 Information about Service Pack 1 for Windows 7 and for Windows Server 2008 R2

Registry information

To apply the hotfix in this package, you do not have to make any changes to the registry.

Restart requirement

You must restart the computer after you apply this hotfix.

Hotfix replacement information

This hotfix does not replace a previously released hotfix.

File information

The global version of this hotfix installs files that have the attributes that are listed in the following tables. The dates and the times for these files are listed in Coordinated Universal Time (UTC). The dates and the times for these files on your local computer are displayed in your local time together with your current daylight saving time (DST) bias. Additionally, the dates and the times may change when you perform certain operations on the files.

Windows 7 and Windows Server 2008 R2 file information notes


Important Windows 7 hotfixes and Windows Server 2008 R2 hotfixes are included in the same packages. However, hotfixes on the Hotfix Request page are listed under both operating systems. To request the hotfix package that applies to one or both operating systems, select the hotfix that is listed under «Windows 7/Windows Server 2008 R2» on the page. Always refer to the «Applies To» section in articles to determine the actual operating system that each hotfix applies to.

  • The files that apply to a specific product, milestone (RTM, SPn), and service branch (LDR, GDR) can be identified by examining the file version numbers as shown in the following table:

    Version

    Product

    Milestone

    Service branch

    6.1.760
    1.17xxx

    Windows 7 and Windows Server 2008 R2

    SP1

    GDR

    6.1.760
    1.21xxx

    Windows 7 and Windows Server 2008 R2

    SP1

    LDR

  • GDR service branches contain only those fixes that are widely released to address widespread, critical issues. LDR service branches contain hotfixes in addition to widely released fixes.

  • The MANIFEST files (.manifest) and the MUM files (.mum) that are installed for each environment are listed separately in the «Additional file information for Windows 7 and for Windows Server 2008 R2» section. MUM and MANIFEST files, and the associated security catalog (.cat) files, are critical to maintaining the state of the updated component. The security catalog files, for which the attributes are not listed, are signed with a Microsoft digital signature.

For all supported x86-based versions of Windows 7

File name

File version

File size

Date

Time

Wmic.exe

6.1.7601.17759

396,288

11-Jan-2012

05:34

Wmic.exe

6.1.7601.21895

396,288

11-Jan-2012

05:59

For all supported x64-based versions of Windows 7 and of Windows Server 2008 R2

File name

File version

File size

Date

Time

Wmic.exe

6.1.7601.17759

566,784

11-Jan-2012

06:35

Wmic.exe

6.1.7601.21895

566,784

11-Jan-2012

06:17

For all supported IA-64-based versions of Windows Server 2008 R2

File name

File version

File size

Date

Time

Wmic.exe

6.1.7601.17759

1,074,688

11-Jan-2012

05:02

Wmic.exe

6.1.7601.21895

1,074,688

11-Jan-2012

05:15

Status

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the «Applies to» section.

More Information

For more information about the WMIC tool, visit the following Microsoft website:

How to use the WMIC toolFor more information about software update terminology, click the following article number to view the article in the Microsoft Knowledge Base:

824684 Description of the standard terminology that is used to describe Microsoft software updates

Additional file information

Additional file information for Windows 7 and for Windows Server 2008 R2

Additional files for all supported x86-based versions of Windows 7

File name

X86_6c466840b6e63f63f610c157eb689892_31bf3856ad364e35_6.1.7601.17759_none_a7c0934712923ecf.manifest

File version

Not applicable

File size

712

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

X86_a876a9b062f41a0d4de85d1baa24c21b_31bf3856ad364e35_6.1.7601.21895_none_458609fa6ddfc3e2.manifest

File version

Not applicable

File size

712

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

X86_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.17759_none_a38b04d82b34f3eb.manifest

File version

Not applicable

File size

2,606

Date (UTC)

11-Jan-2012

Time (UTC)

06:18

File name

X86_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.21895_none_a3e560cb44769e1d.manifest

File version

Not applicable

File size

2,606

Date (UTC)

11-Jan-2012

Time (UTC)

07:05

Additional files for all supported x64-based versions of Windows 7 and of Windows Server 2008 R2

File name

Amd64_3d500f95fcbf849689fefcb7325c786f_31bf3856ad364e35_6.1.7601.21895_none_9650dc0e92149a9f.manifest

File version

Not applicable

File size

1,072

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_60d1a618a98c83d3b0ba72adf281e92c_31bf3856ad364e35_6.1.7601.21895_none_cd4f676a3ce6ab90.manifest

File version

Not applicable

File size

716

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_6c466840b6e63f63f610c157eb689892_31bf3856ad364e35_6.1.7601.17759_none_03df2ecacaefb005.manifest

File version

Not applicable

File size

714

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_a876a9b062f41a0d4de85d1baa24c21b_31bf3856ad364e35_6.1.7601.21895_none_a1a4a57e263d3518.manifest

File version

Not applicable

File size

714

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_bde56e055e65c9622ce1740fd94aef81_31bf3856ad364e35_6.1.7601.17759_none_1477ed41b5701bb4.manifest

File version

Not applicable

File size

1,072

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_c9c34784d98d026b6c1e90dee14ad8b6_31bf3856ad364e35_6.1.7601.17759_none_a008e51116613975.manifest

File version

Not applicable

File size

716

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.17759_none_ffa9a05be3926521.manifest

File version

Not applicable

File size

2,610

Date (UTC)

11-Jan-2012

Time (UTC)

07:40

File name

Amd64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.21895_none_0003fc4efcd40f53.manifest

File version

Not applicable

File size

2,610

Date (UTC)

11-Jan-2012

Time (UTC)

07:31

Additional files for all supported IA-64-based versions of Windows Server 2008 R2

File name

Ia64_4f90e151ff6a00729d20707864ceceb7_31bf3856ad364e35_6.1.7601.17759_none_421d1d510aabe246.manifest

File version

Not applicable

File size

1,070

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Ia64_d9ae40a789be5755cf49c836060dc0f1_31bf3856ad364e35_6.1.7601.21895_none_5a456f45388820ce.manifest

File version

Not applicable

File size

1,070

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Ia64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.17759_none_a38ca8ce2b32fce7.manifest

File version

Not applicable

File size

2,608

Date (UTC)

11-Jan-2012

Time (UTC)

06:56

File name

Ia64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.21895_none_a3e704c14474a719.manifest

File version

Not applicable

File size

2,608

Date (UTC)

11-Jan-2012

Time (UTC)

07:16

Need more help?

Want more options?

Explore subscription benefits, browse training courses, learn how to secure your device, and more.

Communities help you ask and answer questions, give feedback, and hear from experts with rich knowledge.

Windows 7 Enterprise Windows 7 Professional Windows 7 Ultimate Windows 7 Home Basic Windows 7 Home Premium Windows Server 2008 R2 Datacenter Windows Server 2008 R2 Enterprise Windows Server 2008 R2 for Itanium-Based Systems Windows Server 2008 R2 Foundation Windows Server 2008 R2 Standard Windows Server 2008 R2 Web Edition More…Less

Symptoms

Consider the following scenario. You connect a USB flash drive to a computer that is running Windows 7 or Windows Server 2008 R2. You try to run one of the following Windows Management Instrumentation Command-line (WMIC) tool commands to query the hard disk drives on the computer:

wmic diskdrive get *

wmic diskdrive get serialNumberIn this scenario, you receive an error message that resembles the following:

Invalid XML content

Cause

This issue occurs because the XML parser treats the control characters that are included in the serial number of some drives as invalid. Therefore, the XML parser cannot parse content that includes these control characters. This behavior causes valid results for other drives to be displayed incorrectly, together with the behavior that is mentioned in the «Symptoms» section.

Resolution

Hotfix information

A supported hotfix is available from Microsoft. However, this hotfix is intended to correct only the problem that is described in this article. Apply this hotfix only to systems that are experiencing the problem described in this article. This hotfix might receive additional testing. Therefore, if you are not severely affected by this problem, we recommend that you wait for the next software update that contains this hotfix.

If the hotfix is available for download, there is a «Hotfix download available» section at the top of this Knowledge Base article. If this section does not appear, contact Microsoft Customer Service and Support to obtain the hotfix.

Note If additional issues occur or if any troubleshooting is required, you might have to create a separate service request. The usual support costs will apply to additional support questions and issues that do not qualify for this specific hotfix. For a complete list of Microsoft Customer Service and Support telephone numbers or to create a separate service request, visit the following Microsoft website:

http://support.microsoft.com/contactus/?ws=supportNote The «Hotfix download available» form displays the languages for which the hotfix is available. If you do not see your language, it is because a hotfix is not available for that language.

Prerequisites

To apply this hotfix, you must be running Windows 7 Service Pack 1 (SP1) or Windows Server 2008 R2 Service Pack 1 (SP1).

For more information about how to obtain a Windows 7 or Windows Server 2008 R2 service pack, click the following article number to view the article in the Microsoft Knowledge Base:

976932 Information about Service Pack 1 for Windows 7 and for Windows Server 2008 R2

Registry information

To apply the hotfix in this package, you do not have to make any changes to the registry.

Restart requirement

You must restart the computer after you apply this hotfix.

Hotfix replacement information

This hotfix does not replace a previously released hotfix.

File information

The global version of this hotfix installs files that have the attributes that are listed in the following tables. The dates and the times for these files are listed in Coordinated Universal Time (UTC). The dates and the times for these files on your local computer are displayed in your local time together with your current daylight saving time (DST) bias. Additionally, the dates and the times may change when you perform certain operations on the files.

Windows 7 and Windows Server 2008 R2 file information notes


Important Windows 7 hotfixes and Windows Server 2008 R2 hotfixes are included in the same packages. However, hotfixes on the Hotfix Request page are listed under both operating systems. To request the hotfix package that applies to one or both operating systems, select the hotfix that is listed under «Windows 7/Windows Server 2008 R2» on the page. Always refer to the «Applies To» section in articles to determine the actual operating system that each hotfix applies to.

  • The files that apply to a specific product, milestone (RTM, SPn), and service branch (LDR, GDR) can be identified by examining the file version numbers as shown in the following table:

    Version

    Product

    Milestone

    Service branch

    6.1.760
    1.17xxx

    Windows 7 and Windows Server 2008 R2

    SP1

    GDR

    6.1.760
    1.21xxx

    Windows 7 and Windows Server 2008 R2

    SP1

    LDR

  • GDR service branches contain only those fixes that are widely released to address widespread, critical issues. LDR service branches contain hotfixes in addition to widely released fixes.

  • The MANIFEST files (.manifest) and the MUM files (.mum) that are installed for each environment are listed separately in the «Additional file information for Windows 7 and for Windows Server 2008 R2» section. MUM and MANIFEST files, and the associated security catalog (.cat) files, are critical to maintaining the state of the updated component. The security catalog files, for which the attributes are not listed, are signed with a Microsoft digital signature.

For all supported x86-based versions of Windows 7

File name

File version

File size

Date

Time

Wmic.exe

6.1.7601.17759

396,288

11-Jan-2012

05:34

Wmic.exe

6.1.7601.21895

396,288

11-Jan-2012

05:59

For all supported x64-based versions of Windows 7 and of Windows Server 2008 R2

File name

File version

File size

Date

Time

Wmic.exe

6.1.7601.17759

566,784

11-Jan-2012

06:35

Wmic.exe

6.1.7601.21895

566,784

11-Jan-2012

06:17

For all supported IA-64-based versions of Windows Server 2008 R2

File name

File version

File size

Date

Time

Wmic.exe

6.1.7601.17759

1,074,688

11-Jan-2012

05:02

Wmic.exe

6.1.7601.21895

1,074,688

11-Jan-2012

05:15

Status

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the «Applies to» section.

More Information

For more information about the WMIC tool, visit the following Microsoft website:

How to use the WMIC toolFor more information about software update terminology, click the following article number to view the article in the Microsoft Knowledge Base:

824684 Description of the standard terminology that is used to describe Microsoft software updates

Additional file information

Additional file information for Windows 7 and for Windows Server 2008 R2

Additional files for all supported x86-based versions of Windows 7

File name

X86_6c466840b6e63f63f610c157eb689892_31bf3856ad364e35_6.1.7601.17759_none_a7c0934712923ecf.manifest

File version

Not applicable

File size

712

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

X86_a876a9b062f41a0d4de85d1baa24c21b_31bf3856ad364e35_6.1.7601.21895_none_458609fa6ddfc3e2.manifest

File version

Not applicable

File size

712

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

X86_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.17759_none_a38b04d82b34f3eb.manifest

File version

Not applicable

File size

2,606

Date (UTC)

11-Jan-2012

Time (UTC)

06:18

File name

X86_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.21895_none_a3e560cb44769e1d.manifest

File version

Not applicable

File size

2,606

Date (UTC)

11-Jan-2012

Time (UTC)

07:05

Additional files for all supported x64-based versions of Windows 7 and of Windows Server 2008 R2

File name

Amd64_3d500f95fcbf849689fefcb7325c786f_31bf3856ad364e35_6.1.7601.21895_none_9650dc0e92149a9f.manifest

File version

Not applicable

File size

1,072

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_60d1a618a98c83d3b0ba72adf281e92c_31bf3856ad364e35_6.1.7601.21895_none_cd4f676a3ce6ab90.manifest

File version

Not applicable

File size

716

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_6c466840b6e63f63f610c157eb689892_31bf3856ad364e35_6.1.7601.17759_none_03df2ecacaefb005.manifest

File version

Not applicable

File size

714

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_a876a9b062f41a0d4de85d1baa24c21b_31bf3856ad364e35_6.1.7601.21895_none_a1a4a57e263d3518.manifest

File version

Not applicable

File size

714

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_bde56e055e65c9622ce1740fd94aef81_31bf3856ad364e35_6.1.7601.17759_none_1477ed41b5701bb4.manifest

File version

Not applicable

File size

1,072

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_c9c34784d98d026b6c1e90dee14ad8b6_31bf3856ad364e35_6.1.7601.17759_none_a008e51116613975.manifest

File version

Not applicable

File size

716

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Amd64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.17759_none_ffa9a05be3926521.manifest

File version

Not applicable

File size

2,610

Date (UTC)

11-Jan-2012

Time (UTC)

07:40

File name

Amd64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.21895_none_0003fc4efcd40f53.manifest

File version

Not applicable

File size

2,610

Date (UTC)

11-Jan-2012

Time (UTC)

07:31

Additional files for all supported IA-64-based versions of Windows Server 2008 R2

File name

Ia64_4f90e151ff6a00729d20707864ceceb7_31bf3856ad364e35_6.1.7601.17759_none_421d1d510aabe246.manifest

File version

Not applicable

File size

1,070

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Ia64_d9ae40a789be5755cf49c836060dc0f1_31bf3856ad364e35_6.1.7601.21895_none_5a456f45388820ce.manifest

File version

Not applicable

File size

1,070

Date (UTC)

11-Jan-2012

Time (UTC)

14:06

File name

Ia64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.17759_none_a38ca8ce2b32fce7.manifest

File version

Not applicable

File size

2,608

Date (UTC)

11-Jan-2012

Time (UTC)

06:56

File name

Ia64_microsoft-windows-w..ommand-line-utility_31bf3856ad364e35_6.1.7601.21895_none_a3e704c14474a719.manifest

File version

Not applicable

File size

2,608

Date (UTC)

11-Jan-2012

Time (UTC)

07:16

Need more help?

By Ryan Lambert — Published August 25, 2018

Update: This bug is fixed in the latest supported version’s minor releases. Upgrade to fix this 🐛 bug!

I ran into a problem when moving a database from a production PostGIS-enabled PostgreSQL
server to a development server for testing. Turns out, what I found is a bug in pg_dump related to
the XML data type. The problem encountered is filed under
bug #15342.
Tom Lane summarized the issue:

«There are two problems here: pg_dump neglects to force a safe
value of xmloption for the restore step, plus there doesn’t seem to be
a safe value for it to force :-(.»

The rest of this post explores what the problem is, how to tell if you are affected, and your options
if you find yourself in this group.

  • Who does this affect?
  • Data to reproduce the bug
  • Check for problematic XML data
  • Workaround for part of the problem
  • PostgreSQL and XML, and bug #15342
  • What to do?

Who does this affect?

This bug affects PostgreSQL databases that contain columns of the XML data type. More specifically,
it’s a problem if any of the XML data includes a <!DOCTYPE> block.
If you have XML data with the <!DOCTYPE> block, and also have true XML fragments, you are really affected.
These databases will experience headaches when restoring dump files saved using
the pg_dump or pg_dumpall utilities.

The Check for problematic XML data section below provides a query to help determine if your databases
are affected.

QGIS Layer Styles

If you have PostGIS databases that support QGIS users and those QGIS users store their styles
in the PostGIS database (look at public.layer_styles table), this affects you.
I discovered this bug because I am both an admin and analyst user of our PostGIS databases.
I use QGIS and I love that I can have my complex styles saved (and backed up!) directly in the
database with the spatial data itself.

QGIS achieves this by storing its XML style information in a table named
public.layer_styles in the database storing the PostGIS data.
Style information is stored in XML format in a column named styleqml that includes a
document type declaration
(<!DOCTYPE>). As I mentioned earlier, XML data with <!DOCTYPE> is at the core of this problem.

Data to reproduce the bug

The following SQL code will create a table with two columns and three rows of data. These three rows of
data are sufficient to replicate the error and illustrate the problem. I have replicated this issue on
PostgreSQL versions 9.5 through 11.

CREATE TABLE public.xml_doc (
    notes TEXT NOT NULL,
    data XML NOT NULL
);

-- Example derived from https://wiki.postgresql.org/wiki/XML_Support
INSERT INTO public.xml_doc 
SELECT 'Document, no DOCTYPE', 
    XMLROOT (
  XMLELEMENT (                                       
    NAME gazonk,                                
    XMLATTRIBUTES (
      'val' AS name,
      1 + 1 AS num
    ),
    XMLELEMENT (
      NAME qux,
      'foo'
    )
  ),
  VERSION '1.0',
  STANDALONE YES
)
;

-- Example XML document simplified from:  https://xmlwriter.net/xml_guide/doctype_declaration.shtml
INSERT INTO public.xml_doc 
SELECT 'Document, with DOCTYPE', 
    XMLPARSE (DOCUMENT '<?xml version="1.0" standalone="no" ?>
        <!DOCTYPE document SYSTEM "subjects.dtd">
        <document>
          <title>Subjects available in Mechanical Engineering.</title>
          <subjectID>2.303</subjectID>
        </document>')
;

-- Example dervied from https://wiki.postgresql.org/wiki/XML_Support
INSERT INTO public.xml_doc
SELECT 'Content fragment', 
    XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>');

This example is derived from my original example SQL Fiddle. The SQL Fiddle version will not be maintained/updated.

Data loaded

The code above adds three rows of data with three classifications of XML data.

  • XML document, no DOCTYPE declaration
  • XML document, with DOCTYPE declaration
  • XML fragment

Only the Document, no DOCTYPE row will always load successfully from a pg_dump backup.
The Content fragment will load with the default setting, but fails with the workaround.
The Document, with DOCTYPE fails by default and only works with the workaround.

The following query shows the XML data loaded and checks the XML for valid documents.

SELECT notes, data IS DOCUMENT AS xml_document, data
    FROM public.xml_doc;
┌────────────────────────┬──────────────┬──────────────────────────────────────────────────────────────────────────────────────────┐
│         notes          │ xml_document │                                           data                                           │
╞════════════════════════╪══════════════╪══════════════════════════════════════════════════════════════════════════════════════════╡
│ Document, no DOCTYPE   │ t            │ <?xml version="1.0" standalone="yes"?><gazonk name="val" num="2"><qux>foo</qux></gazonk> │
│ Document, with DOCTYPE │ t            │ <?xml version="1.0" standalone="no"?>                                                   ↵│
│                        │              │                 <!DOCTYPE document SYSTEM "subjects.dtd">                               ↵│
│                        │              │                 <document>                                                              ↵│
│                        │              │                   <title>Subjects available in Mechanical Engineering.</title>          ↵│
│                        │              │                   <subjectID>2.303</subjectID>                                          ↵│
│                        │              │                 </document>                                                              │
│ Content fragment       │ f            │ abc<foo>bar</foo><bar>foo</bar>                                                          │
└────────────────────────┴──────────────┴──────────────────────────────────────────────────────────────────────────────────────────┘
(3 rows)

pg_dump and restore

With example data loaded we can take a backup of the database using pg_dump.

pg_dump -d xml_test -f xml_test_invalid.sql

On a second PostgreSQL server, create a new database and attempting to restore the backup using psql:

psql -d postgres -c "CREATE DATABASE restore_here;"
psql -d restore_here -f xml_test_invalid.sql

It starts out so encouraging…

SET
Time: 4.502 ms
SET
Time: 3.987 ms
SET
Time: 4.818 ms
...

But then… ERROR!

ERROR:  invalid XML content
DETAIL:  line 2: StartTag: invalid element name
  <!DOCTYPE document SYSTEM "subjects.dtd">
   ^
CONTEXT:  COPY xml_doc, line 2, column data: "<?xml version="1.0"
standalone="no"?>
  <!DOCTYPE document SYSTEM "subjects.dtd">
  <document>
    <..."

One particularly unfortunate aspect of this bug is that it doesn’t throw an error during the
pg_dump process. Instead, this bug shows up first when attempting to restore
a pg_dump file that includes XML data with the <!DOCTYPE> block.

This illustrates why an untested backup is not a backup! Some problems, like this one, only show up at the tail end of the process.

Check for problematic XML data

There are three steps to see if (or how badly) you are affected.

  • Database includes XML columns?
  • XML data includes <!DOCTYPE>
  • XML data includes fragments

Database includes XML

The following query will check the active database for any columns with the XML data type.
If this query does not return any rows, you do not have XML columns in this database,
and don’t have to worry about this bug.

SELECT table_schema, table_name, column_name
    FROM information_schema.columns 
    WHERE data_type = 'xml';

┌──────────────┬────────────┬─────────────┐
│ table_schema │ table_name │ column_name │
├──────────────┼────────────┼─────────────┤
│ public       │ xml_doc    │ data        │
└──────────────┴────────────┴─────────────┘
(1 row)

If rows are returned, make note of the table(s) and column(s), those will be used in the next two queries.

XML includes <!DOCTYPE>

If the prior query returned data, you need to check those columns for XML data including <!DOCTYPE> tags.
The following query shows a rough way to check for this by first
CASTing the XML column (named data) to TEXT and using the LIKE operator.

A count greater than zero (0) indicates your database has data including the DOCTYPE declaration.
This means, at minimum, you need to use the workaround discussed below in order to use pg_dump on that
database. If this query returns zero rows, you are not affected by this bug.

SELECT COUNT(*)
    FROM public.xml_doc
    WHERE data::TEXT LIKE '%<!DOCTYPE%';

┌───────┐
│ count │
├───────┤
│     1 │
└───────┘
(1 row)

When using pg_dumpall means these considerations apply instance wide, not just to a single database.

Do you have XML fragments too?

If you have XML data in your PostgreSQL database including DOCTYPE, you must also check for XML
fragment data. The reason why is discussed later.

SELECT COUNT(*)
        FROM public.xml_doc
        WHERE data IS NOT DOCUMENT;

If you have a count greater than zero on this query, and also found XML data with DOCTYPE, you have a big problem with pg_dump.
You have bug #15342.
We’ll come back to this after discussing the workaround if you only have DOCTYPE XML, no fragments.

Workaround for XML with DOCTYPE, but no fragments

If you have XML data with DOCTYPE included in the data (QGIS styles, for example) and no data that is just
XML fragments, the workaround is
to set an option at the top of the script: SET XML OPTION DOCUMENT;
This workaround will work, much of the time (provided no XML fragments…).
Experienced PostGIS / QGIS users have known about this issue the workaround for quite a while,
it was logged as
a QGIS bug back in 2014
that resulted in the QGIS team adding a note about this to
their user manual:

«If you want to make a backup of your PostGIS database using the pg_dump and pg_restore commands, and the default layer styles as saved by QGIS fail to restore afterwards, you need to set the XML option to DOCUMENT and the restore will work.»

That means add this:

SET XML OPTION DOCUMENT;

One caveat of this workaround is that you must use the plain SQL format for pg_dump. The custom
format dump file will not work because you are not able to edit the output to include the XML option.

If this workaround works for you, it’s trivial to update your backup commands to start each pg_dump
file with the required command.

echo "SET XML OPTION DOCUMENT;" > ~/tmp/database_with_xml.sql
pg_dump -d database_with_xml >> ~/tmp/database_with_xml.sql

PostgreSQL and XML, and Bug #15342

Before going on to the final part of this bug, let’s understand a bit more about
how PostgreSQL supports XML. I’ve been lucky enough to mostly avoid XML data, so I had to dive into
the PostgreSQL docs to learn more
about how XML is handled.

«The xml type can store well-formed “documents”, as defined by the XML standard, as well as “content” fragments»

This introduces concepts of XML document and fragments. I had already shown you this concept
above in SQL code with IS DOCUMENT and IS NOT DOCUMENT in the queries.

Now refer back to the workaround above, that setting overrides the default setting of CONTENT
with the setting DOCUMENT. Though this next statement from the
docs seems to indicate the workaround should not be required (emphasis mine):

«The default is CONTENT, so all forms of XML data are allowed

Wait, if all forms of XML data are allowed in CONTENT as this states, why do we have to use a
workaround to override that setting? There is a note further down in the PostgreSQL docs that seem to explain this:

«With the default XML option setting, you cannot directly cast character strings to type xml if they contain a document type declaration, because the definition of XML content fragment does not accept them. If you need to do that, either use XMLPARSE or change the XML option.«

DOCUMENT setting and Fragments

The above workaround of adding SET XML OPTION DOCUMENT; was so easy, right? The problem
is when you set this option, XML fragments (accepted by the default CONTENT setting) are now rejected.
They aren’t documents. Re-visiting our query from the beginning, notice Content fragment is False.

SELECT notes, data IS DOCUMENT AS xml_document, data
    FROM public.xml_doc;
┌────────────────────────┬──────────────┬──────────────────────────────────────────────────────────────────────────────────────────┐
│         notes          │ xml_document │                                           data                                           │
╞════════════════════════╪══════════════╪══════════════════════════════════════════════════════════════════════════════════════════╡
│ Document, no DOCTYPE   │ t            │ <?xml version="1.0" standalone="yes"?><gazonk name="val" num="2"><qux>foo</qux></gazonk> │
│ Document, with DOCTYPE │ t            │ <?xml version="1.0" standalone="no"?>                                                   ↵│
│                        │              │                 <!DOCTYPE document SYSTEM "subjects.dtd">                               ↵│
│                        │              │                 <document>                                                              ↵│
│                        │              │                   <title>Subjects available in Mechanical Engineering.</title>          ↵│
│                        │              │                   <subjectID>2.303</subjectID>                                          ↵│
│                        │              │                 </document>                                                              │
│ Content fragment       │ f            │ abc<foo>bar</foo><bar>foo</bar>                                                          │
└────────────────────────┴──────────────┴──────────────────────────────────────────────────────────────────────────────────────────┘
(3 rows)

Trying to restore the database dump now fails on the record that is an XML fragment, instead of failing
on the record with the <!DOCTYPE>.

ERROR:  invalid XML document
DETAIL:  line 1: Start tag expected, '<' not found
abc<foo>bar</foo><bar>foo</bar>
^
CONTEXT:  COPY xml_doc, line 3, column data: "abc<foo>bar</foo><bar>foo</bar>"

This illustrates the real problem. Without the workaround, it’s impossible to restore XML
data that uses <!DOCTYPE>.
With the workaround, XML fragments will fail to restore. If you have both…

If your PostgreSQL databases have XML fragments and XML with <!DOCTYPE>, pg_dump will not restore properly.

What to do?

If you have a database that includes XML data making pg_dump unusable, what can you do?

  • Implement more reliable backups
  • Evaluate and standardize your XML data
  • Split databases / instances

Backups

While pg_dump is a great utility for specific purposes, it should not be relied upon
as your main backup process. A tool such as
pgbackrest
or
barman
is far better suited for that task.
We use pgbackrest at RustProof Labs, and those backups do not suffer from the problem
that pg_dump does in this instance. That discussion is far beyond the scope of this post.

If you currently are scripting database backups using pg_dump or pg_dumpall, it’s worth the
effort to invest in a more reliable backup solution.

Evaluate and standardize

If your database has XML data with DOCTYPE and fragments both, try to evaluate if one of the two XML
types can be converted to either a more/less formal XML data format. The answer to if you can do
this will be very context specific to your data, systems, users, and software.

For example, our PostGIS databases are used regularly from QGIS, and I love saving our styles directly
in the database. Because of this, and the fact that QGIS’s style XML includes <!DOCTYPE>, that means having
XML fragments would cause a serious problem. Luckily, this is the only XML our databases contain and I don’t
see any new XML sources coming in anytime soon. In our case, the workaround works.

Split data

If you have both <!DOCTYPE> and XML fragments, and can’t convert one to the other, try to split the
different data out into different databases at minimum, but different PostgreSQL instances would be preferred.
Without splitting that data into different instances, pg_dumpall would be affected even if the XML sources
are in different databases.

Summary

If a PostgreSQL database stores a mix of XML data, specifically including records with
<!DOCTYPE> blocks, the pg_dump utility can generate invalid database dump files.
There is a workaround, but it only works if the database does not include any XML fragments.
If you have both <!DOCTYPE> and fragments, pg_dump is going to have a bad time.

I don’t have a good idea of how many PostgreSQL databases this affects. The PostGIS
community has a large number of users, and many of those databases support QGIS
users storing layers in the public.layer_styles table.
The workaround should work for many of those users, unless their database also stores fragments of XML as well.

Hopefully this bug is fixed at some point, I will try to update this post to reflect the status
when that happens.

Need help with your PostgreSQL servers or databases?
Contact us
to start the conversation!

By Ryan Lambert
Published August 25, 2018
Last Updated November 01, 2022

Basically double-quote all attribute values.

You can read about it in this tutorial.

I used an on-line xml validator and this is the result:

The value following «version» in the XML declaration must be a quoted string.

Change:

<?xml version=1.0 encoding=UTF-8?>

by

<?xml version="1.0" encoding="UTF-8"?>

Open quote is expected for attribute «xmlns:dpid» associated with an element type «dpid:DpidDatabase».

Change:

<dpid:DpidDatabase xmlns:dpid=http://ddex.net/xml/dpid/11 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:schemaLocation=http://ddex.net/xml/dpid/11 http://ddex.net/xml/dpid/11/dpid.xsd>

by

<dpid:DpidDatabase xmlns:dpid="http://ddex.net/xml/dpid/11" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ddex.net/xml/dpid/11 http://ddex.net/xml/dpid/11/dpid.xsd">

Open quote is expected for attribute «SequenceNumber» associated with an element type «DpidOwner».

Change all occurrences of:

<DpidOwner SequenceNumber=XX>

by double quoting SequenceNumber values

<DpidOwner SequenceNumber="1">

You can check the result db<>fiddle here

sequencenumber                                                                                                                                                                                                                                                     | dpid                            | companyname                                     | address                                                                                                                           
:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------ | :---------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------
<DpidOwner SequenceNumber="1"><br>  <DPID>PADPIDA2006111001O</DPID><br>  <CompanyName>234AG</CompanyName><br>  <Address>Riedtlistrasse 23, Z�???????�??????�?????�????�???�??�?�¼rich, 8006, CH</Address><br> </DpidOwner> | <DPID>PADPIDA2006111001O</DPID> | <CompanyName>234AG</CompanyName>                | <Address>Riedtlistrasse 23, Z�???????�??????�?????�????�???�??�?�¼rich, 8006, CH</Address>
<DpidOwner SequenceNumber="2"><br>  <DPID>PADPIDA2007011501Q</DPID><br>  <CompanyName>OpenIMP</CompanyName><br>  <Address>8-10 Rhoda Street, London, E2 7EF, UK</Address><br> </DpidOwner>                                                                         | <DPID>PADPIDA2007011501Q</DPID> | <CompanyName>OpenIMP</CompanyName>              | <Address>8-10 Rhoda Street, London, E2 7EF, UK</Address>                                                                          
<DpidOwner SequenceNumber="3"><br>  <DPID>PADPIDA2007040501K</DPID><br>  <CompanyName>The Harry Fox Agency</CompanyName><br>  <Address>711 Third Avenue, 8th Floor, New York, 10017, USA</Address><br> </DpidOwner>                                                | <DPID>PADPIDA2007040501K</DPID> | <CompanyName>The Harry Fox Agency</CompanyName> | <Address>711 Third Avenue, 8th Floor, New York, 10017, USA</Address>                                                              

erleug

0 / 0 / 0

Регистрация: 11.02.2019

Сообщений: 14

1

21.01.2020, 16:45. Показов 4257. Ответов 9

Метки postgesql, postgre, xml (Все метки)


Студворк — интернет-сервис помощи студентам

у меня есть файлы, которые хотел бы импортировать в БД Есть таблица Test с полем xml пытаюсь сделать импорт

SQL
1
2
3
4
5
6
COPY
  public."Test"
( 
  "TestXML" 
)
FROM 'D:file.xml';

Вылетала ошибка

SQL
1
ERROR:  could NOT OPEN file "D:file.xml" FOR reading: No such file OR directory

Понимаю, что скорее всего проблема в неправильности пути. Решение было в экранировании слэшов в пути к файлу.

SQL
1
2
3
4
5
6
COPY    
public."Test"
    (
      "TestXML"
    )
    FROM E'D:\file.xml';

Файл читает, но теперь проблема в другом, при считывании файла вылетает ошибка

SQL
1
2
ОШИБКА:  invalid XML content
CONTEXT:  COPY Test, строка 1, колонка TestXML: "<?xml version="1.0" encoding="UTF-8"?>"

Думаю проблема в переносе строки, в скрипте дописываю

SQL
1
DELIMITER 'n'

и вылетает ошибка

SQL
1
ОШИБКА:  разделитель для COPY должен быть однобайтным символом

Также пробовал

SQL
1
WITH DELIMITER AS E'n'

и все также вылетает ошибка

SQL
1
ОШИБКА:  разделителем для COPY не может быть символ новой строки или возврата каретки

В документации пишется, что команда COPY FROM распознаёт следующие спецпоследовательности: n, r и тп. То есть принимать разделитель оно должно. Подскажите, в чем тут ошибка



0



1218 / 942 / 374

Регистрация: 02.09.2012

Сообщений: 2,868

21.01.2020, 21:47

2

a) DELIMETER вам не поможет.
б) если пишет invalid content, значит где-то действительно что-то не так.
посмотрите hex-редактором на содержимое файла, может какие-то странные символы затесались
(BOM по-моему тоже не распознается).
в) для копирования в таблицу концы строк из файла придется удалить.

ну и не мешает ВСЕГДА писать версию PG, в которой вы делаете все эти эксперименты.

Добавлено через 8 минут
пункт в) точнее так.

конец строки — это разделитель записей (для справки DELIMETER — это разделитель полей)
COPY загружает данные в таблицы в виде записей, каждая из которых может состоять из полей.
вы загружаете в таблицу из одного поля — значит одна строка — одна запись.
соответственно, если XML элемент (со свом содержимым) не помещается в строку, то будет ошибка парсинга

Поясню…

так ошибка

Код

<?xml version="1.0" encoding="UTF-8"?>
<element>
<subelement attr="XYZ"><item>ABC</item></subelement></element>
<element><subelement attr="XYZ"><item>ABC</item></subelement></element>
<element><subelement attr="XYZ"><item>ABC</item></subelement></element>

а так норм

Код

<?xml version="1.0" encoding="UTF-8"?>
<element><subelement attr="XYZ"><item>ABC</item></subelement></element>
<element><subelement attr="XYZ"><item>ABC</item></subelement></element>
<element><subelement attr="XYZ"><item>ABC</item></subelement></element>



1



erleug

0 / 0 / 0

Регистрация: 11.02.2019

Сообщений: 14

22.01.2020, 11:45

 [ТС]

3

то есть, если у меня структура файла такая:

XML
1
2
3
4
5
<?xml version="1.0" encoding="UTF-8"?>
<Root>
      <step>Manufacturing step 1 at this work center</step>
      <step>Manufacturing step 2 at this work center</step>
</Root>

верным вариантом будет ?

XML
1
2
<?xml version="1.0" encoding="UTF-8"?>
<Root><step>Manufacturing step 1 at this work center</step><step>Manufacturing step 2 at this work center</step></Root>

Добавлено через 34 минуты
вообщем при такой структуре файла импорт работает:

XML
1
2
<Root><step>Manufacturing step 1 at this work center</step><step>Manufacturing step 2 at this work center</step></Root>
<Root2><step>Manufacturing step 1 at this work center</step><step>Manufacturing step 2 at this work center</step></Root2>

когда добавляю строчку с кодировкой и версией файла

XML
1
2
3
<?xml version="1.0" encoding="UTF-8"?>
<Root><step>Manufacturing step 1 at this work center</step><step>Manufacturing step 2 at this work center</step></Root>
<Root2><step>Manufacturing step 1 at this work center</step><step>Manufacturing step 2 at this work center</step></Root2>

то выбивает ошибку

XML
1
2
ОШИБКА:  invalid XML content
CONTEXT:  COPY Test, строка 1, колонка TestXML: "<?xml version="1.0" encoding="UTF-8"?>"

hex-редактором прошелся, лишних символов в файле нету. Что в таком случае может быть неправильного?



0



1218 / 942 / 374

Регистрация: 02.09.2012

Сообщений: 2,868

22.01.2020, 20:15

4

Цитата
Сообщение от erleug
Посмотреть сообщение

Что в таком случае может быть неправильного?

Попробуйте сторонним парсером проверить. invalid XML content в коде PG возвращает, когда libxml2 вернул NULL или error при парсинге текста. Значит действительно что-то не так.
Больше ничего не приходит в голову.
Выложите сюда что-ли файл или на файл-обменник. Посмотрю.



0



0 / 0 / 0

Регистрация: 11.02.2019

Сообщений: 14

23.01.2020, 10:50

 [ТС]

5

мой файлик закинул на файлообменник, тут не дает закинуть



0



1218 / 942 / 374

Регистрация: 02.09.2012

Сообщений: 2,868

23.01.2020, 21:59

6

все заносится с ?xml? и без него (конец строки надо учитывать — будет пустая запись)
рассказывайте номер версии ПГ, показывайте по шагам что и как делаете.



0



erleug

0 / 0 / 0

Регистрация: 11.02.2019

Сообщений: 14

24.01.2020, 12:41

 [ТС]

7

SQL Manager Lite for PostgreSQL 5.5.1 и сервер PostgreSQL 9.4.

выполняю запрос

XML
1
2
3
4
5
6
COPY
  public."Test"
(
  "TestXML"
)
FROM E'D:\file.xml';

файл лежит на диске, где и сервер.
Вылетает ошибка

XML
1
2
ОШИБКА:  invalid XML content
CONTEXT:  COPY Test, строка 1, колонка TestXML: "<?xml version="1.0" encoding="UTF-8"?>"

Больше ничего не делаю

Миниатюры

Импорт данных из XML файла в SQL Manager Lite for PostgreSQL
 



0



1218 / 942 / 374

Регистрация: 02.09.2012

Сообщений: 2,868

24.01.2020, 18:24

8

Лучший ответ Сообщение было отмечено erleug как решение

Решение

Цитата
Сообщение от erleug
Посмотреть сообщение

PostgreSQL 9.4.

Ну вот сразу бы сказали, что вы любитель старины
Или совсем уберите заголовок XML
Или подтяните строчку с первым элементом к заголовку.

Уже не помню, что и как там меняли в логике работы COPY, какие баги закрывали.
Но в 9.4 только так можно.

<?xml version=»1.0″ encoding=»UTF-8″?><element><subelement attr=»XYZ»><item>ABC</item></subelement></element>
<element><subelement attr=»XYZ»><item>ABC</item></subelement></element>
<element><subelement attr=»XYZ»><item>ABC</item></subelement></element>

В вашем примере перенос строки говорит COPY, что данные записи закончились.
XML, состоящий только из декларации заголовка, не считается валидным,
поэтому закономерное invalid XML content.



1



0 / 0 / 0

Регистрация: 11.02.2019

Сообщений: 14

27.01.2020, 12:45

 [ТС]

9

спасибо, а в более новых версиях, я так понимаю все должно ок работать?



0



1218 / 942 / 374

Регистрация: 02.09.2012

Сообщений: 2,868

27.01.2020, 23:33

10

Все таки я был неправ. Посмотрел код еще раз внимательно.
С 9.6 версии точно, дальше в прошлое не смотрел, стали разрешать «пустой XML».
То есть заголовок может быть, а содержимого нет. Формально — это невалидный XML (классические парсеры и чекеры не должны это пропускать). Но в PG сделали, что можно. Уже не знаю по каким причинам.
В результате
# для 9.4 — будет невалидный XML

Код

psql (9.4.25)
Type "help" for help.

postgres=# select '<?xml version="1.0" ?>'::xml;
ERROR:  invalid XML content
LINE 1: select '<?xml version="1.0" ?>'::xml;
               ^

# старшие версии — сделают пустой XML

Код

psql (12.1)
Type "help" for help.

test=> select '<?xml version="1.0" ?>'::xml;
 xml 
-----
 
(1 row)

test=>

какое из этих поведений Вас больше устроит/не устроит — не могу сказать.

С точки зрения обработки через COPY (каждая строчка — каждая запись) я бы убрал XML заголовок. Он все равно не играет никакой роли при заливке через COPY.



1



Здравствуйте, уважаемые участники форума.

Необходимо поместить данные, содержащиеся в большом xml-файле, в таблицу. Создал таблицу с одним столбцом, имеющим тип xml, после чего попытался выполнить следующее:

COPY xmltest FROM '/home/alexey/projects/test/ProductData.xml';

Получил ошибку:

[2200N] ERROR: invalid XML content
Detail: line 1: XML declaration allowed only at the start of the document <?xml version=»1.0″ encoding=»utf-8″?>

Подумал, что Postgres`у не нравится кодировка. Проверил файл командой:

file ProductData.xml

Получил ответ:

ProductData.xml: XML 1.0 document, UTF-8 Unicode (with BOM) text, with CRLF line terminators

Удалил ВОМ командой:

sed -i '1s/^xEFxBBxBF//' orig.txt

После этого file показывает:

ProductData.xml: XML 1.0 document, UTF-8 Unicode text, with CRLF line terminators

Снова пытаюсь выполнить импорт той же инструкцией:

COPY xmltest FROM '/home/alexey/projects/test/ProductData.xml';

Получаю ответ:

[2200N] ERROR: invalid XML content
Detail: line 1: Premature end of data in tag ProductData line 1
g/2001/XMLSchema» xsi:noNamespaceSchemaLocation=»Productgegevens_insbou003.xsd»>

Помогите, пожалуйста.

UPD: Разобрался — надо было, оказывается, удалить знаки перевода строки и табуляции.

  • Ошибка invalid use of void expression
  • Ошибка invalid use of this in non member function
  • Ошибка invalid use of non static data member
  • Ошибка invalid use of incomplete type class ui mainwindow
  • Ошибка invalid use of incomplete type class qdebug