Document from excel to xml. Create and edit an XML file in Excel

Home / Installing programs

When developing an electronic document management system, it was necessary to implement functions for exporting data in popular formats. In particular, in Microsoft Excel format. The requirements for export were quite simple - export data with a minimum of formatting, i.e. no merged cells, playing with fonts, etc. Export formats XLSX and Excel XML.

In this case, I’ll tell you about Excel XML.

So, in any system that operates on tabular data, sooner or later the need to export data arises. Export purposes are different:

Implementing in a class a set of functions for recording the values ​​of cells and a series is the main requirement, which implies the creation of functions for recording the values ​​of cells of specified types and the ability to write the finished series to a file.

The ability to work with an unlimited amount of data - of course, the export class itself cannot be responsible for the volume being written, but it should provide functions for writing data to disk and freeing up RAM for the next portion of data.

In addition to the described requirements, it was necessary to add service functions:

  • Enabling AutoFilter
  • Compress the file into zip.

Implementation

First of all, when creating a class, I check the final file name and request the number of columns and rows. The file must have a valid name, and the folder in which it will be saved must exist. Everything is as usual.
The Excel XML format allows you to save information about the user who created it in the file, therefore, when creating a header, I write down the name of the organization, information about the user and the date the file was created.

Public function writeDocumentProperties($organization = null, $user = null) ( fwrite($this->file, " "); if (!is_null($user)) ( fwrite($this->file, " ".$user->description.""); fwrite($this->file, " ".$user->description.""); ) $dt = new Datetime(); $dt_string = $dt->format("Y-m-d\TH:i:s\Z"); fwrite($this->file, " ".$dt_string.""); fwrite($this->file, " ".$dt_string.""); if (!is_null($organization)) fwrite($this->file, " ".$organization->name.""); fwrite($this->file, " 12.00"); fwrite($this->file, ""); }
True, it is in this function that the entities of the document management system are used - organization (organization) and user (user). Replacing these entities with, say, string values ​​is not a problem.

The most interesting part of the header is the styling information. They are implemented very conveniently in the Excel XML format, so I simply create a table with styles for strings, date/time and hyperlinks.

Public function writeStyles() ( fwrite($this->file, ""); //default style fwrite($this->file, ""); //Datetime style fwrite($this->file, ""); fwrite($this->file, ""); fwrite($this->file, ""); //Hyperlink style fwrite($this->file, ""); //Bold fwrite($this->file, ""); fwrite($this->file, ""); }

Having completed the preparatory work, you can proceed to recording data. Opening a worksheet is just a couple of tags, just at this moment information about the number of columns and rows is used.

Public function openWorksheet() ( fwrite($this->file, " "); fwrite($this->file, strtr("

", array("(col_count)"=>$this->colCount, "(row_count)"=>$this->rowCount))); )
But recording rows is a more interesting process. The class must work quickly and process an unlimited amount of data, because there can be a hundred thousand or even a million records! If you want speed, work with memory; if you want unlimited data, work with disk. To reconcile the requirements, I implemented the resetRow and flushRow functions.
The first one clears the current row, after which it can be filled with data again, and the second one writes the current row to an open file on disk. Using them together allows you to maintain a balance between speed and amount of memory used.

Public function resetRow() ( $this->currentRow = array(); ) public function flushRow() ( fwrite($this->file, implode("", $this->currentRow)); unset($this-> currentRow);
Each cell is written with a function corresponding to the data type, namely appendCellxxx, where xxx is the data type. Valid data types: Num, String, Real, DateTime, Date, Time, Link. Example of a function for writing a numeric value:

Public function appendCellNum($value) ( ​​$this->currentRow = " ".$value.""; }
After recording all the data, all that remains is to close the worksheet and workbook.

Application

The use of the described class is based on data export using the CArrayDataProvider provider. However, assuming that the volume of exported data may be very large, a special iterator CDataProviderIterator is used, which iterates through the returned data by 100 records (you can specify a different number of records).

Public function exportExcelXML($organization, $user, &$filename) ( $this->_provider = new CArrayDataProvider(/*query*/); Yii::import("ext.AlxdExportExcelXML.AlxdExportExcelXML"); $export = new AlxdExportExcelXML ($filename, count($this->_attributes), $this->_provider->getTotalItemCount() + 1); $export->openWriter(); $export->openWorkbook(); organization, $user); $export->writeStyles(); $export->openWorksheet(); //title row $export->resetRow(); $export->openRow(true); as $code => $format) $export->appendCellString($this->_objectref->getAttributeLabel($code)); $export->closeRow(); $export->flushRow(); = new CDataProviderIterator($this->_provider, 100); foreach ($rows as $row) ( $export->resetRow(); $export->openRow(); foreach ($this->_attributes as $code => $format) ( switch ($format->type) ( case "Num": $export->appendCellNum($row[$code]); /*other types*/ default: $export->appendCellString(""); ) ) $export->closeRow(); $export->flushRow(); ) //close all $export->closeWorksheet(); $export->closeWorkbook(); $export->closeWriter(); //zip file $export->zip(); $filename = $export->getZipFullFileName(); )
In my case, each row is written to disk, which is quite acceptable for now, but may require changes in the future. For example, it would be wise to save not every row, but every ten or even a hundred rows at a time. Then the export speed will increase.

Speed

By the way, I learned from my own experience how important it is to assume the possibility of the existence of large volumes of data during a batch operation such as export.
Initially, I tried to export data using CActiveDataProvider, which required about 240 seconds when exporting 1000 records! Changing the query to use CArrayDataProvider reduced the time to export 1000 records to 0.5 seconds!
I measured export indicators especially for this publication.
Exported 1626 records from 9 attributes representing information about closed incidents (see ITSM).
Initial view of the exported table
Result
(sorry, the picture disappears after publication)
Export figures
Final file size: 1 312 269
Compressed file size: 141 762
Time taken: approx. 0.5 sec

Anyone interested can get the source code of my class for free. Just remember to correct the function writeDocumentProperties to unlink from the document management system entities organization and user, or use your own similar entities with the corresponding properties.

After importing XML data, mapping the data to worksheet cells, and making changes to the data, you often need to export or save the data as an XML file.

Important:

Export XML data (max. 65,536 lines)

Export XML data (more than 65,536 rows)

    Find the difference between the total number of lines in the file and the number 65,537. Let's denote this number as x.

    Delete x rows from the beginning of an Excel sheet.

    Export the sheet to an XML data file (the previous section describes the procedure).

    Click the button Close, But don't save sheet. Then open the Excel sheet again.

    Remove all data after the entire x, and then export it as an XML data file (see the previous section of the procedure).

    This will prevent you from losing the rest of the data. At this point, you have two XML export files that can be combined to create a duplicate of the original sheet.

Saving XML data in mapped cells in an XML data file

If you need to maintain backward compatibility with earlier versions of XML functionality, you can save the file as an XML data file rather than using the command Export.

Note: If the worksheet contains titles or labels that differ from the XML element names in the XML map, Excel uses the XML element names when you export or save the XML data.

Common problems when exporting XML data

Messages similar to the following may appear when exporting XML data.

This XML map can be exported, but some required elements are not mapped

This message may appear for the following reasons.

    The XML map associated with this XML table has one or more required elements that are not mapped to it.

    The hierarchical list of items in the XML Source task pane indicates the presence of required items by placing a red star in the upper right corner of the icon to the left of each item. To map the element you want, drag it onto the sheet where you want it to appear.

    The element represents a recursive structure.

    A typical example of a recursive structure is a hierarchy of employees and managers, in which the same XML elements are nested at several levels. Although you could match all elements in the XML Source task pane, Excel does not support recursive structures that are more than one level deep, so it cannot match all elements.

    The XML table contains mixed content.

    Mixed content occurs when an element contains a child element and plain text outside of the child element. This is often the case when formatting tags (such as bold tags) are used to mark data within an element. The child element may be displayed (if supported in Excel), but the text content is lost when the data is imported and is not available when exported, meaning it is not used in either the forward or reverse operation.

Can't export XML maps in a workbook

The XML map will fail to export if the relationships of the mapped element to other elements cannot be preserved. The relationship may not survive for the following reasons.

    The mapped element's schema definition is contained in a sequence with the following attributes:

    • attribute maxoccurs not equal to 1;

      the sequence contains more than one direct child element or includes another composite object as such an element.

    Non-repeating sibling elements with the same repeating parent element are mapped to different XML tables.

    Multiple duplicate elements are mapped to the same XML table, and the repetition is not defined by an ancestor.

    Children of different parent elements are mapped to the same XML table.

Additionally, you cannot export an XML map if it contains one of the following XML schema constructs.

    List of lists. One list of elements contains another list of elements.

    Unstandardized data. The XML table contains an element that, according to the definition in the schema, must occur once (the attribute maxoccurs assigned the value 1). When you add such an element to an XML table, Excel will populate the table column with multiple instances of it.

    Choice. The matched element is part of the circuit construct .

Representing data based on entering a description with tags or program settings. You cannot open them for editing with a regular double click. This is due to the fact that the required application, which is used by default, is not installed to associate with the extension. But if you want a readable table file that can be edited, you can open the XML file in Excel. In this case, no converters are needed that can convert formats between themselves. The only caveat is that this feature is only available in Office versions 2003 and higher.

How to open XML in Excel: method one

Let's look at importing data based on Excel version 2016. The first and easiest way is to initially launch Excel. When you start the application, instead of a greeting and logo, it will display a special login window, in which there is a line “Open other books” in the left menu.

After this, the browse item is used, and in the new window XML is selected as the opening format. After this, using the usual method, we find the desired file and press the open button. In this case, it is recognized not as a text document containing descriptions and tags, but as a very ordinary table. Naturally, the data can be edited at your discretion, but more on that later.

How to open XML format in Excel: method two

Another proposed method is practically no different from the first. You can open an XML file in Excel from the file menu or use the shortcut Ctrl + O to do this.

Again, the type of format to be opened is first selected, after which the desired file is found and the corresponding button is pressed.

Opening XML: Method Three

There are several more XML methods in Excel. So, in the 2016 version of the program, you can use the top panel menu, where you select the “Data” section, and then click the button to obtain external data.

In the drop-down menu, you just need to select the “From Other Sources” section and use the line “From XML Import” in the new menu. This is followed by the standard procedure of searching for the desired file and then opening it.

Editing, saving and exporting

When using any of these methods, the user gets the table structure. Editing is done in the same way as with standard XLS files. Sometimes, for ease of editing and saving data, it is advisable to use the developer menu.

In this case, you can import not all the contents of the XML file, but only what is really necessary, inserting information into the appropriate columns and rows, specifying the XML object as the data source. But to do this, you need to log into your account in the program itself using your registration with Microsoft.

You can save the changed file immediately in the original format by selecting the appropriate type from the list. From the file menu, if the object was saved in the “native” Excel format, you can select the export function, click on change file type and set XML as the final format.

If the user is too lazy to do such conversions, or he uses a version of Office lower than version 2003, he will have to use a special converter to open this format as a table. There are quite a lot of such programs now offered. As a last resort, if this is not suitable, you can easily turn to specialized online services, where the format will be changed within a couple of tens of seconds. After completing these steps, all that remains is to download the finished result in XLS format to your hard drive, and then open it in Excel. However, in most cases such actions are not required, since in Office 2003 the ability to directly open (import) the XML format is already provided initially. And it seems that few people today use outdated Microsoft office products.

Microsoft Excel is a convenient tool for organizing and structuring a wide variety of data. It allows you to process information using different methods and edit data sets.

Let's consider the possibilities of using it to generate and process web application files. Using a specific example, we will study the basics of working with XML in Excel.

How to create an XML file from Excel

XML is a file standard for transmitting data on the Web. Excel supports its export and import.

Let's look at creating an XML file using the example of a production calendar.

  1. Let's make a table from which you need to create an XML file in Excel and fill it with data.
  2. Let's create and insert an XML map with the required document structure.
  3. Export table data to XML format.

We save the file as XML.

Other ways to get XML data (schema):

  1. Download from a database, specialized business application. Schemes can be provided by commercial sites and services. Simple options are publicly available.
  2. Use ready-made samples to test XML maps. The samples contain the main elements and XML structure. Copy and paste into Notepad and save with the desired extension.


How to save an Excel file in XML format

One of the options:

  1. Click the Office button. Select “Save as” - “Other formats”.
  2. We assign a name. Select the save location and file type – XML.

More options:

  1. Download XLC to XML converter. Or find a service that allows you to export the file online.
  2. Download the XML Tools Add-in from the official Microsoft website. It is freely available.
  3. Opening a new book. Office button – “Open”.

How to open an XML file in Excel

Click OK. You can work with the resulting table as with any Excel file.

How to Convert XML File to Excel

We edit the created table and save it in Excel format.

How to collect data from XML files in Excel

The principle of collecting information from multiple XML files is the same as the principle of transformation. When we import data into Excel, the XML map is transferred at the same time. Other data can be transferred to the same schema.

Each new file will be linked to an existing map. Each element in the table structure corresponds to an element in the map. Only one data binding is allowed.

To configure linking options, open the Map Properties tool from the Developer menu.


Possibilities:

  1. Each new file will be checked by Excel for compliance with the installed card (if we check the box next to this item).
  2. Data may be updated. Or new information will be added to the existing table (makes sense if you need to collect data from similar files).

These are all manual ways to import and export files.

Note: This article has served its purpose and will soon be discontinued. To avoid "Page Not Found" errors, we remove links that we know of. If you have created links to this page, please remove them and together we can maintain consistency across the web.

If you need to create an XML data file and an XML schema file from a range of cells in a worksheet, you can use version 1.1 of the XML Tools for Excel 2003 add-in to extend the existing XML capabilities in Microsoft Excel 2007 and later versions.

Note: This add-in was developed for Excel 2003. The documentation and user interface refer to lists, which are called Excel tables in versions of the application later than Excel 2003.

For more information about working with this add-in, see Use the XML Tools add-in version 1.1 for Excel 2003.

Step 2: Convert a range of cells to an XML table

    Enter the data for which you want to create an XML data file and an XML schema file. The data must be presented in a tabular format in the form of columns and rows (called ordinary data).

    On the tab Add-ons in the group Menu commands click the arrow next to the caption XML Tools and press the button Convert range to XML list.

    Enter the range of cells with the data you want to convert as an absolute reference in the text box.

    In the field The first line contains the column names select No, if the first row contains data, or Yes if the first row contains column headers, and click OK.

    Excel will automatically create the XML schema, link the cells to the schema, and create the XML table.

    Important: If the Visual Basic Editor opens and you see a Visual Basic for Applications (VBA) error message, follow these steps:

    1. Click the button OK.

      In the highlighted line of the VBA code module, remove "50" from the line. In other words, change:
      XMLDoc as MSXML2 . DOMDocument50
      To whom:
      XMLDoc as Msxml2. Domdocument

      Press F5 to search for the next line containing the text "XMLDoc As msxml2.DOMDocument50", click OK and change the line as in the previous paragraph.

      Press F5 again to find and change other instances of the line.

      If you no longer see the VBA error message after you press F5, close the Visual Basic Editor to return to the workbook. The range of cells will be converted to an XML table.

      Note: To display all XML maps in a workbook, in the tab Developer in the group XML click the button Source to display the XML Source task pane. At the bottom of the XML Source task pane, click XML Maps.

      If the tab Developer is not visible, follow the first three steps in the next section to add it to the Excel ribbon.

Step 3: Export the XML table to an XML data (XML) file

Note: When creating XML maps and exporting data in Excel to XML files, there is a limit to the number of rows that can be exported. When exporting to an XML file from Excel, you can save up to 65,536 rows. If the file contains more than 65,536 rows, Excel will only be able to export the first rows (number of rows mod 65,537). For example, if a worksheet contains 70,000 rows, Excel exports 4,464 rows (70,000 mod 65,537). We recommend following one of the following tips: 1) use the XLSX format; 2) save the file in "XML 2003 Table (*.xml)" format (this will lose the mappings); 3) delete all lines after 65536 and then export again (this will keep the mappings but will lose the lines at the end of the file).

Step 4: Save the XML Schema in an XML Schema (XSD) file

Note: This page has been automatically translated and may contain inaccuracies and grammatical errors. It is important to us that this article is useful to you. Was the information useful? For convenience also (in English).

© 2024 ermake.ru -- About PC repair - Information portal