PDMX, Poll & Direct Metadata for SDMX (Excel)
Introduction
The Excel PDMX tool supports the user in retrieving all major SDMX artefacts from the SDMX endpoint of his or her choice (dataflow, datastructure, conceptscheme, codelist and metadatastructure).
See below for more background info on SDMX.
It can easily be used within an office environment as a regular Excel worksheet (with macro enabled).
SDMX artefacts can be collected for two different systems, and optionally compared for any differences.
The results are saved in separate sheets per artefact and system.
PDMX can be used for different uses cases, viewing available cross-domain codelists to reuse, deciding related artefacts to update in case of new versions etc.
In the main sheet (named PDMX) the results are summarized (see figure 1 below for an impression) and the requests can be initiated by selecting the proper button in the Form (to launch by the 'Form'-button).
The retrieval of the artefacts can take some time (especially the datastructures), dependant of the number and network connection.
Currently only JSON formatted content is supported by PDMX.
In the statusbar the status of the retrieval can be followed.
Some supporting functions can be selected as well, to hide or unhide the configuration, to initialize the worksheet, to resize the column widths of all data sheets, to save summarized results to a history sheet.
The results for the different artefacts are checked for their usage as well (to detect any 'orphans' or unused artefacts, see figure 2).
The standard interrelationships of the especially the datastructure are therefore of great importance.
The tool can be used for all standard SDMX-API endpoints (to configure by their url, accept header and REST parameters, see below the options).
Some examples are already configured and applicable. The configuration can be hidden of unhidden by the button in the general form.
It is only needed to enter data in the configuration cells (yellow fields) and to select the endpoint of your choice in the main sheet (in the green or blue cell).
Warning: do not remove or delete any cells in the existing sheets or change any sheet names which may cause the tool to fail!
Usage
After downloading PDMX from the download-page or by this link, you must first enable the macro by trusting this file (at your own risk).
You can rename the file for your own convenience.
The next thing is to add and select your configuration-parameters in the configuration-part of the main sheet (called PDMX), which can be hidden.
With the Form-button all available functions can be disclosed:
- Get DFs: Retrieve all dataflows from system (data is collected in sheet DF1 or DF2)
- Get DSs: Retrieve all datastructures from system (data is collected in sheet DS1 or DS2)
- Get CSs: Retrieve all conceptschemes from system (data is collected in sheet CS1 or CS2)
- Get CLs: Retrieve all codelists from system (data is collected in sheet CL1 or CL2)
- Agency rep.: Produces summary of all retrived artefacts per agency (in sheet AG1 or AG2)
- Check data: Retrieve all data in sheet DT1 or DT2 for all dataflows (in sheet DF1 or DF2)
- Count categ.: Retrieve categories and counts only the number present
- Count constr.: Retrieve constraints and counts only the number present
- Compare DFs: Compare all retrieved dataflows of system1 with system2 (in sheet DFc)
- Compare DSs: Compare all retrieved datastructures of system1 with system2 (in sheet DSc)
- Compare CSs: Compare all retrieved conceptschemes of system1 with system2 (in sheet CSc)
- Compare CLs: Compare all retrieved codelists of system1 with system2 (in sheet CLc)
- Add history: Add current totals of summary in main sheet as a new row in the History sheet
- Unhide config: Unhide (or hide) the configuration data in the main sheet
- Resize sheets: Resize all data sheets to its initial column sizes
- Initialise: Remove all retrieved data from the worksheet (except the configuration)
- Cancel: Cancel function selection with the form menu
Note 1: The first 8 functions can be applied for the 'green' and 'blue' system (or system1 and system2).
Note 2: The right order in retrieving a complete and consistent collection of a system is dataflows, datastructures, conceptschemes and codelist.
Note 3: By hitting the little button next to label System 1 or System 2, the DFs, DSs, CSs and CLs can be retrieved in one go.
Note 4: The Check Data function may take a long time depending on your connection or system endpoint. You can abort this function at any time (the data size is marked with an asterisk as incomplete). And if you start another Excel workbook, it will automatically abort.
The results are displayed in the main PDMX-sheet, as well the url used and start of the reponse retrieved.
The previous results are moved to the next column to the right.
After retrieving the artefacts, you can examine the results in the appropriate sheets.
Artefacts used are counted by their usage (by datastructure), so artefacts not used can easily be recognized.
Older versions of artefacts can be such a reason.
Disclaimer
Please note that you use the PDMX tool at your own risk. Excelmeer is in no way liable for any negative consequences of its use.
Standard SDMX-API parameters
DETAIL-parameter: This attribute specifies the desired amount of information to be returned.
- ALLSTUBS : all artefacts should be returned as stubs, containing only identification information, as well as the artefacts' name.
- REFERENCESTUBS: referenced artefacts should be returned as stubs, containing only identification information, as well as the artefacts' name.
- REFERENCEPARTIAL: referenced item schemes should only include items used by the artefact to be returned. For example, a concept scheme would only contain the concepts used in a DSD, and its ISPARTIAL flag would be set to true. Likewise, if a dataflow has been constrained, then the codelists referenced by the DSD referenced by the dataflow should only contain the codes allowed by the content constraint
- ALLCOMPLETESTUBS: all artefacts should be returned as complete stubs, containing identification information, the artefacts' name, description,annotations and isFinal information.
- REFERENCECOMPLETESTUBS: referenced artefacts should be returned as complete stubs, containing identification information, the artefacts' name, description, annotations and isFinal information.
- FULL (default): all available information for all artefacts should be returned.
REFERENCES-parameter: This attribute instructs the web service to return (or not) the artefacts referenced by the artefact to be returned (for example, the code lists and concepts used by the data structure definition matching the query as well as the artefacts that use the matching artefact (for example, the dataflows that use the data structure definition matching the query).
- NONE (default): no references will be returned.
- PARENTS: the artefacts that use the artefact matching the query
- PARENTSANDSIBLINGS: the artefacts that use the artefact matching the query, as well as the artefacts referenced by these artefacts.
- CHILDREN: artefacts referenced by the artefact to be returned.
- DESCENDANTS: references of references, up to any level, will also be returned.
- ALL: the combination of parentsandsiblings and descendants, in addition, a concrete type of resource may also be used (for example, references=codelist).
More info on the SDMX-API can be found here.
SDMX background
SDMX, which stands for Statistical Data and Metadata eXchange, is an ISO standard designed to describe statistical data and metadata, normalise their exchange, and improve their efficient sharing across statistical and similar organisations.
SDMX is sponsored by eight international organisations including the Bank for International Settlements (BIS), European Central Bank (ECB), Eurostat (Statistical Office of the European Union), International Labour Organization (ILO), International Monetary Fund (IMF), Organisation for Economic Cooperation and Development (OECD), United Nations Statistical Division (UNSD), and World Bank.
The SDMX standard provides an integrated approach to facilitating statistical data and metadata exchange, enabling interoperable implementations within and between systems concerned with the exchange, reporting and dissemination of statistical data and their related meta-information.
More info on SDMX can be found at the SDMX website.
PS. In case you are looking for some customization in PDMX or any other Excel development, please contact Excelmeer.
Figure 1: animated example of main screen
Figure 2: example of retrieved codelists
|