Structured and interoperable beneficial ownership data
Overview
The utility and value of beneficial ownership (BO) data is enhanced when the data is available in a structured format. Structured data refers to information that is highly organised according to a predefined model. Since the first jurisdictions have started collecting – and for some, publishing – BO information, some have done so as structured data whilst others have done so as unstructured data. Unstructured data does not follow a predefined data model: for example, if a reporting person is free to describe the relationship between a beneficial owner and a legal entity in their own words. Whilst structured data can be produced in non-digital environments, when structured data is available digitally it can be more easily read and processed by machines.
The first part of this policy briefing outlines the key benefits of collecting, storing, and publishing structured BO data. Jurisdictions that have published open, structured BO data have made a broader range of data analyses by additional users possible, facilitating early impact of beneficial ownership transparency (BOT) reforms.[1] To maximise the impact of BOT reforms, a disclosure regime should collect, store, and share BO information as structured data. This will lead to:
Structuring data creates information that is predictable. Because the structure is predefined, users know what to expect from the data, and this makes it easy to work with. These benefits do not only apply to technical users. Non-technical users can benefit from structured data without ever having to use data directly. Because structured data can be made available in formats that can be readily processed by machines, computers, websites, apps, and other tools – for example, through a web interface, application programming interface (API), or in bulk format – it can be developed so people can access, visualise, and interact with relevant information in a variety of non-technical ways. Structured data can be integrated into both human- and machine-led processes that are either impossible or laborious with unstructured data. For example, making structured BO data available in bulk format allows users such as Financial Intelligence Units, procurement agencies, banks, and journalists to apply data science and machine learning techniques to identify suspicious patterns of ownership or beneficial owners that appear on other datasets of interest.
By removing the frictions associated with unstructured data, structured data decreases the cost of collection of data by governments and compliance to disclosure requirements by legal entities. It also reduces costs associated with maintaining and publishing data. Structured data reduces the cost and increases the impact of achieving the policy aims of BOT reforms by reducing the costs associated with use and analysis. Higher up-front costs associated with setting up the required systems are expected to be negated by lower costs associated with collection, storage, publication, use, and maintenance in the long run.
At the heart of structured data is interoperability, that is, being able to readily use the data with other sources, and integrate it into different systems and processes. The transnational nature of complex BO relationships makes combining BO datasets from different jurisdictions essential to gaining full visibility of ownership structures. Meeting the additional policy objectives for which countries pursue BOT – such as improving procurement processes and enforcing sanctions and campaign financing rules – also requires that the information be combined with other datasets. When BO data is structured and interoperable it is also easier to verify, as a greater range of verification mechanisms can be used, thereby improving data quality.
These benefits would be greatest following the wide adoption of a data standard such as Open Ownership (OO)’s Beneficial Ownership Data Standard (BODS).[a] BODS is a framework for publishing structured data about beneficial ownership in a format that can be read and understood by computer systems around the world. BODS has been adopted by both governments and the private sector, and a range of tools and applications have been developed around it.[2]
The second part of this policy briefing highlights what implementers need in order to operationalise structured BO data. In order to operationalise structured BO data, implementers should:
- Create an enabling environment by taking a user-centred and interactive approach, and by establishing and progressively enhancing the legal, regulatory, and political framework to achieve technical goals relating to BOT. This includes ensuring a solid legal and policy foundation in line with the Open Ownership Principles (OO Principles) and providing sufficient resources.[b]
- Establish principles for collecting and storing BO information by ensuring that, at a minimum, structured BO data:
- identifies the people, companies, and other relevant parties disclosed in a BO declaration by using unique identifiers and sufficient descriptive fields;
- describes the full range of relationships that can exist between parties disclosed in a BO declaration; and
- ensures BO disclosures are auditable.
Implementers should ensure that systems design and business processes underpin the aims of reforms on a technical level. Care should be given at the early stages of implementation to ensure the technical systems and database design meet the full functionality and access expected at the publication and data sharing stages.[c]
- Realise potential and resolve uncertainty at the publication stage. Ensuring published data is auditable by users is necessary to realise the data’s full potential. This can be achieved by making published data available in a range of ways for both non-technical users as well as technical users and systems at scale, such as:
- per-record search via a web interface;
- browsing records via a web interface;
- bulk format;
- API access.
Implementers should also decide on an appropriate licence for the data and provide sufficient accompanying documentation in the form of a publication policy, which should aim to resolve any uncertainties over the published data.
Structured data is a core tenet of the OO Principles, as it ensures data is readily combinable with other data, predictable, and reliable.[3] The OO Principles set the standard for effective BO disclosure and establish approaches for publishing high-quality, useful data. The OO Principles help ensure that published data is usable, accurate, and interoperable.
Figure 1. Example of a beneficial ownership disclosure system using structured data
Structured BO data improves its functionality, reduces the cost across all stages, and leads to greater policy impact. To achieve this, implementers should create an enabling environment, and data should be structured in a way that identifies and describes key elements of beneficial ownership. Digital systems and administrative processes need to fit together smoothly to enable BO information to be collected, stored, maintained, exchanged, and published. Uncertainties should be removed at the sharing and publication stage by adhering to open standards and publishing a clear publication policy, including documentation and licensing information. Data can be made auditable by providing multiple ways to access data. Data standards such as BODS provide a structured data format, along with guidance for collecting, sharing, and using BO data.
Box 1. Key concepts and definitions
In order to understand how structured and interoperable BO data can contribute to meeting policy goals and the necessary policies to facilitate collecting, storing, and sharing structured BO data, it is necessary to explain a number of key concepts. Whilst some of these concepts apply more generally, the core focus is in the context of BO information.
Data is used to store and communicate information by machines and people. It is a unit of information. Data on its own has no inherent meaning, but acquires meaning when used or viewed in a particular context.
Structured data is data that is highly organised according to a predefined model.[d] It has sufficient content, organisation, and context to be interpretable by machines and to convey meaningful information about beneficial ownership (see Table 1). Structured data can be created in non-digital environments, but in this briefing it refers to digital data.
Machine-readable data is data in a format that can be readily processed by a machine or computer. Machine-readable data must be digital structured data.
Data is interoperable when it can be readily used with other sources of data and integrated into different systems and processes. Interoperable BO data, for example, might use a widely agreed method for describing company numbers, allowing datasets from multiple jurisdictions to be joined together.[4] Interoperable BO data may also be joined together with non-BO datasets, such as contracting data.
A data standard provides a documented set of rules and agreements for how data is structured, published, and contextualised. It can also cover data format, definition, transmission, manipulation, use, and management. Standards provide a common language for producing and understanding data, regardless of its origin, and embed a high degree of interoperability by design. Structured data that does not adhere to the same data standard can be – but is not necessarily – interoperable, but would require an extra step of translation to join the data together. BODS, discussed in more detail later, is a data standard which sets rules for high-quality BO data.[5]
Table 1. Unstructured (left) versus structured (right) beneficial ownership data
Unstructured | Structured | |
---|---|---|
Nature of ownership or control | Nature of ownership or control | |
This beneficial owner indirectly herself, or through her children, owns 27% of the declaring legal entity’s shares through the following shareholders of the legal entity (1) “Angerujjheit B.V.”, registration number in the Netherlands 64739564, registered office: Byterslaan 105, NL-4722GF Amsterdam, Netherlands; (2) “RigaTech Systems Ltd.”, registration number in the United Kingdom: 396654, registered office: P.O. Box 124, Company Services Ltd. Main Road, London, United Kingdom. | % Aggregate share ownership | 27 |
% Aggregate control via voting shares | 27 | |
Direct share ownership in declaring entity | 0 | |
Direct voting control over declaring entity | 0 | |
1.1 Intermediate legal owner(s) | ||
Legal owner 1 | ||
Name | Angerujjheit B.V. | |
Registration authority | Commercial register of the Netherlands | |
Registration number | 64739564 | |
Legal owner 2 | ||
Name | RigaTech Systems Ltd. | |
Registration authority | Companies House, UK | |
Registration number | 396654 |
On the left-hand side of this hypothetical example, data is unstructured, as all the information relating to the beneficial owner and her relationship with a company is in a single text field. On the right-hand side, data is structured, as the information is separated out into different fields in a standardised way.
Foot notes
[a] For more information, see: “Beneficial Ownership Data Standard (v0.3)”, Open Ownership, n.d., https://standard.openownership.org.
[b] For more information, see: “Open Ownership Principles”, Open Ownership, updated July 2021, https://www.openownership.org/en/principles.
[c] For more information on database design, see: “Relational database design considerations for beneficial ownership information”, Extractive Industries Transparency Initiative and Open Ownership, 16 December 2021, https://www.openownership.org/en/publications/relational-database-design-considerations-for-beneficial-ownership-information/
[d] Formally, “structured” and “semi-structured” data are different categories. For the purposes of BOT, however, it is enough to note that the same information will often be stored in structured form (in a relational database) and published in semi-structured form (such as JavaScript Object Notation (JSON) or XML). Both structured and semi-structured data are included in the definition used in this briefing, as long as sufficient information is conveyed through structure and context.
End notes
[1] For examples, see: “Case studies”, Open Ownership, n.d., https://www.openownership.org/en/publication-categories/case-studies.
[2] See: “Beneficial Ownership Data Standard”, Open Ownership, n.d., https://www.openownership.org/en/topics/beneficial-ownership-data-standard.
[3] “Open Ownership Principles – Structured data”, Open Ownership, updated July 2021, https://www.openownership.org/en/principles/structured-data.
[4] The FAIR Principles offer a framework for data management and stewardship for machine-actionable data. See: “FAIR Principles”, GO FAIR, n.d., https://www.go-fair.org/fair-principles.
[5] “Beneficial Ownership Data Standard (v0.3)”, Open Ownership.