Data is complex and fascinating, originating from a variety of sources, including patients, sites, labs, wearables, and ePRO, just to name a few. This makes precision everything. This data collection includes receiving electronic external data, as well as utilizing Clinical Data Management (CDM) systems, such as an Electronic Data Capture (EDC) database, where key data points are entered by site research personnel from source documents and paper or electronic medical records.
The data collected can potentially pass through two critical processes, data integration or data reconciliation.
The terms sound similar, but they are not interchangeable. In fact, one of the top CDM questions we receive from Sponsors is, “What is the difference between data integration and data reconciliation? Aren’t they the same?”
In this article, we will outline data integration vs data reconciliation and explore why the distinction matters.1
Electronic external data is defined as ‘electronic data’ that is collected outside of the EDC. Let’s start by looking at the types of data this includes:
The practice of CDM data integration requires EDC back-end programming, programming validation time and recurring maintenance of these data connections. The external data vendor also needs to be aware of this request as it will require the vendor’s technical expertise to support the EDC back-end programming by providing outgoing programming to connect the data systems using webservices or Application Program Interface (API).
This also requires programmatic manipulation of the raw, external data file to configure the external data file to fit the configuration requirements of the EDC system – and the process can be precarious. Any data manipulation could degrade the quality of the original raw, external data. Reconfiguring these files, even with validation, might introduce manual errors in the programming code which can affect the dataset.
CDM data reconciliation is a data review process that compares unique identifiers in the EDC data such as subject number, visit, nominal time point, collection dates and collection times with the same data points in the electronic external data source datasets. The data points to be reconciled are defined at the project level through discussions between the Sponsor, CRO and electronic external data vendor and documented in a data cleaning plan. Discrepancies between the EDC data and the external data source are identified by CDM, and those discrepancies are addressed by the external data vendor, Clinical Research Associate (CRA), or site. After data reconciliation discrepancies are communicated to the appropriate party (e.g., through site data queries, vendor communication, Sponsor teleconferences, etc.), the data are corrected to ensure both the EDC and electronic external data are reconciled and matching.
Data transfer agreements (DTA) and Data transfer specifications (DTS), are developed between the external data vendor and the data recipient to ensure agreement and understanding:
Data integration requires a data connection, which includes a technical mapping and programming effort to funnel data from an external data source into the EDC to display data points on the EDC screen. In contrast, data reconciliation refers to receiving and managing external data in its native format to clean and analyze.
One of the common misconceptions is that there is a need or requirement to integrate all external data sources directly into the EDC. While this is considered to be a nice-to-have, it does add more time to start up and should be thought of as optional as data can still be viewed in its native form or directly from the source. For the purposes of data analysis, Biometrics (Clinical Data Management and Biostatistics) can fully support handling multiple sources of datasets to perform data cleaning and statistical analysis. From the Sponsor and medical reviewer perspectives, reviewing aggregate clinical data and patient-specific data can be done using reports and tools outside of the EDC by utilizing programmed patient profiles or data visualization software (e.g., JReview).
Data integration and data reconciliation are both critical elements in a well-designed CDM plan, but they are also aspects that are heavily impacted by the CRO partner the Sponsor chooses to execute the protocol. Protect your endpoints by selecting a CRO that has the expertise and experience to make sure your final data set is as representative and accurate as possible.
Learn more about our indication-specific approach to end-to-end CDM coverage >
1. For the purposes of this article, the discussion of data integration and data reconciliation will not include EMR, ePRO/Randomization capabilities built into the EDC. This article also does not address any non-clinical subject data related processes such as EDC to outside system data pushes to support grants/site payments, project tracking such as CTMS (Clinical Trial Management System).
2. ePRO collection can be a part of the EDC as a service provided by the EDC vendor or a separate third-party system that is not dynamically connected to the EDC.
3. As a side note, for safety labs, this does not include local labs where the results are entered by the research site into the EDC from local laboratory result reports.