Skip to content
Mehmet Can Ay edited this page Jul 17, 2024 · 2 revisions
pdataviewer-logo
  1. What Is PDataViewer?
  2. What Is PDataViewer For?
  3. What Problem Does PDataViewer Solve?

What Is PDataViewer?

Identifying and utilizing cohort studies comes with its own challenges. PDataViewer is a web application designed to facilitate this process by ranking cohorts based on variables required in order to investigate specific research questions, providing summary statistics, showcasing the distribution of biomarkers, and participant drop-out rates, and utilizing an auto-harmonization tool to semantically harmonize cohort studies for research purposes.

What Is PDataViewer For?

PDataViewer is designed to:

  • Provide a general list of Parkinson's disease cohort studies along with their summary statistics (such as the total number of participants, number of participants in each diagnosis group, study location, etc.) as well as their reference and data access application links.
  • Showcase the distribution of specified variables in selected cohort studies and within each diagnosis type using box plots.
  • Provide a list of research question-specific Parkinson's disease cohort studies based on user-provided variables and showcase the distribution of these variables in each cohort.
  • Display ethnoracial diversity in each Parkinson's disease cohort study.
  • Demonstrate semantic feature mappings across cohorts.
  • Showcase participant drop-out rates in longitudinal studies for specified features.
  • Utilize INDEX (the Intelligent Data Steward Toolbox), a semantic auto-harmonization tool, to harmonize multiple cohort studies against a user-specified standard or, by default, against the Parkinson's Disease Common Data Model (PASSIONATE).
  • Showcase the embedding space of PASSIONATE variables with a t-SNE plot.

What Problem Does PDataViewer Solve?

Due to data security and patient privacy concerns, accessing actual patient-level data takes time. In some cases, the variables reported in the cohort study documentation may not be present in the data itself for various reasons (e.g., clinicians might opt-out of collecting such measurements due to the procedure's invasiveness). Additionally, in longitudinal cohort studies, some participants may leave the study prematurely, reducing the availability of certain measurements in follow-up visits.

Furthermore, cohort studies do not follow a standard naming convention when reporting their measurements. For instance, the age of participants could be recorded as "Age," "age," "age_of_participant," or "participant_age". Prior to applying data-driven approaches while employing such datasets, variables require to be semantically harmonized, which often is a lengthy procedure, especially when dealing with a large variety of cohorts and variables.

PDataViewer aims to increase the transparency, findability, and accessibility of the data while reducing the time required for the semantic harmonization of variables. It provides insights into patient-level data, a list of research topic-specific cohorts with application links, and an auto-harmonization tool. The ultimate goal is to enhance the FAIRness of data in the Parkinson's disease cohort study field.

Clone this wiki locally