Example use case:
Julia is an oncologist that specializes in female reproductive health. As part of her research, she is interested in using existing data on uterine cancers. If possible, she would like to see multiple datatypes (gross imaging, genomic data, proteomic data, histology) that come from the same patient, so she can look for shared phenotypes to test for their potential as early diagnostics. Julia heard that the Cancer Data Aggregator has made it easy to search across multiple datasets created by NCI, and so has decided to start her search there.
The CDA provides a custom python tool for searching CDA data.
Q (short for Query) offers several ways to search and filter data, and several input modes:
- Q.() builds a query that can be used by
- Q.run() returns data for the specified search
- Q.count() returns summary information (counts) data that fit the specified search
- columns() returns entity field names
- unique_terms() returns entity field contents
Before Julia does any work, she needs to import these functions cdapython.
She'll also need to import pandas to work with dataframes and itables to display them nicely. The
opt. settings are pre-configuring how itables should display her tables, with scrolling and paging enabled.
from cdapython import Q, columns, unique_terms, query import numpy as np import pandas as pd from itables import init_notebook_mode, show init_notebook_mode(all_interactive=True) import itables.options as opt opt.maxBytes=0 opt.scrollX="200px" opt.scrollCollapse=True opt.paging=True opt.maxColumns=0 print(Q.get_version())