Find all the CPTAC subjects¶
I'm a researcher, and I want to reuse data from the Clinical Proteomic Tumor Analysis Consortium, but it's been stored across multiple data centers. I just want an easy way to track it all down.
First, decide what column to search. I'm looking for columns that have to do with project
:
InĀ [3]:
Copied!
columns(column=["*project*"])
columns(column=["*project*"])
Out[3]:
table | column | data_type | nullable | description |
---|---|---|---|---|
Loading ITables v2.2.5 from the init_notebook_mode cell...
(need help?) |
member_of_research_project
has the definition I'm looking for, so I'm going to search that for cptac
. I want both subject and researchsubject info, so I'm requesting rows that match cptac
from those two tables, joined:
InĀ [4]:
Copied!
fetch_rows(table="subject", match_all="member_of_research_project = *cptac*", link_to_table='researchsubject')
fetch_rows(table="subject", match_all="member_of_research_project = *cptac*", link_to_table='researchsubject')
Out[4]:
subject_id | cause_of_death | days_to_birth | days_to_death | ethnicity | race | sex | species | vital_status | researchsubject_id | member_of_research_project | primary_diagnosis_condition | primary_diagnosis_site | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Loading ITables v2.2.5 from the init_notebook_mode cell...
(need help?) |
InĀ [5]:
Copied!
fetch_rows(table="subject", match_all="member_of_research_project = *cptac*", link_to_table='researchsubject', return_data_as='tsv', output_file='my_file.tsv')
fetch_rows(table="subject", match_all="member_of_research_project = *cptac*", link_to_table='researchsubject', return_data_as='tsv', output_file='my_file.tsv')
This looks like what I want, so I'll re-run the query but save it to a file this time: