Log In

EDRN Catalog and Archive Service


COPY_NUMBER_LEV1

Dataset Abstract:

Lung cancer is the number one cause of cancer death worldwide, with up to 25% of cases arising in never smokers. Lung cancers in smokers and never smokers have been shown to exhibit specific genetic and clinical features which suggests they are distinct diseases. Despite these findings, there is still much to learn about lung tumorigenesis in these two groups. We hypothesized that if smoker and never smoker lung tumors are different diseases, they are likely driven by different molecular mechanisms and therefore likely exhibit different patterns of copy number alterations throughout the genome. We used Affymetrix SNP 6.0 arrays to generate high resolution copy number profiles for lung adenocarcinomas from 30 smokers and 30 never smokers as well as cell lines derived from 20 smokers and 20 light or never smokers to elucidate global differences in genomic aberrations between these two groups.


The following additional information has been defined for this dataset. The information has been provided by the Principal Investigator or staff from his or her laboratory.

ProtocolID
282
ProtocolName
Identification of biomarkers for lung cancer in never smokers
DataSetName
COPY_NUMBER_LEV1
LeadPI
Wan Lam
SiteName
DataCustodian
Kelsie Thu
DataCustodianEmail
kthu@bccrc.ca
OrganSite
Lung
CollaborativeGroup
Lung and Upper Aerodigestive
MethodDetails
41 cell lines
Raw SNP array data (CEL files = level 1) were imported and processed in Partek Genomics Suite software. Partek normalization includes background subtraction, probe length adjustment, fragment length adjustment, and GC content adjustment. Analysis was performed following the "Copy Number" workflow using SNP array data for 72 normal HapMap individuals to create a baseline for copy number. A matrix of normalized log2 intensity ratios (tumor cell line:HapMap baseline) for each probe in each sample was generated (level 2 data). Genomic segmentation was performed in Partek to generate a list of segmental copy number alterations in each sample using a 50 marker minimum and a p-value threshold of 10-6 to define copy number altered regions. Segmentation data was parsed into a gene-centric matrix with estimated copy number values for each gene (level 3 data). A copy number = 2 is copy neutral; copy number < 2 is a loss and copy number > 2 is a copy gain. Genes were mapped using base pair coordinates from a table of RefSeq annotated genes downloaded from the UCSC genome browser (March 2006 genome build). Gene accessions with ambiguous mapping were removed from the RefSeq table. For genes with multiple accessions, the accession with the longest transcript was used for mapping.

60 matched tumor normal pairs
Raw SNP array data (CEL files = level 1) were imported and processed in Partek Genomics Suite software. Partek normalization includes background subtraction, probe length adjustment, fragment length adjustment, and GC content adjustment. Analysis was performed following the "Copy Number" workflow using a paired analysis. A matrix of normalized log2 intensity ratios (tumor:normal) for each probe in each sample was generated (level 2 data). Genomic segmentation was performed in Partek to generate a list of segmental copy number alterations in each sample using a 50 marker minimum and a p-value threshold of 10-6 to define copy number altered regions. This data was parsed into a gene-centric matrix with estimated copy number values for each gene (level 3 data). A copy number = 2 is copy neutral; copy number < 2 is a loss and copy number > 2 is a copy gain. Genes were mapped using base pair coordinates from a table of RefSeq annotated genes downloaded from the UCSC genome browser (March 2006 genome build). Gene accessions with ambiguous mapping were removed from the RefSeq table. For genes with multiple accessions, the accession with the longest transcript was used for mapping.
PubMedID
DateDatasetFrozen
QAState
Accepted
DatasetDescription
EDRN Tumor Copy Number Data
SiteID
519
AnalyticResults
This dataset includes copy number data for 60 lung adenocarcinoma tumors and 41 lung cancer cell lines.
DataDisclaimer
Data and information released from the National Cancer Institute (NCI) are provided on an "AS IS" basis, without warranty of any kind, including without limitation the warranties of merchantability, fitness for a particular purpose and non-infringement. Availability of this data and information does not constitute scientific publication. Data and/or information may contain errors or be incomplete. NCI and its employees make no representation or warranty, express or implied, including without limitation any warranties of merchantability or fitness for a particular purpose or warranties as to the identity or ownership of data or information, the quality, accuracy or completeness of data or information, or that the use of such data or information will not infringe any patent, intellectual property or proprietary rights of any party. NCI shall not be liable for any claim for any loss, harm, illness or other damage or injury arising from access to or use of data or information, including without limitation any direct, indirect, incidental, exemplary, special or consequential damages, even if advised of the possibility of such damages. In accordance with scientific standards, appropriate acknowledgment of NCI should be made in any publications or other disclosures concerning data or information made available by NCI.