Class 12: Introduction to Genome Informatics Lab
Q1 - What are the 4 candidate SNPS?
The 4 SNPs are rs12936231, rs8067378, rs9303277, and rs7216389.
Q2 - What three genes do these variants overlap or effect?
The three genes are ZPBP2, GSDMB, and ORMDL3.
Q3 - What is the location of rs8067378 and what are the different alleles for rs8067378?
The location is Chromosome 17:39, 895, 095 (GRCh38: forward strand). Alleles are A / C / G.
Q4 - Name at least 3 downstream genes for rs8067378?
Three downstream genes include ZPBP2, GSDMB, and ORMDL3.
Q5 - What proportion of the Mexican Ancestry in Los Angeles sample population (MXL) are homozygous for the asthma associated SNP (G|G)?
Nine individuals are homozygous G|G, which would give us a proportion of ~14.1%.
Q6 - Back on the ENSEMBLE page, use the “search for a sample” field above to find the particular sample HG00109. This is a male from the GBR population group. What is the genotype for this sample?
The genotype for this sample is G|G.
Q7 - How many sequences are there in the first file? What is the file size and format of the data? Make sure the format is fastqsanger here!
There are 3,863 sequenes in fastqsanger format with a file size of 741.9 KB.
Q8 - What is the GC content and sequence length of the second fastq file?
The GC content is 54% and the sequence length is 50-75 bp.
Q9 - How about per base sequence quality? Does any base have a mean quality score below 20?
No, the mean quality stays above 20 for all base positions and no base has mean less than 20.
Q10 - Where are most the accepted hits located?
Most of the accepted hits are located on chromosome 17.
Q11 - Following Q10, is there any interesting gene around that area?
An interesting gene around that area is the ORMDL3 gene because it is associated with asthma risk (related to the SNP in this lab)
Q12 - Cufflinks again produces multiple output files that you can inspect from your right-handside galaxy history. From the “gene expression” output, what is the FPKM for the ORMDL3 gene? What are the other genes with above zero FPKM values?
The FPKM for the ORMDL3 gene is 136,853. The other genes with above zero FPKM values are ZPBP2 (4,613.49), GSDMB (26,366.3), GSDMA (133.634) and PSMD3 (299,021).