This document has last been compiled on 2021-12-14 21:34:11.

Loading Data

## Reading in metadata from file: results/BT642Year1/data/root_meta.txt
## Reading in data from file: results/BT642Year1/data/root_normCounts.txt
## Reading in metadata from file: results/BT642Year1/data/leaf_meta.txt
## Reading in data from file: results/BT642Year1/data/leaf_normCounts.txt
## Total number of leaf samples: 100 
## Total number of genes in leaf samples: 21183 
## No additional bad samples to remove (probably removed during normalization)
## There are no samples to remove beyond those used in the main experiment
## Variance filter for leaf samples:  Remove genes in the bottom 0.2 quantile of variability 
## Total number of genes after limiting to high variable genes : 16946 
## After filtering leaf samples: 100 samples, 16946 genes
## Total number of root samples: 96 
## Total number of genes in root samples: 23595 
## No additional bad samples to remove (probably removed during normalization)
## There are no samples to remove beyond those used in the main experiment
## Variance filter for root samples:  Remove genes in the bottom 0.2 quantile of variability 
## Total number of genes after limiting to high variable genes : 18876 
## After filtering root samples: 96 samples, 18876 genes

Basic statistics on dataset

There are 35115 in the S. bicolor genome (based on annotation: gene_annotation_642.tsv). After low-expression filtering and normalization, we obtain 21183 genes in the leaf samples, and 23595 genes in root samples. If we filter on top of this to the top 50% variable genes, we get 18876 in the root and 16946 (Note: we are not using this variance filter except in the clustering, and we need to double check that it is still being used in this way) Unlike in our previous analysis, our differential expression analysis is not first doing a leaf-specific or root-specific expression filter after normalization to remove leaf or root specific genes.

Annotation on genes:

We give some basic info on the annotations on all genes / orthogroups:

DE genes - Control versus Drought

Leaf samples

For leaf samples, out of the 16946 genes considered, here is the number of genes DE for each genotype, under the two drought conditions (using the global measure of DE, ie based on splines function fit, with qvalue required to be less than 0.05). Log-fold change is based on the global log-fold change measure, MaxOfIntervals (require to be in absolute value greater than 2)).

Genotype Preflowering (DE) Preflowering (DE+lfc) Postflowering (DE) Postflowering (DE+lfc)
RT430 NA 505 NA 160
BT642 9839 505 6429 160

Intersect with ortholog information (only Remap Year)

We intersect the above DE/lfc results with the information about the ortholog group / gene, and its relationship to the 642/430 mappings

First we consider the relationship between 642/430 within the ortholog group:

Now the relationship of the orthogroup to 623:

Different threshold changes

The following are the number of genes significant (q-value < 0.05) for different cutoffs of absolute log-fold-change

Threshold Pre.642 Post.642
0.8 2736 1637
1.0 1950 1045
1.5 938 394
2.0 505 160

Log-fold change based on Different Drought Intervals

The following consider the number DE with log-fold change cutoffs specific to drought/recovery intervals:

Summary of the number of genes found DE with lfc > 2 in the indicated direction in drought intervals:

leaf_Preflowering_BT642 leaf_Postflowering_BT642
DroughtTps_up 131 73
DroughtTps_down 59 71
EarlyDroughtTps_up 79 114
EarlyDroughtTps_down 9 48
LateDroughtTps_up 254 62
LateDroughtTps_down 223 92

Summary of the number of genes found DE with lfc > 2 in the indicated direction in recovery intervals (pre-flowering only):

leaf_Preflowering_BT642
RecoveryTps_up 6
RecoveryTps_down 19
EarlyRecoveryTps_up 265
EarlyRecoveryTps_down 111
LateRecoveryTps_up 6
LateRecoveryTps_down 23

Root samples

For root samples, out of the 18876 genes considered, here is the number of genes DE for each genotype, under the two drought conditions (using the global measure of DE, ie based on splines function fit, with qvalue required to be less than 0.05). Log-fold change is based on the global log-fold change measure, MaxOfIntervals (require to be in absolute value greater than 2)).

Genotype Preflowering (DE) Preflowering (DE+lfc) Postflowering (DE) Postflowering (DE+lfc)
RT430 NA 915 NA 449
BT642 17937 915 10358 449

Intersect with ortholog information (Remap Year)

We intersect the above DE/lfc results with the information about the ortholog group / gene, and its relationship to the 642/430 mappings

First we consider the relationship between 642/430 within the ortholog group:

Now the relationship of the orthogroup to 623:

Different threshold changes

The following are the number of genes significant (q-value < 0.05) for different cutoffs of absolute log-fold-change

Threshold Pre.642 Post.642
0.8 5753 3088
1.0 4073 2147
1.5 1783 913
2.0 915 449

Log-fold change based on Different Drought Intervals

The following consider the number DE with log-fold change cutoffs specific to drought/recovery intervals:

Summary of the number of genes found DE with lfc > 2 in the indicated direction in drought intervals:

root_Preflowering_BT642 root_Postflowering_BT642
DroughtTps_up 331 189
DroughtTps_down 721 318
EarlyDroughtTps_up 112 187
EarlyDroughtTps_down 264 398
LateDroughtTps_up 705 198
LateDroughtTps_down 1404 355

Summary of the number of genes found DE with lfc > 2 in the indicated direction in recovery intervals (pre-flowering only):

root_Preflowering_BT642
RecoveryTps_up 30
RecoveryTps_down 6
EarlyRecoveryTps_up 427
EarlyRecoveryTps_down 170
LateRecoveryTps_up 15
LateRecoveryTps_down 4
## [1] "2021-12-14 21:34:53 PST"
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.7     purrr_0.3.4    
##  [5] tidyr_1.1.4     tibble_3.1.6    ggplot2_3.3.5   tidyverse_1.3.1
##  [9] readr_2.0.2     rmarkdown_2.11  knitr_1.36      SCF_4.1.0      
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7           fs_1.5.0               lubridate_1.8.0       
##  [4] bit64_4.0.5            httr_1.4.2             GenomeInfoDb_1.30.0   
##  [7] tools_4.1.2            backports_1.3.0        bslib_0.3.1           
## [10] utf8_1.2.2             R6_2.5.1               DBI_1.1.1             
## [13] BiocGenerics_0.40.0    colorspace_2.0-2       withr_2.4.2           
## [16] tidyselect_1.1.1       bit_4.0.4              compiler_4.1.2        
## [19] cli_3.1.0              rvest_1.0.2            Biobase_2.54.0        
## [22] xml2_1.3.2             sass_0.4.0             scales_1.1.1          
## [25] genefilter_1.76.0      digest_0.6.28          XVector_0.34.0        
## [28] pkgconfig_2.0.3        htmltools_0.5.2        highr_0.9             
## [31] limma_3.50.0           dbplyr_2.1.1           fastmap_1.1.0         
## [34] rlang_0.4.12           readxl_1.3.1           rstudioapi_0.13       
## [37] RSQLite_2.2.8          jquerylib_0.1.4        generics_0.1.1        
## [40] jsonlite_1.7.2         vroom_1.5.5            RCurl_1.98-1.5        
## [43] magrittr_2.0.1         GenomeInfoDbData_1.2.7 Matrix_1.3-4          
## [46] Rcpp_1.0.7             munsell_0.5.0          S4Vectors_0.32.2      
## [49] fansi_0.5.0            lifecycle_1.0.1        stringi_1.7.5         
## [52] yaml_2.2.1             zlibbioc_1.40.0        grid_4.1.2            
## [55] blob_1.2.2             parallel_4.1.2         crayon_1.4.2          
## [58] lattice_0.20-45        splines_4.1.2          Biostrings_2.60.2     
## [61] haven_2.4.3            annotate_1.72.0        hms_1.1.1             
## [64] KEGGREST_1.34.0        pillar_1.6.4           stats4_4.1.2          
## [67] reprex_2.0.1           XML_3.99-0.8           glue_1.5.0            
## [70] evaluate_0.14          modelr_0.1.8           png_0.1-7             
## [73] vctrs_0.3.8            tzdb_0.2.0             cellranger_1.1.0      
## [76] gtable_0.3.0           assertthat_0.2.1       cachem_1.0.6          
## [79] xfun_0.28              xtable_1.8-4           broom_0.7.10          
## [82] survival_3.2-13        AnnotationDbi_1.56.1   memoise_2.0.0         
## [85] IRanges_2.28.0         ellipsis_0.3.2