This document has last been compiled on 2021-12-14 23:14:01.

Loading Data

## Reading in metadata from file: results/BT642Year2/data/root_meta.txt
## Reading in data from file: results/BT642Year2/data/root_normCounts.txt
## Reading in metadata from file: results/BT642Year2/data/leaf_meta.txt
## Reading in data from file: results/BT642Year2/data/leaf_normCounts.txt
## Total number of leaf samples: 125 
## Total number of genes in leaf samples: 22181 
## No additional bad samples to remove (probably removed during normalization)
## Removing the 13  samples identified as not part of the main experiment, from the leaf samples
## Variance filter for leaf samples:  Remove genes in the bottom 0.2 quantile of variability 
## Total number of genes after limiting to high variable genes : 17744 
## After filtering leaf samples: 112 samples, 17744 genes
## Total number of root samples: 133 
## Total number of genes in root samples: 23916 
## No additional bad samples to remove (probably removed during normalization)
## Removing the 13  samples identified as not part of the main experiment, from the root samples
## Variance filter for root samples:  Remove genes in the bottom 0.2 quantile of variability 
## Total number of genes after limiting to high variable genes : 19132 
## After filtering root samples: 120 samples, 19132 genes

Basic statistics on dataset

There are 35115 in the S. bicolor genome (based on annotation: gene_annotation_642.tsv). After low-expression filtering and normalization, we obtain 22181 genes in the leaf samples, and 23916 genes in root samples. If we filter on top of this to the top 50% variable genes, we get 19132 in the root and 17744 (Note: we are not using this variance filter except in the clustering, and we need to double check that it is still being used in this way) Unlike in our previous analysis, our differential expression analysis is not first doing a leaf-specific or root-specific expression filter after normalization to remove leaf or root specific genes.

Annotation on genes:

We give some basic info on the annotations on all genes / orthogroups:

DE genes - Control versus Drought

Leaf samples

For leaf samples, out of the 17744 genes considered, here is the number of genes DE for each genotype, under the two drought conditions (using the global measure of DE, ie based on splines function fit, with qvalue required to be less than 0.05). Log-fold change is based on the global log-fold change measure, MaxOfIntervals (require to be in absolute value greater than 2)).

Genotype Preflowering (DE) Preflowering (DE+lfc) Postflowering (DE) Postflowering (DE+lfc)
RT430 NA 198 NA 34
BT642 7621 198 910 34

Intersect with ortholog information (only Remap Year)

We intersect the above DE/lfc results with the information about the ortholog group / gene, and its relationship to the 642/430 mappings

First we consider the relationship between 642/430 within the ortholog group:

Now the relationship of the orthogroup to 623:

Different threshold changes

The following are the number of genes significant (q-value < 0.05) for different cutoffs of absolute log-fold-change

Threshold Pre.642 Post.642
0.8 1619 269
1.0 1022 178
1.5 406 81
2.0 198 34

Log-fold change based on Different Drought Intervals

The following consider the number DE with log-fold change cutoffs specific to drought/recovery intervals:

Summary of the number of genes found DE with lfc > 2 in the indicated direction in drought intervals:

leaf_Preflowering_BT642 leaf_Postflowering_BT642
DroughtTps_up 52 4
DroughtTps_down 28 3
EarlyDroughtTps_up 7 0
EarlyDroughtTps_down 0 0
LateDroughtTps_up 167 34
LateDroughtTps_down 117 40

Summary of the number of genes found DE with lfc > 2 in the indicated direction in recovery intervals (pre-flowering only):

leaf_Preflowering_BT642
RecoveryTps_up 6
RecoveryTps_down 0
EarlyRecoveryTps_up 39
EarlyRecoveryTps_down 11
LateRecoveryTps_up 2
LateRecoveryTps_down 1

Root samples

For root samples, out of the 19132 genes considered, here is the number of genes DE for each genotype, under the two drought conditions (using the global measure of DE, ie based on splines function fit, with qvalue required to be less than 0.05). Log-fold change is based on the global log-fold change measure, MaxOfIntervals (require to be in absolute value greater than 2)).

Genotype Preflowering (DE) Preflowering (DE+lfc) Postflowering (DE) Postflowering (DE+lfc)
RT430 NA 657 NA 187
BT642 15882 657 9150 187

Intersect with ortholog information (Remap Year)

We intersect the above DE/lfc results with the information about the ortholog group / gene, and its relationship to the 642/430 mappings

First we consider the relationship between 642/430 within the ortholog group:

Now the relationship of the orthogroup to 623:

Different threshold changes

The following are the number of genes significant (q-value < 0.05) for different cutoffs of absolute log-fold-change

Threshold Pre.642 Post.642
0.8 4254 2101
1.0 2950 1302
1.5 1348 470
2.0 657 187

Log-fold change based on Different Drought Intervals

The following consider the number DE with log-fold change cutoffs specific to drought/recovery intervals:

Summary of the number of genes found DE with lfc > 2 in the indicated direction in drought intervals:

root_Preflowering_BT642 root_Postflowering_BT642
DroughtTps_up 229 65
DroughtTps_down 303 138
EarlyDroughtTps_up 104 36
EarlyDroughtTps_down 121 98
LateDroughtTps_up 386 107
LateDroughtTps_down 593 269

Summary of the number of genes found DE with lfc > 2 in the indicated direction in recovery intervals (pre-flowering only):

root_Preflowering_BT642
RecoveryTps_up 73
RecoveryTps_down 15
EarlyRecoveryTps_up 326
EarlyRecoveryTps_down 177
LateRecoveryTps_up 17
LateRecoveryTps_down 0
## [1] "2021-12-14 23:14:57 PST"
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.7     purrr_0.3.4    
##  [5] tidyr_1.1.4     tibble_3.1.6    ggplot2_3.3.5   tidyverse_1.3.1
##  [9] readr_2.0.2     rmarkdown_2.11  knitr_1.36      SCF_4.1.0      
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7           fs_1.5.0               lubridate_1.8.0       
##  [4] bit64_4.0.5            httr_1.4.2             GenomeInfoDb_1.30.0   
##  [7] tools_4.1.2            backports_1.3.0        bslib_0.3.1           
## [10] utf8_1.2.2             R6_2.5.1               DBI_1.1.1             
## [13] BiocGenerics_0.40.0    colorspace_2.0-2       withr_2.4.2           
## [16] tidyselect_1.1.1       bit_4.0.4              compiler_4.1.2        
## [19] cli_3.1.0              rvest_1.0.2            Biobase_2.54.0        
## [22] xml2_1.3.2             sass_0.4.0             scales_1.1.1          
## [25] genefilter_1.76.0      digest_0.6.28          XVector_0.34.0        
## [28] pkgconfig_2.0.3        htmltools_0.5.2        highr_0.9             
## [31] limma_3.50.0           dbplyr_2.1.1           fastmap_1.1.0         
## [34] rlang_0.4.12           readxl_1.3.1           rstudioapi_0.13       
## [37] RSQLite_2.2.8          jquerylib_0.1.4        generics_0.1.1        
## [40] jsonlite_1.7.2         vroom_1.5.5            RCurl_1.98-1.5        
## [43] magrittr_2.0.1         GenomeInfoDbData_1.2.7 Matrix_1.3-4          
## [46] Rcpp_1.0.7             munsell_0.5.0          S4Vectors_0.32.2      
## [49] fansi_0.5.0            lifecycle_1.0.1        stringi_1.7.5         
## [52] yaml_2.2.1             zlibbioc_1.40.0        grid_4.1.2            
## [55] blob_1.2.2             parallel_4.1.2         crayon_1.4.2          
## [58] lattice_0.20-45        splines_4.1.2          Biostrings_2.60.2     
## [61] haven_2.4.3            annotate_1.72.0        hms_1.1.1             
## [64] KEGGREST_1.34.0        pillar_1.6.4           stats4_4.1.2          
## [67] reprex_2.0.1           XML_3.99-0.8           glue_1.5.0            
## [70] evaluate_0.14          modelr_0.1.8           png_0.1-7             
## [73] vctrs_0.3.8            tzdb_0.2.0             cellranger_1.1.0      
## [76] gtable_0.3.0           assertthat_0.2.1       cachem_1.0.6          
## [79] xfun_0.28              xtable_1.8-4           broom_0.7.10          
## [82] survival_3.2-13        AnnotationDbi_1.56.1   memoise_2.0.0         
## [85] IRanges_2.28.0         ellipsis_0.3.2