Releases: Plant-Food-Research-Open/TEgenomeSimulator
v1.0.0
What's new in the latest TEgenomeSimulator (v1.0.0)?
-
A new flag
--to_maskunder Custom Genome mode (--mode 1or-M 1) to enable the use of RepeatMasker for masking TEs in user-provided real genome, followed by TE removal to generate the TE-depleted genome.- This option has to be used in conjunction with
--repeat2which specifies the TE library file for repeat masking. - Note that users still need to specify the TE library used for simulation via the argument
--repeat. - Users may specify different TE libraries or use the same library for
--repeatand--repeat2.
- This option has to be used in conjunction with
-
A new distribution for the probility of sequence integrity has been implemented for Random Synthesized Genome mode(
--mode 0or-M 0) and Custom Genome mode (--mode 1or-M 1).- Here sequence integrity referes to the length ratio of inserted TE sequences relative to the consensus sequence. For example, if a TE sequence to be inserted into the synthetic genome is 800bp and the corresponding consensus sequence is 1kb, the integrity is 0.8.
- TEgenomeSimulator now uses beta distribution to model the sequence integrity of each TE family.
- The alpha and beta values are 0.5 and 0.7 by default to create an asymatric U-shape distribution, in which the probability of low sequence integrity (e.g. <10%) is higher than that of high integrity (e.g. >90%). You can use this online tool to see how the distribution would look like with different alpha and beta values.
- Users can use the parameter
-aor--alphato specify alpha value, and use-bor--betato specify beta value. - The simulator also allow users to specify a fixed proportion of total TE copies to be intact TE insertion, using the parameter
-ior--intact. Intact TE loci are kept in the same length as the original TE family sequence.
-
A new mode, TE Composition Approximation mode (
--mode 2or-M 2), to allow users to create a simulated genome in which the TE composition is simulated in a way to approximate the real TE make-up in the original genome.- TEgenomeSimulator does this by masking and removing the original TEs using RepeatMasker with the TE library supplied via
--repeat2, and then simulates TE mutations based on the prior information extracted from RepeatMasker output file. - The rest of the steps, random and nested TE insertions are the same as the first two modes.
- Note that the simulator uses the TE library specified by
--repeatargument for TE mutagenesis and insertions. Therefore it is crucial to specify the same TE library with--repeatand--repeat2for the purpose of this genome approximation simulation.
- TEgenomeSimulator does this by masking and removing the original TEs using RepeatMasker with the TE library supplied via
-
New installation option using the Apptainer image: TEgenomeSimulator_v1.0.0.sif
v0.1.0
Version 0.1.0
The first release of TEgenomeSimulator.
What's Changed
- Scripted by @ting-hsuan-chen
- Fixed typo in README by @CeciliaDeng in #1
- Comments after reviewing TEgenomeSimulator by @oliviaAB in #2
New Contributors
- @CeciliaDeng made their first contribution in #1
- @oliviaAB made their first contribution in #2
Full Changelog: https://github.com/PlantandFoodResearch/TEgenomeSimulator/commits/v0.1.0