Skip to content

Releases: Plant-Food-Research-Open/TEgenomeSimulator

v1.0.0

02 May 02:52
1e4ca3b

Choose a tag to compare

What's new in the latest TEgenomeSimulator (v1.0.0)?

  • A new flag --to_mask under Custom Genome mode (--mode 1 or -M 1) to enable the use of RepeatMasker for masking TEs in user-provided real genome, followed by TE removal to generate the TE-depleted genome.

    • This option has to be used in conjunction with --repeat2 which specifies the TE library file for repeat masking.
    • Note that users still need to specify the TE library used for simulation via the argument --repeat.
    • Users may specify different TE libraries or use the same library for --repeat and --repeat2.
  • A new distribution for the probility of sequence integrity has been implemented for Random Synthesized Genome mode(--mode 0 or -M 0) and Custom Genome mode (--mode 1 or -M 1).

    • Here sequence integrity referes to the length ratio of inserted TE sequences relative to the consensus sequence. For example, if a TE sequence to be inserted into the synthetic genome is 800bp and the corresponding consensus sequence is 1kb, the integrity is 0.8.
    • TEgenomeSimulator now uses beta distribution to model the sequence integrity of each TE family.
    • The alpha and beta values are 0.5 and 0.7 by default to create an asymatric U-shape distribution, in which the probability of low sequence integrity (e.g. <10%) is higher than that of high integrity (e.g. >90%). You can use this online tool to see how the distribution would look like with different alpha and beta values.
    • Users can use the parameter -a or --alpha to specify alpha value, and use -b or --beta to specify beta value.
    • The simulator also allow users to specify a fixed proportion of total TE copies to be intact TE insertion, using the parameter -i or --intact. Intact TE loci are kept in the same length as the original TE family sequence.
  • A new mode, TE Composition Approximation mode (--mode 2 or -M 2), to allow users to create a simulated genome in which the TE composition is simulated in a way to approximate the real TE make-up in the original genome.

    • TEgenomeSimulator does this by masking and removing the original TEs using RepeatMasker with the TE library supplied via --repeat2, and then simulates TE mutations based on the prior information extracted from RepeatMasker output file.
    • The rest of the steps, random and nested TE insertions are the same as the first two modes.
    • Note that the simulator uses the TE library specified by --repeat argument for TE mutagenesis and insertions. Therefore it is crucial to specify the same TE library with --repeat and --repeat2 for the purpose of this genome approximation simulation.
  • New installation option using the Apptainer image: TEgenomeSimulator_v1.0.0.sif

v0.1.0

27 Jan 02:18
bf4a3a6

Choose a tag to compare

Version 0.1.0

The first release of TEgenomeSimulator.

What's Changed

New Contributors

Full Changelog: https://github.com/PlantandFoodResearch/TEgenomeSimulator/commits/v0.1.0