Skip to content

Bug with clean_traj=False // improvement suggestions #1

@rafwiewiora

Description

@rafwiewiora

1) When using clean_traj=False with multiple trajectories, you call map_to_integers on each trajectory separately:

newseq, m_dict = map_to_integers(seq, seq_map)

and for each trajectory you start counting new state indexes from 0:
hence if the 1st trajectory doesn't contain all states, you'll map the 'new' states from further trajectories to index 0 etc. again

Extra suggestion: you return m_dict here:

newseq, m_dict = map_to_integers(seq, seq_map)

but use seq_map (which luckily had been modified in place by map_to_integers) eventually instead:

self.seq_map = seq_map

2) When using coarse_macrostates=True the original dtrajs are not copied and get modified:

traj[i] = stateA[0]

which is not great if you don't know about this and want to use the same dtrajs downstream...

3) You use **kwargs:

coarse_macrostates=False, **kwargs):

they don't actually get passed to anything, but cause trouble if you misspell the existing keyword arguments - there's no error, e.g. I kept misspelling clean_trajs instead of clean_traj and it took me a while to realize why my transition paths were coming out wrong.

4) fit using list dtrajs is 6x faster than with int64 numpy array, and 36x faster than with int32 numpy array -- it should convert arrays to lists.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions