Update model implementations to use flash attention

Flash attention support has been added to Keras 3.
https://github.com/keras-team/keras/blob/25d6d80a6ecd31f0da52c325cd16dbe4a29b7329/keras/src/layers/attention/multi_head_attention.py#L55

However, some of the models implemented in KerasHub is overriding `def _compute_attention()` function which has the flash attention enabling mechanism. The implementations need to be updated