Mohammed Kashif

Learn AI from first principles.
Follow me on LinkedIn

Flash Attention

The flash attention is a mechanism used for optimizing inference of LLMs and VLMs