Some interesting findings of attention mechanism history

The genesis of this blog post stems from a thoughtful discussion I had with my roommate. The impetus for this conversation originated from the recent widespread popularity of chatGPT, which utilizes transformer technology and, consequently, has put the attention mechanism once again in the spotlight. While there have been many discussions surrounding the attention mechanism, it is widely known that its structure comprises the “q-k-v” formulation - a fact that has largely been established since its inception in the article “Attention is all you need”. However, this prompts the question:‘where did it originate?’ This thought-provoking topic serves as the basis of this blog post.

明确ChatGPT能解决问题的边界

Ruiqiang Xiao
Ruiqiang Xiao
MSc student in HKUST

My research interests include distributed robotics, mobile computing and programmable matter.