You might have wondered what underlying mechanism and models does the famous ChatGPT, the Artificial Intelligence (AI) chat assistant employs. The T of the GPT, that is the Generative Pre-trained Transformer, is a Machine Learning (ML) model architecture called “Transformer” that’s been shaking the field for the last few years. As demonstrated by the naming of the seminal paper “Attention Is All You Need” that introduced the transformer architecture to the literature, the essential part of this model is the “attention.” In my senior thesis, I am proposing an alternative implementation of this attention mechanism using quantum photonic circuitry. By using a variant of the functional form given for the attention calculations (that is, using the Gaussian Radial Basis Function instead of Softmax), I present the Quantum Photonic Transformer, or in short, QPT. In my work, QPT circuitry is described and is implemented in computer code as a hybrid classical-quantum transformer. Using this implementation, in my simulational work, I show the viability of QPT by its performance on the simple problem of 2-dimensional point classification. The importance of this work stems from the possible increases in computational speed for the transformer models (remember how slow ChatGPT responds to your queries or how it doesn’t remember what you chatted about a few prompts ago), effectively encoding multiple inputs in a single quantum light mode. Further work includes both the deeper exploration of this speedup and the physical implementation of the photonic circuit described.