Optical endoscopy is the most commonly applied procedure for inspecting the gastrointestinal (GI) tract, and it is based on different approaches and designs, from using a flexible optical scope to swallowing a small camera capsule that obtains photographs as it advances through the digestive tract. Despite its wide use in GI diagnostics and theranostics, optical visualization only allows a superficial inspection of the wall lining (mucosa), therefore limiting the ability to obtain information from deeper GI layers. In the quest for developing methods to visualize under the mucosal layers, we review herein progress with optoacoustic endoscopy, a technique that captures optical contrast in high resolution deep inside tissues, enabling imaging beneath the surface of the mucosa. Optoacoustic endoscopy combines imaging of optical contrast with the resolution and depth penetration afforded by ultrasonography, thus merging highly advantageous characteristics for clinical applications. We review progress and the current status of the technology, its key endoscopic competitors, and challenges for clinical application. We further offer a perspective regarding future directions and the overall application potential of the technique to complement the current state-of-the-art.