TryHackMe

TryHackMe Evil-GPT v2: Exploring Basic Prompt Injection

Vic Galvan

28 Jul 2025 — 2 min read

"Put your LLM hacking skills to the test one more time."

Evil-GPT v2 is an easy TryHackMe room created by hadrian3689 and h4sh3m00. It involves exploiting a vulnerable large language model (LLM) to reveal a flag.

My first attempt involved simply asking for the flag. However, the LLM didn't give up its secrets that easily. Direct flag requests are immediately rejected.

However, the LLM can be fooled by utilizing a prompt injection vulnerability. Prompt Injection vulnerabilities occur when a user manipulates an LLM's behaviour or output in unintended ways. A common way to bypass an LLM's restrictions is to make it believe it's in debugging mode before making your request.

When used on this LLM, we are given the flag.

The following resources were used during the development of this post.

Resources

Tips on attacking | Tensor Trust

Rise to the top of the Tensor Trust leaderboard by fooling AI language models, and help researchers make more secure AI along the way.

Tensor Trust

TryHackMe Write-up: TryHack3M: Bricks Heist

TryHack3M: Bricks Heist is a WordPress exploitation and wallet enumeration room on TryHackMe.

HackTheBox Write-Up: ElectricBreeze-1

ElectricBreeze-1 is a very easy Sherlock created by VivisGhost on HackTheBox. Sherlocks are HackTheBox's "investigative Capture The Flags".

Starting From the Bottom, and It Feels Better Than I Thought It Would

After three years as a cybersecurity researcher, I’m rebuilding my knowledge from the ground up. By revisiting cybersecurity fundamentals and filling in gaps I skipped early on, I’m creating a stronger, more complete understanding of the field. It's going better than I expected.

TryHackMe: Active Directory Basics

Welcome to my writeup of TryHackMe's Active Directory Basics room! Let's dive into it. Windows Domains Content Windows Domain - a group of users and computers under the administration of a given business. Active Directory (AD) - a centralized repository of common components of a Windows

Resources

Read more

TryHackMe Write-up: TryHack3M: Bricks Heist

HackTheBox Write-Up: ElectricBreeze-1

Starting From the Bottom, and It Feels Better Than I Thought It Would

TryHackMe: Active Directory Basics