The Cybersecurity Incident I Encountered and How I Responded

Recently, my startup project in Japan faced a cyber attack where the attacker abused the SMS verification API, leading to an inflated SMS service bill. The situation has now calmed down. I would like to reflect on the cybersecurity incidents I have encountered over the years and the lessons I have learned from them.

A Domestic Content Platform (2016 - 2018)

This content platform was my second project during my first entrepreneurial venture. Since it was my first time starting a business, I didn't deliberately implement any defenses. Another reason was that the overall environment at that time was not as intense as it is now, so an ordinary internet project might not experience many DDoS attacks throughout the year even without defenses. People can adapt to their environment, but it is also difficult to surpass the larger context.

DDoS Attack in 2017

At that time, the response was to use high defense from cloud vendors. In hindsight, this seems quite foolish because the high defenses from various cloud vendors are essentially a tax on intelligence. The cost of these high defense IPs is only a few hundred dollars a month, yet they are priced at tens of thousands, and using them requires modifying domain resolution. You never know when an attacker will strike, so often you only switch to high defense when under attack, and when the month is over, you switch back to a regular IP.

The problems with this protection method are:

By the time you switch to a high defense IP, the service has already been taken down;
It requires manual operation, and after modifying the resolution, the DNS cache on the user's side needs to refresh before the service can be restored;
High defense IPs are expensive; cloud vendors simply see that if you can be attacked, you are a high-value customer, and from the perspective of supply and demand, they treat you like a fool to exploit.

However, at that time, the project was quite profitable, and I was indeed a naive fool, so I foolishly bought a month of high defense every time I was attacked, without feeling there was anything wrong with this approach.

As I mentioned earlier, the overall cybersecurity environment at that time was relatively mild. After a few attacks, the attackers realized I could quickly switch to high defense and left in frustration.

Docker Vulnerability Intrusion in 2016

Still at this content platform, the Docker daemon was listening on a public port, and if the firewall was configured to allow access, it would lead to the server's permissions being handed over. One time, we noticed the server's performance was lacking for no special reason, and upon inspection, we found it was mining 😅. The mining program came from a downloader, which was introduced through Docker. It was quite an embarrassing and even somewhat amusing episode.

XSS Attack Attempts in 2017

During that time, I often saw people posting <script>alert(‘test’)</script>, <script src=”foo/xxx.js”></script> in the comments section. I knew what they were trying to do, so I traced the foo/xxx.js to find their XSS platform domain, then found the registrant's Weibo and directly messaged them to stop their futile attempts 🤣. They replied with a string of ellipses. I felt they were probably quite young, likely a high school student, much like I was back then.

Because front-end frameworks like Vue handle XSS issues with their {{ }} syntax, I didn't need to do anything; I just had to be careful to sanitize HTML code using libraries like dompurify.

PixelCloud (2021 - now)

PixelCloud is my third startup project, which began in 2021. By this time, the cybersecurity environment had changed completely, with automated attack bots running rampant. A moment's inattention could quickly turn you into a lamb waiting to be slaughtered. A typical piece of evidence is the emergence of platforms like Shodan, which provide full visibility of IPv4. IPv4 addresses consist of 4 x 8 = 32 bits (binary), totaling 2^32 = 4,294,967,296 IP addresses. With the development of Moore's Law, it is no longer difficult for an ordinary person to scan the entire IPv4 range. Therefore, it is normal for large cybersecurity companies to regularly traverse all IPv4 network ports. Websites like Shodan emerged as a result, which I will mention later.

Attacks on Website Server Hosts

Due to the harsh network environment, almost no website operates without a CDN anymore. The website domain resolves to the CDN, making it impossible for attackers to know the source IP address; they can only see the cluster IP of the CDN, so they can only attack the CDN and not your server.

However, my server was still attacked. Why? Because Nginx, when handling domain errors for HTTPS sites, returns the certificate to the client. This allows platforms like Shodan to scan your server IP and learn the domain through the error message. Attackers can search for the domain on Shodan and immediately know the source address, bypassing the CDN to attack directly.

The solution is to use a newer version of Nginx, which supports the ssl_handshake_reject directive. If the accessed domain does not match (for example, accessing via IP), it directly disconnects the connection to prevent domain leakage.

A few months ago, Mitsubishi UFJ Bank suffered a DDoS attack, and someone on Twitter wondered why such a large bank didn't use a CDN. Of course, they did; they used Akamai. They likely made the same mistake I did, allowing hackers to bypass the CDN and attack directly.

Searching for mufg on Shodan reveals the source IP immediately.

Attacks on Object Storage

Attackers used their downlink bandwidth to inflate my traffic bill, causing my object storage to be shut down by the cloud vendor, effectively halting related business operations.

1T of traffic was consumed in just half an hour.

The solution was to enforce human verification at the API for signed download URLs. Specifically, since PixelCloud is a domestic project, we used Geetest.

Attacks on CDN

Attackers also used similar methods against the CDN, obtaining the URL of a small file and then inflating the CDN's traffic bill through downlink bandwidth. This was resolved by setting a request limit per IP on the CDN of the T cloud vendor.

DDoS Attacks on Game Servers

Game servers cannot use CDN because game networks operate at L4, while CDN (Content Delivery Network) typically implies HTTP at L7. There are no similar L4 protection services like Cloudflare Spectrum in China; both Tencent's EdgeOne and Cloudflare Spectrum itself are quite expensive and unrealistic for entrepreneurs.

To defend against DDoS attacks at a low cost, my partner came up with a solution. We wrote a program that connects to the cloud vendor's API to generate four persistent L4 proxy virtual machines. If one virtual machine is taken down, or if the corresponding EIP is taken down, we generate more EIPs and virtual machines exponentially until the attack stops. For example, if one is taken down, we generate two; if two are taken down, we generate four. Once the attack stops, the excess EIPs and virtual machines are automatically deleted.

This method is beneficial because both cloud servers and EIPs are billed hourly, so you only pay for the hours during which the attack lasts.

This method operated well for about a year. However, we encountered unexpected issues. Mainly, the U cloud vendor was unhappy.

Because we did not use the cloud vendor's high defense but instead "unilaterally" built an elastic high defense solution, the U cloud vendor was quite displeased. The account manager's exact words were, "You are the only one who has been attacked this many times and still hasn't purchased high defense." Even when I initially moved from A to U, I had detailed discussions with their business team about this solution, and they approved it. Even at vendor A, no one bothered me regardless of how large the attack was.

During the Spring Festival of 2023, we faced a significant attack that took down more than ten IPs at once. The U cloud vendor then revoked our permission to apply for EIPs, forcing us to buy their high defense. However, by that time, the attack had already ended, and we barely held on, so we didn't want to buy high defense. We had to take down the dynamic DDoS defense system and could only use the existing EIPs for defense. Normally, we used two EIPs; if one was taken down, we manually re-bound a new EIP to the L4 proxy virtual machine in the U cloud vendor's backend and then manually modified the DNSPod resolution. After 24 hours, the previously taken down EIP could be used again.

After about half a year of this, we experienced a small attack approximately every two weeks, causing some players to experience lag in the game, but it wasn't a major issue until June 20. On June 20, we faced another significant attack that exhausted all available EIPs, forcing us to buy high defense. The situation was urgent, and the angry U vendor's "backend" even threatened to terminate our service. I had no choice but to purchase the U vendor's high defense under pressure, spending 15k.

PixelCloud's profit margins are much thinner than the previous content platform, and 15k was unacceptable. I had to start looking for other high defense solutions, and at this point, I finally realized that the rental cost of high defense servers was only a few hundred dollars a month!

The subsequent steps were straightforward: we moved the L4 proxy to the high defense server. The DDoS issue was thus completely resolved.

A Certain Content Platform in Japan (2023 - now)

SMS Abuse Round 1

This project is my fourth startup, primarily targeting the global market, and it does not require L4 protection, allowing us to use Cloudflare for free, essentially eliminating DDoS concerns.

Recently, we faced another cyber attack where the attacker inflated my SMS service bill by automating requests to the SMS verification API. This attack method is similar to the object storage download attack on PixelCloud, so my first reaction was to implement human verification as a solution. However, unlike direct file downloads, each SMS verification must go through my API. Could I use other strategies to minimize the impact of human verification on conversion rates?

I began trying to limit phone numbers; if a certain phone number requested verification codes too many times, I would blacklist that number. This strategy was ineffective because the attacker did not need to own a specific phone number to attack me; they only needed to ensure the SMS was sent, which would incur charges on my bill, thus completing an attack. Therefore, banning phone numbers would not incur any loss for them.

I then tried banning IPs; if a certain IP made too many requests, I would blacklist that IP. This strategy was also ineffective because the attacker had too many IPs, and their IPs were cheaper than my SMS costs, making this defense strategy unfeasible.

In the end, I had to resort to using hcaptcha. The reason for choosing it was that Google reCaptcha and Cloudflare Turnstile do not work in mainland China.

Round 2

Two days later, the attacker returned, continuing to inflate my SMS service bill. This means they spent two days researching how to bypass hcaptcha to attack me, demonstrating their persistence.

However, this also increased their attack costs. Why? Previously, they only needed to fabricate a phone number that complied with ISO rules, without needing to own it, and could complete the attack by calling the API. Now, they needed to use a headless browser like Puppeteer to complete hcaptcha. If they used AI agent-like tools, each attack would incur additional token costs.

Moreover, they seemed unable to guarantee success with the hcaptcha solution each time.

Half of the attempts failed, indicating that hcaptcha was still effective.

To achieve the effect of inflating my bill, the attacker needed to ramp up the concurrency to a certain level. Using headless browsers or AI agents meant that the attacker could not complete the attack from their own computer; they had to rely heavily on cloud vendor virtual machines to achieve high concurrency.

If the attacker used cloud vendor EIPs to conduct the attack, then banning their server IPs became a cost-effective strategy.

After automatically banning 7 or 8 IPs, the attack stopped.

This attack also revealed a useful strategy: directly banning all IPs from cloud vendors. For example, the IP 89.116.154.76 in the screenshot belongs to a cloud vendor called latitude.sh. I observed the attack logs and added the commonly used cloud vendor IPs to the blacklist, achieving significant results. It's worth noting that ordinary users are unlikely to access the internet through public cloud IPs.

Round 3

Two days later, the attacker once again drained my SMS balance. Checking the logs revealed that the attacker seemed to have figured out my defense strategy, employing a more diverse range of IPs for the attack. After banning another batch of cloud vendor IPs, the attacker even managed to ensure that each IP came from a different organization, including IPs from civilian ISPs.

It seems this attacker was not to be underestimated and likely had some profit motive behind their actions. I needed to seriously develop a defense strategy; otherwise, this issue would not end.

After some thought, I implemented the following defense strategies:

Use hcaptcha;
If a certain IP makes too many requests or fails too many times, ban that IP;
If more than 3 IPs from a certain organization are banned, ban all subsequent IPs from that organization (instead of handling them manually);
(For the phone number binding interface of logged-in users) If an IP is banned and the request also contains a User ID, ban that user;
If a certain international area code makes too many requests, ban that area code. Of course, I cannot heed the hackers' every demand, so I added a whitelist for commonly used areas that would never be banned;
If a user successfully completes the verification, immediately unban that area code and the corresponding organization of that IP.

After deployment, the strategy of directly banning area codes proved very effective. I don't know why the attacker did not use common area codes like +1 from the United States to inflate my bill. Did they think I would refuse American customers?

Round 4

After a year and two months, the defense strategy above was once again breached by attackers. The attackers used a pool of numbers from three countries to send SMS requests to me simultaneously. Among them, there was a GB number, and the UK number was on the whitelist, so the defense strategy could not block it. Additionally, all requests from GB numbers completed the verification code, and the defense strategy would also unban its IP according to the settings. At the same time, the attackers reduced the frequency of attacks to ensure the attacks could proceed smoothly.

I was curious why the attackers were so persistent and what their motives were, so I learned about SMS Pumping: the attackers have partnerships with some number segment operators, allowing them to share profits from the SMS telecom chain. For the attackers, my website is like a money printer.

At this point, a year and two months later, LLM has developed to GPT 5.5, and I no longer need to write specific attack strategies myself. I just need to tell the AI the direction of the defense strategy adjustments and the verification methods to see if they are effective, and I can go do other things.

The adjustment idea is to no longer use "completing the verification code" as a trust signal; only actual purchase behavior on the website can lead to unblocking.

Final Thoughts

In fact, every time a cyber attack occurs, it is quite overwhelming because I lose money, expend energy, and my services are affected.

For example, in this SMS API Abuse incident, it seems that the attacker has a time difference with me; they do not attack during the day on my side but start attacking in the early morning. After deploying a new defense strategy, I have to recharge $10 to see if the attacker's attacks are still effective. However, it seems that every time the attacker drains my balance, they go off to play games, and for a while, they do not attack again, leaving me to wait. I have been going to sleep at 4:30 AM these past few days.

As one who is constantly under attack, I want to complain a little; this kind of offense and defense confrontation is really unequal. I must handle it as quickly as possible and minimize the impact of the attacks on ordinary users. Meanwhile, the attackers only need to hide in the dark, prepare well, and wait for the right moment to strike.

Integrating hCaptcha has long been on my to-do list. Due to my experience with PixelCloud, I had anticipated that such incidents would occur, but I had been too lazy to implement it until the attack happened. These hunters lurking in the dark forest are somewhat neutral because even if they do not attack, the security vulnerabilities are already there; they merely expose the problems sooner.

On the other hand, comparing projects from 2016 and 2021, due to the more frequent and widespread cyber attacks, professionals who are under attack have to spend more time and money on cybersecurity to achieve the same effects that previously required little extra attention. For instance, in 2016, content platform projects used object storage extensively without ever encountering such bizarre attacks. However, in 2021, to counter such attacks, all users had to complete a sliding verification code before using the download function. Thanks to my timely response, the attackers did not achieve their goals. Both sides expended a lot of energy on this matter, but there was no change in the front lines. The arms race resembles a prisoner's dilemma, where everyone pursues their own maximum benefit, leading to a lose-lose situation. The existence of thieves imposes a cost of locks on society as a whole.

During my rebellious teenage years (from junior high to high school), I was very fascinated by hackers. Before I learned programming, I self-studied a lot of cybersecurity knowledge and even penetrated the official server of my high school and gained access to 3389, which makes me somewhat of a "script kiddie." Because of this rebellious experience, when I started working in programming, I paid close attention to various cybersecurity issues, such as injections and cross-site scripting, so I have never encountered a situation where hackers gained system permissions due to my code being poorly written.

Of course, there is another possibility. Professional cyber attacks often exploit undisclosed 0-day vulnerabilities, which are things that "I don't know that I don't know." It is possible that they have already compromised my server, and due to the permissions of a rootkit, they might even have higher privileges than I do, so I have been unaware and feeling good about myself.

@2025-04-22 22:30