HomeArchiveFeedShelf

The Cybersecurity Incident I Encountered and How I Responded

Recently, my startup project in Japan faced a cyber attack where the attacker exploited an SMS verification API, causing my SMS service bill to skyrocket. The situation has calmed down now. I want to reflect on the cybersecurity incidents I've encountered over the years and the lessons I've learned from them.

A Domestic Content Platform (2016 - 2018)

This content platform was my second project during my first entrepreneurial venture. Since it was my first time, I didn't deliberately implement any defenses. Another reason was the environment at that time; the intensity of attacks was not as fierce as it is now, so an ordinary internet project might not experience many DDoS attacks in a year even without defenses. People can adapt to their environment, but it's hard to surpass the broader context.

DDoS Attack in 2017

At that time, the response was to use high defense from cloud providers. In hindsight, this seems quite foolish because the high defenses offered by various cloud providers are essentially a tax on intelligence. The cost of these high defense IPs is only a few hundred yuan per month, yet they charge tens of thousands, and using them requires modifying domain resolution. You never know when an attacker will strike, so often you only switch to high defense when under attack, and once the month is up, you switch back to a regular IP.

The problems with this protective method are:

  1. By the time you switch to a high defense IP, the service has already been taken down;
  2. It requires manual operation, and after modifying the resolution, the DNS cache on the user's end needs to refresh before the service can be restored;
  3. High defense IPs are expensive; cloud providers see those who can be attacked as high-value customers and exploit you from a supply-demand perspective, rather than because of any high technical content.

However, at that time, the project's revenue was good, and I was indeed a naive fool, so I mindlessly bought a month of high defense every time I was attacked, not thinking there was anything wrong with this approach.

As I mentioned earlier, the cybersecurity environment was relatively mild back then. After a few attacks, the attackers realized I could quickly switch to high defense and left in frustration.

Docker Vulnerability Intrusion in 2016

Still referring to this content platform, at that time, the Docker daemon was listening on a public port, and if the firewall was configured to allow access, it would lead to the server's permissions being handed over. One time, we noticed the server was underperforming for no apparent reason, and upon checking, we found it was mining cryptocurrency 😅. The mining program came from a downloader, which was introduced via Docker. It was quite an embarrassing and somewhat humorous episode.

XSS Attack Attempts in 2017

During that time, I often saw comments in the comment section like <script>alert('test')</script> and <script src='foo/xxx.js'></script>. I knew what he was trying to do, so I traced the foo/xxx.js to find his XSS platform domain, then looked up the registrant's Weibo and directly messaged him to stop his futile attempts 🤣. He replied with a string of ellipses. He seemed quite young, probably a high school student, much like my younger self.

Because front-end frameworks like Vue handle XSS issues with their {{ }} syntax, I didn't need to do anything; I just had to ensure that when inserting HTML code, I used libraries like dompurify to sanitize it.

PixelCloud (2021 - now)

PixelCloud is my third entrepreneurial project, started in 2021. By this time, the cybersecurity environment had changed completely, with automated attack bots rampant. A moment's inattention could quickly turn you into a lamb waiting to be slaughtered. The most typical evidence of this is the emergence of platforms like Shodan, which provide full visibility of IPv4. IPv4 addresses consist of 4 x 8 = 32 bits (binary), totaling 2^32 = 4294967296 IP addresses. With the development of Moore's Law, it's no longer difficult for an average person to scan the entire IPv4 space. It's also normal for large cybersecurity companies to regularly scan all IPv4 network ports. Websites like Shodan emerged as a result, which I will mention later.

Attacks on Website Server Hosts

Due to the harsh network environment, almost no website operates without a CDN anymore. When a website's domain resolves to a CDN, attackers cannot ascertain the source IP address; they can only see the CDN's cluster IP. If they want to attack, they can only target the CDN, not your server.

However, my server was still attacked. Why? Because Nginx, when handling HTTPS domain errors, returns the certificate to the client. This allows platforms like Shodan to scan your server IP and learn the domain from the error message. Attackers can then search for the domain on Shodan and immediately discover the source address, bypassing the CDN to launch an attack directly.

The solution is to use a newer version of Nginx, which supports the ssl_handshake_reject directive. If the accessed domain does not match (for example, if accessed via IP), it will directly terminate the connection to prevent domain leakage.

A few months ago, Mitsubishi UFJ Bank suffered a DDoS attack, and someone on Twitter wondered why such a large bank wasn't using a CDN. Of course, they were using Akamai. They likely made the same mistake as I did, allowing hackers to bypass the CDN and attack directly.

Searching for mufg on Shodan reveals the source IP directly.

Attacks on Object Storage

Attackers used their downlink bandwidth to blow up my traffic bill, causing my object storage to be shut down by the cloud provider, halting related business operations.

1T of traffic was consumed in just half an hour.

The solution was to enforce human verification at the API for signed download URLs. Specifically, since PixelCloud is a domestic project, we used Geetest.

Attacks on CDN

Attackers also employed similar tactics against the CDN, obtaining the URL of a small file and then using their downlink bandwidth to blow up the CDN's traffic bill. This was resolved by setting a request limit per IP on the CDN of T cloud provider.

DDoS Attacks on Game Servers

Game servers cannot use a CDN because game networks operate at L4, while CDN (Content Delivery Network) content generally implies HTTP, which is L7. There are no similar L4 protection services like Cloudflare Spectrum in China; both Tencent's EdgeOne and Cloudflare Spectrum are quite expensive and unrealistic for entrepreneurs.

To defend against DDoS attacks at low cost, my partner came up with an idea. We wrote a program that connects to the cloud provider's API to generate four persistent L4 proxy virtual machines. If one virtual machine is taken down, or if the corresponding EIP is taken down, we would exponentially generate more EIPs and virtual machines until the attack stops. For example, if one is taken down, we generate two; if two are taken down, we generate four. After the attack stops, these extra EIPs and virtual machines are automatically deleted.

This method is beneficial because cloud servers and EIPs are billed hourly, so you only pay for the hours during which the attack persists.

This method operated well for about a year. However, we encountered unexpected issues. Mainly, U cloud provider was unhappy.

Because we weren't using the cloud provider's high defense but instead "unilaterally" built an elastic high defense solution, the U cloud provider was quite displeased. The account manager's exact words were, "You are the only one attacked so many times without purchasing high defense." Even when I initially moved from A to U, I had detailed discussions with their business team about this solution, and they approved it. Even at A, no one bothered me regardless of how large the attack was.

During the Spring Festival of 2023, we faced a significant attack that took down over ten IPs in one go. The U cloud provider then restricted our permission to apply for EIPs, forcing us to purchase their high defense. However, by that time, the attack had already ended, and I was reluctant to buy high defense. So, I took down the dynamic DDoS defense system and could only allocate defenses using the existing EIPs. Normally, I used two EIPs, and if one was taken down, I would manually rebind a new EIP to the virtual machine L4 proxy in the U cloud provider's backend and then manually modify the DNSPod resolution. After 24 hours, the previously taken down EIP would be usable again.

Thus, after about half a year, I faced small attacks approximately every two weeks, causing some players to experience lag in the game, but it wasn't a major issue until June 20. On June 20, another significant attack occurred, exhausting all available EIPs, and I had no choice but to purchase high defense. The situation was urgent, and the angry U provider's "backend" even threatened to terminate our service. At that moment, I had no choice but to purchase the U provider's high defense, costing 15k.

PixelCloud is not like the previous content platform; the profit margin is thin, and 15k is unacceptable. I had to start looking for other high defense solutions, and at this point, I finally realized that the rental cost for high defense servers was only a few hundred yuan per month!

The subsequent steps were simple: I moved the L4 proxy to the high defense server. The DDoS problem was thus completely resolved.

A Certain Content Platform in Japan (2023 - now)

SMS Abuse Round 1

This project is my fourth entrepreneurial venture, primarily targeting the global market, and it doesn't require L4 protection, allowing me to use Cloudflare for free, essentially eliminating concerns about DDoS attacks.

Recently, I faced another cyber attack where the attacker automated requests to the SMS verification API, causing my SMS service bill to explode. This attack method is quite similar to the object storage download attack I faced with PixelCloud, so my first reaction was to implement human verification as a solution. However, unlike direct file downloads, each SMS verification must go through my API. Could I use other strategies to minimize the impact of human verification on conversion rates?

I began trying to limit phone numbers; if a certain number requested verification codes too many times, I would blacklist that number. This strategy proved ineffective because the attacker didn't need to own a specific phone number to attack me; they only needed to ensure that the SMS was sent, which would incur charges on my bill, thus completing an attack. Therefore, banning phone numbers wouldn't cause them any loss.

I then tried banning IPs; if a certain IP made too many requests, I would blacklist that IP. This strategy was also ineffective because the attacker's IPs were simply too numerous, and their IPs were cheaper than my SMS costs, making this defense strategy unfeasible.

In the end, I resorted to using hCaptcha. The reason for choosing it was that Google reCaptcha and Cloudflare Turnstile do not work in mainland China.

Round 2

Two days later, the attacker returned, continuing to rack up charges on my SMS service bill. This means they spent two days researching how to bypass hCaptcha to attack me, showing remarkable persistence.

However, their actions also increased their attack costs. Why? Previously, they only needed to fabricate a phone number that complied with ISO rules, without needing to own it, and could call the API to complete the attack. Now, they needed to use a headless browser like Puppeteer to complete hCaptcha. If they used tools like AI agents, completing an attack would incur additional token costs.

Moreover, they seemed unable to guarantee success with the hCaptcha solution every time.

Half of the attempts failed, indicating that hCaptcha was still effective.

To achieve the effect of overwhelming my bill, the attacker needed to ramp up concurrency to a certain extent. Using headless browsers or AI agents meant they couldn't complete the attack from their own computers; they had to rely heavily on cloud providers' virtual machines to achieve high concurrency.

If the attacker used cloud provider EIPs to attack, then banning their server IPs became a cost-effective strategy.

After automatically banning 7 or 8 IPs, the attack stopped.

This attack also revealed a useful strategy: directly banning all IPs from cloud providers. For example, the IP 89.116.154.76 in the screenshot belongs to a cloud provider called latitude.sh. I observed the attack logs and added the frequently used cloud providers by the attacker to the blacklist, achieving significant results. It's worth noting that ordinary users almost never access the internet through public cloud IPs.

Round 3

Two days later, the attacker once again drained my SMS balance. Checking the logs revealed that the attacker seemed to have figured out my defense strategy and was using a more diverse range of IPs for the attack. After banning another batch of cloud providers, the attacker even managed to ensure that each IP came from a different organization, including some from civilian ISPs.

It seems this attacker was quite determined, likely motivated by some profit, and I needed to carefully craft my defense strategy; otherwise, this issue would not end.

After some thought, I implemented the following defense strategies:

  1. Use hCaptcha;
  2. If a certain IP makes too many requests or fails too many times, blacklist that IP;
  3. If more than 3 IPs from a certain organization are banned, blacklist all subsequent IPs from that organization (instead of handling them manually);
  4. (For the binding phone number interface of logged-in users) If an IP is banned and the request also contains a User ID, blacklist that user;
  5. If a certain international area code requests too many times, blacklist that area code. Of course, I can't just listen to the hackers; I need to add a whitelist for commonly used regions, which will never be banned;
  6. If a user successfully completes the verification, immediately unban that area code and the corresponding organization of that IP.

After deployment, the strategy of directly banning area codes proved very effective. I don't know why the attacker didn't use common area codes like +1 from the United States to drain my bill. Did they think I would reject American customers?

Final Thoughts

In fact, every time a cyber attack occurs, I feel quite overwhelmed because I lose money, expend energy, and my services are affected.

For example, in this recent SMS API Abuse incident, it seems that the attacker is in a different time zone than me. I don't get attacked during the day, but the attacks start at midnight. Every time I deploy a new defense strategy, I have to top up $10 to see if the attacker's methods are still effective. However, it seems that the attacker drains my balance and then goes off to play games, not attacking for a while, leaving me to wait. I've been going to bed at 4:30 AM these past few days.

As someone who is constantly on the receiving end, I want to vent a little: this kind of offense-defense confrontation is really unequal. I must respond as quickly as possible and minimize the impact of the attacks on ordinary users, while the attacker only needs to hide in the shadows, ready to strike at any moment.

Integrating hcaptcha has long been on my to-do list. Given my experience with PixelCloud, I anticipated that such incidents would occur, but I kept procrastinating until the attack happened. These hunters lurking in the dark forest are somewhat neutral because even if they don't attack, the security vulnerabilities are already there; they merely expose the problems sooner.

On the other hand, comparing projects from 2016 to those in 2021, due to the increased frequency and prevalence of cyber attacks, professionals on the receiving end have to spend more time and money on cybersecurity to achieve the same results that previously required little extra attention. For instance, in 2016, content platform projects extensively used object storage without ever encountering such bizarre attacks. However, in 2021, to counter such attacks, all users had to complete a slider captcha before they could use the download feature. Thanks to my timely response, the attacker did not achieve their goal. Both sides expended a lot of energy on this matter, yet there was no change in the frontline. The arms race resembles a prisoner's dilemma, where everyone seeks to maximize their own interests, leading to a lose-lose situation. The existence of thieves increases the cost of locks for society as a whole.

During my rebellious teenage years (from middle school to high school), I was very fascinated by hackers. Before I learned programming, I self-studied a lot of cybersecurity knowledge and even penetrated my high school's official website server, gaining access to 3389, making me somewhat of a "script kiddie." Because of this rebellious experience, when I started working in programming, I paid close attention to various cybersecurity issues, such as injections and cross-site scripting, so I have never encountered a situation where my code's lack of rigor allowed hackers to gain system access.

Of course, there is another possibility. Professional cyber attacks often exploit undisclosed 0-day vulnerabilities, which is the "I don't know what I don't know" scenario. It's possible that they have already compromised my server, and due to rootkit privileges, they might even have higher access than I do, leaving me unaware while I continue to feel secure.

@2025-04-22 22:30