I've been on a path to become a senior developer, and one thing I kept pushing off was actually learning load testing. Not just knowing what it is, but properly sitting down, writing scripts, and running tests against something real. Today I finally did it, and it ended up teaching me something I didn't expect.
What I built to test against
Before getting into k6 itself, I needed something worth testing. I initiated a NestJS microservices app, a simple blogging platform with four services for phase 1: auth, user, blog, and comment. Nothing fancy, but real enough. Auth uses JWT with refresh tokens, and passwords are hashed with bcrypt.
Once the app was wired up with Docker and gRPC between services, I had something I could actually throw load at.
Getting into k6
k6 is an open-source load testing tool by Grafana Labs. You write your test scripts in JavaScript (TypeScript too, with a little setup), define your virtual users and stages, set thresholds, and run it. It then simulates concurrent users hitting your endpoints and gives you detailed metrics, latency percentiles, failure rates, throughput, all of it.
I started with a smoke test just to make sure everything was wired correctly. Passed. Then I wrote a proper auth flow test: register, login, refresh, logout, and set this as my load profile:
export const options = {
stages: [
{ duration: '30s', target: 20 },
{ duration: '1m', target: 20 },
{ duration: '30s', target: 0 },
],
thresholds: {
http_req_failed: ['rate<0.01'],
http_req_duration: ['p(95)<600'],
},
}; Ramp up to 20 virtual users, hold for a minute, ramp back down. 95th percentile latency should stay under 600ms. Seemed reasonable.
What broke
It didn't pass.
http_req_duration
✗ 'p(95)<600' p(95)=1.41s
95th percentile was 1.41 seconds. Average was 516ms. That's way above what I'd expect for a simple auth flow on localhost. All checks passed, every register, login, refresh, and logout returned the right status codes, but the latency was clearly off.
Digging into why
The auth flow hits bcrypt on registration. I was using a cost factor of 12, which is a common default you'll see in tutorials and starter code. But bcrypt's cost factor is exponential, factor 12 means 2^12 = 4096 rounds of hashing. Under a single request, that's fine. Under 20 concurrent virtual users all registering and logging in at the same time, that's a lot of CPU being chewed up all at once.
I dropped the cost factor from 12 to 10 (2^10 = 1024 rounds)and ran the test again.
http_req_duration
✓ 'p(95)<1500' p(95)=362.14ms
avg=169.06ms p(90)=325.12ms p(95)=362.14ms Average dropped from 516ms to 169ms. p95 went from 1.41s to 362ms. Throughput nearly doubled, from 2400 total requests to 4360 in the same time window. Just by changing one number.
That's the kind of thing you don't notice when you're testing one request at a time in Postman or Bruno. Under load, it shows up immediately.
But don't fully trust these numbers either
Here's the thing, I also noticed something worth calling out. When I ran the same endpoints naturally (no load test, just manual API calls), /users/me averaged around 10ms, and /auth/register averaged around 138ms. That's expected.
But under k6, with all four microservices running on the same machine and sharing the same CPU, bcrypt on the auth service was eating resources and affecting everything else. The numbers I got from k6 aren't wrong, but they're biased, they reflect a single-machine setup where all services are competing for the same hardware, not a real production environment where each service would have its own resources.
So the bcrypt finding is real and valid. The exact millisecond numbers? don’t rely too much on the exact timing numbers unless you test them in a production-like setup.
Where this goes next
The next step is OpenTelemetry. k6 tells you that something is slow. OTel tells you why and where, tracing requests across service boundaries, seeing exactly where time is being spent. That's where the actual truth lives.
But for a first day with k6, I'm glad I did this. I went from never having written a load test script to finding a genuine performance issue and understanding why it happened. That's worth writing down.
If you're building anything with bcrypt and haven't thought about your cost factor under concurrent load, go test it. You might be surprised.
References