Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple counter fails on Next.js -- Avoid using Node.js-specific APIs #584

Open
pablote opened this issue Aug 28, 2023 · 19 comments
Open

Simple counter fails on Next.js -- Avoid using Node.js-specific APIs #584

pablote opened this issue Aug 28, 2023 · 19 comments

Comments

@pablote
Copy link

pablote commented Aug 28, 2023

Hi, I know this library is supposed to be framework agnostic, but it seems to be generating a bad interaction with Next.js. I have a counter defined like this:

let httpRequestsTotal = new client.Counter({
      name: "http_requests_total",
      help: "Total number of HTTP requests",
      labelNames: ["method", "route", "statusCode"],
    })

This counter works well if used from within an API route for example. Thing is, I want it to catch every single request, so I use it where it makes most sense, in a middleware:

import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";
import { metrics } from "@/modules/common/metrics";

export function middleware(request: NextRequest) {
  NextResponse.next();
  console.log(`${request.method} -- ${request.url}`);
  metrics.httpRequestsTotal.inc({ method: request.method, route: request.url });
}

Doing this throws the following error:

Server Error
Error: A Node.js API is used (process.uptime) which is not supported in the Edge Runtime.

This might not make a lot of sense if you don't know Next.js but some code, like the middleware, runs on a different JS runtime where it cannot be assumed that Node.js global APIs are available. My main question is: why would a counter.inc() call process.uptime? Seems unnecessary.

Ideally, no Node.js specific APIs should be used, only standard Web APIs, unless specifically required (and being able to opt-out).

thanks for the help

@Guid21
Copy link

Guid21 commented Aug 29, 2023

I too am wondering how to solve this problem

@SimenB
Copy link
Collaborator

SimenB commented Aug 29, 2023

Next.js does some weird bundling stuff and have an odd global environment. There might be something off due to that? Probably better to ask in their support channels.

I don't think we've had requests to work in a browser context before, but that should be fine. Happy to take PRs allowing it.

@pablote
Copy link
Author

pablote commented Aug 29, 2023

I run the prom-client code locally on my computer and I was able to see what the problem was. Many of the default metrics run code outside of any function, so even if you don't use them, just by importing this code gets run. Some of that code uses Node.js libraries or grobals.

I changed all the cases that caused problems so that the code would be inside the function. See these couple examples:
Screenshot 2023-08-29 at 9 54 23 AM
Screenshot 2023-08-29 at 9 54 38 AM

After doing this the error went away. That said, this is still not working well, in strange ways. I need to keep testing around, but apparently this is not gonna be easy.

I don't think we've had requests to work in a browser context before, but that should be fine. Happy to take PRs allowing it.

It's not a browser environment. It's a server-side JS runtime, only that Node.js specific APIs are not allowed. I'm not sure how Vercel runs code, maybe it's a Lambda@Edge thing but not sure. I know that Cloudfare Workers have the same restrictions.

@pablote
Copy link
Author

pablote commented Sep 7, 2023

I've managed to get this working on Next.js, but it required quite a few changes on prom-client which I'm not sure if the project would be willing to consider merging @SimenB.

You can check the changes in this fork here: https://github.com/siimon/prom-client/compare/master...pablote:prom-client:feature/next-friendly-changes?expand=1. To summarize:

  • Like I mentioned before, many of the default metrics run code outside of any function, so even if you don't use them, just by importing this code runs. Some of that code uses Node.js libraries or globals which are not allowed on Next.js edge runtime. I've moved that code within the metrics fn so that if you opt out of using them, they don't affect the app.
  • Had to change the PushGateway quite a bit:
    • Node.js url package usage replaced by WHATWG URL API
    • Node.js http/https packages replaced with the Fetch API
    • Added an extra options param to be able to replace push to always hit /metrics
    • gzip compression removed as to not depend on a Node.js module, CompressionStream API might be a replacement for it
    • These two changes break backward compat a bit, ideally this shouldn't have been part of the interface and should have been abstracted away, but it's close enough:
      • The options second ctor param is now a fetch options, not a http/https module options, they are similar but not the same.
      • The response { resp, body } are now fetch response and response.body respectively, which are not the same type as before.

@SimenB
Copy link
Collaborator

SimenB commented Sep 8, 2023

Making requires lazy and migrating to URL global should both be fine to land at least. Let's start with that? Then fetch is a natural one to target after that. Node 16 is EOL shortly, so we might get away with using global fetch as well, but I wanna evaluate that separately

@harry-gocity
Copy link

harry-gocity commented Sep 27, 2023

For simple default metrics, creating an api route works well in Next.js:

collectDefaultMetrics({});

const prometheus = async (_req: NextApiRequest, res: NextApiResponse) => {
    res.setHeader('Content-type', register.contentType);
    res.send(await register.metrics());
};

export default prometheus;

As for using prom-client anywhere else I've only been able to do so in getServerSideProps, as the middleware expects Vercel's edge runtime compatability. Quite a lot of (imo understandably) unhappy discussion about it here. It's very annoying for us as we are running dockerised node, we have nothing but the node runtime, and we get these edge runtime warnings.

In the end my getServerSideProps all have a wrapper that looks something like this:

import { GetServerSideProps, GetServerSidePropsContext } from 'next';
import { Counter, Histogram, register } from 'prom-client';

// Even though it may look like nothing has been registered yet, if next
// recompiles due to HMR prom-client will throw errors when calling
// new Histogram()/new Gauge()/new Counter() as metrics with the same name
// already exist in the registry.
// https://github.com/siimon/prom-client/issues/196
register.removeSingleMetric('nextjs_gSSP_duration_seconds');
register.removeSingleMetric('nextjs_page_requests_total');

export const histograms = {
    gSSPDurationSeconds: new Histogram({
        name: 'nextjs_gSSP_duration_seconds',
        help: 'Duration of gSSP in seconds',
        labelNames: baseLabelNames,
        buckets: [0.01, 0.05, 0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10],
    }),
};

export const counters = {
    pageRequestsTotal: new Counter({
        name: 'nextjs_page_requests_total',
        help: 'Count of page requests',
        labelNames: baseLabelNames,
    }),
};

// Register metrics on the default registry
register.registerMetric(histograms.gSSPDurationSeconds);
register.registerMetric(counters.pageRequestsTotal);

export const observedGetServerSideProps =
    (gSSP: GetServerSideProps, route: string) => async (context: GetServerSidePropsContext) => {
        // All incoming requests will have some derivable labels
        const baseLabels = { route, foo: 'bar' }

        // Run gSSP
        const endGSSPDurationTimer = histograms.gSSPDurationSeconds.startTimer();
        const response = await gSSP(context);
        const duration = endGSSPDurationTimer(baseLabels);

        // Capture metrics
        histograms.gSSPDurationSeconds.observe(duration);
        counters.pageRequestsTotal.labels(baseLabels).inc();

        // Return gSSP response result
        return response;
    };

That is used like this

const unwrappedGetServerSideProps: GetServerSideProps = () => {
    // your gSSP here
};

export const getServerSideProps: GetServerSideProps = observedGetServerSideProps(
    unwrappedGetServerSideProps, '/[destination]'
);

I did find that the observedGetServerSideProps had to be defined in another file, if I used something similar in the actual pages directory Next would complain at build, the same as your issue @pablote. I'm not entirely sure why a separate module gets around the warning, but Next doesn't mind 🤷🏼‍♂️

I would be very interested in improvements to allow us to use the middleware in Next rather than this wrapper, as currently we cannot use prom-client for any statically optimised pages (no getServerSideProps/getStaticProps). And the wrapping isn't ideal, it's very all-or-nothing for adding metrics to a page.

@pablote
Copy link
Author

pablote commented Sep 27, 2023

Seems like we've run into a similar picture. Using prom-client works well from getServerSideProps and the get static one as well, also from API routes. The thing is you have to add that individually everywhere, and it still doesn't catch some calls. Middleware is the natural place to do that, but like I said above, it only works if I heavily modify prom-client to not use anything Node.js specific.

I've seen some suggestions like using a custom server to host the Next.js app, but that's not an option for me.

That said, it looks like Next.js is integrating OpenTelemetry as a functionality, although it's still an experimental feature. So I'll probably not pursue this any further and see how that pans outs.

@harry-gocity
Copy link

it looks like Next.js is integrating OpenTelemetry as a functionality

Unfortunately this is where we're at too - I've only added prom-client in a couple of our next.js sites, but we have 10+ across various teams, and I need to recommend something for usage everywhere. That said I would like to avoid OpenTelemetry if I can, as we're already all-in on prom w/ push gateway for our java services.

@pablote
Copy link
Author

pablote commented Sep 28, 2023

It doesn't have to be one or the other I think. You can have an app using OT pushing metrics to a OT collector (this is very similar to pushing prom metrics to a push gateway). And then have Prometheus consume those from the collector, check this out: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/examples/demo

That said, last time I tried this in Next.js, it was only pushing trace/spans style metrics, and not counter/gauge/histogram prom style metrics. Not sure if it's a WIP and they plan to add that in the future.

@zbjornson zbjornson changed the title Simple counter fails on Next.js Simple counter fails on Next.js -- Avoid using Node.js-specific APIs Oct 25, 2023
@IainMcHugh
Copy link

IainMcHugh commented Jul 1, 2024

For simple default metrics, creating an api route works well in Next.js:

collectDefaultMetrics({});

const prometheus = async (_req: NextApiRequest, res: NextApiResponse) => {
    res.setHeader('Content-type', register.contentType);
    res.send(await register.metrics());
};

export default prometheus;

@harry-gocity @pablote I've seen a few examples online with this approach, can you confirm that calling collectDefaultMetrics({}) outside of the API handler/inside the API handler file works as intended?

I am trying to migrate a Next app away from using a custom server and currently collectDefaultMetrics is being called when the custom server is initially invoked. My intuition is that this function should be called globally when the application is started, and so I'm wondering if calling it within the api/metrics.ts will work as expected. I have tried to move this invokation inside the instrumentation.ts file but with no luck as of yet (I get a 200 empty response)

@harry-gocity
Copy link

harry-gocity commented Jul 2, 2024

can you confirm that calling collectDefaultMetrics({}) outside of the API handler/inside the API handler file works as intended?

We've since upgraded to next 14 (still pages dir) and now have an instrumentation.ts that looks like this:

export async function register() {
    console.log('Configuring instrumentation');

    if (process.env.NEXT_RUNTIME === 'nodejs') {
        const prom = await import('prom-client');

        console.log('Configuring default prom metrics');
        prom.collectDefaultMetrics({});

        console.log('Creating custom prom metrics');
        new prom.Histogram({
            ...
        });
        new prom.Counter({
            ...
        });
        new prom.Counter({
            ...
        });
    }
}

We don't call collectDefaultMetrics anywhere else. And to access the metrics, we use register.getSingleMetric rather than a reference to the created metric directly.

@IainMcHugh
Copy link

Thanks @harry-gocity I was missing the instrumentationHook flag in the next config file. Two questions:

  • Do you call await prom.register.metrics(); at the end of this register function?
  • Do you have an API route where you expose all of the metrics to be scraped from?

At the end of the register block I am able to call getSingleMetric() and it returns the correct value, however when I try to do the same thing in an API route I get undefined:

import { NextApiRequest, NextApiResponse } from 'next';
import { register } from 'prom-client';

const handler = async (_: NextApiRequest, res: NextApiResponse) => {
  const metric = register.getSingleMetric(
    'my_app_http_request_count',
  );
  console.log(metric); // returns undefined
  console.log(register.getMetricsAsArray()); // returns []
  res.send('Text');
};

export default handler;

Here is my exact instrumentation.ts file:

export async function register() {
  console.log('Configuring instrumentation');

  if (process.env.NEXT_RUNTIME === 'nodejs') {
    const prom = await import('prom-client');

    console.log('Configuring default prom metrics');
    prom.collectDefaultMetrics({
       prefix: 'my_app',
     });
    console.log('Creating custom prom metrics');

    new prom.Counter({
      name: 'my_app_http_request_count',
      help: 'Count of HTTP requests made to the app',
      labelNames: ['method'],
    });
    await prom.register.metrics(); // is this needed?
    console.log(prom.register.getMetricsAsArray()); // returns correct data
    console.log(prom.register.getSingleMetric('my_app_http_request_count')); // returnes correct data
  }
}

@harry-gocity
Copy link

Do you call await prom.register.metrics(); at the end of this register function?
Do you have an API route where you expose all of the metrics to be scraped from?

We don't call prom.register.metrics() in instrumentation - that's the function call to make to get the actual metrics string. You should call it in an api route like this:

import { NextApiRequest, NextApiResponse } from 'next';
import { register } from 'prom-client';

const prometheus = async (_req: NextApiRequest, res: NextApiResponse) => {
    res.setHeader('Content-type', register.contentType);
    res.send(await register.metrics());
};

export default prometheus;

@IainMcHugh
Copy link

After a lot of debugging I finally found what was missing. For anyone who has a similar issue, in my next.config.js I needed both the instrumentationHook and:

  ...
  experimental: {
    instrumentationHook: true,
    serverComponentsExternalPackages: ['prom-client'],
  },
  ...

and then calling await register.metrics() works within an API handler.

Also in relation to the original issue, while you can't increment a counter directly within the middleware.ts file, you can call to an exposed API from the middleware (example /pages/api/metrics/counter.ts) and increment this way.

@justenau
Copy link

I'm having issues making this work on production build. I setup instrumentation as per this example #584 (comment) , in the api route called register.metrics() and added experimental next config. If I run locally (yarn dev) everything works as expected and I'm able to increase counter from another api route or to get all the metrics from the metrics endpoint. But as soon as I run the build, this stops working.
In the logs I do see that instrumentation code runs and inside the instrumentation I'm able to get counter value or any other metric but once I call metrics route, I receive empty response.
What could be causing this to behave differently in production build?
Note: Using NextJS app router.

@harry-gocity
Copy link

FWIW I've somewhat started to give up with this lib - we're now using @vercel/otel and @opentelemetry packages instead, and scraping metrics with a prometheus exporter instead. The integration with Next.js is much cleaner, we have better tracing on GSSP (segmented by component load, fetch calls, and render) and we don't have to jump through bundling hoops to get a build passing. The only thing we still use this library for is the default metrics, the custom ones are all being migrated.

@justenau
Copy link

Just FYI I was able to resolve my issue, left more in-depth explanation here:
#640 (comment)

@santosguilherme
Copy link

@harry-gocity can you share some examples using @vercel/otel and @opentelemetry?

@LaExploradora
Copy link

@santosguilherme

I suppose this might help you.

vercel/next.js#16205 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants