Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added gunicorn production server documentation #1272

Closed

Conversation

evananyonga
Copy link

@evananyonga evananyonga commented Oct 22, 2020

Description

Achieving OpenTelemetry instrumentation in a Django production servers like gunicorn, since instrumentation of the application on works on a development server. You have to install gunicorn in your application to effect this change.

Fixes # (1197)

Documentation

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • This change requires a documentation update

How Has This Been Tested?

I have applied tox lint checks

@evananyonga evananyonga requested review from a team, codeboten and lzchen and removed request for a team October 22, 2020 06:14
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 22, 2020

CLA Check

.. code-block:: python

def post_fork(server, worker):
worker(DjangoInstrumentor().instrument())
Copy link
Contributor

@owais owais Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why is result of instrument() being passed on to worker? Is it even expected or required for a worker to be called like this in post_fork? AFAIK, you can totally ignore doing anything with the server and worker object, and just perform general initialization but may be I'm wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right actually. I was thinking about that too but I thought instrumentation should pass through a worker. I see now and agree that passing it into post_fork is quite sufficient

.. code-block:: python

def post_fork(server, worker):
worker(DjangoInstrumentor().instrument())
Copy link
Contributor

@owais owais Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. As far as I remember, tracing as a whole needs to be setup in post_fork including creating the tracer provider, span processors, instrumentation etc. Especially if batch span processor is used. Any reason to only mention instrumentation here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that I create a pipeline in post_fork? I was thinking more of using opentelemetry as an agent. I could be wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you recommended a pipeline. Let me put that in effect.


.. code-block:: python

def post_fork(server, worker):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen some problems regarding the forking process of Gunicorn when working with Opencensus like here and here. The main issue was that the worker thread that was responsible for sending spans to the exporter (BatchSpanProcessor in this case) was not copied to the child processes due to Gunicorn using os.fork() to spawn them. Does this post_fork() configuration fix this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, this is not enough. The entire trace pipeline (particularly batch processor) needs to be setup in post_fork for it to work.

Copy link
Member

@aabmass aabmass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The django example docs (and example code) should be updated too I believe. https://opentelemetry-python.readthedocs.io/en/latest/examples/django/README.html

@owais is there an elegant way to handle test vs. production to avoid instrumenting the application twice?

@owais
Copy link
Contributor

owais commented Oct 23, 2020

@aabmass one could initialize tracing in wsgi.py. That will always only be instrumented once both in dev and in prod. This is because each gunicorn worker imports wsgi.py after forking. The downside to this is that other django modules are imported before wsgi.py is imported and this results in a subtle bug #1276.

I found that initializing tracing in manage.py and in gunicorn.config.py both works well for both dev and production. Adding a tracing.py file to your django project that looks something like this:

from logging import getLogger
from pkg_resources import iter_entry_points

from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    ConsoleSpanExporter,
    BatchExportSpanProcessor,
    SimpleExportSpanProcessor,
)

logger = getLogger(__file__)

def init_tracing():
  provider = TracerProvider()
  trace.set_tracer_provider(provider)

  provider.add_span_processor(BatchExportSpanProcessor(ConsoleSpanExporter()))
  auto_instrument()


# function should be provided out of box as `opentelemetry.instrumentation.auto_instrumentation.auto_instrument`
def auto_instrument():
  for entry_point in iter_entry_points("opentelemetry_instrumentor"):
      try:
          entry_point.load()().instrument()
      except Exception: 
          logger.exception("Instrumenting of %s failed", entry_point.name)

and then import and call init_tracing() from both manage.py and gunicorn.config.py:

manage.py

#!/usr/bin/env python
"""Django's command-line utility for administrative tasks."""
import os
import sys

from djtest.tracing import init_tracing  # change 1

def main():
    """Run administrative tasks."""
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'djtest.settings')
    try:
        from django.core.management import execute_from_command_line
    except ImportError as exc:
        raise ImportError(
            "Couldn't import Django. Are you sure it's installed and "
            "available on your PYTHONPATH environment variable? Did you "
            "forget to activate a virtual environment?"
        ) from exc
    init_tracing()  # change 2
    execute_from_command_line(sys.argv)


if __name__ == '__main__':
    main()

gunicorn.config.py

from djtest.tracing import init_tracing

def post_fork(server, worker):
    init_tracing()

I've seen some projects start gunicorn using manage.py like ./manage.py gunicorn but I don't think this is used anymore with new Django projects since django added native wsgi.py to projects.

@evananyonga
Copy link
Author

Please check out my changes in #1286

srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
@codeboten
Copy link
Contributor

Closing this in favour of #1286

@codeboten codeboten closed this Nov 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants