Debugging

This page describes how to debug in PyFlink.

Logging Infos

Python UDFs can log contextual and debug information via standard Python logging modules.

  1. @udf(input_types=[DataTypes.BIGINT(), DataTypes.BIGINT()], result_type=DataTypes.BIGINT())
  2. def add(i, j):
  3. import logging
  4. logging.info("debug")
  5. return i + j

Accessing Logs

If the environment variable FLINK_HOME is set, logs will be written in the log directory under FLINK_HOME. Otherwise, logs will be placed in the directory of the PyFlink module. You can execute the following command to find the log directory of the PyFlink module:

  1. $ python -c "import pyflink;import os;print(os.path.dirname(os.path.abspath(pyflink.__file__))+'/log')"

Debugging Python UDFs

You can make use of the pydevd_pycharm tool of PyCharm to debug Python UDFs.

  1. Create a Python Remote Debug in PyCharm

    run -> Python Remote Debug -> + -> choose a port (e.g. 6789)

  2. Install the pydevd-pycharm tool

    1. $ pip install pydevd-pycharm
  3. Add the following command in your Python UDF

    1. import pydevd_pycharm
    2. pydevd_pycharm.settrace('localhost', port=6789, stdoutToServer=True, stderrToServer=True)
  4. Start the previously created Python Remote Debug Server

  5. Run your Python Code