Resource Trackers

Purpose: understanding cluster compute resource utilization for complex and intensive data engineering and model development jobs. This includes containers active, cores used and memory (RAM) committed to jobs run via the different tools available.

Hadoop Tracker: provides a view of resource utilization across all YARN managed jobs.

Spark Application Tracker: provides a detailed view of resource utilization and environment variables for jobs running in Spark (e.g. Jupyter notebooks with PySpark kernels).

Â