watchme.watchers.gpu package

Submodules

watchme.watchers.gpu.decorators module

Copyright (C) 2019 Vanessa Sochat.

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.

watchme.watchers.gpu.decorators.monitor_gpu(*args, **kwargs)[source]

a decorator to monitor a function every 3 (or user specified) seconds. We include one or more task names that include data we want to extract. we get the pid of the running function, and then use the gpu_task from gpu to watch it. The functools “wraps” ensures that the (fargs, fkwargs) are passed from the calling function despite the wrapper. The following parameters can be provided to “monitor resources”

Parameters
  • watcher (the watcher instance to use, used to save data to a “task”) – folder that starts with “decorator-<name<”

  • seconds (how often to collect data during the run.)

  • only (ignore skip and include, only include this custom subset)

  • skip (Fields in the result to skip (list).)

  • include (Fields in the result to include back in (list).)

  • create (whether to create the watcher on the fly (default False, must) – exist)

  • name (the suffix of the decorator-gpu-<name> folder. If not provided,) – defaults to the function name

watchme.watchers.gpu.pynvml module

Copyright (C) 2019 Vanessa Sochat.

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.

The original license (2011-2015) is included below.

exception watchme.watchers.gpu.pynvml.NVMLError[source]

Bases: Exception

exception watchme.watchers.gpu.pynvml.NVMLError_AlreadyInitialized

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_CorruptedInforom

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_DriverNotLoaded

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_FunctionNotFound

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_GpuIsLost

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_InsufficientPower

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_InsufficientSize

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_InvalidArgument

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_IrqIssue

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_LibRmVersionMismatch

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_LibraryNotFound

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_NoPermission

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_NotFound

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_NotSupported

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_OperatingSystem

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_ResetRequired

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_Timeout

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_Uninitialized

Bases: watchme.watchers.gpu.pynvml.NVMLError

exception watchme.watchers.gpu.pynvml.NVMLError_Unknown

Bases: watchme.watchers.gpu.pynvml.NVMLError

class watchme.watchers.gpu.pynvml.c_nvmlAccountingStats_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

gpuUtilization

Structure/Union member

isRunning

Structure/Union member

maxMemoryUsage

Structure/Union member

memoryUtilization

Structure/Union member

reserved

Structure/Union member

startTime

Structure/Union member

time

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlBAR1Memory_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

bar1Free

Structure/Union member

bar1Total

Structure/Union member

bar1Used

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlBridgeChipHierarchy_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

bridgeChipInfo

Structure/Union member

bridgeCount

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlBridgeChipInfo_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

fwVersion

Structure/Union member

type

Structure/Union member

watchme.watchers.gpu.pynvml.c_nvmlDevice_t

alias of watchme.watchers.gpu.pynvml.LP_struct_c_nvmlDevice_t

class watchme.watchers.gpu.pynvml.c_nvmlEccErrorCounts_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

deviceMemory

Structure/Union member

l1Cache

Structure/Union member

l2Cache

Structure/Union member

registerFile

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlEventData_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

device

Structure/Union member

eventData

Structure/Union member

eventType

Structure/Union member

watchme.watchers.gpu.pynvml.c_nvmlEventSet_t

alias of watchme.watchers.gpu.pynvml.LP_struct_c_nvmlEventSet_t

class watchme.watchers.gpu.pynvml.c_nvmlHwbcEntry_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

firmwareVersion

Structure/Union member

hwbcId

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlLedState_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

cause

Structure/Union member

color

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlMemory_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

free

Structure/Union member

total

Structure/Union member

used

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlPSUInfo_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

current

Structure/Union member

power

Structure/Union member

state

Structure/Union member

voltage

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlProcessInfo_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

pid

Structure/Union member

usedGpuMemory

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlSample_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

sampleValue

Structure/Union member

timeStamp

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlUnitFanInfo_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

speed

Structure/Union member

state

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlUnitFanSpeeds_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

count

Structure/Union member

fans

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlUnitInfo_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

firmwareVersion

Structure/Union member

id

Structure/Union member

name

Structure/Union member

serial

Structure/Union member

watchme.watchers.gpu.pynvml.c_nvmlUnit_t

alias of watchme.watchers.gpu.pynvml.LP_struct_c_nvmlUnit_t

class watchme.watchers.gpu.pynvml.c_nvmlUtilization_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

gpu

Structure/Union member

memory

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlValue_t[source]

Bases: _ctypes.Union

dVal

Structure/Union member

uiVal

Structure/Union member

ulVal

Structure/Union member

ullVal

Structure/Union member

class watchme.watchers.gpu.pynvml.c_nvmlViolationTime_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

referenceTime

Structure/Union member

violationTime

Structure/Union member

watchme.watchers.gpu.pynvml.ensureUtfEncoding(value)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceClearAccountingPids(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceClearCpuAffinity(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceClearEccErrorCounts(handle, counterType)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetAPIRestriction(device, apiType)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetAccountingBufferSize(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetAccountingMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetAccountingPids(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetAccountingStats(handle, pid)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetApplicationsClock(handle, type)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetAutoBoostedClocksEnabled(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetBAR1MemoryInfo(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetBoardId(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetBrand(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetBridgeChipInfo(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetClockInfo(handle, type)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetComputeMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetComputeRunningProcesses(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCount()[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCpuAffinity(handle, cpuSetSize)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCurrPcieLinkGeneration(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCurrPcieLinkWidth(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCurrentClocksThrottleReasons(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCurrentDriverModel(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCurrentEccMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetCurrentGpuOperationMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetDecoderUtilization(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetDefaultApplicationsClock(handle, type)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetDetailedEccErrors(handle, errorType, counterType)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetDisplayActive(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetDisplayMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetDriverModel(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetEccMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetEncoderUtilization(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetEnforcedPowerLimit(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetFanSpeed(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetGpuOperationMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetGraphicsRunningProcesses(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetHandleByIndex(index)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetHandleByPciBusId(pciBusId)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetHandleBySerial(serial)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetHandleByUUID(uuid)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetIndex(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetInforomConfigurationChecksum(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetInforomImageVersion(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetInforomVersion(handle, infoRomObject)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetMaxClockInfo(handle, type)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetMaxPcieLinkGeneration(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetMaxPcieLinkWidth(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetMemoryErrorCounter(handle, errorType, counterType, locationType)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetMemoryInfo(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetMinorNumber(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetMultiGpuBoard(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetName(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPciInfo(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPcieReplayCounter(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPcieThroughput(device, counter)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPendingDriverModel(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPendingEccMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPendingGpuOperationMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPerformanceState(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPersistenceMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPowerManagementDefaultLimit(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPowerManagementLimit(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPowerManagementLimitConstraints(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPowerManagementMode(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPowerState(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetPowerUsage(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetRetiredPages(device, sourceFilter)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetRetiredPagesPendingStatus(device)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetSamples(device, sampling_type, timeStamp)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetSerial(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetSupportedClocksThrottleReasons(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetSupportedEventTypes(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetSupportedGraphicsClocks(handle, memoryClockMHz)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetSupportedMemoryClocks(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetTemperature(handle, sensor)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetTemperatureThreshold(handle, threshold)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetTopologyCommonAncestor(device1, device2)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetTopologyNearestGpus(device, level)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetTotalEccErrors(handle, errorType, counterType)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetUUID(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetUtilizationRates(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetVbiosVersion(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceGetViolationStatus(device, perfPolicyType)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceOnSameBoard(handle1, handle2)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceRegisterEvents(handle, eventTypes, eventSet)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceResetApplicationsClocks(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetAPIRestriction(handle, apiType, isRestricted)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetAccountingMode(handle, mode)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetApplicationsClocks(handle, maxMemClockMHz, maxGraphicsClockMHz)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetAutoBoostedClocksEnabled(handle, enabled)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetComputeMode(handle, mode)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetCpuAffinity(handle)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetDefaultAutoBoostedClocksEnabled(handle, enabled, flags)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetDriverModel(handle, model)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetEccMode(handle, mode)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetGpuOperationMode(handle, mode)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetPersistenceMode(handle, mode)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceSetPowerManagementLimit(handle, limit)[source]
watchme.watchers.gpu.pynvml.nvmlDeviceValidateInforom(handle)[source]
watchme.watchers.gpu.pynvml.nvmlErrorString(result)[source]
watchme.watchers.gpu.pynvml.nvmlEventSetCreate()[source]
watchme.watchers.gpu.pynvml.nvmlEventSetFree(eventSet)[source]
watchme.watchers.gpu.pynvml.nvmlEventSetWait(eventSet, timeoutms)[source]
class watchme.watchers.gpu.pynvml.nvmlFriendlyObject(dictionary)[source]

Bases: object

watchme.watchers.gpu.pynvml.nvmlFriendlyObjectToStruct(obj, model)[source]
watchme.watchers.gpu.pynvml.nvmlInit()[source]
class watchme.watchers.gpu.pynvml.nvmlPciInfo_t[source]

Bases: watchme.watchers.gpu.pynvml._PrintableStructure

bus

Structure/Union member

busId

Structure/Union member

device

Structure/Union member

domain

Structure/Union member

pciDeviceId

Structure/Union member

pciSubSystemId

Structure/Union member

reserved0

Structure/Union member

reserved1

Structure/Union member

reserved2

Structure/Union member

reserved3

Structure/Union member

watchme.watchers.gpu.pynvml.nvmlShutdown()[source]
watchme.watchers.gpu.pynvml.nvmlStructToFriendlyObject(struct)[source]
watchme.watchers.gpu.pynvml.nvmlSystemGetDriverVersion()[source]
watchme.watchers.gpu.pynvml.nvmlSystemGetHicVersion()[source]
watchme.watchers.gpu.pynvml.nvmlSystemGetNVMLVersion()[source]
watchme.watchers.gpu.pynvml.nvmlSystemGetProcessName(pid)[source]
watchme.watchers.gpu.pynvml.nvmlSystemGetTopologyGpuSet(cpuNumber)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetCount()[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetDeviceCount(unit)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetDevices(unit)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetFanSpeedInfo(unit)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetHandleByIndex(index)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetLedState(unit)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetPsuInfo(unit)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetTemperature(unit, type)[source]
watchme.watchers.gpu.pynvml.nvmlUnitGetUnitInfo(unit)[source]
watchme.watchers.gpu.pynvml.nvmlUnitSetLedState(unit, color)[source]
class watchme.watchers.gpu.pynvml.struct_c_nvmlDevice_t[source]

Bases: _ctypes.Structure

class watchme.watchers.gpu.pynvml.struct_c_nvmlEventSet_t[source]

Bases: _ctypes.Structure

class watchme.watchers.gpu.pynvml.struct_c_nvmlUnit_t[source]

Bases: _ctypes.Structure

watchme.watchers.gpu.tasks module

Copyright (C) 2019 Vanessa Sochat.

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.

watchme.watchers.gpu.tasks.gpu_task(**kwargs)[source]

Get variables about the gpu of the host. No parameters are required. We’ve already instantited the Task object and have checked that the calling host has nvml GPU

Parameters

skip (an optional list of (comma separated) fields to skip. Can be in) – net_io_counters,net_connections,net_if_address,net_if_stats

Module contents

Copyright (C) 2019 Vanessa Sochat.

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.

class watchme.watchers.gpu.Task(name, params=None, **kwargs)[source]

Bases: watchme.tasks.TaskBase

assert_gpu()[source]

has_gpu is run from the getgo to see if there are any libraries for the client to read from. If not, we alert the user and exit.

export_func()[source]

this function should return the correct task (from the tasks.py in the same folder) based on some logic of the params that are given by the user (self.params). If there is only one kind of function for the task, it’s fairly easy to import and return it here. This function should take no arguments, but instead use the self.params already provided in the client.

required_params = []