Memory management utils

Utility functions for memory management. Currently primarily for GPU.

gpu_mem_get[source]

gpu_mem_get(id=None)

get total, used and free memory (in MBs) for gpu id. if id is not passed, currently selected torch device is used

gpu_mem_get

  • for gpu returns GPUMemory(total, used, free)
  • for cpu returns GPUMemory(0, 0, 0)
  • for invalid gpu id returns GPUMemory(0, 0, 0)

gpu_mem_get_all[source]

gpu_mem_get_all()

get total, used and free memory (in MBs) for each available gpu

gpu_mem_get_all

  • for gpu returns [ GPUMemory(total_0, used_0, free_0), GPUMemory(total_1, used_1, free_1), .... ]
  • for cpu returns []

gpu_mem_get_free_no_cache[source]

gpu_mem_get_free_no_cache()

get free memory (in MBs) for the currently selected gpu id, after emptying the cache

gpu_mem_get_used_no_cache[source]

gpu_mem_get_used_no_cache()

get used memory (in MBs) for the currently selected gpu id, after emptying the cache

gpu_mem_get_used_fast[source]

gpu_mem_get_used_fast(gpu_handle)

get used memory (in MBs) for the currently selected gpu id, w/o emptying the cache, and needing the gpu_handle arg

gpu_with_max_free_mem[source]

gpu_with_max_free_mem()

get [gpu_id, its_free_ram] for the first gpu with highest available RAM

gpu_with_max_free_mem:

  • for gpu returns: gpu_with_max_free_ram_id, its_free_ram
  • for cpu returns: None, 0

preload_pytorch[source]

preload_pytorch()

preload_pytorch is helpful when GPU memory is being measured, since the first time any operation on cuda is performed by pytorch, usually about 0.5GB gets used by CUDA context.

class GPUMemory

GPUMemory(total, used, free) :: tuple

GPUMemory(total, used, free)

GPUMemory is a namedtuple that is returned by functions like gpu_mem_get and gpu_mem_get_all.

b2mb[source]

b2mb(num)

convert Bs to MBs and round down

b2mb is a helper utility that just does int(bytes/2**20)

Memory Tracing Utils

class GPUMemTrace[source]

GPUMemTrace(silent=False)

Trace GPU allocated and peaked memory usage

Usage examples:

from fastai.utils.mem import GPUMemTrace
memtrace = GPUMemTrace()
memtrace.start() # start tracing

def some_code(): pass

some_code()
memtrace.report() # print intermediary cumulative report
delta_used, delta_peaked = memtrace.data() # same but as data

some_code()
memtrace.report('2nd run') # print intermediary cumulative report
delta_used, delta_peaked = memtrace.data()

for i in range(10):
    memtrace.reset()
    some_code()
    memtrace.report(f'i={i}') # report for just the last code run since reset

# combine report+reset
memtrace.reset()
for i in range(10):
    some_code()
    memtrace.report_n_reset(f'i={i}') # report for just the last code run since reset

memtrace.stop() # stop the monitor thread

It can also be used as a context manager:

with GPUMemTrace() as memtrace:
    some_code()
delta_used, delta_peaked = memtrace.data()
mem_trace.report("measured in ctx")

Workarounds to the leaky ipython traceback on exception

ipython has a feature where it stores tb with all the locals() tied in, which prevents gc.collect() from freeing those variables and leading to a leakage.

Therefore we cleanse the tb before handing it over to ipython. The 2 ways of doing it are by either using the gpu_mem_restore decorator or the gpu_mem_restore_ctx context manager which are described next:

gpu_mem_restore[source]

gpu_mem_restore(func)

Reclaim GPU RAM if CUDA out of memory happened, or execution was interrupted

gpu_mem_restore is a decorator to be used with any functions that interact with CUDA (top-level is fine)

  • under non-ipython environment it doesn't do anything.
  • under ipython currently it strips tb by default only for the "CUDA out of memory" exception.

The env var FASTAI_TB_CLEAR_FRAMES changes this behavior when run under ipython, depending on its value:

  • "0": never strip tb (makes it possible to always use %debug magic, but with leaks)
  • "1": always strip tb (never need to worry about leaks, but %debug won't work)

e.g. os.environ['FASTAI_TB_CLEAR_FRAMES']="0" will set it to 0.

class gpu_mem_restore_ctx[source]

gpu_mem_restore_ctx()

context manager to reclaim RAM if an exception happened under ipython

if function decorator is not a good option, you can use a context manager instead. For example:

with gpu_mem_restore_ctx():
   learn.fit_one_cycle(1,1e-2)

This particular one will clear tb on any exception.