我正在使用gunicorn和flask开发一个简单的REST控制器.
在每次REST调用时,我执行以下代码
@app.route('/objects', methods=['GET'])
def get_objects():
video_title = request.args.get('video_title')
video_path = "../../video/" + video_title
cl.logger.info(video_path)
start = request.args.get('start')
stop = request.args.get('stop')
scene = [start, stop]
frames = images_utils.extract_frames(video_path, scene[0], scene[1], 1)
cl.logger.info(scene[0]+" "+scene[1])
objects = list()
##objects
model = GenericDetector('../resources/open_images/frozen_inference_graph.pb', '../resources/open_images/labels.txt')
model.run(frames)
for result in model.get_boxes_and_labels():
if result is not None:
objects.append(result)
data = {'message': {
'start_time': scene[0],
'end_time': scene[1],
'path': video_path,
'objects':objects,
}, 'metadata_type': 'detection'}
return jsonify({'status': data}), 200
此代码运行tensorflow冻结模型,如下所示:
class GenericDetector(Process):
def __init__(self, model, labels):
# ## Load a (frozen) Tensorflow model into memory.
self.detection_graph = tf.Graph()
with self.detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(model, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
self.boxes_and_labels = []
# ## Loading label map
with open(labels) as f:
txt_labels = f.read()
self.labels = json.loads(txt_labels)
def run(self, frames):
tf.reset_default_graph()
with self.detection_graph.as_default():
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with tf.Session(graph=self.detection_graph, config=config) as sess:
image_tensor = self.detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
detection_boxes = self.detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
detection_scores = self.detection_graph.get_tensor_by_name('detection_scores:0')
detection_classes = self.detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = self.detection_graph.get_tensor_by_name('num_detections:0')
i = 0
for frame in frames:
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(frame, axis=0)
# Actual detection.
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections], \
feed_dict={image_tensor: image_np_expanded})
boxes = np.squeeze(boxes)
classes = np.squeeze(classes).astype(np.int32)
scores = np.squeeze(scores)
for j, box in enumerate(boxes):
if all(v == 0 for v in box):
continue
self.boxes_and_labels.append(
{
"ymin": str(box[0]),
"xmin": str(box[1]),
"ymax": str(box[2]),
"xmax": str(box[3]),
"label": self.labels[str(classes[j])],
"score": str(scores[j]),
"frame":i
})
i += 1
sess.close()
def get_boxes_and_labels(self):
return self.boxes_and_labels
一切似乎都是例外,但是一旦我向服务器发送第二个请求,我的GPU(GTX 1050)就会出现内存不足:
ResourceExhaustedError (see above for traceback): OOM when allocating
tensor of shape [3,3,256,256] and type float
如果我之后尝试拨打电话,它大部分时间都可以工作.有时它也可以用于后续调用.我尝试在单独的进程上执行GenericDetector(使GEnericDetector hereditate Process),但它没有帮助.我读过,一旦执行REST GET的进程死了,就应该释放GPU的内存,所以我也尝试在执行tensorflow模型后添加sleep(30),没有运气.我做错了吗?
解决方法:
问题是Tensorflow为进程而不是Session分配内存,关闭会话是不够的(即使你放了allow_growth选项).
The first is the allow_growth option, which attempts to allocate only as much GPU memory based on runtime allocations: it starts out allocating very little memory, and as Sessions get run and more GPU memory is needed, we extend the GPU memory region needed by the TensorFlow process. Note that we do not release memory, since that can lead to even worse memory fragmentation.
在TF github上有一个带有一些解决方案的issue,你可以使用线程中提出的RunAsCUDASubprocess来装饰你的run方法.