harbor仓库镜像的删除

docker镜像仓库中镜像的清理,一直是个比较麻烦的事情。尤其是在测试环境当中,每天都会有大量的构建。由此会产生大量的历史镜像,而这些镜像,大多数都没有用。

在harbor中,清理镜像,也得分为两步,第一步是从ui中删除历史镜像。这个时候镜像并不会被真正删除,好在harbor集成了镜像删除的功能。

废话不多说,直接给操作。

清理UI中的镜像

清理ui中的镜像,如果直接通过图形界面一个个的点击删除的话,在有大规模镜像需要清理的时候,简直就是灾难,而且这种方式,实在太low。

我这里简单写了个脚本,以实现如下功能:

  1. 遍历所有project
  2. 获取project中所有tag数超过30的repositories
  3. 获取这些tag数超过30的repositories的所有tag
  4. 基于时间排序,保留最新的30个tag
  5. 删除其他tag

脚本示例:

#! /usr/bin/env python
# -*- coding:utf-8 -*-


import requests
import json


class RequestClient(object):

    def __init__(self,login_url, username, password):
        self.username = username
        self.password =  password
        self.login_url = login_url
        self.session = requests.Session()
        self.login()

    def login(self):
        self.session.post(self.login_url, params={"principal": self.username, "password": self.password})

class ClearHarbor(object):
    
    def __init__(self, harbor_domain, password, schema="https",
                 username="admin"):
        self.schema = schema
        self.harbor_domain = harbor_domain
        self.harbor_url = self.schema + "://" + self.harbor_domain
        self.login_url = self.harbor_url + "/login"
        self.api_url = self.harbor_url + "/api"
        self.pro_url = self.api_url + "/projects"
        self.repos_url = self.api_url + "/repositories"
        self.username = username
        self.password = password
        self.client = RequestClient(self.login_url, self.username, self.password)

    def __fetch_pros_obj(self):
        # TODO
        self.pros_obj = self.client.session.get(self.pro_url).json()
        return self.pros_obj

    def fetch_pros_id(self):
        self.pros_id = []
        # TODO
        pro_res = self.__fetch_pros_obj()
        for i in pro_res:
            self.pros_id.append(i['project_id'])
        return self.pros_id

    def fetch_del_repos_name(self, pro_id):
        self.del_repos_name = []
        repos_res = self.client.session.get(self.repos_url, params={"project_id": pro_id})
        # TODO
        for repo in repos_res.json():
            if repo["tags_count"] > 30: 
                self.del_repos_name.append(repo['name'])
        return self.del_repos_name

    def fetch_del_repos(self, repo_name):
        self.del_res = []
        tag_url = self.repos_url + "/" + repo_name + "/tags"
        # TODO
        tags = self.client.session.get(tag_url).json()
        tags_sort = sorted(tags, key=lambda a: a["created"])
        #print(tags_sort) 
        del_tags = tags_sort[0:len(tags_sort) -30]
        #print(del_tags)
        for tag in del_tags:
            del_repo_tag_url = tag_url + "/" + tag['name']
            print(del_repo_tag_url)
            del_res = self.client.session.delete(del_repo_tag_url)
            self.del_res.append(del_res)

        return self.del_res


if __name__ == "__main__":
   
    harbor_domain = "hub.test.com" 
    password = "xxxxxxx"
    res = ClearHarbor(harbor_domain,password)
    # 循环所有的project id
    for i in res.fetch_pros_id():
        # 获取所有tag超过30的repos
        repos = res.fetch_del_repos_name(i)
        if repos:
            print(repos)   
            for repo in repos:
                del_repos = res.fetch_del_repos(repo)
                print(del_repos)

  

清理镜像释放空间

 

依旧为286m,到是什么原因呢,通过官方github查看使用文档,终于找到了解决办法,那就是在webui删除镜像是软删除,并不会释放空间,在

webui界面删除后必须停止harbor后再进行硬删除,也就是空间释放,垃圾回收,下面是官方文档原文

Deleting repositories

Repository deletion runs in two steps.

First, delete a repository in Harbor's UI. This is soft deletion. You can delete the entire repository or just a tag of it. After the soft deletion, the repository is no longer managed in Harbor, however, the files of the repository still remain in Harbor's storage.

harbor仓库镜像的删除harbor仓库镜像的删除

CAUTION: If both tag A and tag B refer to the same image, after deleting tag A, B will also get deleted. if you enabled content trust, you need to use notary command line tool to delete the tag's signature before you delete an image.

Next, delete the actual files of the repository using the registry's garbage collection(GC). Make sure that no one is pushing images or Harbor is not running at all before you perform a GC. If someone were pushing an image while GC is running, there is a risk that the image's layers will be mistakenly deleted which results in a corrupted image. So before running GC, a preferred approach is to stop Harbor first.

Run the below commands on the host which Harbor is deployed on to preview what files/images will be affected:

$ docker-compose stop

$ docker run -it --name gc --rm --volumes-from registry vmware/registry:2.6.2-photon garbage-collect --dry-run /etc/registry/config.yml

NOTE: The above option "--dry-run" will print the progress without removing any data.

Verify the result of the above test, then use the below commands to perform garbage collection and restart Harbor.

$ docker run -it --name gc --rm --volumes-from registry vmware/registry:2.6.2-photon garbage-collect  /etc/registry/config.yml

$ docker-compose start

For more information about GC, please see GC.

官方已经说的很明白了,第一个run是只打印出来已删除镜像,但不进行空间释放和垃圾回收,执行下面的一个run,成功释放空间了。

harbor仓库镜像的删除

执行垃圾回收命令 docker run -it --name gc --rm --volumes-from registry vmware/registry:2.6.2-photon garbage-collect  /etc/registry/config.yml 自行参照镜像名称修改

harbor仓库镜像的删除

释放过程截图

启动harbor

harbor仓库镜像的删除

查看空间是否释放

harbor仓库镜像的删除

 

我们的空间回来了

再上传一次试试?

harbor仓库镜像的删除

证明没问题,可以上传。

也可以下载

harbor仓库镜像的删除

至此,仓库空间释放,垃圾回收告一段落

附上官方文档地址

https://github.com/vmware/harbor/blob/master/docs/user_guide.md

上一篇:部署YUM仓库及NFS共享服务


下一篇:蓝鲸6.02双机部署文档